Summary
Add the AZ AIOps Dashboard as an optional observability component of the AI Landing Zone. It is an Azure Monitor Workbook that provides a single-pane-of-glass view of the entire Azure AI/Cognitive Services estate — no agents, no code, no external dependencies.
Source repo: https://github.com/krishna-sunkavalli/az-aiops-dashboard
Problem / Motivation
The AI Landing Zone provides a solid foundation for deploying AI workloads, but there is no built-in operational visibility layer. Once deployed, platform and app teams have limited means to:
- Monitor request volumes, error rates (5xx), and throttling (429) across Azure OpenAI and Models API endpoints
- Inventory all Cognitive Services resources and verify Diagnostic Settings are configured
- Identify which models are being used and at what volume
- Understand the geographic distribution of AI resources
Without this, teams manually stitch together Azure Monitor metrics and write ad hoc KQL queries — adding friction during incidents and day-to-day operations.
Proposed Solution
Add the AZ AIOps Dashboard workbook as an optional observability add-on. It is already packaged as:
- A standalone
.workbook file
- An ARM template (
azuredeploy.json) for one-click deployment
- Deployable via Azure CLI, Azure PowerShell, or manual import
What the workbook shows (9 tabs)
| Area |
Key Visuals |
| Overview |
Request volume split (AOAI vs Models API), resource map by region, status code trend lines, model-level request bar chart, resource inventory with Diagnostic Settings status |
| Request Charts |
HTTP status code breakdowns — 5xx errors and 429 throttling per API type |
| Model Usage |
Per-model request volumes, top models by usage |
| Availability |
Resource health and availability trends |
| + more |
Token usage, latency, content safety (on roadmap) |
Filter Parameters
| Parameter |
Default |
| Subscription |
All |
| Resource Group |
All |
| Cognitive Services Resource |
All |
| Log Analytics Workspace |
All |
| Time Range |
Last 24 hours |
Prerequisites
- Azure Reader role on Cognitive Services resources
- Log Analytics Workspace — already provisioned by the AI Landing Zone
Why it fits the AI Landing Zone
- WAF Operational Excellence — Observability is a core WAF pillar; the AI LZ already references WAF AI workload guidance.
- Zero additional infrastructure — Consumes the Log Analytics Workspace already provisioned by the landing zone.
- CAF alignment — Directly supports the monitoring and management design areas of the AI LZ Design Checklist.
- No new dependencies — Pure Azure Monitor Workbook with no code changes to existing IaC required.
- One-click deployment — ARM template is ready for an optional Deploy to Azure button.
Implementation Proposal
- Add workbook source and ARM template under
observability/aiops-dashboard/
- Add an Observability section to the README with a Deploy to Azure button
- Optionally surface as an optional module in the Bicep/Terraform implementations
Happy to contribute the implementation — PR incoming.
References
Summary
Add the AZ AIOps Dashboard as an optional observability component of the AI Landing Zone. It is an Azure Monitor Workbook that provides a single-pane-of-glass view of the entire Azure AI/Cognitive Services estate — no agents, no code, no external dependencies.
Source repo: https://github.com/krishna-sunkavalli/az-aiops-dashboard
Problem / Motivation
The AI Landing Zone provides a solid foundation for deploying AI workloads, but there is no built-in operational visibility layer. Once deployed, platform and app teams have limited means to:
Without this, teams manually stitch together Azure Monitor metrics and write ad hoc KQL queries — adding friction during incidents and day-to-day operations.
Proposed Solution
Add the AZ AIOps Dashboard workbook as an optional observability add-on. It is already packaged as:
.workbookfileazuredeploy.json) for one-click deploymentWhat the workbook shows (9 tabs)
Filter Parameters
Prerequisites
Why it fits the AI Landing Zone
Implementation Proposal
observability/aiops-dashboard/Happy to contribute the implementation — PR incoming.
References