Monitoring: Azure Monitor, Log Analytics & Alerts
Enterprise monitoring for data engineering with Azure Monitor, Log Analytics, alerts, and workbooks
Monitoring Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MONITORING ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β DATA SOURCES COLLECTION ANALYSIS β
β ββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Synapse ββββββββββββ>β Diagnostic βββββ>β Log Analyticsβ β
β β Databricksβ β Settings β β Workspace β β
β β ADF ββββββββββββ>β β β β β
β β ADLS Gen2β β Send to: β β KQL Queries β β
β ββββββββββββ β β’ Log Analyt.β β Dashboards β β
β β β’ Storage β β β β
β ββββββββββββ β β’ Event Hub β ββββββββ¬ββββββββ β
β β Custom ββββββββββββ>β β β β
β β Metrics β ββββββββββββββββ β β
β ββββββββββββ β β
β βΌ β
β VISUALIZATION ALERTING AUTOMATION β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Azure β<ββββββ>β Azure βββββ>β Logic Apps β β
β β Monitor β β Alerts β β (Remediation)β β
β β Workbooks β β β β β β
β β β β β’ Metric β ββββββββββββββββ β
β β Dashboards β β β’ Log β β
β ββββββββββββββββ β β’ Activity β β
β ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
KQL Queries for Data Engineering
// ADF Pipeline runs summary
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "PipelineRuns"
| summarize
TotalRuns = count(),
SuccessfulRuns = countif(Status_s == "Succeeded"),
FailedRuns = countif(Status_s == "Failed")
by bin(TimeGenerated, 1h)
| render timechart
// Synapse SQL Pool query performance
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SYNAPSE"
| where Category == "SQLRequest"
| where DurationMs_d > 1000
| project QueryText_s, DurationMs_d, RequestTime_d, User_s
| order by DurationMs_d desc
// ADLS Gen2 storage usage
AzureMetrics
| where ResourceProvider == "MICROSOFT.STORAGE"
| where MetricName == "UsedCapacity"
| summarize AvgStorageGB = avg(Average) / 1024 / 1024 / 1024
by bin(TimeGenerated, 1d)
| render timechart
// Databricks cluster metrics
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATABRICKS"
| where Category == "clusters"
| summarize
ActiveClusters = countif(State_s == "RUNNING"),
TotalDBUs = sum(TotalDBUs_d)
by bin(TimeGenerated, 1h)
| render timechart
Alert Rules Configuration
{
"properties": {
"displayName": "ADF Pipeline Failure Alert",
"severity": 2,
"enabled": true,
"scopes": [
"/subscriptions/xxx/resourceGroups/rg/providers/Microsoft.DataFactory/factories/adf-prod"
],
"condition": {
"allOf": [
{
"field": "name",
"equals": "FailedPipelineRuns"
},
{
"field": "Microsoft.DataFactory/factories/pipelineRuns/Status",
"equals": "Failed"
}
]
},
"actions": {
"actionGroups": [
"/subscriptions/xxx/resourceGroups/rg/providers/Microsoft.Insights/actionGroups/ag-data-team"
]
},
"evaluationFrequency": "PT5M",
"windowSize": "PT15M"
}
}
βΉοΈ
Pro Tip: Create custom Azure Monitor Workbooks for data engineering dashboards. Include metrics for pipeline runs, data volumes, query performance, and cost trends.
Interview Questions
Q1: What metrics should you monitor for a data engineering platform? A: 1) Pipeline success/failure rates, 2) Data volumes processed, 3) Query performance (duration, resources), 4) Storage utilization, 5) Cost trends, 6) Data freshness, 7) Error rates and types.
Q2: How do you implement end-to-end monitoring for an ADF pipeline? A: 1) Enable diagnostic settings for ADF, 2) Create Log Analytics workspace, 3) Build KQL queries for pipeline metrics, 4) Create Azure Monitor Workbooks, 5) Set up alert rules for failures, 6) Implement custom logging in activities.
Q3: What is the difference between metrics and logs in Azure Monitor? A: Metrics are numerical time-series data (CPU, memory, throughput). Logs are detailed event records (pipeline runs, errors, query text). Use metrics for real-time monitoring; logs for debugging and analysis.