In my opinion, part of monitoring with Azure Monitor is being able to alert at scale. When we say alerting at scale this means a few things to me. First, being able to create alerts in a programmatic fashion. Second, it means creating alerts that cover multiple resources at once. For instance, creating a CPU alert for all current and future IaaS VMs. Or creating DTU alerts for Azure SQL PaaS across entire resource groups. For alerting at scale with Azure Monitor we have ARM Templates and of course PowerShell. There are four Azure Monitor Powershell modules.
Azure Monitor Powershell Modules
At present, we have four main PowerShell modules.
I’m not going to list all the commands available, because there are a lot. The Application Insights and Operational Insights module for Log Analytics are both have commands for creating, modifying and managing their respective workspaces. The modules of interest for this post are Az.Monitor. There is also the OmsIngestionAPI module, which I’ve covered here. There is a sixth module as well. The commands available in Az.AlertsManagement should just be included in the Az.Monitor module. That said the commands in AlertsManagement are largely related to smart groups and action rules.
Because you cannot create an Alert in Azure Monitor via PowerShell without providing an Action Group, we’re going to start with Action Groups. The available commands for Action Groups in the Az.Monitor module are as follows:
Thanks to PowerShell’s wonderful verb action command naming, these are pretty clear on what they do. Except for New-AzActionGroup. This command does not do what you think it does. In fact, its not even used in the creation of new Action Groups. When looking at the command, the only required parameter is ActionGroupID. What this command does is create a PowerShell object with the Action Group ID in it.
To create a new Action Group you need to use New-AzActionGroupReceiver and Set-AzActionGroup and your code will look something like this.
$email = New-AzActionGroupReceiver -Name "hasmug" -EmailReceiver -EmailAddress "email@example.com" Set-AzActionGroup -Name "HASMUG Action Group" -ResourceGroup $rg -ShortName "MUG" -Receiver $email
$act = Get-AzActionGroup -ResourceGroupName VS_Monitoring -Name email $action = New-AzActionGroup -ActionGroupId $act.id
Azure Monitor Metrics
In my opinion most alerts should be some sort of metric based alert. Don’t get me wrong there are numerous things you can do with log search based alerts, and often may need to. But when it comes to Azure Resources there is a lot Azure Monitor Metrics can cover for alerts from Storage Accounts to Azure SQL PaaS and IaaS VMs. I prefer to use Metric alerts where I can because the alerts are stateful within Azure Monitor. Meaning Azure Monitor remembers the last state that that metric was in. Whereas log search alerts will continue to keep alerting on the same issue each time that log is ingested with the same threshold breach. So for instance, if you setup a CPU alert for a VM and that VM breached the threshold you set for 30 minutes, a metric alert is going to alert you one time, and a log search is going to alert you at whatever collection interval you have set for performance counters.
Get Resource Specific Metrics
You can get any currently available metric by using Get-AzMetricDefinition, you will also need the Resource ID of a specific resource you want the Metrics for. Unfortunately I am not aware of a way to get available metrics for a specific Azure service without having an existing resource. You can however view all the supported metrics based on Azure services here.
So for instance if you wanted to see all the available metrics for a Log Analytics workspace you could perform the following commands. Using Get-AzOperationalInsightsWorkspace and Get-AzMetricDefinition.
$workspace = Get-AzOperationalInsightsWorkspace -Name VS-Sandlot -ResourceGroupName VS_Monitoring Get-AzMetricDefinition -ResourceId $workspace.resourceid
Create Metric Alerts with Powershell
To create a Metric Alert we need quite a few things. First for the alert itself, we need:
- Resource Group Name
- Window Size
- Target ResourceId
- Action Group
- Time Aggregation
Second, we need 3 Powershell commands. We need two to create our Alert criteria and dimensions and one that creates the actual alert. To create the criteria we use New-AzMetricAlertRuleV2Criteria. Dimension selection is not a required parameter for the criteria command, however, there are instances where not defining a dimension will give us very generic alert results. So I use New-AzMetricAlertRuleV2DimensionSelection especially when setting Alerts against a Log Analytics workspace.
#set dimensions of Alert to Computer. This will alert on all current and future computer members of the workspace $dim = New-AzMetricAlertRuleV2DimensionSelection -DimensionName "Computer" -ValuesToInclude "*" #set alert criteria and counter % Processor Time $criteria = New-AzMetricAlertRuleV2Criteria -MetricName "Average_% Processor Time" ` -DimensionSelection $dim ` -TimeAggregation average ` -Operator GreaterThanorEqual ` -Threshold 90
This code sets us up a dimension using Computer Name, sets our criteria for % Processor Time, average time aggregation, Greater than or equal as the operator and our threshold is set at 90. Notice $dim for dimension has been passed into $criteria as a parameter for the criteria command.
Finally, we’ll use Add-AzMetricAlertRuleV2 to create the alert.
Add-AzMetricAlertRuleV2 -Name "Windows and Linux CPU Alert" ` -ResourceGroupName $RGObject.ResourceGroupName ` -WindowSize 00:05:00 ` -Frequency 00:01:00 ` -TargetResourceId $ResourceId ` -Condition $criteria ` -ActionGroup $action ` -Severity $severity
This command creates the alert in the resource group you specified, against the ResourceID you specify with the Action Group we got earlier in the post with our criteria, severity, window size, and time grain we specify.
When we start looking into automating alerts there are a number of options available between Powershell and ARM templates. This post has gone over just a few of the key things you need when using Powershell to automate alerts. That said when you start needing to create 5 or 10 alerts for IaaS VMs, and then more alerts for Azure SQL and alerts for Azure Functions or App Services, your now looking at potentially dozens or hundreds of alerts. The need to start automating that just to save time becomes a lot more apparent.
The follow up post for IaaS can be found here https://www.systemcenterautomation.com/2020/01/azure-monitor-alerting-at-scale-iaas/