Major alert volume

When a major incident unfolds, alerts are often the first visible symptom. Latency breaches, error rate spikes, infrastructure health checks failing. Before tickets are logged or escalations begin, the telemetry starts to shift. The question becomes: "are we seeing a short spike, or the start of something bigger?"

This tile visualizes alert volume for major incidents over time, turning raw monitoring data into a clear signal of system pressure. It doesn’t just show how many alerts fired. It shows whether alert activity is rising, stabilizing, or tapering off.

Before you start

This tile works best alongside the ticket creation rate tile, allowing you to correlate alert spikes with increases in incident volume.

Configuring the tile

  1. Data Source: Select Azure.
  2. Data Stream: Select Alerts.
  3. Objects: Select the resources you want alerts for.
  4. Parameters:
    1. Monitor condition: Select Fired.
    2. Severity: Enter sev0 and sev1 to capture only major incidents.
  5. Timeframe: Select the timeframe you want to track. Note that after adding a monitor or configuring a KPI the Use dashboard timeframe option is disabled.

  6. SQL Analytics: Enable the toggle.
    1. SQL > Query: Enter a query such as the following to generate a timeline of tickets being raised. This query builds a complete hourly ticket-creation timeline for the current dashboard timeframe, including hours where no tickets were created.
      WITH hours AS (
        SELECT UNNEST(
          generate_series(
            DATE_TRUNC('hour', CAST('{{timeframe.start}}' AS TIMESTAMP)),
            DATE_TRUNC('hour', CAST('{{timeframe.end}}'   AS TIMESTAMP)),
            INTERVAL 1 HOUR
          )
        ) AS hour
      ),
      alert_counts AS (
        SELECT
          DATE_TRUNC('hour', CAST("properties.essentials.startDateTime" AS TIMESTAMP)) AS hour,
          COUNT(*) AS alert_count
        FROM dataset1
        GROUP BY 1
      )
      SELECT
        hours.hour,
        COALESCE(alert_counts.alert_count, 0) AS alert_count
      FROM hours
      LEFT JOIN alert_counts
        ON alert_counts.hour = hours.hour
      ORDER BY hours.hour;
  7. Visualization: Select Line.
    1. Mapping > X-Axis: Select Hour.
    2. Y-Axis: Select Ticket Count.
    3. X-Axis: Enable the toggle and enter Date time.
    4. Y-Axis: Enable the toggle and enter Tickets raised.
    5. Options: Enable the following toggles:
      • Shading
      • Grid lines
  8. Click Save.

Adding a monitor

This tile is ideal for monitoring because alert volume often drifts before it explodes. Small but sustained increases in fired alerts are easy to overlook in the moment, especially during an active incident.

The goal here is not to react to a single noisy spike. It is to detect when alert noise is consistently rising and may indicate acceleration, cascading impact, or deteriorating service health.

Configuration

Configure the following in the tile editor:

  1. Monitoring: Enable the Monitoring toggle.
  2. Type: Select Threshold.
  3. Value: Select Top.
  4. Column: Select Alert Count.
  5. Evaluate by: Select Hour.
  6. Conditions:
    Assuming a normal baseline of around two fired alerts per hour, we configure thresholds to catch meaningful change without overreacting to minor fluctuation.
    1. Error: Enable the toggle, then configure as greater than and then supply an appropriate value. For our example we'll enter 3.
    2. Warning: Enable the toggle, then configure as greater than and then supply an appropriate value. For our example we'll enter 2.
  7. Click Save.

Was this article helpful?


Have more questions or facing an issue?