Learn how alerts work, and how to create and examine them.

With Wavefront, you can create smart alerts that dynamically filter noise and capture true anomalies.

  • Specify one or more alert targets that receive the alert notification(s).
  • Create a multi-threshold alert to notify different targets depending on alert severity.
  • View an image of the chart in the alert notification and click a link to see the alert in context.
  • Examine firing alerts in Alert Viewer to get context.

The end result is fewer false alerts and faster remediation when real issues occur.

Wavefront Alerts

An alert defines:

  • The condition under which metric values indicate a system problem.
  • One or more targets to notify when the condition evaluates to true or false for a specified period of time.
  • Optionally, information about the alert notification format.

Wavefront supports classic alerts, where each alert has one preset severity, and multi-threshold alerts, where an alert can have different severities for different threshold values.

How to Create an Alert – The Basics

You can create an alert from any chart, or from the Create Alert page. The basic process is the same.

  1. Specify the alert condition, for example, CPU utilization is less than 70%.
  2. Optionally use backtesting to see how often the alert fires and adjust the threshold.
  3. Add an alert target, that is, specify who will receive the alert and how (e.g., email or PagerDuty), then save the alert.
Video of alert creation overview

The rest of this page explains:

  • How you can fine-tune the process to get just the right number of alerts to just the right people.
  • How to create alerts and customize the condition and the target.
  • How to create multi-threshold alerts, which can send notifications to different targets based on the severity of the problem.

Alert Condition

The alert condition is a query language expression that defines the threshold for an alert.

  • If an alert’s Condition field is set to a conditional expression, for example ts("requests.latency") > 195, then all data values that satisfy the condition are marked as true (1) and all data values that do not satisfy the condition are marked as false (0).
  • If the Condition field is set to a base ts(), hs(), etc. expression, for example ts("cpu.loadavg.1m"), then all non-zero data values are marked as true and all zero data values are marked as false. If there is no reported data, then values are neither true nor false.

An alert fires when a metric stays at a value that indicates a problem for the specified amount of time.

  • A classic alert sends a notification with the specified severity to all specified targets.
  • A multi-threshold alert allows you to specify multiple severities and a different target for each severity. Each target is notified if the condition is met when the alert changes state.

Alert Target

Each alert is associated with one or more alert targets. The alert target specifies who to notify when the alert changes state.

  • For classic alerts, you specify a (single) severity and one or more corresponding alert targets. You can set up email, PagerDuty, and custom alert targets.
  • For multi-threshold alerts, you can specify a different alert target for each threshold, for example, an email target when the alert reaches the INFO threshold and a PagerDuty target when the alert reaches the SEVERE threshold. You can specify only custom alert targets, but it’s easy to set up a custom email or PagerDuty alert target.

The maximum number of email alert targets is 10 for classic alerts and 10 per severity for multi-threshold alerts. If you exceed the number, you receive a message like the following:

{"status":{"result":"ERROR","message":"Invalid notification specified: null","code":400}}

How Alerts Work Video

In this video, Clement explains how classic alerts work:

In this video, Jason explains classic alerts while he’s showing them in the UI:

Creating an Alert

You can create a classic alert with a single severity level (e.g., SEVERE), or a multi-threshold alert, which allows you to customize alert behavior for different thresholds.

Create a Classic Alert

Required fields for a classic alert are:

  • Alert name (default is New Alert)
  • Alert condition
  • Alert severity

To notify alert targets when the alert changes state, you can specify targets during alert creation or later.

To create a classic alert:

  1. Do one of the following:
    • Alerts Browser - Click Alerting from the taskbar and click the Create Alert button located above the filter bar.
    • Chart - Click the ellipsis icon on the right of the query and select Create Alert. create_alert
  2. Specify the following required alert properties.
    PropertyDescription
    Name Name of the alert. 1-255 characters.
    Condition A conditional expression that defines the threshold for the alert. The condition expression can include any valid Wavefront Query Language construct. The condition expression coupled with the Alert fires setting determines when the alert fires.
    • Alert fires - Length of time (in minutes) during which the Condition expression must be true before the alert fires. Minimum is 1. For example, if you enter 5, the alerting engine reviews the value of the condition during the last 5-minute window to determine whether the alert should fire.
    • Alert resolves - Length of time (in minutes) during which the Condition expression must be not true before the alert switches to resolved. Minimum is 1. Omit this setting or pick a value that is greater than or equal to the Alert fires value to avoid resolve-fire cycles.
    For details and examples, see Alert States and Lifecycle.
    Severity How important the alert is. In decreasing importance: SEVERE, WARN, SMOKE, and INFO.
  3. (Recommended) Specify a Display Expression. Defaults to the value of the condition expression, either 0 or 1. Specify a display expression to get more details when the alert changes state. The display expression can include any valid Wavefront Query Language construct, and typically captures the underlying time series that the condition expression is testing. The results of the display expression are:
    • Shown in the Events Display preview chart on the page for creating or editing the alert.
    • Shown in any chart image that is included in a notification triggered by the alert.
    • Shown in the interactive chart you can visit from a notification triggered by the alert.
    • Used as the basis for any statistics that you might include in a custom notification triggered by the alert.
  4. (Optional) To help you find the alert and information about it in the Alerts Browser, specify Additional Information and Tags.
    PropertyDescription
    Additional Information Any additional information, such as a link to a run book.
    Tags Tags assigned to the alert. You can enter existing alert tags or create new alert tags. See Organizing Related Alerts.
  5. (Recommended) Specify a list of alert targets to notify when the alert changes state, for example, from CHECKING to FIRING, or when the alert is snoozed. You can specify up to ten different targets across the following types. Use commas to separate targets of the same type.
    PropertyDescription
    Email Valid email addresses. Alert notifications are sent to these addresses in response to a default set of triggering events, and contain default HTML-formatted content. You can specify up to 10 valid email addresses.
    PagerDuty Key PagerDuty keys obtained by following the steps for the PagerDuty integration. Alert notifications that use these keys are sent in response to a default set of triggering events, and contain default content.
    Alert Target Names of custom alert targets that you have previously created to:
    • Configure webhook notifications for pager services and communication channels. Follow the steps for the VictorOps integration, Slack integration, or HipChat integration for notifications on these popular messaging platforms.
    • Configure email or PagerDuty notifications with nondefault content or triggers.
  6. (Optional) If you are protecting metrics with metrics security policies in your environment, select the Secure Metrics Details check box. A simplified alert notification is sent.
    PropertyDescription
    Secure Metric Details If selected, alert notifications do not show metric details and alert images.
  7. (Optional) Click the Advanced link to configure the following alert properties. The defaults for those properties are often appropriate.
    PropertyDescription
    Checking Frequency Number of minutes between checking whether Condition is true. Minimum and default is 1. When an alert is in the INVALID state, it is checked approximately every 15 minutes, instead of the specified checking frequency.
    Evaluation Strategy Allows you to select Real-time Alerting. By default, Wavefront ignores values for the last 1 minutes to account for delays. Many data sources are updated only at certain points in time, so using the default evaluation strategy prevents spurious firings. If you select this check box, we include values for the last 1 minute. The alert is evaluated strictly on the ingested data. See Limiting the Effects of Data Delays.
    Resend Notifications Whether to resend notification of a firing alert. If enabled, you can specify the number of minutes to wait before resending the notification.
    Unique PagerDuty Incidents Select this option to receive separate PagerDuty notifications for each series that meets the alert conditions.
    For example, you get separate PagerDuty notifications for both the following series because the env tag is different.
    #first series
    app.errors source=machine env=prod
    
    #second series
    app.errors source=machine env=stage
      
    Metrics Whether to include obsolete metrics. By default, alerts don't consider data that have not reported for 4 weeks or more. Include obsolete metrics if you use queries that aggregate data in longer time frames.
  8. Click Save.

Video: Create a Classic Alert

This video shows how Jason creates a classic alert:

Create a Multi-Threshold Alert

Required fields for a multi-threshold alert are:

  • Alert name (defaults to New Alert)
  • Alert condition and operator (e.g., greater than (>))
  • At least one severity/threshold value pair.

For each severity, you can specify one or more alert targets to notify when the alert changes state. Each target is notified if the condition is met when the alert changes state.

Only custom alert targets are supported, but you can initially create the alert without specifying a target.

For a multi-threshold alert, Wavefront creates a display expression that shows the alert condition.

To create a multi-threshold alert:

  1. Do one of the following:
    • Alerts Browser - Click Alerting from the taskbar and click the Create Alert button located above the filter bar.
    • Chart - Click the ellipsis icon on the right of the query and select Create Alert. create_alert
  2. Next to Type, select Threshold.
  3. Fill in the following required alert properties.
    PropertyDescription
    Name Name of the alert. 1-255 characters.
    Condition A query language expression that defines the threshold for the alert. The condition expression can include any valid Wavefront Query Language construct. The condition expression coupled with the Alert fires setting determines when the alert fires.
    • Alert fires - Length of time (in minutes) during which the Condition expression must be true before the alert fires. Minimum is 1. For example, if you enter 5, the alerting engine reviews the value of the condition during the last 5 minute window to determine whether the alert should fire.
    • Alert resolves - Length of time (in minutes) during which the Condition expression must be not true before the alert switches to resolved. Minimum is 1. Omit this setting or pick a value that is greater than or equal to the Alert fires to avoid potential chains of resolve-fire cycles.
    For details and examples, see Alert States and Lifecycle.
    Operator Select one of the operators, for example, greater than or >. The operator determines which values to use for the different severity thresholds. For example, if the operator is greater than, then:
    • You don't have to specify all 4 severities.
    • SEVERE must be the highest number
    • INFO must be the lowest number
    • The numbers must increase from INFO to SEVERE.
    Severity For multi-threshold alerts, specify more than one severity - or create a Classic alert. Associate a threshold value with each severity. The order must match the operator.

    For example, you can specify an Operator >=, SEVERE 6000, and WARN 5000, but you can't specify SEVERE 5000, and WARN 6000 with that operator.
  4. (Recommended) Specify a list of alert targets for each severity. Wavefront notifies the targets when the alert changes state, for example, from CHECKING to FIRING, or when the alert is snoozed. Specify names of custom alert targets that you already created. You can specify up to ten different targets for each severity. You cannot specify an email address or PagerDuty key directly.

  5. (Optional) To help you find the alert and information about it, specify Additional Information and Tags.
    PropertyDescription
    Additional Information Any additional information, such as a link to a run book.
    Tags Tags assigned to the alert. You can enter existing alert tags or create new alert tags. See Organizing Related Alerts with Alert Tags.
  6. (Optional) If you are protecting metrics in your environment with metrics security policies, select the Secure Metrics Details check box. A simplified alert notification is sent.
    PropertyDescription
    Secure Metric Details If checked, alert notifications do not show metric details and alert images.
  7. (Optional) Click the Advanced link to configure the following alert properties. The defaults for those properties are often appropriate.
    PropertyDescription
    Checking Frequency Number of minutes between checking whether Condition is true. Minimum and default is 1. When an alert is in the INVALID state, it is checked approximately every 15 minutes, instead of the specified checking frequency.
    Evaluation Strategy Allows you to select Real-time Alerting. By default, Wavefront ignores values for the last 1 minutes to account for delays. Many data sources are updated only at certain points in time, so using the default evaluation strategy prevents spurious firings. If you select this check box, we include values for the last 1 minute. The alert is evaluated strictly on the ingested data. See Limiting the Effects of Data Delays.
    Resend Notifications Whether to resend notification of a firing alert. If enabled, you can specify the number of minutes to wait before resending the notification.
    Unique PagerDuty Incidents Select this option to receive separate PagerDuty notifications for each series that meets the alert conditions.
    For example, you get separate PagerDuty notifications for both the following series because the env tag is different.
    #first series
    app.errors source=machine env=prod
    
    #second series
    app.errors source=machine env=stage
      
    Metrics Whether to include obsolete metrics. By default, alerts don't consider data that have not reported for 4 weeks or more. Include obsolete metrics if you use queries that aggregate data in longer time frames.
  8. Click Save.

Video: Create a Multi-Threshold Alert

This video shows how to create a multi-threshold alert:

threshold alerts

Do More!