Learn how to create and manage alerts.

Most Wavefront users examine alerts and drill down to find the problem. A subset of Wavefront users create and manage alerts.

How to Create an Alert – The Basics

You can create an alert from any chart, or from the Create Alert page. The basic process is the same.

  1. Specify the alert condition, for example, CPU utilization is less than 70%.
  2. Optionally use backtesting to see how often the alert fires and adjust the threshold.
  3. Add an alert target, that is, specify who will receive the alert and how (e.g., email or PagerDuty), then save the alert.
Video of alert creation overview

The rest of this page explains:

  • How you can fine-tune the process to get just the right number of alerts to just the right people.
  • How to create alerts and customize the condition and the target.
  • How to create multi-threshold alerts, which can send notifications to different targets based on the severity of the problem.

Alert Condition

The alert condition is a query language expression that defines the threshold for an alert.

  • If an alert’s Condition field is set to a conditional expression, for example ts("requests.latency") > 195, then all data values that satisfy the condition are marked as true (1) and all data values that do not satisfy the condition are marked as false (0).
  • If the Condition field is set to a base ts(), hs(), etc. expression, for example ts("cpu.loadavg.1m"), then all non-zero data values are marked as true and all zero data values are marked as false. If there is no reported data, then values are neither true nor false.

An alert fires when a metric stays at a value that indicates a problem for the specified amount of time.

  • A classic alert sends a notification with the specified severity to all specified targets.
  • A multi-threshold alert allows you to specify multiple severities and a different target for each severity. Each target is notified if the condition is met when the alert changes state.

Alert Target

Each alert is associated with one or more alert targets. The alert target specifies who to notify when the alert changes state.

  • For classic alerts, you specify a (single) severity and one or more corresponding alert targets. You can set up email, PagerDuty, and custom alert targets.
  • For multi-threshold alerts, you can specify a different alert target for each threshold, for example, an email target when the alert reaches the INFO threshold and a PagerDuty target when the alert reaches the SEVERE threshold. You can specify only custom alert targets, but it’s easy to set up a custom email or PagerDuty alert target.

The maximum number of email alert targets is 10 for classic alerts and 10 per severity for multi-threshold alerts. If you exceed the number, you receive a message like the following:

{"status":{"result":"ERROR","message":"Invalid notification specified: null","code":400}}

Create a Classic Alert

Prerequisites

Required fields for a classic alert are:

  • Alert name (default is New Alert)
  • Alert condition
  • Alert severity

To notify alert targets when the alert changes state, you can specify targets during alert creation or later.

Procedure

  1. Do one of the following:
    • Alerts Browser - Click Alerting from the taskbar and click the Create Alert button located above the filter bar.
    • Chart - Click the ellipsis icon on the right of the query and select Create Alert. create_alert
  2. Specify the following required alert properties.
    PropertyDescription
    Name Name of the alert. 1-255 characters.
    Condition A conditional expression that defines the threshold for the alert. The condition expression can include any valid Wavefront Query Language construct. The condition expression coupled with the Alert fires setting determines when the alert fires.
    • Alert fires - Length of time (in minutes) during which the Condition expression must be true before the alert fires. Minimum is 1. For example, if you enter 5, the alerting engine reviews the value of the condition during the last 5-minute window to determine whether the alert should fire.
    • Alert resolves - Length of time (in minutes) during which the Condition expression must be not true before the alert switches to resolved. Minimum is 1. Omit this setting or pick a value that is greater than or equal to the Alert fires value to avoid resolve-fire cycles.
    For details and examples, see Alert States and Lifecycle.
    Severity How important the alert is. In decreasing importance: SEVERE, WARN, SMOKE, and INFO.
  3. (Recommended) Specify a Display Expression. Defaults to the value of the condition expression, either 0 or 1. Specify a display expression to get more details when the alert changes state. The display expression can include any valid Wavefront Query Language construct, and typically captures the underlying time series that the condition expression is testing. The results of the display expression are:
    • Shown in the Events Display preview chart on the page for creating or editing the alert.
    • Shown in any chart image that is included in a notification triggered by the alert.
    • Shown in the interactive chart you can visit from a notification triggered by the alert.
    • Used as the basis for any statistics that you might include in a custom notification triggered by the alert.
  4. (Optional) To help you find the alert and information about it in the Alerts Browser, specify Additional Information and Tags.
    PropertyDescription
    Additional Information Any additional information, such as a link to a run book.
    Tags Tags assigned to the alert. You can enter existing alert tags or create new alert tags. See Organizing Related Alerts.
  5. (Recommended) Specify a list of alert targets to notify when the alert changes state, for example, from CHECKING to FIRING, or when the alert is snoozed. You can specify up to ten different targets across the following types. Use commas to separate targets of the same type.
    PropertyDescription
    Email Valid email addresses. Alert notifications are sent to these addresses in response to a default set of triggering events, and contain default HTML-formatted content. You can specify up to 10 valid email addresses.
    PagerDuty Key PagerDuty keys obtained by following the steps for the PagerDuty integration. Alert notifications that use these keys are sent in response to a default set of triggering events, and contain default content.
    Alert Target Names of custom alert targets that you have previously created to:
    • Configure webhook notifications for pager services and communication channels. Follow the steps for the VictorOps integration or Slack integration for notifications on these popular messaging platforms.
    • Configure email or PagerDuty notifications with nondefault content or triggers.
  6. (Optional) If you are protecting metrics with metrics security policies in your environment, select the Secure Metrics Details check box. A simplified alert notification is sent.
    PropertyDescription
    Secure Metric Details If selected, alert notifications do not show metric details and alert images.
  7. (Optional) Click the Advanced link to configure the following alert properties. The defaults for those properties are often appropriate.
    PropertyDescription
    Checking Frequency Number of minutes between checking whether Condition is true. Minimum and default is 1. When an alert is in the INVALID state, it is checked approximately every 15 minutes, instead of the specified checking frequency.
    Evaluation Strategy Allows you to select Real-time Alerting. By default, Wavefront ignores values for the last 1 minutes to account for delays. Many data sources are updated only at certain points in time, so using the default evaluation strategy prevents spurious firings. If you select this check box, we include values for the last 1 minute. The alert is evaluated strictly on the ingested data. See Limiting the Effects of Data Delays.
    Resend Notifications Whether to resend notification of a firing alert. If enabled, you can specify the number of minutes to wait before resending the notification.
    Unique PagerDuty Incidents Select this option to receive separate PagerDuty notifications for each series that meets the alert conditions.
    For example, you get separate PagerDuty notifications for both the following series because the env tag is different.
    #first series
    app.errors source=machine env=prod
    
    #second series
    app.errors source=machine env=stage
      
    Metrics Whether to include obsolete metrics. By default, alerts don't consider data that have not reported for 4 weeks or more. Include obsolete metrics if you use queries that aggregate data in longer time frames.
  8. Click Save.

Create a Multi-Threshold Alert

Prerequisites

Ensure that you have the information for the required fields for your multi-threshold alert:

  • Alert name (defaults to New Alert)
  • Alert condition and operator (e.g., greater than (>))
  • At least one severity/threshold value pair.

For each severity, you can specify one or more alert targets to notify when the alert changes state. Each target is notified if the condition is met when the alert changes state.

Only custom alert targets are supported, but you can initially create the alert without specifying a target.

For a multi-threshold alert, Wavefront creates a display expression that shows the alert condition.

Procedure

  1. Do one of the following:
    • Alerts Browser - Click Alerting from the taskbar and click the Create Alert button located above the filter bar.
    • Chart - Click the ellipsis icon on the right of the query and select Create Alert. create_alert
  2. Next to Type, select Threshold.
  3. Fill in the following required alert properties.
    PropertyDescription
    Name Name of the alert. 1-255 characters.
    Condition A query language expression that defines the threshold for the alert. The condition expression can include any valid Wavefront Query Language construct. The condition expression coupled with the Alert fires setting determines when the alert fires.
    • Alert fires - Length of time (in minutes) during which the Condition expression must be true before the alert fires. Minimum is 1. For example, if you enter 5, the alerting engine reviews the value of the condition during the last 5 minute window to determine whether the alert should fire.
    • Alert resolves - Length of time (in minutes) during which the Condition expression must be not true before the alert switches to resolved. Minimum is 1. Omit this setting or pick a value that is greater than or equal to the Alert fires to avoid potential chains of resolve-fire cycles.
    For details and examples, see Alert States and Lifecycle.
    Operator Select one of the operators, for example, greater than or >. The operator determines which values to use for the different severity thresholds. For example, if the operator is greater than, then:
    • You don't have to specify all 4 severities.
    • SEVERE must be the highest number
    • INFO must be the lowest number
    • The numbers must increase from INFO to SEVERE.
    Severity For multi-threshold alerts, specify more than one severity - or create a Classic alert. Associate a threshold value with each severity. The order must match the operator.

    For example, you can specify an Operator >=, SEVERE 6000, and WARN 5000, but you can't specify SEVERE 5000, and WARN 6000 with that operator.
  4. (Recommended) Specify a list of alert targets for each severity. Wavefront notifies the targets when the alert changes state, for example, from CHECKING to FIRING, or when the alert is snoozed. Specify names of custom alert targets that you already created. You can specify up to ten different targets for each severity. You cannot specify an email address or PagerDuty key directly.

  5. (Optional) To help you find the alert and information about it, specify Additional Information and Tags.
    PropertyDescription
    Additional Information Any additional information, such as a link to a run book.
    Tags Tags assigned to the alert. You can enter existing alert tags or create new alert tags. See Organizing Related Alerts with Alert Tags.
  6. (Optional) If you are protecting metrics in your environment with metrics security policies, select the Secure Metrics Details check box. A simplified alert notification is sent.
    PropertyDescription
    Secure Metric Details If checked, alert notifications do not show metric details and alert images.
  7. (Optional) Click the Advanced link to configure the following alert properties. The defaults for those properties are often appropriate.
    PropertyDescription
    Checking Frequency Number of minutes between checking whether Condition is true. Minimum and default is 1. When an alert is in the INVALID state, it is checked approximately every 15 minutes, instead of the specified checking frequency.
    Evaluation Strategy Allows you to select Real-time Alerting. By default, Wavefront ignores values for the last 1 minutes to account for delays. Many data sources are updated only at certain points in time, so using the default evaluation strategy prevents spurious firings. If you select this check box, we include values for the last 1 minute. The alert is evaluated strictly on the ingested data. See Limiting the Effects of Data Delays.
    Resend Notifications Whether to resend notification of a firing alert. If enabled, you can specify the number of minutes to wait before resending the notification.
    Unique PagerDuty Incidents Select this option to receive separate PagerDuty notifications for each series that meets the alert conditions.
    For example, you get separate PagerDuty notifications for both the following series because the env tag is different.
    #first series
    app.errors source=machine env=prod
    
    #second series
    app.errors source=machine env=stage
      
    Metrics Whether to include obsolete metrics. By default, alerts don't consider data that have not reported for 4 weeks or more. Include obsolete metrics if you use queries that aggregate data in longer time frames.
  8. Click Save.

Video: Create a Multi-Threshold Alert

This video shows how to create a multi-threshold alert:

threshold alerts

Delete an Alert

Users with Alerts permissions can delete an alert.

  1. Click Alerting in the taskbar to display the Alerts Browser.
  2. Click the ellipsis icon next to the alert.
  3. Select Delete and confirm the deletion.

Edit an Alert

You can change an alert at any time.

  1. Click Alerting in the taskbar to display the Alerts Browser.
  2. Click the name of the alert you want to edit to display the Edit Alert page.
  3. Update the properties you want to change, and click Save.

Use Backtesting to Fine-Tune Conditions

Wavefront can show hypothetical alert-generated events using backtesting. Backtesting enables you to fine tune new or existing alert conditions before you save them.

When you create a classic alert, the Events Display is set to Backtesting. You can later edit the alert.

To change the events display:

  1. Select the alert and click Edit.
  2. Change the Events Display:
    • Actual Firings - Displays past alert-generated event icons on the chart. You will see how often the alert actually fired within the given chart time window.
    • Backtesting - Displays hypothetical alert-generated event icons on the chart. You can see how often an alert would fire within the chart time window based on the condition and the Alert Fires field.

Backtesting does not always exactly match the actual alert firing. For example:

  • If data comes in late, backtesting won’t match the actual alert firing.
  • If data are meeting the alert condition for the “condition is true for x mins” amount of time, the actual alert might not fire because the alert check, determined by the alert check interval, happens too soon or too late. For both cases, backtesting shows the alert as firing while the actual alert might not show as firing.

Do More!