Monitor CloudWatch, CloudTrail, and Metrics+ with Wavefront

Amazon Web Services (AWS), is a collection of cloud-computing services that provide an on-demand computing platform. The Wavefront Amazon Web Services integration allows you to ingest metrics directly from AWS. The Wavefront Amazon Web Services built-in integration is part of the setup, but the additional steps in this document are needed to complete and customize integration setup.

You have to set up your Wavefront account with the correct permissions.

Supported AWS Integrations

The AWS integration ingests data from many products and provides dashboards for each. See any integration page for a list of dashboards. The following products are of special interest to most customers:

  • CloudWatch - retrieves AWS metric and dimension data. Includes some metrics for Amazon Relational Database (RDS).
  • CloudTrail - retrieves EC2 event information and creates Wavefront System events that represent the AWS events.
  • AWS Metrics+ - retrieves additional metrics using AWS APIs other than CloudWatch. Data include EBS volume data and EC2 instance metadata like tags. You can investigate billing data and the number of reserved instances. Be sure to enable AWS+ metrics because it allows Wavefront to optimize its use of Cloudwatch, and saves money on Cloudwatch calls as a result.

CloudWatch Integration Details

Wavefront retrieves AWS metric and dimension data from AWS services using the AWS CloudWatch API. The complete list of metrics and dimensions that can be retrieved from AWS CloudWatch is available at Amazon CloudWatch Metrics and Dimensions Reference. In addition, you can publish custom AWS metrics that can also be ingested by the CloudWatch integration.

Configuring CloudWatch Data Ingestion

You can configure which instances and volumes to ingest metrics from, which metrics to ingest, and the rate at which Wavefront fetches metrics. To configure CloudWatch ingestion:

  1. In Wavefront, click Integrations in the task bar.
  2. In the Featured section, click the Amazon Web Services tile.
  3. Click the Setup tab.
  4. In the Types column, click the CloudWatch link in the row of the integration you want to configure.
  5. Configure ingestion properties:
    • Instance and Volume Whitelist fields - Whitelist instances and volumes by specifying EC2 tags (as <key>=<value> pairs) defined on the instances and volumes. For example, organization=<yourcompany>. When specified as a comma-separated list, the tags are OR’d. To use instance and volume whitelisting, you must also add an AWS Metrics+ integration because the AWS tags are imported from the EC2 service. If you don’t specify any tags, Wavefront imports metrics from all instances and volumes.
    • Metric Whitelist field - Whitelist metrics by specifying a regular expression. The regular expression must be a complete match of the entire metric name. For example, if you only want CloudWatch data for elb and rds (which come under aws.rds), then use a regular expression such as: ^aws.(elb|rds).*$. If you do not specify a regular expression, all CloudWatch metrics are retrieved.
    • Point Tag Whitelist - Whitelist AWS point tags by specifying a regular expression. If you do not specify a regular expression, no point tags are added to metrics.
    • Service Refresh Rate - Number of minutes between requesting metrics. Default: 5.
  6. Click Save.

CloudWatch Sources and Source Tags

Wavefront automatically sets each metric’s source field and adds source tags to each AWS source, as follows:

Metric Source Field

Wavefront sets the value of the AWS metric source field by service:

  • EC2 - the value of the hostname, host, or name EC2 tags, if the tags exist and you have an EC2 integration. Otherwise, the source is set to the Amazon instance ID.
  • EBS - the Amazon instance ID of the EC2 instance the volume is attached to.
  • All other services - the value of the first CloudWatch dimension. The supported dimensions appear at the bottom of the Amazon service metric documentation topic. For example, see Amazon EC2 Dimensions.

Source Tags

AWS sources are assigned source tags that identify their originating service following this pattern: ~integration.aws.<service>, for example, ~integration.aws.ec2.

CloudWatch Point Tags

Wavefront adds the following point tags to CloudWatch metrics:

  • accountId - the Amazon account that reported the metric.
  • Region - The region in which the service is running. Added to EC2 and EBS metrics only.
  • CloudWatch dimensions. The dimensions vary by service. For example, for AWS S3, the BucketName dimension is added as a point tag.

CloudWatch Pricing

Standard AWS CloudWatch pricing applies each time Wavefront requests metrics using the CloudWatch API. For pricing information, see AWS | Amazon CloudWatch | Pricing. After selecting a region, you can find the current expected price under Amazon CloudWatch API Requests. In addition, custom metrics have a premium price; see the Amazon CloudWatch Custom Metrics section of the pricing page. To limit cost, by default Wavefront queries the API every 5 minutes. However, you can change the request rate, which will change the cost.

As an alternative to using the CloudWatch API for EC2 metrics, you can collect these metrics using a Telegraf collector on each AWS instance. In this case, to prevent CloudWatch from requesting those metrics, you should set the Metric Whitelist property to allow all metrics except EC2. For example:

^aws.(billing|instance|sqs|sns|reservedInstance|ebs|route53.health|ec2.status|elb|s3).*$

By default, on a new Wavefront trial, Wavefront limits the number of unique metrics that can be retrieved from CloudWatch to 10K to cap the AWS CloudWatch bill.

Configuring CloudWatch Billing Metrics

The AWS Billing and Cost Management service sends billing metrics to CloudWatch. You configure AWS to produce aws.billing.* metrics by checking the Receive Billing Alerts checkbox on the Preferences tab in the AWS Billing and Cost Management console:

aws billing

Wavefront reports the single metric aws.billing.estimatedcharges. The source field and ServiceName point tag identify the AWS services. For the total estimated charge metric, source is set to usd and ServiceName is empty. Wavefront also provides the point tags accountId, Currency, LinkedAccount, and Region. Billing metrics are typically reported every 4 hours.

CloudTrail Events, Metrics, and Point Tags

Wavefront retrieves CloudTrail event information stored in JSON-formatted log files in an S3 bucket. The CloudTrail integration parses the files for all events that result from an operation that is not a describe, get, or list, and creates a Wavefront System event.

In the Events browser the events are named AWS Action: <Operation> and have the event tag aws.cloudtrail.ec2. For example:

aws start instance

Starting with release 2018.22.x, we group AWS CloudTrail events by the minute and report the metrics. We also support several point tags that allow you to filter the events.

CloudTrail Metrics

Each metrics starts with aws.cloudtrail.event., followed by one of the EC2 operation names.

The EC2 operations include:

  • [Run|Start|Stop|Terminate|Monitor|Unmonitor]Instances
  • [Attach|Detach]Volume
  • DeleteNetworkInterface
  • AuthorizeSecurityGroupIngress
  • CreateSecurityGroup
  • RequestSpotInstances
  • CancelSpotInstanceRequests
  • ModifyInstanceAttribute
  • CreateTags
  • [Create|Delete]KeyPair
  • DeregisterImage

As a result, the metrics include, for example aws.cloudtrail.event.Start or aws.cloudtrail.event.CreateTags.

In addition, the metric aws.cloudtrail.event.total-per-minute reports the per-minute count of all AWS API calls recorded by the AWS CloudTrail integration.

Point Tags for Filtering CloudTrail Metrics

You can use the following point tags to filter the metrics.

Point tagDescriptionExample
eventType The type of event that generated the event record. AwsApiCall, AwsServiceEvent
eventSource The service that the request was made to. ec2.amazonaws.com
Region The AWS region that the request was made to. us-east-2
accountId The account ID that you specified when you set up the AWS CloudTrail integration. User42
bucket Bucket that you specified when you set up the AWS CloudTrail integration. A random number

AWS Metrics+ Data

AWS Metrics+ are metrics retrieved using AWS metrics API calls other than CloudWatch. Unless otherwise indicated, Wavefront sets the value of the AWS Metrics+ source field to the AWS instance ID. If an EBS volume is detached, its source field is set to the volume ID. The metrics include:

  • aws.instance.price - EC2 instances and how much they cost per hour. This metric includes the point tags availabilityZone, instanceID, instanceLifecycle, instanceType, and operatingSystem.
  • aws.reservedinstance.count - Number of reserved instances in each availability zone by each instance type. This metric includes the point tags availabilityZone, instanceID, instanceType, and operatingSystem. This metric appears only if your account has reserved instances.
  • EBS metrics - EBS metrics include the point tags instanceID, Region, State, Status, volumeId, and volumeType (see Amazon EBS Volume Types). The Status can be attached, detaching, or attaching. The State can be available (detached) or in-use (attached).
    • aws.ebs.volumesize - The volume size of the elastic block store.
    • aws.ebs.volumeiops - The volume I/O operations of the elastic block store.
  • SQS - AWS SQS metrics retrieved every minute from the SQS service.
    • aws.sqs.approximatenumberofmessagesnotvisible - The number of messages that are “in flight.” Messages are considered in flight if they have been sent to a client but have not yet been deleted or have not yet reached the end of their visibility window.
    • aws.sqs.approximatenumberofmessagesdelayed - The number of messages in the queue that are delayed and not available for reading immediately. This can happen when the queue is configured as a delay queue or when a message has been sent with a delay parameter.
    • aws.sqs.approximatenumberofmessages aliased to the CloudWatch metric aws.sqs.approximatenumberofmessagesvisible - The number of messages available for retrieval from the queue.
  • Pricing Metrics - capture the current pricing of EC2 instances. These metrics are available as a preview and subject to change. These metrics have the point tags instanceType, operatingSystem, Region, purchaseOption (All Upfront, Partial Upfront, No Upfront), leaseContractLength (1 or 3 years), and offeringClass (standard or convertible)). The source field is set to the display name of the region. For example, if Region=us-west2, then source=us west (oregon).
    • ~sample.aws.ec2.on-demand.price.hourly - the hourly price (in US$) of an on-demand instance.
    • ~sample.aws.ec2.reserved.price.upfront - the up-front payment (in US$) for a reservation. This metric reports 0 when purchaseOption is No Upfront.
    • ~sample.aws.ec2.reserved.price.hourly - the hourly payment (in US$) for a reservation. This metric reports 0 when the purchaseOption is All Upfront.
  • RDS Metrics -give insight into Amazon Relational Database Service (RDS)
    • aws.rds.allocatedstorage - The amount of storage (in gigabytes) allocated for the database instance.
    • aws.rds.capacity - For Amazon Aurora only, RDS capacity.
    • aws.rds.backtrackconsumedchangerecords - For Amazon Aurora only, the number of change records stored for Backtrack.
  • Service Limit Metrics - capture the current resource limits and usage for your AWS account. These metrics include the point tags Region and category.
    • aws.limits.<resource>.limit - the current limit for an AWS resource in a particular region.
    • aws.limits.<resource>.usage - the current usage of an AWS resource in a particular region. See the following section for details.

AWS Metrics+ Trusted Advisor Service Limits

Each AWS account has limits on the amount of resources that are available to you for each AWS service. You can monitor and manage your resource usage and limits using the AWS service limit metrics in Wavefront.

If you have an account with the required permissions, you can view the available service limits in the AWS Trusted Advisor console.

Example Queries for Service Limits

Here are a few sample queries:

To visualize your limits for EC2 On-Demand Instances per region, you can run the following query:

ts(aws.limits.on_demand_instances_*.limit)

To visualize your usage for EC2 On-Demand Instances per region, you can run the following query:

ts(aws.limits.on_demand_instances_*.usage)

Example Alert for Trusted Advisor Service Limits

Sample alerts from the Wavefront Ops team are on this page.

The following alert is a simple illustration for how alerts like this work.

You can set up an alert to notify you when data reach a certain threshold.

The following chart sets up variables for on-demand instances limit and on-demand instance usage. The visible query shows the percentage.

chart for service limits query

We can create a multi-threshold alert for this query that:

  • Fires if the condition has been true for the last 30 minutes.
  • Notifies SEVERE targets if the value is greater than 90.
  • Notifies WARN targets if the value is greater than 80.

service limits alarm