Wavefront Usage Integration
The Wavefront Service and Proxy Data dashboard, part of the Wavefront Usage integration, allows you to examine internal metrics. These metrics allow you to check whether your Wavefront instance is behaving as expected.
The Wavefront Ingestion Policy Explorer dashboard provides a granular breakdown of Wavefront ingestion across your organization by ingestion policies, accounts, sources, and types. Use this dashboard to identify who is contributing the most to your Wavefront usage and manage your overall usage.
The Wavefront Namespace Usage Explorer dashboard breaks down metrics usage based on integrations with the ability to drill-down further into the metric namespaces.
The Committed Rate and Monthly Usage (PPS P95) displays a detailed breakdown of your Wavefront monthly usage. This enables you to take appropriate action when your Wavefront usage reaches around 95% of your target/committed usage.
Wavefront internal metrics have the following prefixes
To modify the Wavefront Usage alerts, install them and clone them. You must update the required fields in cloned alerts.
- High rate of host IDs observed:Alert for tenant is reporting high rate of new host (source) IDs to Wavefront
- High rate of metric IDs observed:Alert for tenant is reporting high rate of new metric IDs to Wavefront
- High rate of string IDs observed:Alert for tenant is reporting high rate of new string (point tags) IDs to Wavefront
- Alert Webhooks failed:Alert is reporting when the alert webhooks end with either a 4xx or 5xx error.
- Proxy check-in with an invalid token observed:Alert is reporting when a proxy checks in by using an invalid token.
- Proxy Network Latency (P95):Alert is reporting the 95th percentile of the latency. Latency measures the time it takes for the proxy to push its metric, i.e. the duration. Constantly large numbers mean that the network suffers certain latency.
- Proxy Data Received Lag (P95):Alert is reporting the 95th percentile of time differences (in milliseconds) between the timestamp on a point and the time that the proxy received it. Large numbers indicate backfilling old data, or clock drift in the sending systems. You can also graph other percentiles.
- Proxy Backlog (spans) has been accumulating:This alert checks whether there is any span back logs on proxy. Back logs means the proxy is queuing points due to either the span data transmission between proxy and TO service has been blocked, or data is being pushed back by the service due to the ingestion limit imposed.
- Proxy Backlog (hisgrams) has been accumulating:Alert is reporting there are histogram backlogs on the proxy. Backlogs mean that the proxy is queuing histograms because either the data transmission between the proxy and the service has been blocked, or the data is being pushed back by the service because the ingestion limit is reached.
- Proxy rate limiter is activated:Alert is reporting when the points per second rate’s 30-minute moving sum is constantly high. Check to see which proxy is affected by the data being pushed back.
- Proxy Backlog (points) has been accumulating:This alert checks whether there is any metric back logs on proxy. Back logs means the proxy is queuing metric points due to either the data transmission between proxy and TO service has been blocked, or data is being pushed back by the service due to the ingestion limit imposed.
- High proxy JVM memory heap usage observed:Alert is reporting when the heap memory usage of the proxy is constantly high. Make sure that the memory of the proxy is reasonable.
- Wavefront rate limits exceeded:Alert is reporting when the points per second that are being rate limited reach a certain threshold.
- Invalid Alert Condition Found:Alert is reporting that some alerts are with invalid state. To find the detailed list of the alerts that are currently in invalid state, in the alert list search for invalid status.