Learn about gauges, counters, delta counters, histograms, and spans.

Wavefront supports monitoring time series, histograms, and traces.

  • Each time series consists of numeric data points for a metric, for example, CPU load or failed network connections. Time series can use one of the supported data formats. The type of data that you’re collecting determines the type of metric. Wavefront supports gauges, counters, delta counters, and more.

  • Wavefront histograms let you compute, store, and use distributions of metrics rather than single metrics. Histograms are useful for high-velocity metrics about your applications and infrastructure–-particularly metrics that are gathered across many distributed sources.
  • Distributed tracing enables you to track the flow of work that is performed by an application as it processes a user request. We support the OpenTracing standard. You can either visualize and examine traces coming from a 3rd-party system such as Jaeger or Zipkin, or instrument your application for tracing using one of our SDKs.

Summary of Metric Types

The following table gives an overview of metric types. We introduce each type in more detail below.

MetricDescriptionExample
Gauge Shows current value for each point in time. CPU load, network connections
Counter Shows values as they increase (and decrease). Number of failed connections, registered users.
Delta counter Useful for monitoring bursty traffic in a Function-as-a-Service (serverless) environment. Shows how many times an FaaS function executed (or failed).
Histogram Supports computing, storing, and using distributions of metrics that use the Wavefront histogram format. Useful for very high frequency data. See the discussion of histograms.
Trace A trace shows you how a request propagates from one microservice to the next in a distributed application. The basic building blocks of a trace are its spans. You can think of a trace as a tree of related spans. The trace has a unique trace ID, which is shared by each member span in the tree. See "Sample Application for an example.
Span Spans are the fundamental units of trace data. Each span corresponds to a distinct invocation of an operation that executes as part of the request. For example, in our BeachShirts sample application, we have the beachshirts.shopping operation, which includes many invocations of the Printing.getAvailableColors span.

Gauges

A gauge shows the current value for each point in time. Think of a thermometer that shows the current temperature or a gauge that shows how much electricity your Tesla has left.

Many metrics that come into Wavefront are gauges. For example, Wavefront internal metrics include ~alert.checking_frequency.{id} and ~alert.query_time.{alert_id}.

Counters

Counters show information over time. Think of a person with a counter at the entrance to a concert. The counter shows the total number of people that have entered so far.

Counter metrics usually increase over time but might reset back to zero, for example, when a service or system restarts. Users can wrap rate() around a counter if they want to ignore temporary 0 values and see only the positive rate of change.

Wavefront internal metrics that are counters include ~metric.new_host_ids and ~query.requests.

Counter Example (Count Total)

In most cases, you can get the information you need from a counter as follows:

  1. A counter usually represents something like “how many requests have been processed” or “how many errors happened”. You get the metric like this:
    ts(~sample.network.bytes.received)
    
  2. You use the rate()function to get the corresponding per-second rate so you know, for example, “how many requests have been processed per second?” or “How many errors are happening per second”:
    rate(ts(~sample.network.bytes.received))
    
  3. There are often multiple time series that have the counter (e.g. coming from different sources). Each time series reports the count of the requests received or errors. If you’re interested in the total count across your system, you can use sum() to sum it up into a single time series.
    sum(rate(ts(~sample.network.bytes.received)))
    

Counter Example (Count Total Over Time Period)

If you want to count the total number of occurrences of a certain time period, the syntax is slightly more complex. Because counters commonly reset to zero, you need a query that counts the total number of increments over the time period you’re looking at. You want to ignore any counter resets.

Here, we want to get the number of errors for 1 day.

  1. We start by wrapping the counter with ratediff(), which, in contrast to rate() returns the absolute difference between incrementing data points without dividing by the number of seconds between them.
    ratediff(ts(the.counter))
    
  2. We use align to group the data values of the time series into buckets 1 minute.
    align(1m, sum, ratediff(ts(the.counter)))
    
  3. We use rawsum() to combine all time series into one series, and to not use interpolation.
     rawsum(align(1m, sum, ratediff(ts(the.counter))))
    
  4. Finally, we get the result for 1 day by using the msum() function.
     msum(1d, rawsum(align(1m, sum, ratediff(ts(the.counter)))))
    

Gauge into Counter

To turn a gauge into a counter, you can use query language functions such as integral. For example, you could convert a ~alert.checking_frequency.My_ID to see the trend in checking frequency instead of the raw data.

    integral(ts(~alert.checking_frequency.My_ID))

Delta Counters

Delta counters are well suited for the kind of bursty traffic you typically get in a Function-as-a-Service environment. Many functions execute simultaneously and it’s not possible to monitor bursty traffic like that without losing metric points to collision.

For example, instead of one person with a counter standing at a concert entrance, is an example. No single person can capture the composite count, so you add up the counters. In the same way, the Wavefront service can aggregate delta counter information.

If a metric starts with a delta character, the Wavefront service considers that metric a delta metric. The Wavefront service aggregates delta metric points and stores the aggregated point.

The following illustration compares a counter and a delta counter.

  • The counter mycounter sends 3 data points to the Wavefront service. Wavefront stores each value with its timestamp. When you run a query, such as integral(), the Wavefront service fetches the stored values, aggregates them, and returns the result.
  • In the delta counter use case, a FaaS environment runs the function in multiple function invocation instances and sends the points to the Wavefront service. The Wavefront service aggregates the points and stores the result. When the user runs a query, the Wavefront service fetches the already aggregated value.

counters_delta_counters

Histograms

Wavefront can receive and store metrics at 1 point per second per unique source. However, some scenarios generate metrics even more frequently. Suppose you are measuring the latency of web requests. If you have a lot of traffic at multiple servers, you may have multiple distinct measurements for a given metric, timestamp, and source. Using “normal” metrics, we can’t measure this.

To address high frequency data, Wavefront supports histograms – a mechanism to compute, store, and use distributions of metrics. A Wavefront histogram is a distribution of metrics collected and computed by the Wavefront proxy. Histograms are supported by Wavefront proxy 4.12 and later. Wavefront Histograms describes the histogram format, histogram ports, and some examples.

histogram

Traces and Spans

Wavefront follows the OpenTracing standard for representing and manipulating trace data.

  • A trace represents an individual workflow in an application. A trace shows you how a particular request propagates through your application or among a set of services.

  • Spans are the individual segments of work in the trace. A Wavefront trace consists of one or more spans. Each span represents time spent by an operation in a service (often a microservice).

Because requests normally consist of other requests, a trace actually consists of a tree of spans.

Metrics Browser

Use the Metrics Browser to see which metrics are available in your environment and to hide and redisplay metrics.

To view, hide, and redisplay metrics
  1. Select Browse > Metrics
  2. Use the options on the left to narrow down your search.
browse metrics

Hiding and Unhiding Metrics

You can manually hide metrics from the Metrics browser and in the autocomplete dropdown associated with queries. Manually hiding metrics does not permanently delete a metric or metric namespace.

To hide one or more metrics:
  1. Select Browse > Metrics
  2. Click the Manage Hidden Metrics button
  3. In the dialog type a complete metrics name (e.g. requests.latency) or a metric prefix (e.g. requests., cpu.loadavg.).
    • This field does not support auto-complete, so you have to type the entire metric name or metric prefix.
    • The text is case sensitive.
    • Wildcards are not supported. The star * character is considered part of the text string.
  4. Press Enter to add the metric(s) to the list and click Save.
hide metrics
To view hidden metrics:
  1. Select Browse > Metrics
  2. Click the Manage Hidden Metrics button.
  3. Click the Unhide button to the right of the metric or metric prefix to unhide and click Save.
The selected metrics and metric prefixes appear again as long as they have had at least 1 reported data value in the last 4 weeks. Otherwise, these metric/metric prefixes are considered obsolete metrics and Wavefront hides them. You can show obsolete metrics for individual charts or alerts.
view hidden metrics

Learn More!

Search this doc set for details on any of the metric types, or read this: