Learn how to use Wavefront histograms.

Wavefront histograms let you compute, store, and use distributions of metrics rather than single metrics. Histograms are useful for high-velocity metrics about your applications and infrastructure – particularly those gathered across many distributed sources. You can send histograms to a Wavefront proxy or use direct ingestion.

Getting Started

Watch this video for an introduction to histograms:

histograms

The following blog posts give some background information:

Why Use Histograms?

Wavefront can receive and store highly granular metrics at 1 point per second per unique source. However, some scenarios generate even higher frequency data. Suppose you are measuring the latency of web requests. If you have a lot of traffic at multiple servers, you may have multiple distinct measurements for a given metric, timestamp, and source. Using “normal” metrics, we can’t measure this because, rather than metric-timestamp-source mapping to a single value, the composite key maps to a multiset (multiple and possibly duplicate values).

One approach to dealing with high frequency data is to calculate an aggregate statistic, such as a percentile, at each source and send only that data. The problem with this approach is that performing an aggregate of a percentile (such as a 95th percentile from a variety of sources) does not yield an accurate and valid percentile with high velocity metrics. That might mean that even though you have an outlier in some of the source data, it becomes obscured by all the other data.

To address high frequency data, Wavefront supports histograms – a mechanism to compute, store, and use distributions of metrics. A Wavefront histogram is a distribution of metrics collected and computed by the Wavefront proxy (4.12 and later), or sent to the Wavefront service via direct ingestion. To indicate that metrics should be treated as histogram data, the user can:

The Wavefront service rewrites the names of histogram metrics, which you can query with a set of functions.

Wavefront Histogram Distributions

Wavefront creates distributions by aggregating metrics into bins. The following figure illustrates a distribution of 205 points that range in value from 0 to 120 at t = 1 minute, into bins of size 10.

histogram

The following table lists the distribution of one metric at successive minutes. The first row of the table contains the distribution illustrated in the figure. The following rows show how the distribution evolves over successive minutes.

Time (minute)Distribution (number of points)
1 [2, 1, 9, 20, 31, 40, 40, 29, 19, 10, 2, 2]
2 [2, 1, 9, 22, 31, 38, 41, 28, 17, 11, 3, 2]
3 [1, 2, 10, 21, 31, 39, 40, 29, 19, 10, 1, 2]
4 [2, 1, 9, 19, 29, 40, 41, 31, 20, 10, 1, 2]

The Wavefront histogram bin size is computed using a T-digest algorithm, which retains better accuracy at the distribution edges where outliers typically arise. In the algorithm, bin size is not uniform (unlike the histogram illustrated above). However, the bin size that the algorithm selects is irrelevant.

Wavefront histograms do not store each actual data point value that is fed to it. Instead, histograms store the quantiles calculated from histogram points, which are estimates within a certain margin of error.

Histogram Metric Aggregation Intervals

Wavefront supports aggregating metrics by the minute, hour, or day. Intervals start and end on the minute, hour, or day, depending on the granularity that you choose. For example, day-long intervals start at the beginning of each day, UTC time zone.

The aggregation intervals do not overlap. If you are aggregating by the minute, a value reported at 13:58:37 is assigned to the interval [13:58:00;13:59:00]. If no metrics are sent during an interval, no histogram points are recorded.

Sending Histogram Distributions

A histogram distribution allows you to combine multiple points into a complex value that has a single timestamp.

To send a histogram distribution to the Wavefront proxy:

  • Send to the distribution port listed in the table in Histogram Proxy Ports.

  • Use the following format:

    {!M | !H | !D} [<timestamp>] #<points> <metricValue> [... #<points> <metricValue>]
     <metricName> source=<source>
     [<pointTagKey1>=<value1> ... <pointTagKeyN>=<valueN>]
    

    where

    • {!M | !H | !D} identifies the aggregation interval (minute, hour, or day) used when computing the distribution
    • points is the number of points.
    • all elements not enclosed in square brackets, including the source, are required elements.

    For example:

    !M 1493773500 #20 30 #10 5 request.latency source=appServer1 region=us-west
    

    is a distribution that sends 20 points of the metric request.latency with value 30, and 10 points with value 5, that have been aggregated into minute intervals.

You can also send a histogram distribution using direct ingestion. In that case, you must include f=histogram or your data are treated as metrics even if you use histogram data format.

Histogram Example

Suppose you want to send the following points to the Wavefront proxy:

10, 20, 20, 30, 40, 100, 100

If you want an hourly aggregation, you can send those points as a distribution to the histogram distribution listener port:

  • By default, port 2878 for proxy 4.29 and later.
  • By default, 40000 for earlier proxy versions.

!H <timestamp> #1 10 #2 20 #1 30 #2 100 my.metric source=s1

Here, you specify:

  • the interval, in this case hours (!H)
  • timestamp (optional)
  • a set of sequences. Each sequence starts with #, followed by the number of points and the value of the points. In this example, we have 2 for 20 because we’re sending 2 points with the value 20.
  • metric name
  • source
  • optional point tag keys and values

You can also send the histogram data to one of the histogram proxy ports in Wavefront data format. For this example, we use the hour port (40002). You have to send each point separately and include a timestamp, and all points have to arrive within the hour. For example, if you sent a point in the range 3:00-3:59 with !H, it shows at 3:00 with an hs() query.

my.metric 10 <t1> <source>
my.metric 20 <t2> <source>
my.metric 20 <t3> <source>
my.metric 30 <t4> <source>
my.metric 40 <t5> <source>
my.metric 100 <t6> <source>
my.metric 100 <t7> <source>

The proxy aggregates the points and sends only the histogram distribution to Wavefront. The Wavefront service knows only what each bin is and how many points are in each bin. Wavefront does not store the value of each single histogram point, it computes and stores the distribution.

You can now apply other functions to the histogram, for example, you can try to find out what the 85th percentile of the histogram is. For this example, you could now write a query like this:

percentile (85, hs(my.metric))

Histogram Configuration

Histograms are supported by Wavefront proxy 4.12 and later. To use histograms, ensure that your data is in histogram data format, and set the histogram proxy port to send to. Different ports accept different data formats, as shown in the table below. For information on how to configure proxies, see Advanced Proxy Configuration.

Histogram Proxy Ports

To indicate that you are sending histogram data, send the metrics to one of the histogram proxy ports. You can use:

  • Port 2878 for proxy 4.29 and later for histogram distributions.
  • Port 40000 for earlier proxy versions for histogram distributions.
  • The other ports for other formats.
Aggregation Interval or DistributionProxy PropertyDefault ValueData Ingestion Format
distribution histogramDistListenerPorts 2878 (proxy 4.29 and later)
40000 (earlier proxy versions)
Histogram data format
minute histogramMinuteListenerPorts 40001 Wavefront data format
hour histogramHourListenerPorts 40002 Wavefront data format
day histogramDayListenerPorts 40003 Wavefront data format

You can send Wavefront data format histogram data only to a minute, hour, or day port.

You can send histogram data format histogram data only to the distribution port. If you send Wavefront histogram data format to min, hour, or day ports, the points are rejected as invalid input format and logged.

Histogram Configuration Properties

Wavefront supports additional histogram configuration properties, shown in the following table. Note the requirements on the state directory and the effect of the two persist properties listed at the bottom of the table.

PropertyDescriptionFormat
histogramStateDirectory Directory for persistent proxy state, must be writable. Before being flushed to Wavefront, histogram data is persisted on the filesystem where the Wavefront proxy resides. If the files are corrupted or the files in the directory can't be accessed, the proxy reports the problem in its log and fails back to using in-memory structures. In this mode, samples can be lost if the proxy terminates without draining its queues. Default: /var/spool/wavefront-proxy. A valid path on the local file system.
persistAccumulator Whether to persist accumulation state. We suggest keeping this setting enabled unless you are not using hour and day level aggregation and consider losing up to 1 minute worth of data during proxy restarts acceptable. Default: true. .
persistMessages Whether to persist received metrics to disk. Default: true. Boolean.
histogramAccumulatorResolveInterval Interval in milliseconds to write back accumulation changes from memory cache to disk. Only applicable when memory cache is enabled. Increasing this setting reduces storage IO pressure but might increase heap memory use. Default: 100. Positive integer.
histogramAccumulatorFlushInterval Interval in milliseconds to check for histograms that need to be sent to Wavefront acccording to their histogramMinuteFlushSecs etc settings. Default: 1000. Positive integer.
histogramAccumulatorFlushMaxBatchSize Max number of histograms to move to the outbound queue in one flush. Default: no limit. Positive integer.
histogramMaxReceivedLength Maximum line length for received histogram points. Default: 65536. Positive integer.
histogramReceiveBufferFlushInterval Sets maximum time in milliseconds that incoming points can stay in the receive buffer when incoming traffic volume is very low. Default: 100. Positive integer.
histogramProcessingQueueScanInterval Interval in milliseconds between checks for new entries in the processing queue. Default: 20. Positive integer.
histogramMinuteListenerPorts TCP ports to listen on for histograms to be aggregated by minute. Default: 40001. Comma-separated list of ports. Can be a single port.
histogramMinuteAccumulators Number of accumulators per minute port. In high traffic environments we recommend that the total number of accumulators per proxy across all utilized ports does not exceed the number of available CPU cores. Default: 2. Positive integer.
histogramMinuteFlushSecs Time-to-live, in seconds, for a minute granularity accumulation on the proxy (before the intermediary is sent to Wavefront). Default: 70. Positive integer.
histogramMinuteAccumulatorSize Expected upper bound of concurrent accumulations: ~ #time series * #parallel reporting bins. Default: 100000. Positive integer.
histogramMinuteCompression A bound on the number of centroids per histogram. Default: 100. Positive integer in the interval. [20;1000].
histogramMinuteMemoryCache Enabling memory cache reduces I/O load with fewer time series and higher frequency data (more than 1 point per second per time series). Default: false. Boolean.
histogramMinuteAvgDigestBytes Average number of bytes in an encoded distribution/accumulation. Default: 32 + histogramMinuteCompression * 7 Positive integer.
histogramMinuteAvgKeyBytes Average number of bytes in a UTF-8 encoded histogram key. Concatenation of metric, source, and point tags. Default: 150. Positive integer.
histogramHourListenerPorts TCP ports to listen on for histograms to be aggregated by hour. Default: 40002. Comma-separated list of ports. Can be a single port.
histogramHourAccumulators Number of accumulators per hour port. In high traffic environments we recommend that the total number of accumulators per proxy across all utilized ports does not exceed the number of available CPU cores. Default: 2. Positive integer.
histogramHourFlushSecs Time-to-live, in seconds, for an hour granularity accumulation on the proxy (before the intermediary is sent to Wavefront). Default: 4200. Positive integer.
histogramHourAccumulatorSize Expected upper bound of concurrent accumulations: ~ #time series * #parallel reporting bins. Default: 100000. Positive integer.
histogramHourCompression A bound on the number of centroids per histogram. Default: 100. Positive integer in the interval [20;1000].
histogramHourMemoryCache Enabling memory cache reduces I/O load with fewer time series and higher frequency data (more than 1 point per second per time series). Default: false. Boolean.
histogramHourAvgDigestBytes Average number of bytes in an encoded distribution/accumulation. Default: 32 + histogramMinuteCompression * 7 Positive integer.
histogramHourAvgKeyBytes Average number of bytes in a UTF-8 encoded histogram key. Concatenation of metric, source, and point tags. Default: 150. Positive integer.
histogramDayListenerPorts TCP ports to listen on for histograms to be aggregated by day. Default: 40003. Comma-separated list of ports.
histogramDayAccumulators Number of accumulators per day port. In high traffic environments we recommend that the total number of accumulators per proxy across all utilized ports does not exceed the number of available CPU cores. Default: 2. Positive integer.
histogramDayFlushSecs Time-to-live, in seconds, for a day granularity accumulation on the proxy (before the intermediary is sent to Wavefront). Default: 18000 (5 hours). Positive integer.
histogramDayAccumulatorSize Expected upper bound of concurrent accumulations: ~ #time series * #parallel reporting bins. Default: 100000. Positive integer.
histogramDayCompression A bound on the number of centroids per histogram. Default: 100. Positive integer in the interval [20;1000].
histogramDayMemoryCache Enabling memory cache reduces I/O load with fewer time series and higher frequency data (more than 1 point per second per time series). Default: false. Boolean.
histogramDayAvgDigestBytes Average number of bytes in an encoded distribution/accumulation. Default: 32 + histogramDayCompression * 7 Positive integer.
histogramDayAvgKeyBytes Average number of bytes in a UTF-8 encoded histogram key. Concatenation of metric, source, and point tags. Default: 150. Positive integer.
histogramDistListenerPorts TCP ports to listen on for ingesting histogram distributions. Default: 2878 (proxy 4.29 and later) 40000 (earlier proxy versions). Comma-separated list of ports. Can be a single port.
histogramDistAccumulators Number of accumulators per distribution port. In high traffic environments we recommend that the total number of accumulators per proxy across all utilized ports does not exceed the number of available CPU cores. Default: number of available CPU cores. Positive integer.
histogramDistFlushSecs Number of seconds to keep a new distribution bin open for new samples, before the intermediary is sent to Wavefront. Default: 70. Positive integer.
histogramDistAccumulatorSize Expected upper bound of concurrent accumulations: ~ #time series * #parallel reporting bins. Default: 100000. Positive integer.
histogramDistCompression A bound on the number of centroids per histogram. Default: 100. Positive integer in the interval [20;1000].
histogramDistMemoryCache Enabling memory cache reduces I/O load with fewer time series and higher frequency data (Aggregating more than 1 distribution per second per time series). Default: false. Boolean.
histogramDistAvgDigestBytes Average number of bytes in an encoded distribution/accumulation. Default: 32 + histogramDistCompression * 7 Positive integer.
histogramDistAvgKeyBytes Average number of bytes in a UTF-8 encoded histogram key. Concatenation of metric, source, and point tags. Default: 150. Positive integer.

Histogram Metric Naming

You send metrics using the standard Wavefront data format:

<metricName> <metricValue> [<timestamp>] source=<source> <pointTagKey1>=<value1> ... <pointTagKeyN>=<valueN>

For example, request.latency 20 1484877771 source=<source>.

Wavefront adds the suffixes .m, .h, or .d to the metric name according to the aggregation interval. For example, if the metric request.latency is aggregated over an hour, the metric will be named: request.latency.h.

Querying and Viewing Histogram Metrics

You display histogram information by running hs() in conjunction with histogram functions, or by selecting a histogram metric from the Histogram browser.

Histogram Query Basics

You use the hs() function with the name of a histogram metric to access the histogram distributions for that metric. A histogram metric name has an extension .m, .h, or .d:

  • If you sent distributions in histogram data format, the histogram metric extension corresponds to the interval you specified (!M, !H, or !D).
  • If you sent metrics using Wavefront data format, the histogram metric extension corresponds to the histogram port that you used.

To visualize information about histogram distributions, you can run the hs() function under a time-series chart. We implicitly wrap a median() function around the hs() function and display the median value of each distribution as a time series:

default histogram

You can explicitly wrap another histogram function around the result of hs() to see other information. For example, the 2 histogram queries in the following chart display the maximum and minimum values from each histogram distribution:

default histogram

Histogram Summary Information

Sometimes it is useful to see more information about a histogram than just the median or any single percentile. You can use summary() or alignedSummary() to display all of the following percentiles from the histogram data: max, 99.9, 99, 95, 90, 75, avg, median (50), 25, and min.

The following diagram shows the information you get for the metric shown above if you wrap it with summary(). The legend lists the series that are extracted from the histogram data by default.

histogram summary

You can extract just the percentiles that you are interested in, by calling the function with an optional list of percentages as the first argument. For example, the following function returns the 10th and 25th percentile from each histogram distribution in the series:

summary(10, 25, hs("alerting.check.latency.m", customer=perftest))

Viewing Histograms in the Histogram Browser

You can view histograms in the Histogram browser.

To view histograms:

  1. Click Browse > Histograms and start typing the histogram metric name. Each histogram metric has an extension .d, .h, or .m. If you sent a metric in histogram data format, the extension corresponds to the interval you specified (!M, !H, or !D). If you sent a metric using Wavefront data format, the extension depends on the histogram port that you used.
  2. Select the metric you’re interested in.

    select_histogram_chart

  3. Examine the chart.
    • The query is an hs() query, not a ts() query.
    • We display the median for histogram metrics by default. You can use percentile(<value>, hs(<expression>)) to retrieve other percentiles.

    histogram_chart

Monitoring Histogram Points

You can use ~collector metrics to monitor histogram ingestion. See Understanding ~collector Metrics for Histograms.