Learn how to set up sampling for Wavefront trace data.

A cloud-scale web application generates a very large number of traces. You can set up sampling strategies to reduce the volume of ingested trace data.

Well-chosen sampling strategies can give you a good idea of how your application is behaving, while:

  • Limiting the performance impact on network bandwidth and application response times.
  • Reducing the amount of storage required for trace data, and lowering your monthly costs.
  • Filtering out “noise” traces so you can see what’s important.

Wavefront Sampling Strategies

A sampling strategy is a mechanism for selecting which traces to forward to Wavefront. Wavefront lets you set up either or both of the following sampling strategies:

Sampling StrategyDescription
Rate-based sampling Sends N percent of the generated traces to Wavefront. For example, a sampling rate of 10% causes 1 out of 10 traces to be sent and ingested.
Duration-based sampling Sends spans to Wavefront only if they are longer than N milliseconds. For example, a sampling duration of 45 sends spans to Wavefront only if they are longer than 45 milliseconds.

A span that contains an error is always sent to Wavefront, the regardless of the span’s duration or whether it falls in a specified sampling percentage. (A span contains an error if it is associated with the span tag error=true.)

Note: You can query and visualize only the traces and spans that Wavefront has actually received and ingested. If you set up a sampling strategy that severely reduces the volume of ingested trace data, you could end up with queries that produce no results.

Complete vs. Partial Traces

An ingested trace normally could be complete (a trace ingested with all of its member spans) or partial (a trace that is missing one or more spans). The completeness of the traces in a sample depends in part on the sampling strategy:

  • Rate-based sampling attempts to send complete traces. That is, the sampler selects the specified percentage of trace IDs, and then sends all of the spans that belong to each selected trace.

  • Duration-based sampling considers only individual spans. That is, the sampler selects all spans of an appropriate duration, regardless of whether they form complete traces.

Partial traces can also occur in the following situations:

  • If a span contains an error. Each such span is sent individually, without the other spans in the same trace.
  • If a trace has spans from multiple microservices, and you set up different sampling rates for those microservices.

When Sampling Strategies are Combined

You can combine rate-based sampling and duration-based sampling in the same microservice. Doing so causes Wavefront to ingest the union of the spans that are selected by each sampler.

For example, suppose you set the sampling rate to 20% and the sampling duration to 45ms for the same microservice. This causes Wavefront to receive:

  • 20% of the traces generated by that microservice, regardless of the length of their spans.
  • Any additional spans outside of that 20% that are longer than 45ms.

As a result, the ingested sample will contain somewhat more than 20% of the generated traces, with some spans that are shorter than 45ms.

Ways to Set Up Sampling

You can set up a sampling strategy using either of the following methods:

Choose the Wavefront proxy for sampling when you want to:

  • Use a single sampling strategy to coordinate the sampling for all applications that use the same proxy.
  • Configure sampling with minimal effort.
  • Improve the likelihood of ingesting complete traces.

Choose sampling in your instrumented code when you want to:

  • Reduce the performance impact of span reporting on your application.
  • Use direct ingestion (so no proxy).
  • Configure sampling on a per-process basis, for example, when you expect spans from the services in different processes to have different characteristics.

Setting Up Sampling Through the Proxy

You can set up sampling strategies through a Wavefront proxy by adding the sampling properties to the proxy’s configuration file.

  1. On the proxy host, open the proxy configuration file wavefront.conf for editing. The path to the file depends on the host.
  2. In the wavefront.conf file, add one or both of the following properties. For example, the following properties set up a sampling rate of 10% and a sampling duration of 45 milliseconds:
     # Number from 0.0 to 1.0
     traceSamplingRate=.1
     ...
     traceSamplingDuration=45
    
  3. Save the wavefront.conf file.
  4. Start the proxy.

Setting Up Sampling in Your Code

You can set up sampling strategies in application code that is built with one of the following Wavefront observability SDKs:

  • The Wavefront OpenTracing SDK
  • Any Wavefront observability SDK that depends on the Wavefront OpenTracing SDK

You set up a sampling strategy by configuring a Wavefront Tracer with a Sampler object. You create one Sampler for each sampling strategy. See the README file for the Wavefront observability SDK you are using.