Learn about the format for Wavefront spans, and naming conventions for the RED metrics derived from them.

A trace shows you how a request propagates from one microservice to the next in a distributed application. The basic building blocks of a trace are its spans, where each span corresponds to a distinct invocation of an operation that executes as part of the request.

Spans are the fundamental units of trace data. This page provides details about the Wavefront format of a span, as well as the RED metrics that Wavefront automatically derives from spans. These details are mainly useful for developers who need to perform advanced customization.

Wavefront Span Format

A well-formed Wavefront span consists of fields and span tags that capture span attributes. These attributes enable Wavefront to identify and describe the span, organize it into a trace, and display the trace according to the service and application that emitted it. Some attributes are required by the OpenTracing specification and others are required by Wavefront.

Most use cases do not require you to know exactly how Wavefront expects a span to be formatted:

  • When you instrument your application with a Wavefront OpenTracing SDK or a framework SDK, your application emits spans that are automatically constructed by the Wavefront Tracer. (You supply some of the attributes when you instantiate the ApplicationTags object required by the SDK.)
  • When you instrument your application with a Wavefront sender SDK, your application emits spans that are automatically constructed from raw data you pass as parameters.
  • When you instrument your application with a 3rd party distributed tracing system, your application emits spans that are automatically transformed by the integration you set up.

It is possible to manually construct a well-formed span and send it either directly to the Wavefront service or to a TCP port that the Wavefront proxy is listening on for trace data. You might want to do this if you instrumented your application with a proprietary distributed tracing system.

Span Syntax

<operationName> source=<source> <spanTags> <start_milliseconds> <duration_milliseconds>

Fields must be space separated and each line must be terminated with the newline character (\n or ASCII hex 0A).

For example:

getAllUsers source=localhost traceId=7b3bf470-9456-11e8-9eb6-529269fb1459 spanId=0313bafe-9457-11e8-9eb6-529269fb1459 parent=2f64e538-9457-11e8-9eb6-529269fb1459 application=Wavefront service=auth cluster=us-west-2 shard=secondary http.method=GET 1552949776000 343

Span Fields

Field Required Description Format
operationName Yes The string name that indicates the operation represented by the span. Valid characters: a-z, A-Z, 0-9, hyphen ("-"), underscore ("_"), dot (".").
Length: less than 1024 characters.
source Yes The string name of a host or container on which the represented operation executed. Valid characters: a-z, A-Z, 0-9, hyphen ("-"), underscore ("_"), dot (".").
Length: less than 1024 characters.
spanTags Yes See Span Tags, below.
start_milliseconds Yes Start time of the span, expressed as epoch time elapsed since 00:00:00 Coordinated Universal Time (UTC) on January 1, 1970. Whole number of epoch milliseconds or other units (see below).
duration_milliseconds Yes Duration of the span. Whole number of milliseconds or other units (see below). Must be greater than or equal to 0.

Span Tags

Span tags are special tags associated with a span. Many of these span tags are required for a span to be valid. An application can be instrumented to include custom span tags as well. Custom tag names must not use the reserved span tag names listed in the following tables.

Note: The maximum allowed length for a combination of a span tag key and value is 254 characters (255 including the “=” separating key and value). If the value is longer, the span is rejected.

The following table lists span tags that contain information about the span’s identity and relationships.

Span Tags
for Identity
RequiredDescriptionType
traceId Yes Unique identifier of the trace the span belongs to. All spans that belong to the same trace share a common trace ID. UUID
spanId Yes Unique identifier of the span. UUID
parent No Identifier of the span’s dependent parent, if it has one. This tag is populated as the result of an OpenTracing ChildOf relationship. A span without the parent or followsFrom tag is the root (first) span of a trace. UUID
followsFrom No Identifier of the span’s non-dependent parent, if it has one. This tag is populated as the result of an OpenTracing FollowsFrom relationship. Wavefront ignores spans with this tag when calculating the critical path through a trace. A span without the parent or followsFrom tag is the root (first) span of a trace. UUID

The following table lists span tags that describe the architecture of the instrumented application that emitted the span. Wavefront uses these tags to aggregate and filter trace data at different levels of granularity. These tags correspond to the application tags you set through a Wavefront observability SDK.

Span Tags
for Filtering
RequiredDescriptionType
application Yes Name of the instrumented application that emitted the span. String
service Yes Name of the instrumented microservice that emitted the span. String
cluster Yes Name of a group of related hosts that serves as a cluster or region in which the instrumented application runs.
Specify cluster=none to indicate a span that does not use this tag.
String
shard Yes Name of a subgroup of hosts within the cluster, for example, a mirror.
Specify shard=none to indicate a span that does not use this tag.
String

Note: Additional span tags may be present, depending on how you instrumented your application. For example, the framework SDKs automatically use span tags like component, http.method, and so on. You can find out about these tags in the README file for the SDK on GitHub.

Time-Value Precision in Spans

A span has two time-value fields for specifying the start time (start_milliseconds) and duration (duration_milliseconds). Express these values in milliseconds, because Wavefront uses milliseconds for span storage and visualization. For convenience, you can specify time values in other units. Wavefront converts the values to milliseconds.

Wavefront requires that you use the same precision for both time values. Wavefront identifies the precision of the start_milliseconds value, and interprets the duration_milliseconds value using the same unit. The following table shows how to indicate the start-time precision:

Precision for
Start Time Values
Number FormatSample
Start Value
Stored As
Milliseconds
Conversion
Method
Seconds Fewer than 13 digits 1533529977 1533529977000 Multiplied by 1000
Milliseconds
(Thousandths of a second)
13 to 15 digits 1533529977627 1533529977627
Microseconds
(Millionths of a second)
16 to 18 digits 1533529977627992 1533529977627 Truncated
Nanoseconds
(Billionths of a second)
19 or more digits 1533529977627992726 1533529977627 Truncated

Note: When specifying a span in Wavefront span format, make sure you adjust values as necessary so that the units match. For example, suppose you know a span started at 1533529977627 epoch milliseconds, and lasted for 3 seconds. In Wavefront span format, you could specify either of the following pairs of time values:

1533529977 3 (both values in seconds)
1533529977627 3000 (both values in milliseconds)

Indexed and Unindexed Span Tags

Wavefront uses indexes to optimize the performance of queries that filter on certain span tags. For example, Wavefront indexes the application tags (application, service, cluster, shard) so you can quickly query for spans that represent operations from a particular application, service, cluster, or shard. In addition to the application tags, Wavefront indexes certain built-in span tags that conform to the OpenTracing standard, such as span.kind, component, and http.method.

For performance reasons, Wavefront automatically indexes built-in span tags with low cardinality. (A tag with low cardinality has comparatively few unique values that can be assigned to it.) So, for example, a tag like spanId is not indexed.

Note: Wavefront does not automatically index any custom span tags that you might have added when you instrumented your application. If you plan to use a low-cardinality custom span tag in queries, contact Wavefront support to request indexing for that span tag.

RED Metrics Derived From Spans

If you instrument your application with a tracing-system integration or with a Wavefront OpenTracing SDK, Wavefront derives RED metrics from the spans that are sent from the instrumented application. These out-of-the-box metrics are derived from your spans automatically, with no additional configuration or instrumentation on your part. These metrics are key indicators of the health of your services, and you can use them to help you discover problem traces.

RED metrics are measures of:

  • Rate of requests – the number of requests being served per minute
  • Errors – the number of failed requests per minute
  • Duration – per-minute histogram distributions of the amount of time that each request takes

The derived RED metrics are operation-level, which means that they measure individual operations, and not whole traces. For example, an operation-level metric might measure then number of calls per minute to the dispatch operation in the delivery service, where each call to dispatch might correspond to one of many spans in a trace.

Predefined Charts

Wavefront automatically generates charts to display the auto-derived RED metrics for a particular service. To view these charts:

  1. Select Applications > Inventory in the Wavefront task bar. If necessary, scroll to find your application and its services.
  2. Click on the service you want to see metrics for.
  3. If you instrumented your application with a Wavefront SDK, look for the charts in the Overview section. (If you used a tracing-system integration, the charts are in the only section on the page.)

The predefined charts let you view:

  • The per-minute Request Rate, per-minute Error Rate, and Duration (P95) for all requests that are processed by the service.
  • The “top” operations each category: the most frequently invoked operations, the operations with the most errors, and the slowest operations. You can click on an operation in one of these charts to view the just the traces that contain spans for that operation.

tracing overview RED metrics

Note: A service page also displays RED metrics that are collected and sent by the framework SDKs. These SDKs report the RED metrics directly from the instrumented framework APIs, instead of deriving them from the reported spans. (Other metrics and histograms might be sent as well.)

RED Metric Counters and Histograms

The types of RED metrics that we show in the predefined charts are rates and 95th percentile distributions. These metrics are themselves based on underlying counters and histograms that Wavefront automatically derives from spans. You can use these underlying counters and histograms in RED metrics queries, for example, to create alerts on trace data.

Wavefront constructs the names of the underlying counters and histograms as shown in the table below. The name components <application>, <service>, and <operationName> are string values that Wavefront obtains from the spans on which the metrics are derived. If necessary, Wavefront modifies these strings to comply with the Wavefront metric name format. Wavefront also associates each metric with point tags application, service, and operationName, and assigns the corresponding span tag values to these point tags. The span tag values are used without modification.

Operation-Level Metric NamesMetric TypeDescription
tracing.derived.<application>.<service>.<operationName>.invocation.count Counter The number of times that the specified operation is invoked.
Used in the Request Rate chart that is generated for a service.
tracing.derived.<application>.<service>.<operationName>.error.count Counter The number of invoked operations that have errors (i.e., spans with error=true).
Used in the Error Rate chart that is generated for a service.
tracing.derived.<application>.<service>.<operationName>.duration.micros.m Wavefront histogram The duration of each invoked operation, in microseconds, aggregated in one-minute intervals.
Used in the Duration chart that is generated for a service.

RED Metrics Queries for Charts and Alerts

You can perform queries over RED metric counters and histograms and visualize the results in your own charts, just as you would do for any other metrics in Wavefront. You can create alerts on trace data by using RED metrics queries in alert conditions.

Examples

Find at the per-minute error rate for a specific operation executing on a specific cluster:

rate(ts(tracing.derived.beachshirts.shopping.orderShirts.error.count and cluster=us-east-1)) * 60

Use a histogram query to return durations at the 75th percentile for an operation in a service. (The predefined charts display only the 95th percentile.)

percentile(75, hs(tracing.derived.beachshirts.delivery.dispatch.duration.micros.m))

Syntax Alternatives

Wavefront supports 2 alternatives for specifying the RED metric counters and histograms in a query:

  • Use the metric name, for example:
    ts(tracing.derived.beachshirts.delivery.dispatch.error.count)
    
  • Use the point tags application, service, and operationName that Wavefront automatically associates with the metric, for example:
    ts(tracing.derived.*.invocation.count, application="beachshirts" and service="delivery" and operationName="dispatch")
    

The point tag technique is useful when the metric name contains string values for <application>, <service>, and <operationName> that have been modified to comply with the Wavefront metric name format. The point tag value always corresponds exactly to the span tag values.

Trace Sampling and Auto-Derived RED Metrics

If you have instrumented your application with a Wavefront observability SDK, Wavefront always derives the RED metrics before any sampling is performed. This is true when the sampling is performed by the SDK or when the sampling is performed by a Wavefront proxy. Consequently, Wavefront derives the RED metrics from a complete set of generated spans, so the metrics provide a highly accurate picture of your application’s behavior. However, if you click through a chart to inspect a particular trace, you might discover that the trace has not actually been ingested in Wavefront. You can consider configuring a less restrictive sampling strategy.

If you have instrumented your application using a 3rd party distributed tracing system, Wavefront derives the RED metrics after sampling has occurred. The Wavefront proxy receives only a subset of the generated spans, and the auto-derived RED metrics will reflect just that subset. See Trace Sampling and RED Metrics from an Integration.