Learn how to configure Wavefront proxy preprocessor rules.

Starting with version 4.1, the Wavefront proxy includes a preprocessor that applies various user-defined point filtering and altering rules before data is sent to Wavefront. One of the main goals of this functionality is to allow addressing correctable data quality issues within the existing data flow, when fixing the problem at the emitting source is not feasible. An example of such rule would be “before the point line is parsed, replace invalid characters with underscores”, which allows points that would normally be rejected to flow into the system.

Rule Configuration File

You define the proxy preprocessor rules in a separate file, usually <wavefront_config_path>/preprocessor_rules.yaml, using YAML syntax. You can specify the file in your proxy configuration. An example rule file could look like:

# rules for port 2878
'2878':
  # replace bad characters ("&", "$", "!", "@") with underscores in the entire point line string
  ################################################################
  - rule    : replace-badchars
    action  : replaceRegex
    scope   : pointLine
    search  : "[&\\$!@]"
    replace : "_"

  #  remove "az" point tag if its value starts with "dev"
  ################################################################
  - rule    : drop-az-tag
    action  : dropTag
    tag     : az
    match   : dev.*

# rules for port 4242
'4242':
  #  remove "az" point tag if its value starts with "dev"
  ################################################################
  - rule    : drop-az-tag
    action  : dropTag
    tag     : az
    match   : dev.*

  ...

For greater flexibility, you can define rules separately for each listening port. The example above defines 3 rules, 2 for port 2878 and 1 for port 4242.

Every rule must have a rule parameter that contains the rule ID and an action parameter that contains the action to perform.

Rule IDs can contain alphanumeric characters, dashes, and underscores and should be descriptive and unique within the same port. In the example above, the drop-az-tag rule is defined with the same identifier for both ports, 2878 and 4242.

The Wavefront proxy reports a counter metric for every rule that represents the number of times a rule has been successfully applied, and the rule ID becomes part of the proxy metric ~agent.preprocessor.<ruleID>.count. For example, ~agent.preprocessor.replace-badchars.count. For information on proxy metrics, see Monitoring the Health of a Wavefront Instance.

Regex Notes

  • Backslashes in regex patterns must be double-escaped. For example, to match a dot character (“.”), use \\..
  • Regex patterns in the match parameter are a full match. For example, a regex to block the point line that contains stage substring is .*stage.*.
  • Regex patterns in the replaceRegex rule search parameter are a substring match. If search is “A” and replace is “B”, all A’s are replaced with B’s.

Enabling the Preprocessor

To enable the preprocessor, add (or uncomment) the preprocessorConfigFile property in the Wavefront proxy configuration file and set to a valid path to the rules configuration file. The rules file is validated when the proxy starts and the start-up process is aborted if any of the rules are not valid. A detailed error message is provided for every rule that fails validation.

Point Filtering Rules

Point filtering rules support a more flexible version of the proxy whitelistRegex and blacklistRegex properties, and is fully backwards compatible.

blacklistRegex

Defines a regex that points must match to be filtered out.

Parameter Value Description
action blacklistRegex
scope Any of the following:
  • pointLine
  • metricName
  • sourceName
  • <point tag>
"scope" parameter allows filtering points with finer granularity:
  • pointLine applies to the whole point line before it's parsed (can be used with Wavefront and Graphite formats only)
  • metricName applies only to the metric name after the point is parsed
  • sourceName applies only to the source name after the point is parsed
  • any other value of the "scope" parameter applies to the value of a point tag with this name, after the point is parsed
match <regex pattern> A pattern that input lines must match to be filtered out.

Examples:

  # block all points with sourceName that starts with qa-statsd
  ###############################################################
  - rule    : example-block-qa-statsd
    action  : blacklistRegex
    scope   : sourceName
    match   : "qa-statsd.*"

  # block all points where "datacenter" point tag value starts with "west"
  ###############################################################
  - rule    : example-block-west
    action  : blacklistRegex
    scope   : datacenter
    match   : "west.*"

whitelistRegex

Defines a regex that points must match to be accepted. Multiple whitelistRegex rules are allowed, however a point must satisfy all of the rules; if the point doesn’t match at least one of the patterns, it is blocked.

Parameter Value Description
action whitelistRegex
scope Any of the following:
  • pointLine
  • metricName
  • sourceName
  • <point tag>
Allows filtering points with finer granularity:
  • pointLine applies to the whole point line before it's parsed (can be used with Wavefront and Graphite formats only)
  • metricName applies only to the metric name after the point is parsed
  • sourceName applies only to the source name after the point is parsed
  • any other value of the "scope" parameter applies to the value of a point tag with this name, after the point is parsed
match <regex pattern> A pattern that input lines must match to be accepted.

Examples:

  # only allow points that contain "prod" substring anywhere in the point line
  ###############################################################
  - rule    : example-allow-only-prod
    action  : whitelistRegex
    scope   : pointLine
    match   : ".*prod.*"

  # only allow points that have a "datacenter" point tag and its value starts with "west"
  ###############################################################
  - rule    : example-allow-only-west
    action  : whitelistRegex
    scope   : datacenter
    match   : "west.*"

Point Altering Rules

Point altering rules allow you to replace text in the point line and add, remove, and update point tags.

replaceRegex

Replaces arbitrary text in the point line or any of its components:

Parameter Value Description
action replaceRegex
scope Any of the following:
  • pointLine
  • metricName
  • sourceName
  • <point tag>
Allows finer control over where the replacement is applied:
  • pointLine applies to the whole point line before it's parsed (can be used with Wavefront and Graphite formats only)
  • metricName applies only to the metric name after the point is parsed
  • sourceName applies only to the source name after the point is parsed
  • any other value of the "scope" parameter applies to the value of a point tag with this name, after the point is parsed
Any substitutions that address data quality issues that would normally make the data point unparseable, must be applied to the "pointLine" scope.
search <regex pattern> Search pattern. All substrings matching this pattern are replaced with the replacement string.
replace <replacement string> Replacement string. The empty string is allowed. To refer to a capturing group by its number, use "\\1".
match (optional) <regex pattern> If specified, extract the tag only if "scope" (point line, source name, metric name or point tag value) matches this regular expression.

Examples:

# for "exampleCluster" point tag replace all "-" characters with dots
###############################################################
- rule    : example-cluster-name
  action  : replaceRegex
  scope   : exampleCluster
  search  : "-"
  replace : "."

# replace bad characters ("&", "$", "!") with underscores in the entire point line string
################################################################
- rule    : example-replace-badchars
  action  : replaceRegex
  scope   : pointLine
  search  : "[&\\$!]"
  replace : "_"

addTag

Add a point tag with the specified value to all points. If the point tag already exists, its existing value is replaced with the new value.

Parameter Value Description
action addTag
tag <new point tag key> New point tag name.
value <new value> New point tag value.

Example:

  # add "env=prod" point tag to all metrics sent through this port
  ################################################################
  - rule    : tag-all-metrics
    action  : addTag
    tag     : env
    value   : "prod"

addTagIfNotExists

Add a point tag with the specified value to all points. If the point tag already exists, its existing value is preserved.

Parameter Value Description
action addTagIfNotExists
tag <new point tag key> New point tag name.
value <new value> New point tag value.

Example:

  # add "env=prod" point tag to all metrics sent through this port unless already tagged with "env"
  ################################################################
  - rule    : tag-all-metrics
    action  : addTagIfNotExists
    tag     : env
    value   : "prod"

dropTag

Remove a point tag.

Parameter Value Description
action dropTag
tag <point tag name> or <tag name regex> Point tag key (or a regex matching the tag key).
match (optional) <regex pattern> If specified, remove a tag if its value matches this regular expression.

Examples:

  #  remove "dc" point tag from all points
  ################################################################
  - rule    : drop-dc-tag
    action  : dropTag
    tag     : dc

  #  remove "az" point tag if its value starts with "dev"
  ################################################################
  - rule    : drop-az-tag
    action  : dropTag
    tag     : az
    match   : dev.*

extractTag

Create a new point tag based on a metric name, source name, or another point tag value.

Parameter Value
Description
action extractTag
source Any of the following:
  • metricName
  • sourceName
  • <point tag>
The base for the new point tag value: metric name, source name, or another point tag value.
tag <new point tag name> New name for the point tag.
search <regex pattern> Regex pattern to extract the value.
replace <replacement string> String or pattern (empty string is allowed) that will be used as a value for the new point tag. To refer to a capturing group in "search" regex by its number, use either of the following constructs: "\\1" or "$1".
match (optional) <regex pattern> If specified, extract a tag only if "source" (source name, metric name or point tag value) matches this regular expression.

Example:

  # extract a "datacenter" point tag from the source name based on '.dc-' substring.
  # it will extract datacenter=west01 tag from source host0001.web.dc-west01.corp
  ####################################################################################
  - rule    : extract-datacenter
    action  : extractTag
    source  : sourceName
    tag     : datacenter
    search  : "^.*\\.dc-(.*)\\..*"
    replace : "$1"

renameTag

Rename a point tag, preserving its value.

Parameter Value Description
action renameTag
tag <tag name> Point tag to be renamed.
newtag <new tag name> New name for the point tag.
match (optional) <regex pattern> If specified, rename a tag if its value matches this regular expression.

Examples:

  # rename a "dc" point tag to "datacenter" (unconditional)
  ###############################################################
  - rule    : rename-dc-to-datacenter
    action  : renameTag
    tag     : dc
    newtag  : datacenter

  # rename a point tag if its value is numeric. so oldTag=123 would be renamed to numericTag=123, but oldTag=text123 would not be changed.
  ###############################################################
  - rule    : rename-numeric-tag
    action  : renameTag
    tag     : oldTag
    match   : "^\\d*$"
    newtag  : numericTag