VMware Aria Operations for Applications (formerly known as Tanzu Observability by Wavefront) supports monitoring of your Wavefront proxies.
- With the Proxies Browser, you can explore a detailed list of all your proxies.
- With the out-of-the-box dashboards that are based on proxy internal metrics, you can examine the health and the usage of your proxies.
Explore Your Proxies with the Proxies Browser
With the Proxies Browser, you can examine the status and the details of each proxy.
A proxy status can be:
|Active||The proxy is running and sending data.|
|Orphaned||The proxy stopped sending data. The reason can be:
|Stopped by Server||The Operations for Applications subscription has ended for the customer.|
|Token Expired||The token has expired. You must install a new proxy.|
Select Browser > Proxies to display the Proxies Browser.
On the Proxies Browser page, you can see the details about each proxy - name, hostname, ID, last check-in date and time, status, ingestion rate by data type, version, and the user who created it.
In addition, you can:
- Sort the proxies by name, last check-in time, status, version, or the user who created the proxy.
- Search and, optionally, save and share your search.
- Filter the proxies by status.
- Hide or show the filters.
- Show All or Deleted proxies list. The Deleted proxies list shows the ephemeral proxies that were deleted during the last 24 hours and the non-ephemeral proxies that were deleted during the last 1 month.
- Configure the proxies table columns.
- Open the dashboard of a proxy by clicking the proxy name.
- Go to the Operations for Applications Service and Proxy Data dashboard of the Operations for Applications Usage integration by clicking Usage and Proxies Data Dashboard.
Examine the Health and Usage of a Proxy with the Proxy Dashboard
On the Proxies Browser, click the name of a proxy to open its individual dashboard. The proxy dashboard contains charts based on proxy internal metrics, organized in the following sections:
|Overview||Shows the details of the proxy and charts about:
|Metrics||Shows charts about the metric data points that are received, queued, and blocked by the proxy.|
|Distributions (Histograms)||Shows charts about the metric distributions that are received, queued, and blocked by the proxy.|
|Traces (Spans)||Shows charts about the traces that are received, queued, and blocked by the proxy.|
|Logs||Shows charts about the logs received, queued, blocked, and dropped by the proxy.|
|Advanced||Shows charts for troubleshooting the proxy, such as, the proxy memory heap, file descriptor usage, GC events, incoming HTTP requests, data lag, connections, network latency, time spent for the preprocessing rules, queue time, and rate limiter.|
Examine the Proxies Health and Usage with the Operations for Applications Usage Integration
The Operations for Applications Usage integration includes the predefined Operations for Applications Service and Proxy Data dashboard, which contains the Proxies: Overview and the Proxy Troubleshooting sections. These two sections comprise of charts based on the proxy internal metrics for examining the health of the proxies in your environment.
You can navigate to this dashboard in two ways:
- Select Dashboards > All Dashboards and search for the Operations for Applications Service and Proxy Data dashboard.
- On the Proxies Browser page, click Usage and Proxies Data Dashboard.
This section of the Operations for Applications Service and Proxy Data dashboard includes a number of charts that show general information about the proxies in your environment, such as the rate at which each proxy receives points, the rate at which each proxy sends points to Operations for Applications, any queued or blocked points, and more.
The proxy statistics are shown in a tabular chart at the end of the section:
In this section of the Operations for Applications Service and Proxy Data dashboard, you can investigate second-level metrics that give you insight into questions, suchh as:
- Why are some points blocked?
- What’s the file descriptor usage on the proxy JVM?
- How long does it take for points to be pushed from the proxy to Operations for Applications?
For example, this row from that section shows latency metrics using
In this section of the dashboard, you can also monitor the time a proxy is spending with preprocessing rules. The charts show the time the JVM spends on the rules and determine the overall effectiveness of the rules. Rules that are not optimized can contribute to data lag. As a result, Operations for Applications will not receive the data in a timely manner.
For best performance, make sure that the expression leverages the regex best practices for the proxy rules and that your proxy runs the latest version.
The following charts help you understand the time a proxy spends on preprocessing rules:
Preprocessor Rules: CPU Time per Proxy
This chart shows an aggregate view of how long each proxy spends executing all the preprocessing rules.
Preprocessor Rules: CPU Time per Rule
This chart shows an aggregate view across all proxies showing how much time it takes to execute each rule for each message. This chart helps you display outliers and identify preprocessing rules which should be optimized.
Preprocessor Rules: Hit Ratio
This chart helps you identify preprocessing rules that are no longer in use or impact a high number of metrics being ingested. Use this chart to identify if there are some rules which should be deprecated or possibly fine-tuned.
Proxy Internal Metrics
The Wavefront proxies emit the
~proxy. set of internal metrics, which you can use to check if your Wavefront proxy is behaving as expected.
||Counter showing the total points the proxy receives, as a per-second rate. To look at the overall rate of points received across all the ports, you can sum up these series and look at the aggregate rate for a proxy. You can also look at the overall rate across all proxies by summing this up further.|
||Counter showing the number of points successfully delivered to Operations for Applications, broken down by listening port.|
||Counter showing the number of points being queued to be sent to Operations for Applications from the proxy, as a per-second rate. Queueing usually happens for one of the following reasons:
||Gauge of the amount of data that the proxy currently has queued.|
||Gauge of the number of points currently in the queue.|
||Counter of the points being blocked at the proxy, as a per-second rate. If this rate is above 0, you can look at the charts in the Proxy Troubleshooting section of the Operations for Applications Service and Proxy Data dashboard to determine if the metrics contain invalid characters or bad timestamps, or if they are failing configurable regular expressions. A small sample of blocked points – up to
||Rate at which the proxy buffer is filling up in bytes/min.|
||Rate at which points are being received at the proxy.|
||Available space (in bytes) on the proxy.|
||Current version number of the proxy.|
||Counter that shows how many points have been queued due to local proxy settings in
||Count of points blocked because of an illegal character.|
||Count of points blocked because of the timestamp (e.g. older than 1 year).|
||The points rejected based on the allow list/block list validation (using regex) at the Wavefront proxy.|
||% of file descriptors in use per proxy. If this metric reaches close to 100% of the allowed usage for the proxy, increase the uLimit on your system.|
||Garbage collection (GC) activity on the proxy JVM. Anything larger than 200ms is a GC issue, anything near 1s indicates continuous full GCs in the proxy.|
||Memory usage by the proxy process.|
||Duration taken by points pushed from the proxy to reach Operations for Applications. Can help identify network latency issues. You can graph other percentiles.|
||95th percentile of time differences (in milliseconds) between the timestamp on a point and the time that the proxy received it. Large numbers indicate backfilling old data, or clock drift in the sending systems.|
||99th percentile of time differences (in milliseconds) between the timestamp on a point and the time that the proxy received it. Large numbers indicate backfilling old data, or clock drift in the sending systems.|
||Latency introduced by queueing.|