Learn about the Wavefront Tanzu Application Service Integration.

Tanzu Application Service Integration

Tanzu Application Service, previously known as Pivotal Cloud Foundry, is a popular platform for building cloud-native applications. The Tanzu Application Service (TAS) integration is a full-featured implementation offering predefined dashboards and alert conditions and is fully configurable.

Dashboards

The TAS integration contains a set of predefined dashboards that give you an overview of your TAS deployment and specific TAS components:

  • TAS: Summary - overall health of TAS deployment.
  • TAS: Cloud Controller - detailed Cloud Controller metrics.
  • TAS: GoRouter - detailed Gorouter metrics.
  • TAS: Container - health of containers within TAS.
  • TAS: User Account and Authentication (UAA) - detailed UAA server metrics.
  • TAS: Diego Auctioneer - detailed Diego Auctioneer metrics.
  • TAS: Diego BBS - detailed Deigo Bulletin Board System (BBS) metrics.
  • TAS: Diego Cell - health of Diego Cells.
  • TAS: MySQL - Real-time visibility into the TAS MySQL status.
  • TAS: Redis - Real-time visibility into the TAS Redis status.
  • TAS: RabbitMQ - Real-time visibility into the On-Demand TAS RabbitMQ status.
  • TAS: Wavefront Nozzle - To monitor the health and performance of your Tanzu Platform deployment and apps.

Alerts

The TAS alerts are also available for you to install and use. Descriptions of the alerts are available in the Tanzu Observability by Wavefront documentation

Here’s a preview of the Cloud Controller dashboard: images/cloud_controller_dashboard.png

Tanzu Application Service Setup

Supported Versions: TAS v2.9 and later.

Install VMware Tanzu Observability by Wavefront Nozzle Tile

This integration uses the VMware Tanzu Observability by Wavefront Nozzle tile distributed through the Tanzu network.

Refer the documentation to install and configure the tile within your TAS deployment. Use the following Wavefront Instance URL and API token for configuring the Wavefront proxy: Wavefront Instance URL: https://YOUR_CLUSTER.wavefront.com/api
Wavefront API Token: YOUR_API_TOKEN

Send App Metrics

See the documentation for information about installing and configuring the tile within your TAS deployment.

Alerts

  • TAS Active Locks:Total count of how many locks the system components are holding. See here for details.
  • TAS Auctioneer Fetch State Duration Taking Too Long:App stage requests for Diego may be failing. Consult your Tanzu Expert.
  • TAS Auctioneer LRP Auctions Failed:The number of Long Running Process (LRP) instances that the Auctioneer failed to place on Diego Cells. See here for details.
  • TAS Auctioneer Task Auctions Failed:The number of Tasks that the Auctioneer failed to place on Diego Cells. See here for details.
  • TAS Auctioneer Time to Fetch Diego Cell State:Time in ns that the Auctioneer took to fetch state from all the Diego Cells when running its auction. See here for details.
  • TAS BBS Crashed App Instances:Total number of LRP instances that have crashed. See here for details.
  • TAS BBS Fewer App Instances Than Expected:Total number of LRP instances that are desired but have no record in the BBS. See here for details.
  • TAS BBS Master Elected:Indicates when there is a BBS master election. See here for details.
  • TAS BBS More App Instances Than Expected:Total number of LRP instances that are no longer desired but still have a BBS record. See here for details.
  • TAS BBS Running App Instances Rate of Change:DYNAMIC ALERT: NEGATIVE 10 is a placeholder. Rate of change in the average number of app instances being started or stopped on the platform. See here for details.
  • TAS BBS Task Count is Elevated:This elevated BBS task metric is a KPI tracked by the internal Tanzu Web Services team.
  • TAS BBS Time to Handle Requests:The maximum observed latency time over the past 60 seconds that the BBS took to handle requests across all its API endpoints. See here for details.
  • TAS BBS Time to Run LRP Convergence:Time that the BBS took to run its LRP convergence pass. See here for details.
  • TAS BOSH VM CPU Used:CPU utilization - The percentage of CPU spent in user processes. Set an alert and investigate further if the CPU utilization is too high for a job.
  • TAS BOSH VM Disk Used:System disk - Percentage of the system disk used on the VM.
  • TAS BOSH VM Ephemeral Disk Used:Ephemeral disk - Percentage of the ephemeral disk used on the VM.
  • TAS BOSH VM Health:1 means the system is healthy, and 0 means the system is not healthy.
  • TAS BOSH VM Memory Used:System Memory - Percentage of memory used on the VM
  • TAS BOSH VM Persistent Disk Used:Persistent disk - Percentage of the persistent disk used on the VM. Set an alert and investigate if the persistent disk usage is too high for a job over an extended period.
  • TAS Cloud Controller and Diego Not in Sync:Indicates if the cf-apps Domain is up-to-date, meaning that TAS app requests from Cloud Controller are synchronized to bbs. See here for details.
  • TAS Diego Cell Container Capacity:Percentage of remaining container capacity for a given Diego Cell.
  • TAS Diego Cell Disk Capacity:Percentage of remaining disk capacity for a given Diego Cell.
  • TAS Diego Cell Memory Capacity:Percentage of remaining memory capacity for a given Diego Cell.
  • TAS Diego Cell Replication Bulk Sync Duration:Time that the Diego Cell Rep took to sync the ActualLRPs that it claimed with its actual garden containers.
  • TAS Diego Cell Route Emitter Sync Duration:Time the active Route Emitter took to perform its synchronization pass.
  • TAS Garden Health Check Failed:The Diego Cell periodically checks its health against the Garden back end. For Diego Cells, 0 means healthy, and 1 means unhealthy.
  • TAS Gorouter 502 Bad Gateway:The number of bad gateways, or 502 responses, from the Gorouter itself, emitted per Gorouter instance. See here for details.
  • TAS Gorouter File Descriptors:The number of file descriptors currently used by the Gorouter job. Indicates an impending issue with the Gorouter. See here for details.
  • TAS Gorouter Handling Latency:This measures the amount of time a Gorouter takes to handle requests to backend endpoints, including both apps, CC and UAA. See here for details.
  • TAS Gorouter Server Error:The number of requests completed by the Gorouter VM for HTTP status family 5xx, server errors, emitted per Gorouter instance.
  • TAS Gorouter Throughput:This measures the number of requests completed by the Gorouter VM, emitted per Gorouter instance. See here for details.
  • TAS Gorouter Time Since Last Route Register Received:Time since the last route register was received, emitted per Gorouter instance. Indicates if routes are not being registered to apps correctly.
  • TAS Locks Held by Auctioneer:Whether an Auctioneer instance holds the expected Auctioneer lock (in Locket). See here for details.
  • TAS Locks Held by BBS:Whether a BBS instance holds the expected BBS lock (in Locket). See here for details.
  • TAS UAA Latency is Elevated:A quick way to confirm user-impacting behavior is to try login.run.pivotal.io and see if you receive a delayed response. See here for details.