Docs » Charts in Splunk Observability Cloud » Gain insights through chart analytics

Gain insights through chart analytics đź”—

Splunk Infrastructure Monitoring analytics can change a chart that is displaying raw metric data into a powerful tool that gives you a deeper understanding of patterns and trends, so you can more effectively monitor infrastructure, application or service health. In this section, we provide instructions for how to do the following.

This section assumes you are familiar with the following topics.

Compare aggregates by service or other metadata đź”—

When you are looking at infrastructure metrics for a good-sized fleet of hosts, virtual machines or containers, it is often more instructive to look at them at an aggregate level and compare the aggregates than to look at individual instances. Many of the analytics functions allow you to group the output by metadata, which serves this purpose perfectly.

  1. Select the metric you want to compare at an aggregate level (e.g. across services) and enter its name in the Signal field for plot A. In this example, we are plotting demo.trans.latency.

This screenshot shows how to select the metric you want to compare at an aggregate level and use


  1. In the Analytics field, select the function you want to apply, such as mean:aggregation. The chart now displays a single plot line displaying the mean value of the aggregation across all time series in each time interval.

This screenshot shows how to select the function you want to apply and use


  1. Click on the selected function for the plot. Click the group‑by dropdown. Select the metadata you want to group by, such as service (if you are sending in a dimension named “service”), aws_availability_zone (if you are using AWS) or other metadata. In this example, we chose demo_datacenter.

This screenshot shows how to select the metadata you want to group by and use


  1. Now you can see the metric aggregated across all resources (hosts/vm/container) in each sub-group. As the data table shows, each plot line represents one of the two demo_datacenters.

This screenshot shows an example of the analytics aggregated and grouped by the metadata you selected.

Retain peaks and valleys in longer time ranges đź”—

By default, Splunk Infrastructure Monitoring selects a rollup that is appropriate for the time range and chart resolution you have selected. For example, let’s assume you are sending a metric every 10 seconds to Infrastructure Monitoring, and that its metric type is gauge. If you are looking at a month’s worth of that metric in a chart, there are too many data points to display (6 data points per minute x 60 minutes per hour x 24 hours per day x 30 days per month = 259,200 data points).

In this situation, Infrastructure Monitoring applies the default visualization rollup of Average for a gauge metric. This rollup has the effect of averaging out the data, and makes peaks or valleys that are visible at the higher resolution less apparent.

This screenshot shows an example of the default visualization rollup of Average, the gauge metric


To retain the peaks or valleys, you can change the rollup to max or min, whichever is more relevant to your metric. The Y-axis value range may change from what it was in the original visualization. In this illustration, we clone plot A and change the rollup to max in plot B (and change the color in plot B to make the differences easier to see). To clone a plot line, open the plot’s Actions menu (⋯) at the far right of the plot line, then select Clone. For information on changing plot color, see Set options in the plot configuration panel.

This screenshot shows an example of changing the default visualization rollup of Average, the gauge metric, to rollup to max


To make peaks and valleys even more noticeable, increase the chart display resolution. Here, we change it from the default to Very High. The differences are more visible.

This screenshot shows an example of changing the chart display resolution to very high


Choosing a shorter time frame increases visibility as well. Here, we change the time range from the past 20 days to the past week.

This screenshot shows an example of changing the time range from the past 20 days to the past week


For more information about the interactions between rollups, chart resolution, and analytics, see Data resolution and rollups in charts.

Correlate multiple metrics đź”—

It is often useful to visualize multiple metrics on the same chart so as to more easily correlate their behavior. For example, you may want to look at the number of transactions happening per second alongside the latency of the transactions. Splunk Infrastructure Monitoring lets you display as many metrics as you want on a single chart, and gives you two Y-axes in case the ranges of the metrics’ values are significantly different.

  1. Select the metric you want to compare and enter its name in the Signal field for plot A. In this example, we are using demo.trans.latency.

  2. Select the second metric and use it in plot B. We’ve selected demo.trans.count.

This screenshot shows an example of using demo.trans.latency and demo.trans.count for comparing correlations


  1. In plot B, click Y-Axis and select right. To learn more, see Left and right Y-axes.

This screenshot shows how to change the Y-Axis label to right


  1. Using the visualization type option for each plot line, select different types for A and B, such as Line for A and Column for B. To learn more, see Visualization type. In this example, we also used plot configuration options to change the color of plot line B to enhance visibility. To learn more, see Plot color.

This screenshot shows how to change the plot type of B, demo.trans.count, to column to enhance visibility

View weekly, daily or hourly comparisons đź”—

If time of day or week matters for understanding whether your apps or infrastructure are performing within normal bounds, or if your business sees cyclical or periodic demand, e.g. weekdays and weekends are very different, then you can create charts that highlight the change from one week, one day, one hour etc. to the next. (Note that Splunk Infrastructure Monitoring allows you to do comparisons using whatever timeframe you want, not just these intervals.)

  1. Use the first plot (plot A) to show the metric you care about, then clone A to create plot B. (To clone a plot line, open the plot’s Actions menu (⋯) at the far right of the plot line, then select Clone.) In this example, we are using memory.usage.total as our signal.

  2. Add a Timeshift function to plot B, entering a time range over which the change matters, For example, use 5m for 5 minutes, 2d for 2 days, and 1w for 1 week.

This screenshot shows how to select timeshift as an function


This screenshot shows how the one week time range over the change matters, which is memory.usage.total in the example


  1. In plot C, click on Enter Formula to enter A-B to see the difference between now and a week ago.

  2. Use the plot configuration panel to specify an area visualization for plot C. To learn more, see Set options in the plot configuration panel.

This screenshot shows how to change the visualization for plot C to compare the differences between A and B

Use percentages or ratios đź”—

In many cases, you may want to see percentages or ratios rather than the raw metric. For example, the ratio of return codes that signify failure to those that signify success, or the percentage of cache hits out of total cache accesses (hits + misses).

  1. Use the first plot (plot A) to show one of the metrics you care about, e.g. zipper.missCount.

  2. Use the second plot (plot B) to show the other metric you want, e.g. zipper.hitCount.

This screenshot shows plot A as zipper.missCount and plot B as zipper.hitCount


  1. In plot C, enter formula A/(A+B) and add a scale:100 function to express the ratio as a percentage.

This screenshot shows how to add a formulate and scale to show percentage


  1. Alt-click or option-click on the eye icon next to plot C to hide the other plots. You are left with a chart that shows the percentage of missed hits over time.

This screenshot shows how to only display plot C, which is A/(A+B)


  1. Edit the plot name for plot C, so useful information shows up when you hover over the chart (before and after shown below) or view the data table.

This screenshot shows how to change the name of the plot for adding useful information when hover over the chart

Use percentiles to see population overviews đź”—

When you want to get a quick overview of a population, a distributed percentile chart is a good option. To construct such a chart, use non-stacked area charts. Select Show on-chart legend in the Chart Options tab (see Show on-chart legend), then show the plots like the following.

  • p10. In the first plot (plot A), enter the metric and filters you want, then use the Percentile function and enter 10 as the value.

  • median. Clone plot A and use 50 as the value.

  • p90. Clone plot B and use 90 as the value.

This illustration shows what such a chart might look like:

This screenshot shows the percentiles of three plots, which are demo.trans.latency in the example


To see specific values, hover over different points on the chart or display the data table.

Show Top or Bottom N lists đź”—

Top or bottom N charts are great for showing simple outliers, rankings or worst performers.

  1. Enter a metric for plot A. We chose cpu.utilization.

  2. Select List as your chart type.

  3. Apply the analytics function Top or Bottom, then choose either the number of values you want to see in the list or the percentage range you want to see. In this example, we chose Top 5 and specified Count.

This screenshot shows top 5 of cpu.utilization in a list chart


  1. To reduce redundant metadata on the chart, select custom under the Display Fields option in the Chart Options tab to hide the plot name.

  2. Sort Top N charts by Descending value, or Bottom N by Ascending value.

This screenshot shows a descending view of top 5 of cpu.utilization in a list chart
  1. To make the chart even easier to read, use the Display Fields option to hide more fields. You can also hide Entries with missing data under the Visualization Options.

This screenshot shows the view of top 5 of cpu.utilization in a list chart that hides entries with missing data and fields except host.name, host.type and kubernetes_cluster

See changes in distribution đź”—

A histogram is a good way to look at the distribution of a population at a single point in time. Splunk Infrastructure Monitoring provides histograms so you can look at the change in that distribution over time. This is useful for surfacing unexpected changes, e.g. in the latencies of requests served by a cluster.

  1. Select a metric that is being sent from a relatively large number of sources. In this case, we chose demo.trans.latency.

This screenshot shows demo.trans.latency in a line chart view


  1. Choose the histogram graph type.

This screenshot shows demo.trans.latency in a histogram view


Smooth out peaks and valleys đź”—

Do you want to smooth out peaks and valleys in your data, to see general patterns from one period to the next? If you can’t tell at a glance if a value is generally steady, rising, or falling, you want to see data normalized in a moving average format, from one time period to the next. To do this, use the Transformation option instead of Aggregation. The Transformation option is available with the following analytics functions: Mean, Minimum / Maximum, Percentile, Sum, and Variance. For Mean, Minimum, Maximum, and Sum, you can specify either a moving window (the past number of minutes, hours, etc.) or a calendar time window (over the past day, week, month, etc.)

  1. Determine an appropriate interval for applying a moving average.

  2. Use the Mean analytics function, select the Mean:Transformation option, then select the appropriate time window option.

  3. Enter your interval, e.g. 5m.

In the following illustration, values and moving averages are displayed for cpu.utilization as follows:

  • Plot A: Actual values

  • Plot B: 30-minute moving average

  • Plot C: 1-hour moving average

This screenshot shows an example of moving averages are displayed for cpu.utilization by all, 30-minute, and 1-hour


You can also hide plot lines to make the chart easier to read:

This screenshot shows an example of moving averages are displayed for cpu.utilization by all and 1-hour with 5-minute and 30-minute being hidden

Next steps đź”—

For details about all available analytics functions, see the Functions reference for Splunk Observability Cloud.

Once you have developed charts to help you proactively monitor your system, the natural next step is to want to view and receive alerts when values reach certain criteria. For information on how to do this, see Introduction to alerts and detectors in Splunk Observability Cloud.