Splunk® Enterprise


Roll up metrics data for faster search performance and increased storage capacity

If you have high-volume metrics that index large numbers of unique metric data points at a fast rate, you are probably concerned about issues like storage capacity for historical metrics data and the slow performance of searches across those large datasets.

A metric rollup policy can help you with these issues. You apply metric rollup policies to metric indexes with high-volume metrics. A metric rollup policy sets rules for the aggregation and summarization of the metrics on those indexes. The resulting metric rollup summaries are created in one or more target metric indexes. The rollup summaries contain metric data points that are aggregations of the raw metric data points in the source index. The summarized metrics take up less disk space and are faster to search than the orginal metrics.

You can create metric rollup policies through Splunk Web, by adding or updating configurations in metric_rollups.conf, and by using the catalog/metricstore/rollup REST API endpoint.

Certain metrics rollup feature extensions, such as the ability to define multiple default aggregation functions for a rollup policy, can only be managed through manual configuration file edits or REST API operations.

See the following topics:

The Splunk Cloud Platform does not support the metrics rollup feature.

Index prerequisites for metric rollup policies

If you want to define a metric rollup policy, you must identify a source metrics index and one or more target metrics indexes. The source index holds the raw metrics that you want the metric rollup policy to summarize. The target index or indexes are where the rollup summaries are stored.

You can designate a source index as a target index if there is space on it for the summaries. However, colocating your source and target indexes on the same device might reduce your ability to get increased data storage benefits from the feature.

If target indexes for your metric rollup policy do not already exist, you must create them. See Create metrics indexes in Managing Indexers and Clusters of Indexers.

Using metric rollup summaries with distributed search

The background searches that populate the rollup summaries operate on the search head. This means that they require that the source index and the target indexes be discoverable on the search head. If you use distributed search, your indexes are all on the indexer tier and are not discoverable on the search head.

You can work around this by creating stand-in source and target indexes on the search head tier. As long as the stand-in indexes have the same names as the actual indexes on the indexer tier, the Splunk software applies any rollup policies you create for the stand-in indexes to the actual indexes.

If you use distributed search you also need to arrange to have the stand-in index on the search head forward its summary data to the actual index on the indexers. You can do this by setting up a universal forwarder configuration on the search head that uses a whitelist to filter out all other indexes. This enables it to forward the metric rollup summary data to the actual target metrics index on the indexer tier. See the following topics:

About the _metrics_rollup index

The _metrics_rollup index is an internal index that is designed for use by the Monitoring Console. Data flows to it only when the Monitoring Console is enabled and the metrics rollup policy configured in the context of the Monitoring Console app is also enabled.

To learn how to enable data to flow to the _metrics_rollup index, see Resource Usage: CPU Usage in Monitoring Splunk Enterprise and find the subsection on the Median Historical CPU Usage dashboard.

Anatomy of a metric rollup policy

If you have a source metrics index that contains high-volume metrics, you can create a metric rollup policy for it. The source metrics index must be discoverable on a search head. See Index prerequisites for metric rollup policies.

Metric rollup policy requirements

At a minimum, a metric rollup policy determines:

  • How many rollup summaries are created for the raw metrics in its source index.
  • Which target indexes the summaries are stored in.
  • The periods of the scheduled searches that generate the aggregated metric data points for the rollup summaries.
  • The default aggregation function used for the summarization of the raw metrics.

The following table defines the required components of a metric rollup policy.

Item Description
One or more rollup summary definitions Rollup summary definitions determine where and how the search heads create rollup summaries.
A default aggregation function This is the default function the search head uses to aggregate metric data points from the source index when it generates rollup summaries. If you do not define an aggregation function, or if you create your metric rollup policy through Splunk Web, the search head uses the avg function. The other eligible functions are count, max, median, min, perc<int>, and sum.

Each rollup summary definition breaks down further into two parts: a target metric index name and a timespan.

Component Description
Target metric index name This is the index that the metric rollup summary will be created on. You must create the target metric index if it does not already exist. It must be discoverable on a search head. See Index prerequisites for metric rollup policies.
Timespan This sets the period of the scheduled searches that generate the metric rollup summary. It must be indicated with relative time syntax, such as 1h for one hour or 20m for twenty minutes. You might run into search concurrency issues if you set the timespan below 60 seconds.

Metric rollup policy options

A metric rollup policy can optionally include a dimension filter and one or more exception rules. The following table describes these optional components.

Item Description
Dimension filter You can indicate a set of dimensions that must be included or excluded from the rolled-up metrics in the summaries produced by the policy. Included dimensions are the only dimensions in the rolled-up metrics that come from the source metric data points. Excluded dimensions are the only dimensions from the source metric data points that do not appear in the rolled-up metrics.
Aggregation exception rules Create exception rules for metrics that require different aggregation functions than the majority of the metrics in the rollup policy. For example, when your default aggregation is <avg>, you might have specific metrics that should instead be aggregated with functions like count or perc<int>. The other eligible functions are max, median, min, and sum.

How metric rollup summaries are generated

A metric rollup summary is built from the results of a single saved search that can include multiple subsearches. This search runs on a schedule determined by the timespan component of the rollup summary definition. It aggregates sets of raw metric data points using the default aggregation function, or whatever exception aggregation functions might be defined for certain metrics. The search strips out any dimensions that are not in the dimension filter, if one is defined for the summarization policy.

The summary-creating search spawns a separate subsearch for each group of metrics in the source index that have the same aggregate functions and dimension sets.

The search head gives new metric names to the aggregated metric data points produced by the summary-creating search. The new metric names follow this naming convention: <raw_metric_name>_mrollup_<aggregate_function>_<timespan_in_seconds>.

The summary-creating search also adds three new fields to each rolled up metric data point.

Field name Description
rollup_source_index The name of the source index
rollup_span The period of the scheduled search that generated the rollup summary that this metric data point belongs to
rollup_aggregate The function used in the creation of this aggregate data point

Metric rollup summary generation example

Say you have a metric rollup policy on a source index named HomeIndex. The details of this metric rollup policy are as follows:

  • It has a rollup summary definition that names SumIndex as its target index and provides 1h as the period of its background scheduled searches.
  • It was created through Splunk Web, so it uses <avg> as its default aggregation method.
  • It has a dimension filter that includes only these three dimensions: ip, app, and region. This means that the policy only rolls up metrics in HomeIndex that include one or more of these dimensions, and that the policy strips out all other dimensions from the aggregated metric data points that it creates for the metric rollup summary.
  • It has an exception rule for a metric named Metric_C. This rule says that this metric is to be aggregated with the max function when the search head creates rollup metric data points for it.

After you save this policy, a summary-creating search begins running in the background on an hourly schedule. When the search runs, it spawns a subsearch for each metric on HomeIndex that has the included dimensions among its dimension sets. These subsearches produce a single aggregate metric data point each time they run. This means that if an eligible metric has 75 data points indexed over the past hour on HomeIndex, those 75 data points are aggregated into a single metric data point by the rollup search job.

All of these aggregate metric data points are stored on SumIndex. Each point is an aggregation of the metric data points that came in over the past hour for an eligible HomeIndex metric. The background search gives the SumIndex summary metric data points new metric names that reflect their origins, but which also clearly identify them as rollup metric data points.

To continue the example, let us say that on HomeIndex, you have three metrics: metric_A, metric_B, and metric_C. They have different combinations of dimensions, and metric_C has the exception rule which requires that its metric data points be aggregated differently than the others. The following table describes these metrics in terms of the dimensions they contain, the function used for their aggregation, and the metric_name their rolled up metric data points are given.

metric_name on source index Includes ip dimension? Includes app dimension? Includes region dimension? Aggregation function metric_name on target index
metric_A Yes Yes Yes avg (default) metric_A_mrollup_avg_3600s
metric_B No No No n/a Not summarized because it lacks the required dimensions.
metric_C Yes No Yes max (exception rule) metric_C_mrollup_max_3600s

The data points for a metric are rolled up by the rollup summary search as long as they all share the same combination of included dimensions. In the previous example, all of the data points for metric_C get rolled up because they all have ip and region. But if the some of the data points belonging to a metric have an included dimension while other data points belonging to that metric lack that included dimension, none of the data points for that metric get rolled up.

Later, you can search SumIndex in exactly the same way that you currently search HomeIndex. You can run faster searches over longer periods of time because the searches are running across smaller sets of metric data points that only have one to three dimension fields.

You can also arrange to store the metrics in SumIndex for longer periods of time than you might store their corresponding metrics on HomeIndex because they take up less space on disk.

Last modified on 14 July, 2022
Set up ingest-time log-to-metrics conversion with configuration files   Create and edit metric rollup policies with Splunk Web

This documentation applies to the following versions of Splunk® Enterprise: 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.2.0, 9.2.1

Was this topic useful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters