Create a summary index in Splunk Web
If you have a transforming search that runs over a large amount of data and is slow to complete, and you have to run this search on a regular basis, you can create a summary index for it. When that summary index is built, the searches you run against it should complete much faster than they did before.
There are two kinds of summary indexes that you can create:
- summary events indexes
- summary metrics indexes
At a high level, the steps you take to create both types of indexes in Splunk Web are the same.
- Identify an index that can be used for summary indexing. Create one if necessary.
- Design a report that can populate the index with summary data.
- Schedule your index-populating report so it runs on a regular interval without gaps or overlaps.
- Enable the scheduled report for summary indexing. This step ties the report to the summary index and runs the report on its schedule, populating the index with the results of the search.
The details of how you perform these four steps depend on whether you are creating a summary events index or a summary metrics index. See the following subsections for more information.
See Use summary indexing for increased search efficiency to get an overview of summary indexing.
Identify or create an index to use for summary indexing
The process of setting up a summary index starts with an index. If you have an existing index that you can use as a summary index, great. If not, you must create an index.
You need an events index if you are setting up a summary events index. You need a metrics index if you are setting up a summary metrics index.
If you want to use an existing index as a summary index
A best practice for summary indexing is to dedicate different summary indexes to different kinds of data. If the slow-completing search that you want to speed up returns data that is similar to the data stored in an existing summary index, consider using that index for your new summary indexing operation.
Every Splunk platform deployment comes with a default summary events index titled "summary". If you are setting up a summary events index, consider using the "summary" index if it is empty or if it is already being used to summarize searches similar to the one you want to summarize.
Splunk platform deployments do not provide a default metrics summary index.
If you want to create a new index for the purpose of summary indexing
Instructions for creating both types of indexes can be found in different topics depending on whether you use Splunk Cloud Platform or Splunk Enterprise.
- If you use Splunk Cloud Platform, go to Manage Splunk Cloud Platform indexes in the Splunk Cloud Platform Admin Manual.
- If you use Splunk Enterprise, go to Create custom indexes in Managing Indexers and Clusters of Indexers.
Design a report to build and maintain the summary index
The Splunk software builds summary indexes and keeps them up to date by inserting the results of a scheduled search into the index. You need to design that search.
The way you write that search differs slightly depending on whether you intend to summarize events or metrics. There are some common factors, however.
- Whether you are designing a summary events index or a summary metrics index, the search that populates the index with its results will be similar to the slow-performing search that you are trying to speed up.
- In both cases the search must be a transforming search that returns statistical events.
- In both cases you save your finished search as a report.
For more information about saving searches as reports, see Create and edit reports in the Reporting Manual.
Design a report for a summary events index
When you design the transforming search that will populate a summary events index, it will have the same SPL as the slow-to-complete search that inspired you to create a summary index, except that it uses a si*
transforming command in the place of the transforming command already in the search. For example, if the search uses stats
, replace it with sistats
.
After you create this search, save it as a report.
For more information, see Design searches that populate summary event indexes.
Design a report for a summary metrics index
The transforming search that you provide for a summary metrics index must be the slow-completing search that inspired you to build the summary metrics index in the first place. There is no need to alter it. Save the search as a report, if you haven't done so already.
Because you are using this search as the basis for a summary metrics index, this is a good time to prepare for the fact that the metrics summary indexing process will transform the events returned by this search into metric data points. This means that the Splunk software will sort all of the fields that have numeric values into metric measurements with this format: metric_name:<fieldname>=<value>
. The remaining fields will be dimensions, and their format will not be changed.
When you enable the search for summary indexing, you can optionally identify numeric fields that must be treated as dimensions. Look at the results returned by the search and determine whether any of the numeric fields in the events should be on that dimension list. You can add any field to the list as long as you do not list all of them. The Splunk software cannot index metric data points that do not have at least one measurement.
For more information about metric data points, metric measurements, and dimensions, see Overview of metrics in Metrics.
For an overview of the events-to-metrics conversion process, see Convert event logs to metric data points in Metrics.
If you review the events returned by your search and find that potential metric measurement fields have characters that are not alphanumeric or underscores, set HEADER_FIELD_ACCEPTABLE_SPECIAL_CHARACTERS
for the mcollect_stash
source type so that it accepts those characters. If you do not do this the Splunk software will convert those characters into underscores. This is especially important for preserving "." characters in metric names.
For more information about updating source type settings in Splunk Web, see Manage source types in Getting Data In.
For more information about the HEADER_FIELD_ACCEPTABLE_SPECIAL_CHARACTERS
setting, see Extract fields from files with structured data in Getting Data In.
Schedule the report for your summary index
After you save your summary-index-populating search as a report, you need to schedule it. This step is the same for both index types.
Give the report an interval that is smaller than the time range of the searches you will run against the summary events index. This practice ensures that the searches you run against the summary events index are statistically accurate.
For example, if you plan to run searches against a summary events index that return results for the last week, populate that summary index with the results of a report that runs on an hourly interval, returning results for the last hour. If you want to run searches against a summary index over the past year of data, arrange for the summary index to collect data on a daily basis for the past day.
For more information about scheduling reports, see Schedule reports in the Reporting Manual.
Minimize the chance of data gaps and overlaps in the report schedule
It is important that you schedule your summary-index-populating report in a manner that minimizes potential data gaps and overlaps.
Data gaps are periods of time when the summary events index fails to index events. This table lists situations that can cause summary index data gaps.
Issue | Result | Why it happens |
---|---|---|
The summary-index-populating report takes too long to run and runs past the next scheduled run time. | This can lead to skipped report runs. | When a scheduled report job is running, the report scheduler won't launch a subsequent report job until the first job completes. For example, if you know your scheduled report can take as much as 7 minutes to run, do not schedule it to run on a 5 minute interval. |
You force the summary-index-populating report to use real-time scheduling. | This can result in data collection gaps if you are concurrently running several reports. | This happens if the realtime_schedule setting is set to 1 on the Advanced Edit page for the report in Settings > Searches, reports, and Alerts. When you enable a report for summary indexing the Splunk software automatically sets its realtime_schedule to 0 , to help ensure that the report never skips a scheduled run. See Configure the priority of scheduled reports, in the Reporting Manual.
|
splunkd goes down.
|
If the Splunk indexers cannot index events, you will have gaps in your summary indexes. | The splunkd process can go down for a number of reasons, including intentional shutdown.
|
For more information about detecting and fixing gaps in data, see Manage summary index gaps.
Overlaps are events in a summary index (from the same report) that share the same timestamp. Overlapping events skew reports and statistics created from summary indexes. Overlaps can occur if you set the time range of a report to be longer than the report interval. For example, don't arrange for a report that runs hourly to gather data for the past 90 minutes.
Enable your scheduled report for summary indexing
After you schedule the summary-building report, you enable it to build and maintain a summary index. This stage differs slightly depending on the type of summary index you are creating.
Prerequisites
- Identify or create an index that can be used as a summary index.
- Create a report that can be used to populate the summary index.
- Schedule the report so that the summary index is statistically accurate and will not have data gaps and overlaps.
Steps to enable a search for summary events indexing
- Select Settings > Searches, Reports, and Alerts.
- Locate the report that you created and scheduled. Select Edit > Edit Summary Indexing.
- Select Enable Summary Indexing.
- Select Event as the index type.
- Select the events index that you want to use as the summary index for this search. The list displays only indexes to which you have permission to write. The default events summary index is named "summary".
The list does not filter out metrics indexes. Make sure you select an events index.
- (Optional) Use Add Fields to add one or more field/value pairs to the summary events index definition.
The Splunk software annotates events added to the summary index by this search with the field/value pairs that you supply. This enables you to search on these events. For example, you could add the name of the report that populates the summary index (report=summary_firewall_top_src_ip) to the events in your summary. Later, if you want to restrict a search of the summary index to events added by this search, you can addreport=summary_firewall_top_src_ip
to its SPL.
After you save these settings, the Splunk software starts running the search on its schedule in the background. When it runs the search it automatically collects the results into the designated summary events index.
Steps to enable a search for summary metrics indexing
- Select Settings > Searches, Reports, and Alerts.
- Locate the report that you created and scheduled. Select Edit > Edit Summary Indexing.
- Select Enable Summary Indexing.
- Select Metric as the index type.
- (Optional) In Exclude from measures, provide a comma-separated list of fields that can be excluded from the measures in the summarized metric data points.
The metric summarization process automatically converts all numeric fields in your search results into metric measures, and all other fields become dimensions. Numeric fields listed here are added to that set of dimension fields. - Select the metrics index that you want to use as the summary index for this search. The list displays only indexes to which you have permission to write.
The list does not filter out events indexes. Make sure you select a metrics index.
- (Optional) Use Add Fields to have the Splunk software add one or more dimensions to the metric data points that it inserts into the summary metrics index. This does not add metric measurements to the summary index.
The Splunk software annotates metric data points with the dimension field/value pairs that you supply. This helps you to search on these metric data points. For example, you could add the name of the report that populates the summary index (report=summary_firewall_top_src_ip) to the metric data points in your summary. Later, if you want to restrict a search of the summary index to metric data points added by this search, you can addreport=summary_firewall_top_src_ip
to its SPL.
After you save these settings, the Splunk software starts running the search on its schedule as a background search. When it runs the search it automatically converts the event results into metric data points, turning numeric fields not on the Exclude from measures list into metric measurements, and treating all other fields as dimensions. The process then collects these metric data points into the designated summary metric index.
For an overview of the events-to-metrics conversion process, see Convert event logs to metric data points in Metrics.
Use summary indexing for increased search efficiency | Design searches that populate summary events indexes |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.3.2408, 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release)
Feedback submitted, thanks!