Configure summary indexes
For a general overview of summary indexing and instructions for setting up summary indexing through Splunk Web, see Use summary indexing for increased reporting efficiency.
You can't manually configure a summary index for a saved report in
savedsearches.conf until it is set up as a scheduled report that runs on a regular interval, triggers each time it is run, and has the Enable summary indexing alert option selected.
In addition, you need to enter the name of the summary index that the report will populate. You do this through the detail page for the report in Settings > Searches and Reports after selecting Enable summary indexing. The Summary index is the default summary index (the index that Splunk Enterprise uses if you do not indicate another one).
If you plan to run a variety of summary index reports you may need to create additional summary indexes. For information about creating new indexes, see Create custom indexes in the Managing Indexers and Clusters manual. It's a good idea to create indexes that are dedicated to the collection of summary data.
Summary indexing volume is not counted against your license, even if you have several summary indexes. In the event of a license violation, summary indexing will halt like any other non-internal search behavior.
If you enter the name of an index that does not exist, Splunk Enterprise runs the report on the schedule you've defined, but it does not save the report data to a summary index.
For more information about creating and managing reports, see Create and edit reports.
For more information about defining a report that can populate a summary index, see Design searches that populate summary events indexes.
When you define the report that you will use to build your index, in most cases you should use the summary indexing transforming commands in the report's search string. These commands are prefixed with "si-":
sirare. The reports you create with them should be versions of the report that you'll eventually use to query the completed summary index.
The summary index transforming commands automatically take into account the issues that are covered in "Considerations for summary index report definition" below, such as scheduling shorter time ranges for the populating report and setting the populating report to take a larger sample. You only have to worry about these issues if the report you are using to build your index does not include summary index transforming commands.
If you do not use the summary index transforming commands, you can use the
collect search commands to create a report that Splunk Enterprise saves and schedules, and which populates a pre-created summary index. For more information about that method, see "Manually populate the summary index" in this topic.
Customize summary indexing for a scheduled report
When you use Splunk Web to enable summary indexing for a scheduled and summary-index-enabled report, Splunk Enterprise automatically generates a stanza in
$SPLUNK_HOME/etc/system/local/savedsearches.conf. You can customize summary indexing for the report by editing this stanza.
If you've used Splunk Web to save and schedule a report, but haven't used Splunk Web to enable the summary index for the report, you can easily enable summary indexing for the report through
savedsearches.conf as long as you have a new index for it to populate. For more information about manual index configuration, see About managing indexes in Managing Indexers and Clusters.
[ <name> ] action.summary_index = 0 | 1 action.summary_index._name = <index> action.summary_index.<field> = <value>
[<name>]: Splunk Enterprise names the stanza based on the name of the scheduled report that you enabled for summary indexing.
action.summary_index = 0 | 1: Set to 1 to enable summary indexing. Set to 0 to disable summary indexing.
action.summary_index._name = <index>- This displays the name of the summary index populated by this report. If you've created a specific summary index for this report, enter its name in
<index>. Defaults to
summary, the summary index that is delivered with Splunk Enterprise.
action.summary_index.<field> = <value>: Specify a field/value pair to add to every event that gets summary indexed by this report. You can define multiple field/value pairs for a single summary index report.
This field/value pair acts as a "tag" of sorts that makes it easier for you to identify the events that go into the summary index when you are running reports against the greater population of event data. This key is optional but we recommend that you never set up a summary index without at least one field/value pair.
For example, add the name of the report that is populating the summary index (
action.summary_index.report = summary_firewall_top_src_ip), or the name of the index that the report populates (
action.summary_index.index = search).
Search commands useful to summary indexing
Summary indexing utilizes of a set of specialized transforming commands which you need to use if you are manually creating your summary indexes without the help of the Splunk Web interface or the summary indexing transforming commands.
addinfo: Summary indexing uses
addinfoto add fields containing general information about the current report to the report results going into a summary index. Add
| addinfoto any report to see what results will look like if they are indexed into a summary index.
collect: Summary indexing uses
collectto index report results into the summary index. Use
| collectto index any report results into another index (using
overlapto identify gaps and overlaps in a summary index.
overlapfinds events of the same
query_idin a summary index with overlapping timestamp values or identifies periods of time where there are missing events.
Manually configure a report to populate a summary index
If you want to configure summary indexing without using the report options dialog in Splunk Web and the summary indexing transforming commands, you must first configure a summary index just like you would any other index via
indexes.conf. For more information about manual index configuration, see, see the topic "About managing indexes" in the Managing Indexers and Clusters manual.
Important: You must restart Splunk Enterprise for changes in
indexes.conf to take effect.
1. Design a search string that you want to summarize results from in Splunk Web.
- Be sure to limit the time range of your report. The number of results that the report generates needs to fit within the maximum report result limits you have set for reporting.
- Make sure to choose a time interval that works for your data, such as 10 minutes, 2 hours, or 1 day. (For more information about using Splunk Web to schedule report intervals, see the topic "Schedule reports" in the Reporting Manual.)
2. Use the
addinfo search command. Append
| addinfo to the end of the report's search string.
- This command adds information about the report to events that the collect command requires in order to place them into a summary index.
- You can always add
| addinfoto any search string to preview what its results will look like in a summary index.
3. Add the
collect search command to the report's search string. Append
|collect index=<index_name> addtime=t marker="report_name=\"<summary_report_name>\"" to the end of the search string.
index_namewith the name of the summary index.
summary_report_namewith a key to find the results of this report in the index.
summary_report_name*must* be set if you wish to use the overlap search command on the generated events.
Note: For the general case we recommend that you use the provided
summary_index alert action. Configuring via
collect requires some redundant steps that are not needed when you generate summary index events from scheduled reports. Manual configuration remains necessary when you backfill a summary index for timeranges which have already transpired.
Considerations for summary index report definition
If for some reason you're going to set up a summary-index-populating report that does not use the summary indexing transforming commands, you should take a few moments to plan out your approach. With summary indexing, the egg comes before the chicken. Review the results of the report that you actually want to run to help define the report you actually use to populate the summary index.
Many summary-searching reports involve aggregated statistics--for example, a report where you are searching for the top 10 ip addresses associated with firewall offenses over the past day--when the main index accrues millions of events per day.
If you populate the summary index with the results of the same report that you run on the summary index, you'll likely get results that are statistically inaccurate. You should follow these rules when defining the report that populates your summary index to improve the accuracy of aggregated statistics generated from summary index reports.
Schedule a shorter time range for the populating report
The report that populates your summary index should be scheduled on a shorter (and therefore more frequent) interval than that of the report that you eventually run against the index. You should go for the smallest time range possible. For example, if you need to generate a daily "top" report, then the report populating the summary index should take its sample on an hourly basis.
Set the populating report to take a larger sample
The report populating the summary index should seek out a significantly larger sample than the report that you want to run on the summary index. So, for example, if you plan to search the summary index for the daily top 10 offending IP addresses, you would set up a report to populate the summary index with the hourly top 100 offending IP addresses.
This approach has two benefits--it ensures a higher amount of statistical accuracy for the top 10 report (due to the larger and more-frequently-taken overall sample) and it gives you a bit of wiggle room if you decide you'd rather report on the top 20 or 30 offending IPs.
The summary indexing transforming commands automatically take a sample that is larger than the report that you'll run to query the completed summary index, thus creating summary indexes with event data that is not incorrectly skewed. If you do not use those commands, you can use the head command to select a larger sample for the summary-index-populating report than the report that you run over the summary index. In other words, you would have
| head=100 for the hourly summary-index-populating report, and
| head=10 for the daily report over the completed summary index.
Set up your report to get a weighted average
If your summary-index-populating report involves averages, and you are not using the summary indexing transforming commands, you need to set that report up to get a weighted average.
For example, say you want to build hourly, daily, or weekly reports of average response times. To do this, you'd generate the "daily average" by averaging the "hourly averages" together. Unfortunately, the daily average becomes skewed if there aren't the same number of events in each "hourly average". You can get the correct "daily average" by using a weighted average function.
The following expression calculates the daily average response time correctly with a weighted average by using the
eval commands in conjunction with the
sum statistical aggregator. In this example, the
eval command creates a
daily_average field, which is the result of dividing the average response time sum by the average response time count.
| stats sum(hourly_resp_time_sum) as resp_time_sum, sum(hourly_resp_time_count) as resp_time_count | eval daily_average= resp_time_sum/resp_time_count | .....
Schedule the summary-index-populating report to avoid data gaps and overlaps
Along with the above two rules, to minimize data gaps and overlaps you should also be sure to set appropriate intervals and delays in the schedules of reports you use to populate summary indexes.
Gaps in a summary index are periods of time when a summary index fails to index events. Gaps can occur if:
- the scheduled saved report (the one being summary indexed) takes too long to run and runs past the next scheduled run time. For example, if you were to schedule the report that populates the summary to run every 5 minutes when that report typically takes around 7 minutes to run, you would have problems, because the report won't run again when it's still running a preceding report.
Overlaps are events in a summary index (from the same report) that share the same timestamp. Overlapping events skew reports and statistics created from summary indexes. Overlaps can occur if you set the time range of a saved report to be longer than the frequency of the schedule of the report, or if you manually run summary indexing using the collect command.
Example of a summary index configuration
This example shows a configuration for a summary index of Apache server statistics as it might appear in
savedsearches.conf. The keys listed below enable summary indexing for the "Apache Method Summary" report.
Note: If you set
action.summary_index=1, you don't need to have the
collect commands in the report's search string.
#name of the report = Apache Method Summary [Apache Method Summary] # sets the report to run at each interval counttype = always # enable the report schedule enableSched = 1 # report interval in cron notation (this means "every 5 minutes") schedule = */5**** # id of user for report userid = jsmith # search string for summary index search = index=apache_raw startminutesago=30 endminutesago=25 | extract auto=false | stats count by method # enable summary indexing action.summary_index = 1 #name of summary index to which report results are added action.summary_index._name = summary # add these keys to each event action.summary_index.report = "count by method"
Other configuration files affected by summary indexing
Caution: Do not edit settings in
alert_actions.conf without explicit instructions from Splunk Technical Support.
Manage summary index gaps
Configure batch mode search
This documentation applies to the following versions of Splunk Cloud Platform™: 8.0.2006, 8.0.2007, 8.1.2009, 8.1.2011, 8.1.2012, 8.1.2101, 8.1.2103, 8.2.2104, 8.2.2105, 8.2.2106, 8.2.2107 (latest FedRAMP release), 8.2.2109, 8.2.2111, 8.2.2112