Splunk® Enterprise

Knowledge Manager Manual

Download manual as PDF

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Configure summary indexes

For a general overview of summary indexing and instructions for setting up summary indexing through Splunk Web, see the topic "Use summary indexing for increased reporting efficiency" in the Knowledge Manager manual.

You can't manually configure a summary index for a saved report in savedsearches.conf until it is set up as a scheduled report that runs on a regular interval, triggers each time it is run, and has the Enable summary indexing alert option selected.

In addition, you need to enter the name of the summary index that the report will populate. You do this through the detail page for the report in Settings > Searches and Reports after selecting Enable summary indexing. The Summary index is the default summary index (the index that Splunk Enterprise uses if you do not indicate another one).

If you plan to run a variety of summary index reports you may need to create additional summary indexes. For information about creating new indexes, see "Create custom indexes" in the Managing Indexers and Clusters manual. It's a good idea to create indexes that are dedicated to the collection of summary data.

Summary indexing volume is not counted against your license, even if you have several summary indexes. In the event of a license violation, summary indexing will halt like any other non-internal search behavior.

Note: If you enter the name of an index that does not exist, Splunk Enterprise runs the report on the schedule you've defined, but it does not save the report data to a summary index.

For more information about creating and managing reports, see "Create and edit reports" in this manual.

For more information about defining a report that can populate a summary index, see the subtopic on setting up summary index reports in Splunk Web in "Use summary indexing for in increased reporting efficiency," in this manual.

Note: When you define the report that you'll use to build your index, most of the time you should use the summary indexing transforming commands in the report's search string. These commands are prefixed with "si-": sichart, sitimechart, sistats, sitop, and sirare. The reports you create with them should be versions of the report that you'll eventually use to query the completed summary index.

The summary index transforming commands automatically take into account the issues that are covered in "Considerations for summary index report definition" below, such as scheduling shorter time ranges for the populating report, and setting the populating report to take a larger sample. You only have to worry about these issues if the report you are using to build your index does not include summary index transforming commands.

If you do not use the summary index transforming commands, you can use the addinfo and collect search commands to create a report that Splunk Enterprise saves and schedules, and which populates a pre-created summary index. For more information about that method, see "Manually populate the summary index" in this topic.

Customize summary indexing for a scheduled report

When you use Splunk Web to enable summary indexing for a scheduled and summary-index-enabled report, Splunk Enterprise automatically generates a stanza in $SPLUNK_HOME/etc/system/local/savedsearches.conf. You can customize summary indexing for the report by editing this stanza.

If you've used Splunk Web to save and schedule a report, but haven't used Splunk Web to enable the summary index for the report, you can easily enable summary indexing for the report through savedsearches.conf as long as you have a new index for it to populate. For more information about manual index configuration, see the topic "About managing indexes" in the Managing Indexers and Clusters manual.

[ <name> ]
action.summary_index = 0 | 1
action.summary_index._name = <index>
action.summary_index.<field> = <value>
  • [<name>]: Splunk Enterprise names the stanza based on the name of the scheduled report that you enabled for summary indexing.
  • action.summary_index = 0 | 1: Set to 1 to enable summary indexing. Set to 0 to disable summary indexing.
  • action.summary_index._name = <index> - This displays the name of the summary index populated by this report. If you've created a specific summary index for this report, enter its name in <index>. Defaults to summary, the summary index that is delivered with Splunk Enterprise.
  • action.summary_index.<field> = <value>: Specify a field/value pair to add to every event that gets summary indexed by this report. You can define multiple field/value pairs for a single summary index report.

This field/value pair acts as a "tag" of sorts that makes it easier for you to identify the events that go into the summary index when you are running reports against the greater population of event data. This key is optional but we recommend that you never set up a summary index without at least one field/value pair.

For example, add the name of the report that is populating the summary index (action.summary_index.report = summary_firewall_top_src_ip), or the name of the index that the report populates (action.summary_index.index = search).

Search commands useful to summary indexing

Summary indexing utilizes of a set of specialized transforming commands which you need to use if you are manually creating your summary indexes without the help of the Splunk Web interface or the summary indexing transforming commands.

  • addinfo: Summary indexing uses addinfo to add fields containing general information about the current report to the report results going into a summary index. Add | addinfo to any report to see what results will look like if they are indexed into a summary index.
  • collect: Summary indexing uses collect to index report results into the summary index. Use | collect to index any report results into another index (using collect command options).
  • overlap: Use overlap to identify gaps and overlaps in a summary index. overlap finds events of the same query_id in a summary index with overlapping timestamp values or identifies periods of time where there are missing events.

Manually configure a report to populate a summary index

If you want to configure summary indexing without using the report options dialog in Splunk Web and the summary indexing transforming commands, you must first configure a summary index just like you would any other index via indexes.conf. For more information about manual index configuration, see, see the topic "About managing indexes" in the Managing Indexers and Clusters manual.

Important: You must restart Splunk Enterprise for changes in indexes.conf to take effect.

1. Design a search string that you want to summarize results from in Splunk Web.

  • Be sure to limit the time range of your report. The number of results that the report generates needs to fit within the maximum report result limits you have set for reporting.
  • Make sure to choose a time interval that works for your data, such as 10 minutes, 2 hours, or 1 day. (For more information about using Splunk Web to schedule report intervals, see the topic "Schedule reports" in the Reporting Manual.)

2. Use the addinfo search command. Append | addinfo to the end of the report's search string.

  • This command adds information about the report to events that the collect command requires in order to place them into a summary index.
  • You can always add | addinfo to any search string to preview what its results will look like in a summary index.

3. Add the collect search command to the report's search string. Append |collect index=<index_name> addtime=t marker="report_name=\"<summary_report_name>\"" to the end of the search string.

  • Replace index_name with the name of the summary index.
  • Replace summary_report_name with a key to find the results of this report in the index.
  • A summary_report_name *must* be set if you wish to use the overlap search command on the generated events.

Note: For the general case we recommend that you use the provided summary_index alert action. Configuring via addinfo and collect requires some redundant steps that are not needed when you generate summary index events from scheduled reports. Manual configuration remains necessary when you backfill a summary index for timeranges which have already transpired.

Considerations for summary index report definition

If for some reason you're going to set up a summary-index-populating report that does not use the summary indexing transforming commands, you should take a few moments to plan out your approach. With summary indexing, the egg comes before the chicken. Use the report that you actually want to run and review the results of to help define the report you use to populate the summary index.

Many summary-searching reports involve aggregated statistics--for example, a report where you are searching for the top 10 ip addresses associated with firewall offenses over the past day--when the main index accrues millions of events per day.

If you populate the summary index with the results of the same report that you run on the summary index, you'll likely get results that are statistically inaccurate. You should follow these rules when defining the report that populates your summary index to improve the accuracy of aggregated statistics generated from summary index reports.

Schedule a shorter time range for the populating report

The report that populates your summary index should be scheduled on a shorter (and therefore more frequent) interval than that of the report that you eventually run against the index. You should go for the smallest time range possible. For example, if you need to generate a daily "top" report, then the report populating the summary index should take its sample on an hourly basis.

Set the populating report to take a larger sample

The report populating the summary index should seek out a significantly larger sample than the report that you want to run on the summary index. So, for example, if you plan to search the summary index for the daily top 10 offending ip addresses, you would set up a report to populate the summary index with the hourly top 100 offending ip addresses.

This approach has two benefits--it ensures a higher amount of statistical accuracy for the top 10 report (due to the larger and more-frequently-taken overall sample) and it gives you a bit of wiggle room if you decide you'd rather report on the top 20 or 30 offending ips.

The summary indexing transforming commands automatically take a sample that is larger than the report that you'll run to query the completed summary index, thus creating summary indexes with event data that is not incorrectly skewed. If you do not use those commands, you can use the head command to select a larger sample for the summary-index-populating report than the report that you run over the summary index. In other words, you would have | head=100 for the hourly summary-index-populating report, and | head=10 for the daily report over the completed summary index.

Set up your report to get a weighted average

If your summary-index-populating report involves averages, and you are not using the summary indexing transforming commands, you need to set that report up to get a weighted average.

For example, say you want to build hourly, daily, or weekly reports of average response times. To do this, you'd generate the "daily average" by averaging the "hourly averages" together. Unfortunately, the daily average becomes skewed if there aren't the same number of events in each "hourly average". You can get the correct "daily average" by using a weighted average function.

The following expression calculates the daily average response time correctly with a weighted average by using the stats and eval commands in conjunction with the sum statistical aggregator. In this example, the eval command creates a daily_average field, which is the result of dividing the average response time sum by the average response time count.

| stats sum(hourly_resp_time_sum) as resp_time_sum, sum(hourly_resp_time_count) as resp_time_count | eval daily_average= resp_time_sum/resp_time_count | .....

Schedule the summary-index-populating report to avoid data gaps and overlaps

Along with the above two rules, to minimize data gaps and overlaps you should also be sure to set appropriate intervals and delays in the schedules of reports you use to populate summary indexes.

Gaps in a summary index are periods of time when a summary index fails to index events. Gaps can occur if:

  • splunkd goes down.
  • the scheduled saved report (the one being summary indexed) takes too long to run and runs past the next scheduled run time. For example, if you were to schedule the report that populates the summary to run every 5 minutes when that report typically takes around 7 minutes to run, you would have problems, because the report won't run again when it's still running a preceding report.

Overlaps are events in a summary index (from the same report) that share the same timestamp. Overlapping events skew reports and statistics created from summary indexes. Overlaps can occur if you set the time range of a saved report to be longer than the frequency of the schedule of the report, or if you manually run summary indexing using the collect command.

Example of a summary index configuration

This example shows a configuration for a summary index of Apache server statistics as it might appear in savedsearches.conf. The keys listed below enable summary indexing for the "Apache Method Summary" report.

Note: If you set action_summary.index=1, you don't need to have the addinfo or collect commands in the report's search string.

#name of the report = Apache Method Summary
[Apache Method Summary]
# sets the report to run at each interval
counttype = always
# enable the report schedule
enableSched = 1
# report interval in cron notation (this means "every 5 minutes")
schedule = */5****
# id of user for report
userid = jsmith
# search string for summary index
search = index=apache_raw startminutesago=30 endminutesago=25 | extract auto=false | stats count by method
# enable summary indexing
action.summary_index = 1
#name of summary index to which report results are added
action.summary_index._name = summary   
# add these keys to each event
action.summary_index.report = "count by method"

Other configuration files affected by summary indexing

In addition to the settings you configure in savedsearches.conf, there are also settings for summary indexing in indexes.conf and alert_actions.conf.

Indexes.conf specifies index configuration for the summary index. Alert_actions.conf controls the alert actions (including summary indexing) associated with reports.

Caution: Do not edit settings in alert_actions.conf without explicit instructions from Splunk Technical Support.

PREVIOUS
Manage summary index gaps
  NEXT
Configure batch mode search

This documentation applies to the following versions of Splunk® Enterprise: 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.6.0


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters