Preview documentation

 


Best practices for summary indexing

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Best practices for summary indexing

This topic contains guidelines and best practices for configuring and using summary indexing.


General guidelines for summary indexing

Note: Currently, indexing events in a summary index counts against your license volume. We recommend that you not index more events in your summary indexes than you really need. Consult Splunk support for specific information on license volume impact.


Use summary indexing to:


When using summary indexing:


Aggregated statistics

Be careful when building reports made of aggregated statistics. Some aggregating statistical functions (such as distinct count, mode, median, etc.) yield incorrect results when you use them on aggregated statistics. Use one of Splunk's reporting commands to access statistical functions.


For example, if you want to build hourly/daily/weekly reports of average response times, generate the "daily average" by averaging the "hourly averages" together. The daily average becomes skewed if there aren't the same number of events in each "hourly average". Get the correct "daily average" by using a weighted average function.


Example:


The following expression calculates the the daily average response time correctly (a weighted average) using stats and eval.


| stats sum(hourly_resp_time_sum) as resp_time_sum, sum(hourly_resp_time_count) as resp_time_count | eval daily_average= resp_time_sum/resp_time_count | .....

Gaps and overlaps

Gaps

Gaps in a summary index are periods of time when a summary index fails to index events. Gaps can occur if:


Overlaps

Overlaps are events in a summary index (from the same search) that share the same timestamp. Overlapping events skew reports and statistics created from summary indexes. Overlaps can occur if you set the time range of a saved search to be longer than the frequency of the schedule of the search, or you run summary indexing manually (using | collect).


Identify gaps and overlaps in data

Identify overlaps and gaps in a summary index using the "Summary Index Gaps and Overlaps" form search (a default saved search in the main Splunk dashboard), or by using the Splunk:preview:SearchCommandOverlap:latest command in your search (add | overlap at the end of the search that produces overlaps).


If you run the form search Summary Index Gaps and Overlaps, specify the time range using the form, or switch to a "text" display where you must specify the following parameters in the search bar (following | overlap):


either specify:


or:


If you identify a gap, you can run your scheduled saved search over the period of the gap and summary index the results (using | collect). If you identify overlapping events, you can manually delete the overlaps from the summary index by using the search language.

This documentation applies to the following versions of Splunk: 3.3 , 3.3.1 , 3.3.2 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!