By default, a Splunk search retrieves all events. However in some situations you might want to retrieve a sample set of events, instead of retrieving the entire event set. There are several reasons why you might want to use event sampling.
- To perform a quick search to ensure the correct events are being returned
- To determine the characteristics of a large data set without processing every event
- To test that the data selection, formatting, calculations, and other components of the search are working correctly
For most searches, event sampling can greatly increase search performance without decreasing functionality.
The event sampling ratio
The sampling ratio is the likelihood of any event being included in the sample result set. The formula for the ratio is
For example, if the sample ratio value is 100, each event has a 1 in 100 chance of being included in the result set. The selection of each event is independent of the selection of all another events. It is possible that many events are included from the first 100 events, or none at all.
If a search matches 1,000,000 events when sampling is not used, using a sample ratio value of 100 would result in returning approximately 10,000 events.
If you to rerun a sampling search many times, the exact number of returned results is modeled by a binomial distribution with n=1000000 and p=0.01. This distribution looks like a normal distribution, with the mean=10000 and the standard deviation (stdev)=99.5.
In Splunk Web, the sampling ratio that you specify must be a positive integer that is greater than 1. To disable sampling in Splunk Web, set the ratio to 1.
Set the default sampling ratio
In Splunk Enterprise, set the default sampling ratio by editing the
ui-prefs.conf file. The sampling ratio must be a positive integer.
In Splunk Cloud, to change the default sampling ratio, file a Support ticket.
How event sampling works
By default, event sampling is not active. When you run a search, every event that matches your criteria is returned. When you specify a ratio, sampling remains in effect for the active search window. Sampling also remains in effect when you save a search as a report or dashboard panel.
When you specify a ratio value, your value overrides the default value configured for your Splunk deployment and remains in effect until you change it.
If you open a new search window, event sampling is no longer active. However, the last custom ratio that you used appears in the Sampling drop-down.
Commands and functions to avoid with event sampling
Typically, searches that use the
streamstats commands are not good candidates for sampling.
When you calculate statistics using a sample set of events, the statistical values will not be accurate. To determine the true statistical value, you must scale the value returned with event sampling. And scaling only gives you an approximate true value.
For example, you create a report using this search with event sampling enabled.
... | stats sum(x)
Because you used event sampling, the returned value is not the complete sum of all of the events. It is only the sum of the sample set of events. If the sampling ratio is 100, the true sum is approximately 100 times the value returned by the search.
Statistical calculations that fall into this situation are
Other statistics that are difficult to interpret when event sampling is used include:
Specify a sampling ratio
You activate event sampling for a search by specifying a sampling ratio.
1. In Splunk Web, below the Search bar, click No Event Sampling.
2. You can use one of the default ratios or specify a custom ratio.
- a. To use one of the default ratios, click the ratio in the Sampling drop-down.
- b. To specify a custom ratio, click Custom and type the ratio value. Then click Apply. The ratio value must be a positive integer greater than 1.
Event sampling indicators
There are several indicators in the Search & Reporting App window which show that event sampling is active. After you run a search, the Sampling drop-down appears in the event count line. The label for the Sampling drop-down specifies the ratio that is applied to the search. Additionally, if a sampling ratio is being used, the Jobs drop-down specifies the ratio that is applied to the search.
Event sampling with reports and dashboard panels
You can save a search that uses event sampling as a report or dashboard panel. Use the Save As drop-down to save the search.
When the search is saved as a report, the sampling ratio is used when the report is run.
When the search saved as a dashboard panel, the panel is powered by an inline search. When the dashboard is refreshed, the sampling ratio that was saved with the inline search is used.
If you open a report and add the report to a dashboard panel, you can specify how the panel is powered. You can specify that the panel is powered by the inline search that the report is based on. Or you can specify that the panel is powered by the report itself.
- Panels powered by reports
- When you view the source for the panel in Simple XML, there is no indication if the report uses event sampling.
- Panels powered by inline searches
- When you view the source for the panel in Simple XML, if the underlying search uses event sampling there is <sampleRatio> entry. For example:
<event> <title>sample events</title> <search> <query>buttercupgames</query> <earliest>@d</earliest> <latest>now</latest> <sampleRatio>500</sampleRatio> </search> </event>
- Accelerated reports
- You cannot accelerate reports that are based on event sampling searches. See "Accelerate reports" in the Reporting Manual.
Use fields to retrieve events
Retrieve events from indexes
This documentation applies to the following versions of Splunk® Enterprise: 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.6.0, 6.6.1, 6.6.2, 6.6.3