Configure batch mode search
A search running in batch mode searches one bucket at a time in batches instead of searching through events over time. Transforming searches that qualify for batch mode processing can complete faster than they would otherwise.
Batch mode search also improves the reliability for long-running distributed searches, which can fail when an indexer goes down while the search is running. In this case, Splunk software attempts to complete the search by reconnecting to the missing peer or redistributing the search across the rest of the peers.
Batch mode search functionality is enabled by default. See "Configure batch mode search in limits.conf" in this topic for information about configuring or disabling batch mode search.
You can make your batch mode searches even faster by enabling batch mode search parallelization. Under batch mode search parallelization, two or more search pipelines are launched for a qualifying search, and they process the search results concurrently. See "Configure batch mode search parallelization" in this topic.
Requirements for batch mode search
Transforming searches that meet the following conditions can run in batch mode.
- The searches need to use generating commands like
search
,loadjob
,datamodel
,pivot
, ordbinspect
. - The search can include transforming commands, like
stats
,chart
, and so on. However the search cannot include commands likelocalize
andtransaction
. - If the search is not distributed, it cannot use commands that require time-ordered events, like
streamstats
,head
, andtail
.
Confirm whether or not a search is running in batch mode by using the Search Job Inspector. Batch mode search is indicated by the boolean parameter isBatchModeSearch
. See View search job properties in the Search Manual.
Configure batch mode search in limits.conf
If you have a Splunk Enterprise deployment (as opposed to Splunk Cloud Platform), you can configure batch mode search throughout the implementation by changing settings in the limits.conf
configuration file, under the [search]
stanza.
When you have several batch mode search threads running concurrently, they can become a memory usage burden. You can deal with this by disabling batch mode search for your entire implementation, or by limiting the number of events that a batch mode search thread can read at once from an index bucket.
[search] allow_batch_mode = <bool> batch_search_max_index_values = <int>
allow_batch_mode
defaults totrue
, meaning that batch mode search is enabled for qualifying transforming searches. Disable batch mode search by settingallow_batch_mode = false
.- When
allow_batch_mode = true
, use thebatch_search_max_index_values
to limit the number of events read from the index file (bucket). These entries are small, approximately 72 bytes; however, batch mode is more efficient when it can read more entries at once. Defaults to 10000000 (or 10M).
For example, if your batch mode searches are causing you to run low in system memory, you can lower batch_search_max_index_values
to 1000000 (1M) to decrease their memory usage. Setting this parameter to a smaller number can lead to slower search performance. You want to find a balance between efficient batch mode searching and system memory conservation.
Set search peer retry period
Other limits.conf
settings control the periodicity of retries to search peers in the event of failures, such as connection errors. The interval exists between failure and first retry, as well as successive retries in the event of further failures.
[search] batch_retry_min_interval = <int> batch_retry_max_interval = <int> batch_retry_scaling = <double> batch_wait_after_end = <int>
- Use the
batch_retry_min_interval
andbatch_retry_max_interval
parameters to specify the minimum or maximum interval (in seconds) to wait before batch mode attempts to retry the search on a failed peer. The minimum interval defaults to 5 seconds. The maximum interval defaults to 300 seconds. - After a retry attempt fails increase the time to wait before another retry by a scaling factor,
batch_retry_scaling
, which takes a value greater than 1.0. Defaults to 1.5. - Batch mode considers the search complete when all peers have indicated without failure that they have delivered the full answer. If the search finishes, but one or more of the peers has failed, batch mode retries connection with the failed peer(s) for the number of seconds specified by
batch_wait_after_end
. If batch mode cannot reconnect within this period of time, it declares the search results to be incomplete. Defaults to 900 seconds.
Search peer restart for batch mode search
Batch mode handles a search peer restart differently depending on whether the peer is clustered or not.
- If the search peer is clustered, batch mode waits for the cluster master to spawn a new generation.
- If the search peer is not clustered and connection to it is lost, batch mode attempts to reconnect to it, following the retry period parameters described above. When batch mode reestablishes connection to the search peer, it resumes the batch mode search until the search completes.
Configure batch mode search parallelization
You can optionally take advantage of batch mode search parallelization to make your batch mode searches even more efficient. When you enable batch mode search parallelization, two or more search pipelines for batch search run concurrently to read from index buckets and process events. This approach improves the speed and efficiency of your batch mode searches, but at the expense of increased system memory consumption.
You can enable and configure batch mode search parallelization with an additional set of limits.conf
parameters. This is an indexer-side setting. It needs to be configured on all of your indexers, not your search head(s).
[search] batch_search_max_pipeline = <int> batch_search_max_results_aggregator_queue_size = <int> batch_search_max_serialized_results_queue_size = <int>
- Use
batch_search_max_pipeline
to set the number of batch mode search pipelines launched when you run a search that qualifies for batch mode. This parameter has a default value of 1. Set it to 2 or higher to parallelize batch mode searches throughout your Splunk deployment. A higher setting improves search performance at the cost of increasing thread usage and memory consumption. - The
batch_search_max_results_aggregator_queue_size
parameter controls the size of the results queue. The results queue is where the search pipelines leave processed search results. Its default size is 100MB. Never set it to zero. - The
batch_search_max_serialized_results_queue_size
parameter controls the size of the serialized results queue, from which the batch search process transmits serialized search results. Its default size is 100MB. Never set it to zero.
Configure summary indexes |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408
Feedback submitted, thanks!