Configure parallel reduce search processing
To enable parallel reduce search processing for your deployment, you need to configure your indexers to work as intermediate reducers and determine how your deployment should distribute the parallel reduction workload across your indexers.
See Overview of parallel reduce search processing for an overview of parallel reduce search processing and a list of prerequisites.R
Tasks for configuring parallel reduce search processing
The following table describes common tasks for configuring parallel reduce search processing.
Task | Description | For more information |
---|---|---|
Review parallel reduce prerequisites | Make sure that your environment is set up properly to support parallel reduce processing. | See Overview of parallel reduce search processing for a list of prerequisites to follow before you configure parallel reduce processing in your environment. |
Configure your indexers to work as intermediate reducers. | Update your indexer configurations by making configuration file updates. | See Configure your indexers to work as intermediate reducers. |
Determine how your parallel reduction workload is distributed. | Consider whether you need to fine-tune configuration settings for load balancing parallel reduce search processing across intermediate reducers and indexers. | See Determine how your parallel reduction workload is distributed. |
Turn on parallel reduce processing. | Configure settings in the limits.conf file to turn on and start using parallel reduce processing in your environment. | See Turn on parallel reduce processing. |
Configure your indexers to work as intermediate reducers
To update your indexer configurations, you must have access to the server.conf
file for your Splunk deployment, located in $SPLUNK_HOME/etc/system/local/
. See About configuration files and the topics that follow it in the Admin Manual for more information about making configuration file updates.
Parallel reduce search processing is not site-aware. Do not add this configuration to your indexers if they are in a multisite indexer cluster or if they are non-clustered and spread across several sites.
Determine how your parallel reduction workload is distributed
Settings in the [parallelreduce]
stanza of limits.conf
determine the number of intermediate reducers that are selected from your indexers for a parallel reduce search process. They also determine how parallel reduce search processing work is distributed across your indexers.
For example, if you keep the default parallel reduce settings in limits.conf
, the Splunk platform randomly selects a certain number of intermediate reducers each time you run a parallel reduce search. If all of your indexers are in a single-site indexer cluster, the random selection aids in distributing the parallel reduction workload across the cluster.
However, if your indexers are not clustered, and some of your indexers have large indexing loads on average while others do not, you can use the reducers
setting to configure the low-load indexers to be dedicated intermediate reducers. Dedicated intermediate reducers are always used when you run a parallel reduce search process.
These two methods are mutually exclusive. When you set up dedicated intermediate reducers, the Splunk platform cannot randomly select intermediate reducers.
To configure parallel reduce search processing, you must have access to the limits.conf
file for your Splunk deployment, located in $SPLUNK_HOME/etc/system/local/
. See About configuration files and the topics that follow it in the Admin Manual for more information about making configuration file updates.
Enable random selection of intermediate reducers
Random selection of indexers for intermediate reduction service is ideal if you are running a single-site indexer cluster. If you run several parallel reduce searches concurrently, the random selection ensures that the intermediate reduction work is evenly distributed across the cluster.
The default parallel reduce search processing settings enable the Splunk platform to randomly select intermediate reducers from the larger set of indexers when you run parallel reduce searches. The default number of indexers that the Splunk platform repurposes as intermediate reducers during the intermediate reduce phase of the parallel reduce search process is 50% of the total number of indexers in your indexer pool, up to a maximum of 20 indexers.
Random intermediate reducer selection is determined by the maxReducersPerPhase
and winningRate
settings. They belong to the [parallelreduce]
stanza of limits.conf
.
Setting name | Definition | Default value |
---|---|---|
maxReducersPerPhase
|
The maximum number of indexers that can be used as intermediate reducers in the intermediate reduce phase of a parallel reduce search. | 20 |
winningRate
|
The percentage of indexers that can be selected from the total pool of indexers and used as intermediate reducers in a parallel reduce search process. This setting applies only when the reducers setting is not configured in limits.conf . See Enable dedicated intermediate reducers.
|
50 |
Enable dedicated intermediate reducers
To configure a set of non-clustered indexers as dedicated intermediate reducers, add the reducers
setting to the [parallelreduce]
stanza in limits.conf
.
The value of reducers
is a comma-separated list of indexers that you have configured as search peers. Identify each indexer by specifying its host and port using the following format: <host>:<port>
. For example:
reducers=docteam-unix-4:8089, docteam-unix-5:8089, docteam-unix-6:8089
Do not include clustered indexers on the reducers
list.
All indexers in the reducers
list are used as intermediate reducers when you run a parallel reduce search. If the number of indexers in the reducers
list exceeds the value of the maxReducersPerPhase
setting, the Splunk platform randomly selects the intermediate reducers from the reducers
list. For example, if the reducers
setting lists five reducers and maxReducersPerPhase=4
, the Splunk platform randomly selects four intermediate reducers from the list.
If all of the indexers in the reducers
list are down or are otherwise invalid, searches are run without parallel reduction. All reduce operations are processed on the search head.
When you configure the reducers
setting for your deployment, the Splunk platform ceases to apply the winningRate
setting.
Set a timeout to connect indexers and intermediate reducers
When you run a parallel reduce search, set the rdinPairingTimeout
setting in the [parallelreduce]
stanza of the limits.conf
configuration file to ensure that you provide adequate time to enable the indexers and the intermediate reducers to connect with each other.
The default amount of time that you may wait for pairing the indexers and the intermediate reducers is 30 seconds.
Override the number of intermediate reducers for a specific search
Some complex deployments may need to override the number of reducers that are used to run parallel reduce searches. In these advanced use cases, you can use the redistribute
command with the num_of_reducers
argument to override the number of reducers. For more information, see redistribute in the Splunk Enterprise Search Reference.
Turn on parallel reduce processing
There are two settings in the limits.conf file that control parallel reduce search processing.
- To enable parallel reduce on saved searches, set
autoAppliedPercentage
. - To enable parallel reduce on ad hoc searches , set
autoAppliedPercentage
andautoAppliedToAdhocSearches
.
To turn on parallel reduce search processing in your environment, follow these steps.
Prerequisites
- Only users with file system access, such as system administrators, can edit configuration files.
- Review the steps in How to edit a configuration file in the Splunk Enterprise Admin Manual.
Never change or copy the configuration files in the default directory. The files in the default directory must remain intact and in their original location. Make changes to the files in the local directory.
Steps
- Open or create a local limits.conf file at $SPLUNK_HOME/etc/system/local.
- In the
[parallelreduce]
stanza, add the lineautoAppliedPercentage = 100
. You can adjust the percentage of saved searches that use parallel reduce processing, but 100% works best for most deployments. All ad hoc searches continue to run without parallel reduce. - To set all ad hoc searches to run with parallel reduce, in the
[parallelreduce]
stanza, add the lineautoAppliedToAdhocSearches = true
.
It is not necessary to restart the Splunk platform, however, it might take some time for parallel reduce processing to take effect depending on the search_process_mode
setting in the limits.conf.in file.
Overview of parallel reduce search processing | About search head clustering |
This documentation applies to the following versions of Splunk® Enterprise: 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10
Feedback submitted, thanks!