About distributed search
Before reading this manual, see the Distributed Deployment Manual. That manual describes the fundamentals of Splunk Enterprise distributed deployment and shows how distributed search contributes to the overall deployment.
Distributed search provides a way to scale your deployment by separating the search management and presentation layer from the indexing and search retrieval layer.
These are some of the key use cases for distributed search:
- Horizontal scaling for enhanced performance. Distributed search facilitates horizontal scaling by providing a way to distribute the indexing and searching loads across multiple Splunk Enterprise instances, making it possible to index and search large quantities of data.
- Access control. You can use distributed search to control access to indexed data. For example, some users, such as security personnel, might need access to data across the enterprise, while others need access to data only in their functional area.
- Managing geo-dispersed data. Distributed search allows local offices to access their own data, while maintaining centralized access at the corporate level. For example, users in Chicago and San Francisco can look just at their local data, while users at headquarters in New York can search the local data, as well as the data in Chicago and San Francisco.
Distributed search components
With distributed search, a Splunk Enterprise instance called a search head sends search requests to a group of indexers, or search peers, which perform the actual searches on their indexes. The search head then merges the results back to the user. Here is a basic distributed search scenario, with one search head managing searches across several indexers:
Types of distributed search
There are several basic options for deploying a distributed search environment:
- Use one or more independent search heads to search across the search peers.
- Deploy multiple search heads in a search head cluster. The search heads in the cluster share resources, configurations, and jobs. This offers a way to scale your deployment transparently to your users.
- Deploy search heads as part of an indexer cluster. Among other advantages, an indexer cluster promotes data availability and data recovery. The search heads in an indexer cluster can be either independent search heads or members of a search head cluster.
In each case, the search heads perform only the search management and presentation functions. They connect to search peers that index data and search across the indexed data.
Independent search heads
A small distributed search deployment has one independent search head; that is, a search head that is not part of a cluster.
To scale beyond a single search head, deploy a search head cluster.
Search head clusters
A search head cluster is a group of search heads that work together to provide scalability and high availability. It serves as a central resource for searching across a set of search peers.
The search heads in a cluster are, for most purposes, interchangeable. All search heads have access to the same set of search peers. They can also run or access the same searches, dashboards, knowledge objects, and so on.
A search head cluster is the recommended topology when you need to run multiple search heads across the same set of search peers. The cluster coordinates the activity of the search heads, allocates jobs based on the current loads, and ensures that all the search heads have access to the same set of knowledge objects.
Indexer clusters and search heads
Indexer clusters also use search heads to search across the set of indexers, or peer nodes. The search heads in an indexer cluster can be either independent search heads or members of a search head cluster.
You deploy and configure search heads very differently when they are part of an indexer cluster:
- For information on using independent search heads with indexer clusters, see "Configure the search head" in the Managing Indexers and Clusters of Indexers manual.
- For information on using search head clusters with indexer clusters, read "Integrate the search head cluster with an indexer cluster".
Parallel reduce search processing
If you struggle with extremely large high-cardinality searches, you might be able to apply parallel reduce processing to them to help them complete faster. You must have a distributed search environment to use parallel reduce search processing.
High-cardinality searches are searches that must match, filter, and aggregate fields with extremely large numbers of unique values. During a parallel reduce search process, some or all of a high-cardinality search job is processed in parallel by indexers that have been configured to behave as intermediate reducers for the purposes of the search. This parallelization of reduction work that otherwise would be done entirely by the search head can result in faster completion times for high-cardinality searches.
If you want to take advantage of parallel reduce search processing, your indexers should be operating with a light to medium load on average. You can use parallel reduce search processing whether or not your indexers are clustered.
What search heads send to search peers
This documentation applies to the following versions of Splunk® Enterprise: 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5