How clustered search works
In a single-site cluster, the search head performs searches across the entire set of peers.
With a multisite cluster, you can implement search affinity. With search affinity, searches occur across peers on the same site as the search head. This improves network efficiency without reducing access to the full set of cluster data.
Under rare circumstances, described later, you might want to initiate a search on a single peer.
Search across a single-site cluster
Clustered search is very similar to how distributed search works in a non-clustered environment. The main difference is that the search head gets its list of search peers from the master node. It also gets a generation ID from the master. After that, it communicates directly with the peers.
Note: In a clustered search, the search peers are the set of cluster peers that are currently registered with the master (in other words, peers that are up-and-running and participating in the cluster).
When the search head initiates a search:
1. The search head contacts the master node.
2. The master node gives the search head the current generation ID and a list of the peers in that generation (that is, the peers that are currently registered with the master).
3. The search head communicates with the search peers in the same way as in a non-clustered distributed search, providing the peers with exactly the same information (search request and replication bundle), except that it also gives the search peers the generation ID.
4. The search peers use the generation ID to identify which of their bucket copies, if any, are primary for the generation and thus need to participate in the search. As in any other search, the peers also use the search's time range to determine whether to search a particular bucket.
5. The search peers search their primary copies of buckets and send the results back to the search head, which consolidates the results.
As with a non-clustered distributed search, you can have multiple search heads, and the search heads can function as a single unit, known as a search head pool. And, just like non-clustered distributed searches, you can use mounted bundles to reduce the amount of data that gets passed from the search head to the peers.
For details on these and other available features of distributed search, read the Distributed Search manual, starting with "About distributed search". Also, read "Configure the search head" in this manual to learn about a few configuration differences between clustered and non-clustered search heads.
Search locally in a multisite cluster
In a multisite cluster, you typically put search heads on each site. This allows you to take advantage of search affinity. In search affinity, searches normally run across only peers on the same site as the requesting search head.
To implement search affinity, you must configure the
site_search_factor so that each site has at least one searchable copy of the data. For more information on how to set up search affinity, see "Implement multisite search affinity".
Once a site has been configured for search affinity, the actual search process works the same as for single-site clusters. The search head distributes the current generation ID, along with the search and replication bundle, to all peers across the entire cluster. The local peers, however, are the only ones to respond. They search their primary buckets and return results to the search head, using the generation ID to determine which of their bucket copies are primary.
Note: Hot bucket data is replicated in blocks, as described in "How clustered indexing works". If a local search involves a replicated hot bucket copy, where the origin copy is on a different site, there might be a time lag while the local peer waits to get the latest block of hot data from the originating peer. During this time, the search does not return the latest data.
If some peers on the local site are down and the site therefore does not have a full complement of primaries, remote peers will participate in the search, providing results from any primaries missing from the site. In that case, the search does not adhere to search affinity, in order to maintain access to the full set of data. Once the site returns to a valid state, subsequent searches again adhere to search affinity.
Search a single peer
For debugging purposes, you might occasionally need to search a single peer node. You do this by initiating the search directly on the peer, in the usual manner. The search accesses any searchable data on that peer. It does not have access to unsearchable copies of data on the peer or to searchable copies of data on other peers.
Note: Keep in mind that there is no way to configure exactly what data will be searchable on any individual peer. However, at a minimum, all data that has entered the cluster through the peer should be searchable on that peer.
How clustered indexing works
How indexer cluster nodes start up
This documentation applies to the following versions of Splunk® Enterprise: 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14