Splunk® Enterprise

Managing Indexers and Clusters of Indexers

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

How search works in an indexer cluster

In a single-site indexer cluster, the search head performs searches across the entire set of peers.

With a multisite indexer cluster, you can implement search affinity. With search affinity, searches occur across peers on the same site as the search head. This improves network efficiency without reducing access to the full set of cluster data.

Under rare circumstances, described later, you might want to initiate a search on a single peer.

Search across a single-site cluster

Searching across an indexer cluster works in a way similar to how distributed search works with non-clustered indexers. The main difference is that the search head gets its list of search peers from the manager node. It also gets a generation ID from the manager. After that, it communicates directly with the peers.

Note: In an indexer cluster search, the search peers are the set of cluster peers that are currently registered with the manager (in other words, the peers that are up-and-running and participating in the cluster).

When the search head initiates a search:

1. The search head contacts the manager node.

2. The manager node gives the search head the current generation ID and a list of the peers in that generation (that is, the peers that are currently registered with the manager).

3. The search head communicates with the search peers in the same way as in a distributed search not involving an indexer cluster. It provides the peers with exactly the same information (search request and knowledge bundle), except that it also gives the search peers the generation ID.

4. The search peers use the generation ID to identify which of their bucket copies, if any, are primary for the generation and thus need to participate in the search. As in any other search, the peers also use the search's time range to determine whether to search a particular bucket.

5. The search peers search their primary copies of buckets and send the results back to the search head, which consolidates the results.

You can integrate the indexer cluster with a search head cluster, for search head scaling and high availability. See "Integrate the search head cluster with an indexer cluster" in the Distributed Search manual.

For details on these and other available features of distributed search, read the Distributed Search manual, starting with "About distributed search". Also, read "Configure the search head" in this manual to learn about a few configuration differences when dealing with a search head in an indexer cluster.

Search and a multisite cluster

The way that searches function in a multisite cluster depends on whether a search head is configured for search affinity.

Search locally in a multisite cluster

In a multisite cluster, you typically put search heads on each site. This allows you to take advantage of search affinity. In search affinity, searches normally return results only from peers on the same site as the requesting search head.

Search affinity is enabled by default for search heads on multisite clusters. However, you must perform a few steps to take advantage of it. Specifically, you must ensure that both the searchable data and the search heads are available locally. For information on how to set up search affinity, see "Implement search affinity in a multisite indexer cluster".

Once a site has been configured for search affinity, the actual search process works the same as for single-site clusters. The search head distributes the current generation ID, along with the search and knowledge bundle, to all peers across the entire cluster. The local peers, however, are the only ones to respond, if the cluster is in a valid state. They search their primary buckets and return results to the search head, using the generation ID to determine which of their bucket copies are primary.

If the cluster is not in a valid state and the local site does not have a full complement of primaries (typically, because some peers on the site are down), remote peers also participate in the search, providing results from any primaries missing from peers local to the site. In that case, the search does not adhere to search affinity, in order to maintain access to the full set of data. Once the site returns to a valid state, subsequent searches again adhere to search affinity.

Note: Hot bucket data is replicated in blocks, as described in "How clustered indexing works". If a local search involves a replicated hot bucket copy, where the origin copy is on a different site, there might be a time lag while the local peer waits to get the latest block of hot data from the originating peer. During this time, the search does not return the latest data.

Search globally in a multisite cluster

If search affinity is disabled for a search head (by setting its site to "site0"), the search head uses the site0 set of primaries, which ordinarily consists of primaries from all sites across the cluster. The site0 set of primaries is chosen randomly from searchable copies on each site, such that the site0 primary for bucketA might be on site1, the site0 primary for bucketB on site2, and so on.

While the selection result for site0 primaries is functionally random, the site0 primary for any bucket starts out as the bucket's primary copy on its source site. Over time, the site0 primary can change to a primary copy on a different (target) site through actions such as primary rebalance and cluster restarts.

For information on disabling search affiniity, see "Disable search affinity".

Search a single peer

For debugging purposes, you might occasionally need to search a single peer node. You do this by initiating the search directly on the peer, in the usual manner. The search accesses any searchable data on that peer. It does not have access to unsearchable copies of data on the peer or to searchable copies of data on other peers.

Note: Keep in mind that there is no way to configure exactly what data will be searchable on any individual peer. However, at a minimum, all data that has entered the cluster through the peer should be searchable on that peer.

Last modified on 06 October, 2020
PREVIOUS
How clustered indexing works
  NEXT
How indexer clusters handle report and data model acceleration summaries

This documentation applies to the following versions of Splunk® Enterprise: 8.1.0


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters