Splunk® Enterprise

Managing Indexers and Clusters of Indexers

Acrobat logo Download manual as PDF

Splunk Enterprise version 6.x is no longer supported as of October 23, 2019. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Acrobat logo Download topic as PDF

How search works in an indexer cluster

In a single-site indexer cluster, the search head performs searches across the entire set of peers.

With a multisite indexer cluster, you can implement search affinity. With search affinity, searches occur across peers on the same site as the search head. This improves network efficiency without reducing access to the full set of cluster data.

Under rare circumstances, described later, you might want to initiate a search on a single peer.

Search across a single-site cluster

Searching across an indexer cluster works in a way similar to how distributed search works with non-clustered indexers. The main difference is that the search head gets its list of search peers from the master node. It also gets a generation ID from the master. After that, it communicates directly with the peers.

Note: In an indexer cluster search, the search peers are the set of cluster peers that are currently registered with the master (in other words, the peers that are up-and-running and participating in the cluster).

When the search head initiates a search:

1. The search head contacts the master node.

2. The master node gives the search head the current generation ID and a list of the peers in that generation (that is, the peers that are currently registered with the master).

3. The search head communicates with the search peers in the same way as in a distributed search not involving an indexer cluster. It provides the peers with exactly the same information (search request and knowledge bundle), except that it also gives the search peers the generation ID.

4. The search peers use the generation ID to identify which of their bucket copies, if any, are primary for the generation and thus need to participate in the search. As in any other search, the peers also use the search's time range to determine whether to search a particular bucket.

5. The search peers search their primary copies of buckets and send the results back to the search head, which consolidates the results.

You can integrate the indexer cluster with a search head cluster, for search head scaling and high availability. See "Integrate the search head cluster with an indexer cluster" in the Distributed Search manual.

For details on these and other available features of distributed search, read the Distributed Search manual, starting with "About distributed search". Also, read "Configure the search head" in this manual to learn about a few configuration differences when dealing with a search head in an indexer cluster.

Search locally in a multisite cluster

In a multisite cluster, you typically put search heads on each site. This allows you to take advantage of search affinity. In search affinity, searches normally run across only peers on the same site as the requesting search head.

Search affinity is always enabled with multisite clusters. However, you must perform a few steps to take advantage of it. Specifically, you must ensure that both the searchable data and the search heads are available locally. For information on how to set up search affinity, see "Implement search affinity in a multisite indexer cluster".

Once a site has been configured for search affinity, the actual search process works the same as for single-site clusters. The search head distributes the current generation ID, along with the search and knowledge bundle, to all peers across the entire cluster. The local peers, however, are the only ones to respond. They search their primary buckets and return results to the search head, using the generation ID to determine which of their bucket copies are primary.

If some peers on the local site are down and the site therefore does not have a full complement of primaries, remote peers will participate in the search, providing results from any primaries missing from the site. In that case, the search does not adhere to search affinity, in order to maintain access to the full set of data. Once the site returns to a valid state, subsequent searches again adhere to search affinity.

Note: Hot bucket data is replicated in blocks, as described in "How clustered indexing works". If a local search involves a replicated hot bucket copy, where the origin copy is on a different site, there might be a time lag while the local peer waits to get the latest block of hot data from the originating peer. During this time, the search does not return the latest data.

Search a single peer

For debugging purposes, you might occasionally need to search a single peer node. You do this by initiating the search directly on the peer, in the usual manner. The search accesses any searchable data on that peer. It does not have access to unsearchable copies of data on the peer or to searchable copies of data on other peers.

Note: Keep in mind that there is no way to configure exactly what data will be searchable on any individual peer. However, at a minimum, all data that has entered the cluster through the peer should be searchable on that peer.

How clusters handle report and data model acceleration summaries

Indexer clusters do not replicate report acceleration and data model acceleration summaries.

These summaries reside on the peers, in their own directories immediately under each index-level directory. For example, for the "index1" index, they reside under $SPLUNK_HOME/var/lib/splunk/index1. The report acceleration summary directory is named summary and the data model acceleration summary directory is named datamodel_summary.

A summary correlates with one or more buckets, depending on the summary's time span. When a summary is generated, it resides on the peer that holds the primary copy of the bucket for that time span. If the summary spans multiple buckets, and the primary copies of those buckets reside on multiple peers, then each of those peers will hold the corresponding part of the summary.

If primacy gets reassigned from one copy of a bucket to another (for example, because the peer holding the primary copy fails), the summary does not move to the peer with the new primary copy. Therefore, it becomes unavailable. It will not be available again until the next time Splunk Enterprise attempts to update the summary, finds that it is missing, and then regenerates it.

In multisite clusters, like single-site clusters, the summaries reside with the primary bucket copy. Because a multisite cluster has multiple primaries, one for each site that supports search affinity, the summaries reside with the particular primary that the generating search head accessed when running the search. Due to site affinity, that usually means that the summaries reside on primaries on the same site as the generating search head.

Last modified on 04 April, 2016
How clustered indexing works
How indexer cluster nodes start up

This documentation applies to the following versions of Splunk® Enterprise: 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14

Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters