Splunk® Enterprise

Managing Indexers and Clusters of Indexers

Download manual as PDF

Splunk Enterprise version 6.x is no longer supported as of October 23, 2019. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

How clustered search works

In a cluster, you use the search head to search across the set of peers. Under rare circumstances, described later, you might also want to initiate a search on a single peer.

Search across the cluster

Clustered search is very similar to how distributed search works in a non-clustered environment. The main difference is that the search head gets its list of search peers from the master node. It also gets a generation ID from the master. After that, it communicates directly with the peers.

Note: In a clustered search, the search peers are the set of cluster peers that are currently registered with the master (in other words, peers that are up-and-running and participating in the cluster).

When the search head initiates a search:

1. The search head contacts the master node.

2. The master node gives the search head the current generation ID and a list of the peers in that generation (that is, the peers that are currently registered with the master).

3. The search head communicates with the search peers in the same way as in a non-clustered distributed search, providing the peers with exactly the same information (search request and replication bundle), except that it also gives the search peers the generation ID.

4. The search peers use the generation ID to identify which of their bucket copies, if any, are primary for the generation and thus need to participate in the search. As in any other search, the peers also use the search's time range to determine whether to search a particular bucket.

5. The search peers search their primary copies of buckets and send the results back to the search head, which consolidates the results.

As with a non-clustered distributed search, you can have multiple search heads, and the search heads can function as a single unit, known as a search head pool. And, just like non-clustered distributed searches, you can use mounted bundles to reduce the amount of data that gets passed from the search head to the peers.

For details on these and other available features of distributed search, read the Distributed Search manual, starting with "About distributed search". Also, read "Configure the search head" in this manual to learn about a few configuration differences between clustered and non-clustered search heads.

Search a single peer

For debugging purposes, you might occasionally need to search a single peer node. You do this by initiating the search directly on the peer, in the usual manner. The search accesses any searchable data on that peer. It does not have access to unsearchable copies of data on the peer or to searchable copies of data on other peers.

Note: Keep in mind that there is no way to configure exactly what data will be searchable on any individual peer. However, at a minimum, all data that has entered the cluster through the peer should be searchable on that peer.

How clusters handle report and data model acceleration summaries

Indexer clusters do not replicate report acceleration and data model acceleration summaries.

These summaries reside on the peers, in their own directories immediately under each index-level directory. For example, for the "index1" index, they reside under $SPLUNK_HOME/var/lib/splunk/index1. The report acceleration summary directory is named summary and the data model acceleration summary directory is named datamodel_summary.

A summary correlates with one or more buckets, depending on the summary's time span. When a summary is generated, it resides on the peer that holds the primary copy of the bucket for that time span. If the summary spans multiple buckets, and the primary copies of those buckets reside on multiple peers, then each of those peers will hold the corresponding part of the summary.

If primacy gets reassigned from one copy of a bucket to another (for example, because the peer holding the primary copy fails), the summary does not move to the peer with the new primary copy. Therefore, it becomes unavailable. It will not be available again until the next time Splunk Enterprise attempts to update the summary, finds that it is missing, and then regenerates it.

Last modified on 28 June, 2016
How clustered indexing works
How cluster nodes start up

This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters