Splunk® Enterprise

Managing Indexers and Clusters of Indexers

Download manual as PDF

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Deployment overview

This topic describes the main steps to deploying clusters. Subsequent topics describe these steps in detail.

Before you attempt to deploy a cluster, you must be familiar with several areas of Splunk Enterprise administration:

  • How to configure indexers. In particular, see "How the indexer stores indexes", along with the other topics in this manual that describe managing indexes.
  • What a search head does. For an introduction to distributed search, see "About distributed search" in the Distributed Search Manual. Note, however, that cluster search heads are configured a bit differently from how search heads are described in that topic. Those differences are outlined later in this manual, in the topic "Configure the search head" .
  • How to use a forwarder to get data into an indexer. See "Use forwarders" in the Getting Data In Manual.

Migrating from a non-clustered Splunk Enterprise deployment?

Clustered indexers have several different requirements from non-clustered indexers. It's important that you be aware of these issues before you migrate your indexers. For details, see "Key differences between clustered and non-clustered deployments". Once you've read that material, go to "Migrate non-clustered indexers to a clustered environment" for details on the actual migration process.

Important: Before migrating an indexer from non-clustered to clustered, be certain of your needs. The process goes in one direction only. There is no supported procedure for converting an indexer from clustered to non-clustered.

Deploy a cluster

When you deploy a cluster, you enable and configure the cluster master and the peer nodes that perform the indexing. You also enable a search head to search data in the cluster. In addition, you usually set up forwarders to send data to nodes in the cluster. Here's a diagram of a small cluster, showing the various components that you deploy:

Simplified basic cluster 60.png

These are the key steps in deploying clusters:

1. Identify your requirements:

a. Understand your data availability and failover needs. See "About clusters".

b. Decide what replication factor you want to implement. The replication factor is the number of copies of raw data that the cluster maintains. Your optimal replication factor depends on factors specific to your environment, but essentially involves a trade-off between failure tolerance and storage capacity. A higher replication factor means that more copies of the data will reside on more peer nodes, so your cluster can tolerate more node failures without loss of data availability. But it also means that you'll need more nodes and more storage to handle the additional data. For more information, see "Replication factor".

Warning: Make sure you start by choosing the right replication factor for your needs. It is inadvisable to increase the replication factor once the cluster contains a significant amount of data. The cluster would then need to perform a large amount of bucket copying to match the increased replication factor, slowing significantly the overall performance of your cluster while the copying is occurring.

c. Decide what search factor you want to implement. The search factor tells the cluster how many searchable copies of indexed data to maintain. This helps determine the speed with which a cluster can recover from a downed node. A higher search factor allows the cluster to recover more quickly, but it also requires more storage space and processing power. For most environments, the default search factor value of 2 represents the right trade-off, allowing searches to usually continue with little interruption when a node goes down. For more information, see "Search factor".

Warning: Make sure you start by choosing the right search factor for your needs. It is inadvisable to increase the search factor once the cluster contains a significant amount of data. The cluster would then need to perform a large amount of processing (transforming non-searchable bucket copies into searchable copies) to match the increased search factor, and this will have an extremely adverse effect on the overall performance of your cluster while the processing is occurring.

d. Identify other factors that also determine the size of your cluster; for example, the quantity of data you'll be indexing. It usually makes sense to keep all your indexers in a single cluster, so for horizontal scaling, you'll need to add peer nodes beyond those required by the replication factor. Similarly, depending on the anticipated search load, you might need to configure more than one search head.

e. Study the topic "System requirements and other deployment considerations" for information on other key issues.

2. Install the Splunk Enterprise cluster instances on your network. At a minimum, you'll need (replication factor + 2) instances:

  • You need at least the replication factor number of peer nodes, but you might want to add more peers to boost indexing capacity, as mentioned in step 1d.
  • You also need two more instances, one for the master node and the other for the search head.

For information on how to install Splunk Enterprise, read the Installation Manual.

3. Enable clustering on the instances:

a. Enable the master node. See "Enable the master node".

Important: When the master starts up for the first time, it will block indexing on the peers until you have enabled and restarted the replication factor number of peers.

b. Enable the peer nodes. See "Enable the peer nodes".

c. Enable the cluster search head. It's easier to set up a search head for a cluster than for a non-clustered group of indexers. See "Enable the search head".

4. Complete the peer node configuration:

a. Configure the peers' index settings. This step is necessary only if you need to augment the set of default indexes or apps. In general, all the peers must use the same set of indexes, so if you add indexes (or apps that define indexes) to one peer, you must add them to all peers, using a special cluster-specific distribution method. There might also be other configurations that you need to coordinate across the set of peers. See "Prepare the peers for index replication" for information on how to do this.

b. Configure the peers' data inputs. For most purposes, it's best to use forwarders to send data to the peers, as discussed in "Use forwarders to get your data". As described in that topic, you will usually want to use load-balancing forwarders with indexer acknowledgment enabled.

Once you enable the nodes and set up data inputs for the peers, the cluster automatically begins indexing and replicating the data.

Other deployment scenarios

This chapter also provides guidance on a few other cluster deployment scenarios:

PREVIOUS
Basic cluster architecture
  NEXT
READ THIS FIRST: Key differences between clustered and non-clustered deployments

This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters