Use indexer clusters to scale indexing

The main purpose of indexer clusters is to enable index replication. However, clusters can also be generally useful in scale-out deployment topologies as a way to manage multiple indexers, even when index replication isn't a requirement.

For example, say you want to create a deployment of three indexers and one search head, so that you can index larger quantities of data than a single indexer is capable of. The customary way of doing this, and the only way possible prior to Splunk Enterprise 5.0, is to set up each of the indexers independently, add in a search head, and then use a tool like deployment server to coordinate the indexer configurations.

With clustering, you can instead configure this deployment scenario as a cluster, with three peer nodes replacing the three independent indexers. Even if you don't need index replication and its key advantages like data availability and disaster tolerance, there are several reasons why it might be beneficial to use a cluster to coordinate multiple indexer instances:

Simplified management and coordination of indexer configuration (in place of using deployment server or performing manual updates). See "Update common peer configurations" for details.
Simplified set up and control of distributed search. See "Enable the search head".
Better insight into the state of your indexers through the clustering dashboards. See "View the master dashboard".
Ability to take advantage of additional cluster management capabilities as they're developed.

The main downsides of employing a cluster for scaling indexing capacity are these:

You must install an additional Splunk Enterprise instance to function as the cluster master node.
The cluster does not support heterogeneous indexers. All cluster nodes must be at the same version level. In addition, all peer nodes in a cluster must use the same indexes.conf configuration. For further details, see the next section, "Cluster peer management compared to deployment server".
You cannot use the deployment server to distribute configurations or apps across the cluster peers. For further details, see the next section, "Cluster peer management compared to deployment server".

Cluster peer management compared to deployment server

One useful cluster feature is the ability to manage and update the configuration for all indexers (peer nodes) from a central location, the master node. In that respect, it's similar in function to the deployment server. Unlike the deployment server, however, cluster peer management does not have any concept of server classes. Because of this, and because of the way clusters coordinate their activities, you cannot specify different app or indexes.conf configurations for different groups of indexers. (All peer nodes in a cluster must use the same indexes.conf configurations, as well as some other configurations, as described in "Peer node configuration overview".) If you need to maintain a heterogeneous set of indexers, you cannot employ clusters for scaling purposes.

On the other hand, the configuration bundle method used to download updates to peer nodes has certain advantages over the deployment server. Specifically, it not only distributes updates, it also validates them on the peers, and then (when necessary) initiates a rolling restart of the peers. See "Update common peer configurations" for details.

Important: Do not use deployment server or third-party distributed configuration management software, such as Puppet or Chef, to deploy updates directly to peer nodes. You can use such tools to deploy updates to the master, which then deploys those updates to the peers. See "Use deployment server to distribute the apps to the master".

Configure a cluster for scale-out deployment

To set up a cluster for scale-out deployment, without index replication, just set both the replication factor and search factor to 1. This causes the cluster to function purely as a coordinated set of Splunk Enterprise instances, without data replication. The cluster will not make any duplicate copies of the data, so you can keep storage size and processing overhead to a minimum.

Related answers from Splunk Community

Use indexer clusters to scale indexing

Cluster peer management compared to deployment server

Configure a cluster for scale-out deployment

Comments

Use indexer clusters to scale indexing

Was this topic useful?