Restart the entire indexer cluster or a single peer node
This topic describes how to restart the entire indexer cluster (unusual) or a single peer node.
When you restart a master or peer node, the master rebalances the primary bucket copies across the set of peers, as described in "Rebalance the indexer cluster primary buckets".
For information on configuration changes that require a restart, see "Restart after modifying server.conf?" and "Restart or reload after configuration bundle changes?".
Restart the entire cluster
You ordinarily do not need to restart the entire cluster. If you change a master's configuration, you restart just the master. If you update a set of common peer configurations, the master restarts just the set of peers, and only when necessary, as described in "Update common peer configurations".
If, for any reason, you do need to restart both the master and the peer nodes:
1. Restart the master node, as you would any instance. For example, run this CLI command on the master:
2. Once the master restarts, wait until all the peers re-register with the master, and the master dashboard indicates that all peers and indexes are searchable. See "View the master dashboard".
3. Restart the peers as a group, by running this CLI command on the master:
splunk rolling-restart cluster-peers
If you need to restart the search head, you can do so at any time, as long as the rest of the cluster is running.
The rolling-restart command
splunk rolling-restart command performs a phased restart of all the peer nodes, so that the cluster as a whole can continue to perform its functions during the restart process.
Caution: Do not invoke the
splunk rolling-restart command unless absolutely necessary. Restarting the set of peers can result in prolonged amounts of bucket-fixing.
You invoke the
splunk rolling-restart command from the master:
splunk rolling-restart cluster-peers
The master also automatically initiates a rolling restart, when necessary, after distributing a configuration bundle to the peer nodes. For details on this process, see "Distribute the configuration bundle".
The rolling restart works like this: The master issues a restart message to approximately 10% (by default) of the peer nodes at a time. (If there are less than 10 peers in the cluster, it issues the restart to one peer at a time.) Once those peers restart and contact the master, the master then issues a restart message to another 10% of the peers, and so on, until all the peers have restarted. This method helps ensure that load-balanced forwarders sending data to the cluster always have a peer available to receive the data.
At the end of the rolling restart, the master rebalances the cluster primary buckets. See "Rebalance the indexer cluster primary buckets".
Note: The master restarts the peers in random order. In the case of multsite clusters, this operation is not site-aware.
By default, the master issues the restart command to 10% of the peers at a time. However, the percentage is configurable through the
percent_peers_to_restart attribute in the
[clustering] stanza of
server.conf. For convenience, you can configure this attribute with the CLI
splunk edit cluster-config command. For example, to change the restart behavior so that the master restarts 20% of the peers at a time, run this command:
splunk edit cluster-config -percent_peers_to_restart 20
To cause the master to restart all peers at once, run the command with a value of
splunk edit cluster-config -percent_peers_to_restart 100
This can be useful under certain circumstances, such as when no users are actively searching and no forwarders are actively sending data to the cluster. It minimizes the time required to complete the restart.
After changing the
percent_peers_to_restart attribute, you still need to run the
splunk rolling-restart command to initiate the actual restart.
Note: During a rolling restart, there is no guarantee that the cluster will be fully searchable.
Restart a single peer
You might occasionally have need to restart a single peer; for example, if you change certain configurations on only that peer.
There are two ways that you can safely restart a single peer:
- Through Splunk Web (Settings>Server Controls).
- With the CLI command
offline, followed by
When you use Splunk Web or the
start commands to restart a peer, the master waits 60 seconds (by default) before assuming that the peer has gone down for good. This allows sufficient time for the peer to come back on-line and prevents the cluster from performing unnecessary remedial activities.
Note: The actual time that the master waits is determined by the value of the master's
restart_timeout attribute in server.conf. The default for this attribute is 60 seconds. If you need the master to wait for a longer period, you can change the
restart_timeout value, as described in "Extend the restart period".
start restart method has an advantage over the Splunk Web method in that it waits for in-progress searches to complete before stopping the peer. In addition, since it involves a two-step process, you can use it if you need the peer to remain down briefly while you perform some maintenance.
For information on the
offline command, read "Take a peer offline".
Warning: Do not use the CLI
restart command to restart the peer. If you use the
restart command, the master will not be aware that the peer is restarting. Instead, after waiting a default 60 seconds for the peer to send a heartbeat, the master will initiate the usual remedial actions that occur when a peer goes down, such as adding its bucket copies to other peers. The actual time the master waits is determined by the master's
heartbeat_timeout attribute. It is inadvisable to change its default value of 60 seconds without consultation.
Use maintenance mode
Rebalance the indexer cluster primary bucket copies
This documentation applies to the following versions of Splunk® Enterprise: 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15