Restart the entire cluster or a single cluster node
You ordinarily do not need to restart the entire cluster. If you change a master's configuration, you restart just the master, as described in "Configure the master". If you update a set of common peer configurations, you restart just the set of peers, as described in "Update common peer configurations".
Restart the entire cluster
If, for any reason, you do need to restart both the master and the peer nodes:
1. Restart the master node, as you would any Splunk instance. For example, run this CLI command on the master:
2. Once the master restarts, wait until all the peers re-register with the master, and the master dashboard indicates that all peers and indexes are searchable. See "View the master dashboard".
3. Restart the peers as a group, by running this CLI command on the master:
splunk rolling-restart cluster-peers
If you need to restart the search head, you can do so at any time, as long as the rest of the cluster is running.
Restart a single peer
You might occasionally have need to restart a single peer; for example, if you change some configuration file on only that peer.
Important: Do not use the CLI
splunk restart command to restart the peer. If you use the
restart command, the master will not be aware that the peer is restarting. Instead, after waiting a default 60 seconds for the peer to send a heartbeat, the master will initiate the usual remedial actions that occur when a peer goes down, such as adding its bucket copies to other peers. (The actual time the master waits is determined by the master's
heartbeat_timeout attribute. It's inadvisable to change its default value of 60 seconds without consultation.)
There are two ways that you can safely restart a single peer:
- Through Manager (Manager>Server Controls).
- With the CLI command
splunk offline, followed by
When you use Manager or the
start commands to restart a peer, the master waits 10 minutes (by default) before assuming that the peer has gone down for good. This allows sufficient time for the peer to come back on-line and prevents the cluster from performing unnecessary remedial activities. The 10 minute wait occurs only when you restart the peer through Manager or the
Note: The actual time that the master waits is determined by the value of the master's
restart_timeout attribute in server.conf. The default for this attribute is 600 seconds (10 minutes). If you need the master to wait for a longer period, you can change the
restart_timeout value, as described in "Extend the restart period".
start restart method has an advantage over the Manager method in that it waits for in-progress searches to complete before stopping the peer. In addition, since it involves a two-step process, you can use it if you need the peer to remain down briefly while you perform some maintenance. The disadvantage of this method is that it causes the Manager to reassign primacy from any bucket copies on the peer to searchable copies on other peers. This can result in search load skew, which can be a particular problem if the peer was holding a large number of primary bucket copies.
For information on the
offline command, read "Take a peer offline".
Take a peer offline
Basic concepts for advanced users
This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4