Handle master site failure
If the site holding the master node fails, you lose the master's functionality. You must immediately start a new master on one of the remaining sites.
Until the new master starts up, the cluster continues to function as best it can. The peers continue to stream data to other peers based on the list of target peers that they were using at the time the master went down. If some of their target peers go down (as would likely be the case in a site failure), they remove them from their lists of streaming targets and continue to stream data to any peers remaining on their lists.
To deal with master site failure, do the following:
1. Configure a stand-by master on at least one of the sites not hosting the current master. See Replace the master node on the indexer cluster. This is a preparatory step. You must do this before the need arises.
2. When the master site goes down, bring up a stand-by master on one of the remaining sites. See Replace the master node on the indexer cluster.
3. Restart indexing on the cluster, following the instructions in Restart indexing in multisite cluster after master restart or site failure.
The new master now fully replaces the old master.
Note: If the failed site later comes back up, you need to point the peers on that site to the new master. See Ensure that the peer and search head nodes can find the new master.
Remove a peer from the master's list | Restart indexing in multisite cluster after master restart or site failure |
This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10
Feedback submitted, thanks!