Restart indexing in multisite cluster after master restart or site failure
When a master restarts, it blocks indexing until enough peers exist across the indexer cluster to fulfill the replication factor. In a basic, single-site cluster, this is usually desired behavior. However, in the case of a multisite cluster, you might want to restart indexing even though you do not have enough available peers to fulfill all aspects of the site replication factor (for example, in the case of site failure).
The two cases where this need typically arises are:
- A site goes down and you later need to restart the master for any reason.
- The site with the master goes down and you bring up a stand-by master on another site.
If a site goes down but the master, running on another site, remains up, indexing continues as usual, because the master only runs the check at start-up.
Run the splunk set indexing-ready
command on the master to unblock indexing when replication factor number of peers are not available:
splunk set indexing-ready -auth admin:your_password
For example, assume you have a three-site cluster configured with "site_replication_factor = origin:1, site1:2, site2:2, site3:2, total:7", with the master located on site1. If site2 goes down and you subsequently restart the master, the master blocks indexing after it restarts, because it is waiting to hear from a minimum of two peers on site2 ("site2:2"). In this situation, you can use the command to restart indexing on the remaining sites.
Similarly, if site1, which has the master, goes down and you bring up a stand-by master on site2, the new master initially blocks indexing because site1 is not available. You can then use the command to tell the new master to restart indexing.
Important: You must run the splunk set indexing-ready
command every time you restart the master under the listed circumstances. The command unblocks indexing only for the current restart.
Note: Although this command is designed with site failure in mind, you can also use it to restart indexing on a single-site cluster prior to the replication factor number of peers being available. In that circumstance, however, it is usually better just to wait until the replication number of peers rejoin the cluster.
Handle master site failure | Convert a multisite indexer cluster to single-site |
This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10
Feedback submitted, thanks!