Restart indexing in multisite cluster after manager restart or site failure

When a manager restarts, it blocks indexing until enough peers exist across the indexer cluster to fulfill the replication factor. In a basic, single-site cluster, this is usually desired behavior. However, in the case of a multisite cluster, you might want to restart indexing even though you do not have enough available peers to fulfill all aspects of the site replication factor (for example, in the case of site failure).

The two cases where this need typically arises are:

A site goes down and you later need to restart the manager for any reason.
The site with the manager goes down and you bring up a stand-by manager on another site.

If a site goes down but the manager, running on another site, remains up, indexing continues as usual, because the manager only runs the check at start-up.

Run the splunk set indexing-ready command on the manager to unblock indexing when replication factor number of peers are not available:

splunk set indexing-ready -auth admin:your_password

For example, assume you have a three-site cluster configured with "site_replication_factor = origin:1, site1:2, site2:2, site3:2, total:7", with the manager located on site1. If site2 goes down and you subsequently restart the manager, the manager blocks indexing after it restarts, because it is waiting to hear from a minimum of two peers on site2 ("site2:2"). In this situation, you can use the command to restart indexing on the remaining sites.

Similarly, if site1, which has the manager, goes down and you bring up a stand-by manager on site2, the new manager initially blocks indexing because site1 is not available. You can then use the command to tell the new manager to restart indexing.

Important: You must run the splunk set indexing-ready command every time you restart the manager under the listed circumstances. The command unblocks indexing only for the current restart.

Note: Although this command is designed with site failure in mind, you can also use it to restart indexing on a single-site cluster prior to the replication factor number of peers being available. In that circumstance, however, it is usually better just to wait until the replication number of peers rejoin the cluster.

Related answers from Splunk Community

Restart indexing in multisite cluster after manager restart or site failure

Comments

Was this topic useful?