Replace the manager node on the indexer cluster

You might need to replace the manager node for either of these reasons:

The node fails.
You must move the manager to a different machine or site.

Although there is currently no manager failover capability, you can prepare the indexer cluster for manager failure by configuring a stand-by manager that you can immediately bring up if the primary manager goes down. You can use the same method to replace the manager intentionally.

This topic describes the key steps in replacing the manager:

Back up the files that the replacement manager needs.
This is a preparatory step. You must do this before the manager fails or otherwise leaves the system.
Ensure that the peer and search head nodes can find the new manager.
Replace the manager.

In the case of a multisite cluster, you must also prepare for the possible failure of the site that houses the manager. See Handle manager site failure.

Back up the files that the replacement manager needs

There are several files and directories that you must backup so that you can later copy them to the replacement manager:

The manager's server.conf file, which is where the manager cluster settings are stored. You must back up this file whenever you change the manager's cluster configuration.

The manager's $SPLUNK_HOME/etc/master-apps directory, which is where common peer configurations are stored, as described in Update cluster peer configurations. You must back up this directory whenever you update the set of content that you push to the peer nodes.

The manager's $SPLUNK_HOME/var/run/splunk/cluster/remote-bundle/ directory, which contains the actual configuration bundles pushed to the peer nodes. You must back up this directory whenever you push new content to the peer nodes.

If the $SPLUNK_HOME/var/run/splunk/cluster/remote-bundle/ directory contains a large number of old bundles, you can optionally back up only the files associated with the active and previously active bundles. Look for the two files ending with .bundle_active and .bundle_previousActive. Each of those files has has an associated directory and a file that are each identified by the bundle id. You must back up all six files/directories in total.

For example, If the directory contains the file 42af6d880c6a1d43e935e8d8a0062089-1571637961.bundle_active, it will also contain the file 42af6d880c6a1d43e935e8d8a0062089-1571637961.bundle and the directory 42af6d880c6a1d43e935e8d8a0062089-1571637961. To back up the active bundle, you must back up the two files and the directory. Similarly, to back up the previously active bundle, you must back up the file that ends with .bundle_previousActive, as well as the directory and other file with the same id.

In addition to the above files and directories, back up any other configuration files that you have customized on the manager, such as inputs.conf, web.conf, and so on.

In preparing a replacement manager, you must copy over only these files and directories. You do not copy or otherwise deal with the dynamic state of the cluster. The cluster peers as a group hold all information about the dynamic state of a cluster, such as the status of all bucket copies. They communicate this information to the manager node as necessary, for example, when a downed manager returns to the cluster or when a stand-by manager replaces a downed manager. The manager then uses that information to rebuild its map of the cluster's dynamic state.

Ensure that the peer and search head nodes can find the new manager

You can choose between two approaches for ensuring that the peer nodes and search head can locate the replacement instance and recognize it as the manager:

The replacement uses the same IP address and management port as the primary manager. To ensure that the replacement uses the same IP address, you must employ DNS-based failover, a load balancer, or some other technique. The management port is set during installation, but you can change it by editing web.conf.

The replacement does not use the same IP address or management port as the primary manager. In this case, after you bring up the new manager, you must update the manager_uri setting on all the peers and search heads to point to the new manager's IP address and management port.