Use maintenance mode
Maintenance mode halts most bucket fixup activity and prevents frequent rolling of hot buckets. It is useful when performing peer upgrades and other maintenance activities on an indexer cluster. Because it halts critical bucket fixup activity, use maintenance mode only when necessary.
Why use maintenance mode
Certain conditions can generate errors during hot bucket replication and cause the source peer to roll the bucket. While this behavior is generally beneficial to the health of the indexer cluster, it can result in many small buckets across the cluster, if errors occur frequently. Situations that can generate an unacceptable number of small buckets include persistent network problems or repeated offlining of peers.
To stop this behavior, you can temporarily put the cluster into maintenance mode. This can be useful for system maintenance work that generates repeated network errors, such as network reconfiguration. Similarly, if you need to upgrade your peers or otherwise temporarily offline several peers, you can invoke maintenance mode to forestall bucket rolling during that time.
Note: The CLI commands
splunk apply cluster-bundle and
splunk rolling-restart incorporate maintenance mode functionality into their behavior by default, so you do not need to invoke maintenance mode explicitly when you run those commands. A message stating that maintenance mode is running appears on the manager node dashboard.
The effect of maintenance mode on cluster operation
To prevent buckets from rolling unnecessarily, maintenance mode halts most bucket fix-up activity. The only bucket fix-up that occurs during maintenance mode is primary fixup. The manager node will attempt, when necessary, to reassign primaries to available searchable bucket copies.
In particular, the cluster does not perform fixup that entails replicating buckets or converting buckets from non-searchable to searchable. This means that the manager node does not enforce replication factor or search factor policy during maintenance mode. Therefore, if the cluster loses a peer node during maintenance mode, it can be operating under a valid but incomplete state. See Indexer cluster states to understand the implications of this.
Similarly, if the cluster loses peer nodes in numbers equal to or greater than the replication factor, it also loses its valid state for the duration of maintenance mode.
In addition, if the cluster loses even a single peer node while in maintenance mode, it can potentially return incomplete results for searches running during the subsequent period of primary fixup. This period is usually short, often just a few seconds, but even a short period of primary fixup can affect in-progress searches.
Maintenance mode works the same for single-site and multisite clusters. It has no notion of sites.
Enable maintenance mode
Put the cluster into maintenance mode before starting maintenance activity. Once you have finished with maintenance, you should disable maintenance mode.
To invoke maintenance mode, run this CLI command on the manager node:
splunk enable maintenance-mode
When you run the
enable command, a message warning of the effects of maintenance mode appears and requires confirmation that you want to continue.
Effective with version 6.6, maintenance mode persists across manager node restarts.
Disable maintenance mode
To return to the standard bucket-rolling behavior, run:
splunk disable maintenance-mode
Determine maintenance mode status
To determine whether maintenance mode is on, run:
splunk show maintenance-mode
A returned value of
1 indicates that maintenance mode is on.
0 indicates that maintenance mode is off.
Take a peer offline
Restart the entire indexer cluster or a single peer node
This documentation applies to the following versions of Splunk® Enterprise: 8.1.0, 8.1.1, 8.1.2