Remove excess bucket copies from the indexer cluster

Excess bucket copies are copies that exceed the cluster's replication factor or search factor. For example, if the cluster has a replication factor of 3, each bucket should optimally have exactly three copies residing across the set of peer nodes. If one bucket has four copies, that bucket has one excess copy.

Excess copies do not interfere with the operation of the cluster, but they are unnecessary and require extra disk space.

You can view and remove excess bucket copies from the manager node dashboard or from the CLI.

Caution: Before removing excess buckets, ensure that the cluster is in a complete state. There should be few, if any, pending bucket fixup jobs. In the case of a large number of excess buckets, in the range of several million, the best practice is to put the cluster in maintenance mode and remove excess buckets one index at a time.

How excess copies originate

Excess copies can result from peers leaving the cluster and then returning to it. When a peer goes down, the cluster initiates bucket fixing activities to compensate for any copies on that peer, because those copies are no longer available to the cluster. The goal of bucket fixing is return the cluster to the complete state, where each bucket has a replication factor number of copies and a search factor number of searchable copies.

If the peer later returns to the indexer cluster, any bucket copies that the peer retained while down are once again available to the cluster. This can result in the cluster maintaining excess copies of some buckets, as described in the topic What happens when a peer node comes back up.

In effect, a returning peer can cause the cluster to store more copies of some buckets than are needed to fulfill the replication factor and, possibly, the search factor as well. It can sometimes be useful to keep the extra copies around, as that topic explains, but you can save disk space by instead removing them.

Use the manager node dashboard

To view or remove excess bucket copies:

1. On the manager node, click Settings on the upper right side of Splunk Web.

2. In the Distributed Environment group, click Indexer clustering.

This takes you to the manager node dashboard.

3. Select the Indexes tab.

4. Click the Bucket Status button.

This takes you to the Bucket Status dashboard.

5. Select the Indexes with Excess Buckets tab.

This tab provides a list of indexes with excess bucket copies. It enumerates both buckets with excess copies and buckets with excess searchable copies. It also enumerates the total excess copies in each category. For example, if your index "new" has one bucket with three excess copies, one of which is searchable, and a second bucket with one excess copy, which is non-searchable, the row for "new" will report:

2 buckets with excess copies
1 bucket with excess searchable copies
4 total excess copies
1 total excess searchable copies

If you want to remove the excess copies for a single index, click the Remove button on the right side of the row for that index.

If you want to remove the excess copies for all indexes, click the Remove All Excess Buckets button.

Use the CLI

The Splunk CLI has two commands that help manage and remove excess bucket copies. You can run these commands either across the entire set of indexes or on just a single index.

Determine whether the cluster has extra copies

To find out how many buckets have extra copies, including extra searchable copies, run this command from the manager:

splunk list excess-buckets [index-name]

The output from splunk list excess-buckets looks like this:

index=_audit
       Total number of buckets=4
       Number of buckets with excess replication copies=0
       Number of buckets with excess searchable copies=0
       Total number of excess replication copies across all buckets=0
       Total number of excess searchable copies across all buckets=0
index=_internal
       Total number of buckets=4
       Number of buckets with excess replication copies=0
       Number of buckets with excess searchable copies=0
       Total number of excess replication copies across all buckets=0
       Total number of excess searchable copies across all buckets=0
index=main
       Total number of buckets=5
       Number of buckets with excess replication copies=5
       Number of buckets with excess searchable copies=5
       Total number of excess replication copies across all buckets=10
       Total number of excess searchable copies across all buckets=5

Remove extra bucket copies

To remove all extra bucket copies from the cluster (or from one index on the cluster), run this command from the manager:

splunk remove excess-buckets [index-name]

The manager determines which peers to remove the extra copies from. This is not configurable, and the extras will not necessarily be removed from the peer that has most recently returned to the cluster.

Conflicting operations

You cannot run certain operations simultaneously:

Data rebalance
Excess bucket removal
Rolling restart
Rolling upgrade

If you trigger one of these operations while another one is already running, splunkd.log, the CLI, and Splunk Web all surface an error to the effect that a conflicting operation is in progress.

Related answers from Splunk Community

Remove excess bucket copies from the indexer cluster

How excess copies originate

Use the manager node dashboard

Use the CLI

Determine whether the cluster has extra copies

Remove extra bucket copies

Conflicting operations

Comments

Remove excess bucket copies from the indexer cluster

Was this topic useful?