Anomalous bucket issues
Anomalous buckets are buckets that remain in the fixup state indefinitely, without making any progress. Such buckets can indicate or cause a larger problem with your system. An anomalous bucket, for example, can prevent the cluster from meeting its replication and search factors.
The Bucket Status dashboard lets you identify anomalous buckets. It also allows you to take actions on those buckets that can often fix them. Specifically, you can:
- Get details on a bucket.
- Roll the bucket from hot to warm.
- Resync the state of a bucket copy between a peer and the manager.
- Delete a copy of the bucket on a single peer, or delete all copies of the bucket across all peers.
Consult with Splunk Support before performing these actions on a bucket. Some of the actions, performed without full understanding, can lead to further problems with your system or even to irreversible data loss.
Identify anomalous buckets
To identify anomalous buckets and to take action on them, use the Bucket Status dashboard.
- From the manager node dashboard, go to the Bucket Status dashboard. See View the bucket status dashboard.
- Click the Fixup Tasks - Pending tab.
You can filter the list of pending buckets by fixup type and by the amount of time that they have been waiting for fixup. If a bucket has been waiting an unusual amount of time for fixup, it might be the cause of problems.
Take action on an anomalous bucket
For buckets that have been stuck in fixup for long periods of time, you can take remedial action.
- Click Action for the bucket that you want to manage.
- Select one of the available actions:
- View bucket details
- Delete Copy
Use the following sequence when performing actions on anomalous bucket.
- View bucket details
- Delete Copy
Only perform the next action if the previous one does not resolve the issue.
View bucket details
This pop-up window provides details on the bucket such as:
- The bucket size
- Whether it is frozen
- Whether it has been force-rolled
- Whether it is a standalone bucket
- The peers on which it resides
These details can help to narrow down the cause of the bucket problem and what action to take to remediate it.
This action rolls the bucket from the hot state to the warm state. It has an effect only on hot buckets.
The manager holds information about each copy of a bucket. However, in some cases, the manager can have incorrect information about the copy on a particular peer. This condition can occur when communication problems arise between the manager and a peer.
Here are some examples of bucket copy state information that can be out of sync between peer and manager:
- Whether the copy is searchable
- Whether the copy is hot or warm
- Whether the copy is primary
- Whether the copy exists on that peer
The peer knows the state of its bucket copies, so if the peer and the manager have different state information for a bucket copy, the information on the manager is incorrect.
To resolve this problem, resync the bucket copy's state on the manager. When you resync a bucket, you specify the peer with the copy that you need to resync. The resync process causes the peer to send the manager its current information about the bucket copy.
You can delete either a single copy of a bucket on a specific peer, or all copies of a bucket across the entire cluster.
If deleting a single copy causes the cluster to lose its complete state, the cluster will engage in fixup activities so that the bucket again meets both the search factor and the replication factor. This situation might result in another copy of the bucket appearing on the same peer. If, however, the specified bucket is frozen, the cluster does not attempt any fixup activities.
Performing the delete action on all copies of a bucket across the cluster results in irreversible data loss.
Bucket replication issues
Configuration bundle issues
This documentation applies to the following versions of Splunk® Enterprise: 8.1.0, 8.1.1, 8.1.2