Rebalance the indexer cluster
By rebalancing the indexer cluster, you balance the distribution of bucket copies across the set of peer nodes. A balanced set of bucket copies optimizes each peer's search load and, in the case of data rebalancing, each peer's disk storage.
Types of indexer cluster rebalancing
There are three types of indexer cluster rebalancing:
- Primary rebalancing
- Data rebalancing
- Usage-based data rebalancing
Primary rebalancing
The goal of primary rebalancing is to balance the search load across the peer nodes.
Primary rebalancing redistributes the primary bucket copies across the set of peer nodes. It attempts, to the degree possible, to ensure that each peer has approximately the same number of primary copies.
Primary rebalancing simply reassigns primary markers across the set of existing searchable copies. It does not move searchable copies to different peer nodes. Because of this limitation, primary rebalancing is unlikely to achieve a perfect balance of primaries.
Because primary rebalancing only reassigns markers and does not cause any bucket copies to move between peers, it completes quickly.
See Rebalance indexer cluster primary bucket copies.
Data rebalancing
The goal of data rebalancing is to balance the storage distribution across the peer nodes.
Data rebalancing redistributes bucket copies so that each peer has approximately the same number of copies. It balances searchable, non-searchable, and primary copies.
During data rebalancing, the cluster moves bucket copies from peers with more copies to peers with fewer copies. Since this type of rebalancing includes searchable copies, it overcomes the limitation inherent in primary rebalancing and achieves a significantly better balance of primaries.
Because data rebalancing involves significant fixup activity, such as moving bucket copies between peers, it can be a slow and lengthy process.
See Rebalance indexer cluster data.
Usage-based data rebalancing
The goal of usage-based data rebalancing is to balance the search load across the peer nodes. Unlike primary rebalancing, usage-based data rebalancing uses actual bucket-level search usage statistics that have been reported by the peer nodes to the cluster manager. This approach typically provides better search improvement compared to primary rebalancing, which merely balances the primary buckets but does not attempt to determine which buckets are most active.
Usage-based data rebalancing also results in more even storage distribution compared to primary rebalancing, since it moves the buckets themselves (similarly to non-usage data rebalancing) across the peer nodes within each site, unlike primary rebalancing, which just redistributes the primary markers. Like non-usage-based data rebalancing, usage-based data rebalancing can be a slow process.
Primary rebalancing is a quick-and-easy approach to attempting to achieve better search load balance and might be sufficient for your needs. If, however, primary rebalancing does not provide sufficient search performance improvement and you need a deeper and more comprehensive approach to search-aware rebalancing, try usage-based rebalancing.
See Rebalance indexer cluster data based on search usage.
Rebalance indexer cluster primary bucket copies
When you start or restart a manager or peer node, the manager rebalances the set of primary bucket copies across the peers in an attempt to spread the primary copies as equitably as possible. Ideally, if you have four peers and 300 buckets, each peer would hold 75 primary copies. The purpose of primary rebalancing is to even the search load across the set of peers.
How primary rebalancing works
To achieve primary rebalancing, the manager reassigns the primary state from existing bucket copies to searchable copies of the same buckets on other peers, as necessary. This rebalancing is a best-effort attempt; there is no guarantee that full, perfect rebalance will result.
Primary rebalancing occurs automatically whenever a peer or manager node joins or rejoins the cluster. In the case of a rolling restart, rebalancing occurs once, at the end of the process.
Even though primary rebalancing occurs when a new peer joins the cluster, that peer won't participate in the rebalancing, because it does not yet have any bucket copies. The rebalancing takes place among any existing peers that have searchable bucket copies.
When performing primary rebalancing on a bucket, the manager simply reassigns the primary state from one searchable copy to another searchable copy of the same bucket, if there is one, and if, by doing so, the balance of primaries across peers will improve. It does not cause peers to stream bucket copies, and it does not cause peers to make unsearchable copies searchable. If an existing peer does not have any searchable copies, it will not gain any primaries during rebalancing.
Initiate primary rebalancing manually
If you want to initiate the primary rebalancing process manually, you can either restart a peer or hit the /services/cluster/manager/control/control/rebalance_primaries
REST endpoint on the manager node. For example, run this command on the manager node:
curl -k -u admin:pass --request POST \ https://localhost:8089/services/cluster/manager/control/control/rebalance_primaries
For more information, see cluster/manager/control/control/rebalance_primaries in the Rest API Reference Manual.
Rebalance primaries on a multisite cluster
There are a few differences in how primary rebalancing works in a multisite cluster. In a multisite cluster, multiple sites typically have full sets of primary copies. When you rebalance the cluster, the rebalancing occurs independently for each site. For example, in a two-site cluster, the cluster separately rebalances the primaries in site1 and site2. It does not shift primaries between the two sites.
The start or restart of a peer on any site triggers primary rebalancing on all sites. For example, if you restart a peer on site1 in a two-site cluster, rebalancing occurs on both site1 and site2.
View the number of primaries on a peer
To gain insight into the primary load for any peer, you can use the cluster/manager/peers
endpoint to view the number of primaries that the peer currently holds. The primary_count
shows the number of primaries that the peer holds for its local site. The primary_count_remote
shows the number of primaries that the peer holds for non-local sites, including site0.
By using this endpoint on all your peers, you can determine whether the cluster could benefit from primary rebalancing.
See cluster/manager/peers in the Rest API Reference Manual.
Summary of indexer cluster primary rebalancing
Primary rebalancing is the rebalancing of the primary assignments across existing searchable copies in the cluster.
Primary rebalancing occurs under these circumstances:
- A peer joins or rejoins a cluster
- At the end of a rolling restart
- A manager rejoins the cluster
- You manually invoke the
rebalance_primaries
REST endpoint on the manager node
Rebalance indexer cluster data
To rebalance indexer cluster data, you rebalance the set of bucket copies so that each peer holds approximately the same number of copies. This helps ensure that each peer has a similar storage distribution.
The problem of imbalanced data
Imbalanced data increases the likelihood that one or more peers will run out of disk space and thus transition to the detention state. In the detention state, the peer no longer indexes new data, and so forwarders can no longer send data to that peer. When that happens, a load-balanced forwarder shifts its incoming data to other peers, increasing the indexing burden on those peers. In the worst-case scenario, if the forwarder is not configured for load-balancing, its data gets lost.
In addition, as its existing data ages, the peer in detention will contain relatively older data compared to other peers. Since most searches focus on newer data, this means that the peer's data will typically get searched less frequently, shifting the burden of the search load onto the peers that are not in detention.
Aside from detention considerations, imbalanced data can affect peer utilization during searches. If some peer nodes hold more bucket copies for a particular index compared to other peer nodes, they will have a greater part of the search workload for searches on that index. For this reason, data rebalancing is index-aware. When rebalancing completes, each peer will have approximately the same number of bucket copies for each index.
Conditions that cause imbalanced data
A number of factors can cause an imbalance of bucket copies. These include the following:
- Newly added peer nodes. When you add a new peer, it initially has no bucket copies. Through data rebalancing, you can move copies onto that peer from other peers.
- Uneven forwarding of data. If the forwarders are sending more data to some peer nodes, it is likely that those peers will hold more bucket copies. Rebalancing provides a way to move copies from those peers to peers with fewer copies.
What data rebalancing accomplishes
Data rebalancing attempts to achieve an equitable distribution of bucket copies across the set of peer nodes. It factors in several bucket characteristics:
- Data rebalancing balances both non-searchable and searchable bucket copies.
- Data rebalancing balances for each index, so that, in addition to each peer holding approximately the same number of bucket copies in total, each peer will hold the same number of bucket copies for each index. See Data rebalancing and indexes.
- Data rebalancing operates on warm and cold buckets only. It does not rebalance hot buckets.
- Data rebalancing operates only on buckets that meet their replication and search factors.
Data rebalancing attempts to achieve a best-effort balance, not a perfect balance. See Configure the data rebalancing threshold.
Data rebalancing and storage utilization
Data rebalancing has limitations regarding its effect on storage utilization.
Data rebalancing balances the number of bucket copies, not the actual data storage. In addition, data rebalancing attempts to achieve a practical balance, not a perfect balance. In most cases, the process achieves an optimal approximation of balanced storage, as determined by several criteria:
- The process assumes that all peers have the same amount of disk storage available for indexes. It is a best practice to use homogeneous instances for your peer nodes.
- The process assumes that all buckets are the same size, although bucket size does sometimes vary.
- The process stops when the number of copies on each peer is within a small range of a perfect balance. It does not ordinarily attempt to put precisely the same number of copies on each peer. See Configure the data rebalancing threshold.
Make data rebalance search-safe
You can run data rebalance in searchable mode to avoid search downtime. In searchable mode, the rebalance operation is search-safe, depending on limits controllable through timeout settings.
Do not use searchable data rebalance with SmartStore indexes. Searchable mode is not optimized for SmartStore and can cause data rebalance to proceed slowly. Use non-searchable data rebalance instead.
In any case, non-searchable data rebalance of SmartStore indexes usually causes only minimal search disruption. The data rebalance process runs quickly on SmartStore indexes, because it moves only bucket metadata, not the bucket data itself.
The default mode for data rebalance is non-searchable, which means that the operation is not search-safe. Because, as part of its operation, data rebalancing removes bucket copies from their old peers after streaming copies to new peers, the data rebalancing operation might remove bucket copies needed by a search while the search is in progress. Therefore, search results are not guaranteed to be complete during non-searchable data rebalancing.
With searchable data rebalance, the operation waits until in-progress searches are complete, up to a configurable time limit, before removing the old bucket copies that pertain to the search's generation. New searches with a newer generation can simultaneously perform searches against new bucket copies.
Data rebalance can take longer to complete in searchable mode. Searchable data rebalance also requires more storage space because excess buckets are removed in batches, rather than immediately, as is the case with non-searchable data rebalance.
As a best practice, perform an excess bucket removal as a separate operation before running a searchable data rebalance. See Remove excess bucket copies from the indexer cluster. By performing the excess bucket removal separately, the removal process runs as a non-searchable operation, which is significantly faster than the searchable removal operation that otherwise occurs at the start of a searchable data rebalance. The performance improvement is particularly noticeable for larger numbers of excess buckets, in the thousand bucket range and beyond.
You can specify the searchable mode on a per-operation basis when you initiate a data rebalance operation, either through the CLI, as described in Initiate data rebalancing, or through the manager node dashboard, as described in Use the manager dashboard to initiate and configure rebalancing.
You can also specify the searchable mode for all data rebalancing operations with the searchable_rebalance
setting in server.conf
on the manager node. If you set searchable_rebalance = true
, you do not need to specify the searchable mode each time that you initiate a data rebalance.
If you find that some of your searches are not completing during data rebalance, you can change the rebalance_search_completion_timeout
setting in server.conf
to increase the maximum time that the data rebalance operation waits for in-progress searches to complete. Its default value is 180 seconds.
Searches that take longer to complete than the rebalance_search_completion_timeout
value are not retried.
Other settings in server.conf
allow you to control additional aspects of the searchable data rebalance operation when necessary. Modify those settings only in consultation with Splunk Support.
Searchable data rebalance requires that the manager node and all peer nodes run Splunk Enterprise version 7.3.0 or higher.
How data rebalancing works
The manager node controls the data rebalancing process. To achieve the goal of equitable distribution of bucket copies across all peer nodes, the manager moves bucket copies from peers with an above-average number of copies to peers with a below-average number of copies. It continues this process until the cluster is balanced, with each peer holding approximately the same number of bucket copies.
Note: When you initiate a data rebalance operation, the cluster first runs an excess buckets removal operation to remove any bucket copies that exceed the cluster's replication or search factors. See Remove excess bucket copies from the indexer cluster. As described below, the cluster also removes excess buckets during the data rebalance operation, to clean up the extra buckets created during the operation itself.
To achieve rebalancing, the cluster uses the fundamental processes of bucket fixup. It streams bucket copies from one peer to another. The process of "moving" a bucket copy to another peer thus involves streaming the copy from a peer with an above-average number of bucket copies to a peer with a below-average number of bucket copies.
The cluster processes buckets sequentially, one after another, until rebalancing is complete. It does not wait for one bucket to complete before starting rebalancing on the next, so there will typically be a number of buckets simultaneously undergoing rebalancing. See Control rebalancing load.
In the case of non-searchable data rebalance, once the streaming of a bucket to another peer is complete, the cluster handles the excess bucket copy that now exists by immediately removing a copy of the bucket from a peer with an above-average number of bucket copies overall. The cluster performs primary rebalancing at the end of the data rebalancing process.
In the case of searchable data rebalance, the cluster instead removes excess copies in batches, after waiting for any active searches to complete. By batching excess copy removal and waiting until active searches are complete, the cluster is able to minimize the possibility that a bucket for an in-progress search is removed while the search is still running. In addition, to accommodate both in-progress searches and new searches, the cluster also swaps primaries before removing the batch of excess copies. This action results in a new generation. In-progress searches continue to use the old generation, while new searches use the new generation. In this way, the rebalance operation is search-safe throughout the process no matter when a search begins.
The rebalancing process can terminate prematurely, either due to manual intervention or because it times out. Termination conditions are discussed elsewhere in this topic.
Data rebalancing and indexes
The rebalancing process balances the bucket copies by index. When rebalancing completes, each peer holds approximately the same number of bucket copies in total, as well as the same number of copies split by index.
For example, say you have a cluster with four peers and two indexes, index1 and index2. Index1 has 100 bucket copies distributed across all peers. Index2 has 300 copies distributed across all peers, for a total of 400 copies of all buckets across all peers.
Before rebalancing, the bucket distribution across the set of peers might look like this:
- Peer1: 110 total
- Index1: 10
- Index2: 100
- Peer2: 100 total
- Index1: 50
- Index2: 50
- Peer3: 50 total
- Index1: 20
- Index2: 30
- Peer4: 140 total
- Index1: 20
- Index2: 120
After rebalancing, the bucket distribution will look approximately like this:
- Peer1: 100 total
- Index1: 25
- Index2: 75
- Peer2: 100 total
- Index1: 25
- Index2: 75
- Peer3: 100 total
- Index1: 25
- Index2: 75
- Peer4: 100 total
- Index1: 25
- Index2: 75
Data rebalancing in multisite indexer clusters
Multisite data rebalancing operates in essentially the same way as single-site rebalancing. However, in multisite clusters, the manager first balances each bucket across sites to the degree that the site configuration allows. It then balances the bucket within each site.
The degree to which you can balance a multisite cluster across sites depends on the site replication and search factors. For example, if you have a site replication factor of origin:2,total:3
, the cluster maintains two-thirds of the copies on their origin site. If one site originates more buckets than another, rebalancing cannot address the resulting imbalance without violating the site replication factor, and so the imbalance will remain. Similarly, the cluster will not rebalance in a way that violates explicit site requirements. Intersite balancing does, however, balance non-explicit copies across sites.
Initiate data rebalancing
You can rebalance the data for all indexes or for just a single index. In addition, you can set a timeout for the rebalancing.
To rebalance the data, run this CLI command on the manager node:
splunk rebalance cluster-data -action start [-searchable true] [-index index_name] [-max_runtime interval_in_minutes]
Note the following:
- Set the optional
-searchable
parameter totrue
to enable search-safe data rebalance. Data rebalance can take longer to complete in searchable mode. The default for this parameter isfalse
(non-searchable data rebalance). - Use the optional
-index
parameter to rebalance just a single index. Otherwise, the command rebalances all indexes. - Use the optional
-max_runtime
parameter to limit the rebalance activity to a specified number of minutes. The rebalancing stops automatically if the timeout limit is reached, even if there are still more buckets to process. For details on what happens when data rebalancing stops prematurely, see Stop data rebalancing.
You can also initiate rebalancing from the manager node dashboard. See Use the manager dashboard to initiate and configure rebalancing.
It is a best practice to perform data rebalancing during a maintenance window for these reasons:
- Data rebalancing can cause primary bucket copies to move to new peers, so search results are not guaranteed to be complete while data rebalancing continues.
- The fixup activities associated with data rebalancing have a low priority compared to other bucket fixup activities, such as maintaining the replication and search factors, so rebalancing will wait while other fixup activity completes.
Stop data rebalancing
To stop data rebalancing prematurely, run this CLI command on the manager node:
splunk rebalance cluster-data -action stop
For any bucket in the midst of rebalancing when this command occurs, the cluster finishes the current process. It does not initiate any additional processing on the bucket, however. For example, if the cluster is in the midst of copying a non-searchable copy of a bucket, it finishes making the copy. It does not check whether the bucket balance can improve by also processing a searchable copy. It also does not remove any excess copies of the bucket.
View status of data rebalancing
To see whether data rebalancing is running, run this CLI command on the manager node:
splunk rebalance cluster-data -action status
You can also use the manager dashboard to view rebalancing status. See Use the manager dashboard to initiate and configure rebalancing.
Configure the data rebalancing threshold
The manager attempts to achieve a reasonable, but not perfect, balance, in which the resulting number of copies on each peer lies within a narrow range to either side of the average number of copies for all peers.
This balance is configurable through the rebalance_threshold
attribute in the manager's server.conf
. You can adjust the setting either directly in server.conf
or by means of the CLI. For example:
splunk edit cluster-config -mode manager -rebalance_threshold 0.95 -auth admin:your_password
You can also configure the rebalancing threshold through the manager node dashboard. See Use the manager dashboard to initiate and configure rebalancing.
A rebalance_threshold
value of 1.00 means that rebalancing will continue until the cluster is fully balanced, with each peer having the same number of copies. The default value is 0.90, which means that rebalancing will continue until each peer is within 90% of a perfect balance.
With the default setting of 0.90, rebalancing continues until the number of copies on each peer lies within a range of .90 to 1.10 of the average. For example, if you have three peers holding between them a total of 300 copies, meaning that there is an average of 100 copies per peer, the rebalancing process stops when every peer holds between 90 and 110 copies.
If you decide instead that a 95% balance is preferable, you can set rebalance_threshold
to 0.95. The manager will then perform any necessary rebalancing until the number of copies on each peer lies within a range of .95 to 1.05 of the average.
The cluster considers each index individually against the threshold. That is, the goal of the rebalancing is to ensure that each index is balanced to the tolerance set by the rebalance_threshold
attribute.
Use the manager dashboard to initiate and configure rebalancing
You can initiate and configure rebalancing through the manager node dashboard. See Configure the manager node with the dashboard.
- Click the Edit button on the upper right side of the dashboard.
- Select the Data Rebalance option.
A pop-up window appears with several fields. - Fill out the fields as necessary:
- Threshold. Change the rebalancing threshold.
- Max Runtime. Stop the rebalancing process after a set period of time. If you leave this field empty, rebalancing continues until all peers are within the threshold limit.
- Index. Run rebalancing on a single index or across all indexes.
- Searchable. Enable search-safe data rebalance. Data rebalance can take longer to complete in searchable mode.
- To start rebalancing, click the Start button.
The window also provides rebalance status information.
Control rebalancing load
You can configure the rebalancing load to minimize the effect that the consequent fixup activity has on peer indexing and search performance.
The maximum number of buckets that can be rebalancing at a time is subject to the same attributes that determine peer load for any fixup activity: max_peer_rep_load
and max_peer_build_load
in the [clustering]
stanza of server.conf
on the manager node.
Data rebalancing uses the value of these attributes reduced by 1, if the attribute value is greater than 1. For example, if you set max_peer_rep_load
to 4, then the peer can participate in a maximum of three (not four) concurrent rebalancing replications as a target.
Conflicting operations
You cannot run certain operations simultaneously:
- Data rebalance
- Excess bucket removal
- Rolling restart
- Rolling upgrade
If you trigger one of these operations while another one is already running, splunkd.log
, the CLI, and Splunk Web each surface an error that indicates that a conflicting operation is in progress.
Rebalance indexer cluster data based on search usage
By rebalancing buckets based on search usage data, you can balance the search load more effectively. The rebalancing process moves heavily searched buckets across a site's peer nodes where necessary to achieve a more balanced distribution of those buckets.
How usage-based data rebalancing works
Usage-based data rebalancing is initiated manually. You can run the rebalancing periodically to improve distribution of heavily searched buckets. To determine whether sites in your indexer cluster might benefit from rebalancing, you can first run a command that reports standard deviation for search usage among peers on each site.
Turn the usage-based rebalancing capability on and off
By default, each peer node sends search usage information to the cluster manager at a frequency of 1 minute. You can change the frequency of notification or even disable usage notifications altogether with the notify_buckets_usage_period
setting in indexes.conf
on the peer node. However, you should change this setting only with guidance from Splunk personnel.
The process of collecting notifications has minimal performance impact on either the peer nodes or the cluster manager, so the best practice is to leave the setting alone, in its default state.
Determine whether a site has a usage-based bucket imbalance
To determine whether a site has a usage-based bucket imbalance, run the following command on the cluster manager:
splunk rebalance cluster-usage -action status
The information returned will look similar to this:
Current Usage Rebalance: Phase: NotStarted Realtime STDDEV of bucket usage per site: site1 : 210297.74 site2 : 57067.12
The STDDEV (standard deviation) value indicates the level of imbalance on a site. This is a relative number which depends on the scale of the cluster and the bucket usage patterns. A seemingly high value does not necessarily indicate an issue, but a comparable high value does. For example, in this example, site1 has worse usage balancing compared to site2, assuming they are of similar scale.
As a rule of thumb, if the STDDEV value increases significantly compared to its previous status, it's time for a bucket usage rebalance.
In the case of a single site cluster, the site id will be "default".
Initiate usage-based rebalancing
To initiate usage-based rebalancing, run the following command on the cluster manager:
splunk rebalance cluster-usage -action start
Stop usage-based rebalancing
To stop usage-based rebalancing before the process completes, run the following command on the cluster manager:
splunk rebalance cluster-usage -action stop
Check status during and after usage-based rebalancing
The status command shown earlier also allows you to check on the progress of reabalancing or its final effect once the rebalancing completes. For example, this usage shows a status check during the process of rebalancing:
splunk rebalance cluster-usage -action status Current Usage Rebalance: Phase: BucketMoving Progress (%): 70.53 Successful Bucket Moves: 58 Failed Bucket Moves: 0 <======A non-zero value can be normal, but excessive "Failed Bucket Moves" can reduce the effect of bucket usage rebalance. You can investigate via splunkd.log. STDDEV before rebalance per site: site1 : 286851.09 site2 : 38475.83 Realtime STDDEV of bucket usage per site: site1 : 47613.26 site2 : 38475.78
Perform a rolling restart of an indexer cluster | Remove excess bucket copies from the indexer cluster |
This documentation applies to the following versions of Splunk® Enterprise: 9.3.0, 9.3.1, 9.3.2, 9.4.0
Feedback submitted, thanks!