Indexer cluster operations and SmartStore

Indexer clusters treat SmartStore indexes differently from non-SmartStore indexes in some fundamental ways:

The responsibility for high availability and disaster recovery of SmartStore warm buckets shifts from the cluster to the remote storage service. This shift offers the important advantage that warm bucket data is fully recoverable even if the cluster loses a set of peer nodes that equals or exceeds the replication factor in number.

The effect of the replication factor on SmartStore warm buckets differs from its effect on non-SmartStore warm buckets. In particular, the cluster uses the replication factor to determine how many copies of SmartStore warm bucket metadata it maintains. The cluster does not attempt to maintain multiple copies of the warm buckets themselves. In cases of warm bucket fixup, the cluster only needs to replicate the bucket metadata, not the entire contents of the bucket directories.

Replication factor and replicated bucket copies

A cluster treats hot buckets in SmartStore indexes the same way that it treats hot buckets in non-SmartStore indexes. It replicates the hot buckets in local storage across the replication factor number of peer nodes.

When a bucket in a SmartStore index rolls to warm and moves to remote storage, the remote storage service takes over responsibility for maintaining high availability of that bucket. The replication factor has no effect on how the remote storage service achieves that goal.

At the point that the bucket rolls to warm and gets uploaded to remote storage, the peer nodes no longer attempt to maintain replication factor number of local copies of the bucket. The peer node that was the source for that bucket continues to maintain a copy of the bucket in local cache for some period of time, as determined by the cache eviction policy. The peer nodes that were targets for that bucket, however, immediately evict their copies from their local caches.

Even when copies of a bucket are no longer stored locally, the replication factor still controls the metadata that the cluster stores for that bucket. Peer nodes equal in number to the replication factor maintain metadata information about the bucket in their .bucketManifest files. Those peer nodes also maintain empty directories for the bucket, if a copy of the bucket does not currently reside in their local cache.

For example, if the cluster has a replication factor of 3, three peer nodes continue to maintain metadata information, along with populated or empty directories, for each bucket.

By maintaining metadata for each bucket on the replication factor number of peer nodes, the cluster simplifies the process of fetching the bucket when it is needed, in the case of any interim peer node failure.

Unlike non-SmartStore indexes, a cluster can recover most of the data in SmartStore indexes if it loses peer nodes that equal or exceed the replication factor in number. In such a circumstance, the cluster can recover all of the SmartStore warm buckets, because those buckets are stored remotely and so are unaffected by peer node failure. The cluster will likely lose some of its SmartStore hot buckets, because those buckets are stored locally.

For example, in a cluster with a replication factor of 3, the index loses both hot and warm data from its non-SmartStore indexes if three or more peer nodes are simultaneously offline. However, the same cluster can lose any number of peer nodes, even all of its peer nodes temporarily, and still not lose any SmartStore warm data, because that data resides on remote storage.

Search factor, searchable copies, and primary copies

The search factor has the same effect on hot buckets in SmartStore indexes as it does on hot buckets in non-SmartStore indexes. That is, the search factor determines the number of copies of each replicated bucket that include the tsidx files and are thus searchable.

For SmartStore warm buckets, the search factor has no practical meaning. The remote storage holds the master copy of each bucket, and that copy always includes the set of tsidx files, so it is, by definition, searchable. When, in response to a search request, a peer node's cache manager fetches a copy of a bucket to the node's local cache, that copy is searchable to the degree that it needs to be for the specific search. As described in How the cache manager fetches buckets, the cache manager attempts to download only the bucket files needed for a particular search. Therefore, in some cases, the cache manager might not download the tsidx files.

Primary tags work the same with SmartStore indexes as with non-SmartStore indexes. Each bucket has exactly one peer node with a primary tag for that bucket. For each search, the peer node with the primary tag for a particular warm bucket is the one that searches that bucket, first fetching a copy of that bucket from remote storage when necessary

How an indexer cluster handles SmartStore bucket fixup stemming from peer node failure

When a peer node fails, indexer clusters handle bucket fixup for SmartStore indexes in basically the same way as for non-SmartStore indexes, with a few differences. For a general discussion of peer node failure and the bucket-fixing activities that ensue, see What happens when a peer node goes down.

This section covers the differences in bucket fixup that occur with SmartStore. One advantage of SmartStore is that loss of peer nodes equal to, or in excess of, the replication factor in number does not result in loss of warm bucket data. In addition, warm bucket fixup proceeds much faster for SmartStore indexes.

Bucket fixup with SmartStore

In the case of hot buckets, bucket fixup for SmartStore indexes proceeds the same way as for non-SmartStore indexes.

In the case of warm buckets, bucket fixup for SmartStore indexes proceeds much more quickly than it does for non-SmartStore indexes. This advantage occurs because SmartStore bucket fixup requires updates only to the .bucketManifest files on each peer node, without the need to stream the buckets themselves. In other words, bucket fixup replicates only bucket metadata.

Each peer node maintains a .bucketManifest file for each of its indexes. The file contains metadata for each bucket copy that the peer node maintains. When you replicate a bucket from a source peer to a target peer, the target peer adds metadata to its .bucketManifest file for that bucket copy.

The buckets themselves are not replicated during SmartStore fixup because the master copy of each bucket remains present on remote storage and is downloaded to the peer nodes' local storage only when needed for a search.

During SmartStore replication, the target peer nodes also create an empty directory for each warm bucket that has metadata in their .bucketManifest files.

As with non-SmartStore indexes, the SmartStore bucket fixup process also ensures that exactly one peer node has a primary tag for each bucket.

Fixup when the number of nodes that fail is less than the replication factor

For both SmartStore and non-SmartStore indexes, the cluster can fully recover its valid and complete states through the fixup process. The fixup process for SmartStore warm buckets requires only the replication of metadata internal to the cluster, so it proceeds much more quickly.

Fixup when the number of nodes that fail equals or exceeds the replication factor

In contrast to non-SmartStore indexes, a cluster can recover all the warm bucket data for its SmartStore indexes even when the number of failed nodes equals or exceeds the replication factor. As with non-SmartStore indexes, though, there will likely be some data loss associated with hot buckets.

The cluster needs to recover only missing warm bucket metadata because the warm buckets themselves remain present on remote storage.

To recover any lost warm bucket metadata, the master node uses its comprehensive list of all bucket IDs. It compares those IDs with the IDs in the set of .bucketManifest files on the peer nodes, looking for IDs that are present in its list but are not present in any .bucketManifest file. For all such bucket IDs, the master assigns peer nodes to query the remote storage for the information necessary to populate metadata for those buckets in their .bucketManifest files. Additional fixup occurs to meet the replication factor requirements for the metadata and to assign primary tags.

If the master node goes down during this recovery process, the warm bucket metadata is still recoverable. When the cluster regains a master node, the master initiates bootstrapping to recover all warm bucket metadata. For information on bootstrapping, see Bootstrap SmartStore indexes.

Related answers from Splunk Community

Indexer cluster operations and SmartStore

Replication factor and replicated bucket copies

Search factor, searchable copies, and primary copies

How an indexer cluster handles SmartStore bucket fixup stemming from peer node failure

Bucket fixup with SmartStore

Fixup when the number of nodes that fail is less than the replication factor

Fixup when the number of nodes that fail equals or exceeds the replication factor

Comments

Indexer cluster operations and SmartStore

Was this topic useful?