Search factor

When you configure the master node, you designate a search factor. The search factor determines the number of searchable copies of data the indexer cluster maintains. In other words, the search factor determines the number of searchable copies of each bucket. The default value for the search factor is 2, meaning that the cluster maintains two searchable copies of all data. The search factor must be less than or equal to the replication factor.

Searchable and non-searchable bucket copies

The difference between a searchable and a non-searchable copy of a bucket is this: The searchable copy contains both the data itself and some very extensive index files that the peer node uses to search the data. The non-searchable copy contains just the data. Even the data stored in the non-searchable copy, however, has undergone initial processing and is stored in a form that makes it possible to create the index files later, if necessary. For more information on the files that constitute Splunk Enterprise indexes, read the subtopic Data files.

Search recovery from peer node failure

With a search factor of at least 2, the cluster is able to continue searching with little interruption if a peer node goes down. For example, say you specify a replication factor of 3 and a search factor of 2. The cluster will maintain three copies of all buckets on separate peers across the cluster, and two copies of each bucket will be searchable. Then, if a peer goes down and it contains a bucket copy that has been participating in searches, a searchable copy of that bucket on another peer can immediately step in and start participating in searches.

On the other hand, if the cluster's search factor is only 1 and a peer goes down, there will be a significant lag before searching can resume across the full set of cluster data. Although non-searchable copies of the buckets can be made searchable, doing so takes time, because the index files must first be built from the raw data file. The processing time can be significant if the peer that went down was storing a large quantity of searchable data. For help estimating the time needed to make non-searchable copies searchable, look here.

The reason you might want to limit the number of searchable copies on your cluster is because searchable data occupies a lot more storage than non-searchable data. The trade-off, therefore, is between quick access to all your data in case of peer node failure versus increased storage requirements. For help estimating the relative storage sizes of searchable and non-searchable data, read Storage considerations. For most needs, the default search factor of 2 represents the right trade-off.

Search factor in multisite clusters

A multisite cluster uses a special version of the search factor, the site search factor. This determines not only the number of searchable copies that the entire cluster maintains but also the number of searchable copies that each site maintains. For information on the site search factor, see Configure the site search factor.

Related answers from Splunk Community

Search factor

Searchable and non-searchable bucket copies

Search recovery from peer node failure

Search factor in multisite clusters

Comments

Search factor

Was this topic useful?