When you configure the master node, you designate a search factor. The search factor determines the number of searchable copies of data the cluster maintains. The default value for the search factor is 2, meaning that the cluster maintains two searchable copies of all data. The search factor must be less than or equal to the replication factor.
The difference between a searchable and a non-searchable copy of data is this: The searchable copy contains both the data itself and some very extensive index files that Splunk uses to search the data. The non-searchable copy contains just the data. Even the data stored in the non-searchable copy, however, has undergone initial processing and is stored in a form that makes it possible to create the index files later, if necessary. For more information on the files that constitute Splunk indexes, read the subtopic "Data files".
With a search factor of at least 2, the cluster is able to continue searching with little interruption if a peer node goes down. For example, say you specify a replication factor of 3 and a search factor of 2. The cluster will maintain three copies of all buckets on separate peers across the cluster, and two copies of each bucket will be searchable. Then, if a peer goes down and it contains a bucket copy that has been participating in searches, a searchable copy of that bucket on another peer can immediately step in and start participating in searches.
On the other hand, if the cluster's search factor is only 1 and a peer goes down, there will be a significant lag before searching can resume across all the cluster data. Although non-searchable copies of buckets can be made searchable, doing so takes time, because the index files must first be created from the raw data file. The processing time can be significant if the peer that went down was storing a large quantity of searchable data. For help estimating the time needed to make non-searchable copies searchable, look here.
The reason you might want to limit the number of searchable copies on your cluster is because searchable data occupies a lot more storage than non-searchable data. The trade-off, therefore, is between quick access to all your data in case of peer node failure versus increased storage requirements. For help estimating the relative storage sizes of searchable and non-searchable data, read "Storage considerations".
Buckets and clusters
This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18