Splunk® Enterprise

Managing Indexers and Clusters of Indexers

Download manual as PDF

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

About SmartStore

SmartStore is an indexer capability that provides a way to use remote object stores, such as Amazon S3, to store indexed data.

As a deployment's data volume increases, demand for storage typically outpaces demand for compute resources. SmartStore allows you to manage your indexer storage and compute resources in a cost-effective manner by scaling those resources separately.

SmartStore introduces a remote storage tier and a cache manager. These features allow data to reside either locally on indexers or on the remote storage tier. Data movement between the indexer and the remote storage tier is managed by the cache manager, which resides on the indexer.

With SmartStore, you can reduce the indexer storage footprint to a minimum and choose I/O optimized compute resources. Most data resides on remote storage, while the indexer maintains a local cache that contains a minimal amount of data: hot buckets, copies of warm buckets participating in active or recent searches, and bucket metadata.

You can enable SmartStore for all indexes or for a subset of indexes.

SmartStore advantages

SmartStore offers several advantages to the deployment's indexing tier:

  • Reduced storage cost. Your deployment can take advantage of the economy of remote object stores, instead of relying on costly local storage.
  • Access to high availability and data resiliency features available through remote object stores.
  • The ability to scale compute and storage resources separately, thus ensuring that you use resources efficiently.
  • Simple and flexible configuration with per-index settings.

SmartStore offers additional advantages specific to deployments of indexer clusters:

  • Fast recovery from peer failure and fast data rebalancing, requiring only metadata fixups for warm data.
  • Lower overall storage requirements, as the system maintains only a single permanent copy of each warm bucket.
  • Full recovery of warm buckets even when the number of peer nodes that goes down is greater than or equal to the replication factor.
  • A bootstrapping capability that allows a new cluster to inherit the data from an old cluster.
  • Global size-based data retention.
  • Simplified upgrades.

An intelligent cache manager ensures that, for most search use cases, SmartStore provides similar performance to local storage configurations.

Choosing SmartStore

While SmartStore-enabled indexes can significantly decrease storage and management costs under the right circumstances, there are also times when you might find it preferable to continue to rely on local storage.

When to consider moving to SmartStore

SmartStore can help you to achieve significant costs savings for medium to large scale deployments. In particular, consider enabling SmartStore under these circumstances:

  • As the amount of data in local storage continues to grow. While local storage costs might not be a significant issue for a small deployment, you should reconsider your use of local storage as your deployment scales over time.
  • If you are using indexer clusters to take advantage of features such as data recovery and disaster recovery. Through SmartStore, you can achieve these aims through the native capabilities of the remote store, without the need to store large amounts of redundant data on local storage.
  • If you are using indexer clusters and you find that considerable amounts of your time and your compute resources are devoted to managing the cluster. Through SmartStore, you can eliminate much of the cluster management overhead. In particular, you can greatly reduce the scale of time-consuming activities such as offlining peer nodes, data rebalancing, and bucket fixup, because most of the data no longer resides on the peer nodes.
  • When most searches are over recent data.

When not to move to SmartStore

There are a few situations where local storage might be a better choice:

  • If you have a small deployment, with limited amounts of stored data, the advantages of SmartStore might not compensate for the costs of setting up and maintaining a remote store.
  • If you have frequent need to run rare searches, SmartStore might not be appropriate for your purposes, as rare searches can require the indexer to copy large amounts of data from remote to local storage, causing a performance impact. This is particularly the case with searches that cover long timespans. If, however, the searches are across recent data and thus the necessary buckets are already in the cache, then there is no performance impact.
  • If you run frequent long lookback searches, you might need to increase your cache size or continue to rely on local storage.

Features not supported by SmartStore

The following capabilities are not available for SmartStore-enabled indexes. Their corresponding settings must use their default values.

  • Tsidx reduction. Do not set enableTsidxReduction to "true". Tsidx reduction modifies bucket contents and is not supported by SmartStore. Note: You can still search any existing buckets that were tsidx-reduced before migration to SmartStore. As with non-SmartStore deployments such searches will likely run slowly. See Reduce tsidx disk usage.
  • Data integrity control feature. SmartStore-enabled indexes are not compatible with the data integrity control feature, described in Manage data integrity in the Securing Splunk Enterprise manual.
  • Disabling bloom filters. Do not set createBloomfilter to "false". Bloom filters play an important role in SmartStore by helping to reduce downloads of tsidx files from remote storage.
  • Changing the location of bloom filters. Do not change bloomHomePath. Bloom filters must remain in their default locations inside their bucket directories.
  • Summary replication. Summary replication is unnecessary with SmartStore, because summaries are uploaded to remote storage after creation and are accessible to all peers in the cluster.
  • Hadoop data roll.
  • Certain other indexes.conf settings are incompatible with SmartStore. See Settings in indexes.conf that are incompatible with SmartStore or otherwise restricted

Current restrictions on SmartStore use

At this time, SmartStore support requires that your indexing tier conform to certain restrictions:

  • No use of report acceleration or data model acceleration summaries. Because of this restriction, SmartStore is currently not compatible with any app, such as Splunk Enterprise Security, that uses these summaries.
  • Available for indexer clusters only. SmartStore is not available for standalone indexers except for limited testing purposes.
  • Replication factor and search factor must be equal (for example, 3/3 or 2/2).
  • The home path and cold path of each index must point to the same partition.
  • Certain other indexes.conf settings are restricted with SmartStore. See Settings in indexes.conf that are incompatible with SmartStore or otherwise restricted
  • A SmartStore-enabled index cannot be converted to non-SmartStore.
PREVIOUS
Add S2 index to cluster
  NEXT
SmartStore architecture overview

This documentation applies to the following versions of Splunk® Enterprise: 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8


Comments

Thanks, DUThibault! I've corrected the typo.

Andrewb splunk, Splunker
April 29, 2019

rebalacing → rebalancing

DUThibault
April 26, 2019

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters