Configure the SmartStore cache manager
The cache manager maximizes search efficiency through intelligent management of the local cache. It favors retaining in the cache copies of buckets and files that have a high likelihood of participating in future searches. When the cache fills up, the cache manager removes, or "evicts", copies of buckets that are least likely to participate in future searches.
Since the cache manager removes only the cached copies of buckets, the eviction process does not result in loss of data. The manager copies continue to reside in remote storage,
For details on how the cache manager operates, see The SmartStore cache manager.
Cache manager settings reside in the [cachemanager]
stanza in server.conf. In the case of an indexer cluster, you configure the cache manager on each peer node.
The cache manager operates at the global level, across all indexes on an indexer. Aside from the recency settings, you cannot configure the cache manager on a per-index basis.
Set the cache eviction policy
The eviction_policy
setting in server.conf
determines the cache eviction policy.
Eviction policy | Description |
---|---|
lru (default) | Evict the least recently used bucket. |
lruk | Evict the least recently used bucket, keeping track of the last K references to popular buckets, where K=3. |
clock | Evict the bucket with the oldest events first, unless it has been accessed recently. |
lrlt | Evict the bucket with the oldest events first. |
random | Randomly evict a bucket. |
noevict | Don't evict. |
If you want to use an eviction_policy
other than "lru" or "lruk", consult with Splunk Support first.
Initiate eviction based on occupancy of the cache's disk partition
These settings in server.conf
initiate eviction based on occupancy of the cache's disk partition:
- The
max_cache_size
setting specifies the maximum occupied space, in megabytes, for the disk partition that contains the cache. - The
minFreeSpace
setting specifies the minimum free space, in megabytes, for a partition. - The
eviction_padding
setting controls the amount of additional space, in megabytes, that the cache manager protects, beyond theminFreeSpace
value.
The minFreeSpace
setting is not strictly a cache-specific setting, and therefore it does not reside in the [cachemanager]
stanza, but it nevertheless helps determine cache size limits.
When the occupied space on the cache's partition exceeds max_cache_size
, or the partition's free space falls below (minFreeSpace
+eviction_padding
), the cache manager begins to evict data.
Set cache retention periods based on data recency
You can protect recently indexed data from eviction. You can use this capability in two ways:
- On a global level (across all indexes), to favor recently indexed data over recently used data.
- On a per-index level, to favor data in critical indexes over data in non-critical indexes.
To set cache retention periods based on data recency, use the hotlist_recency_secs
and hotlist_bloom_filter_recency_hours
settings. These settings serve to override the eviction policy. You can scope these settings globally or on a per-index level.
The hotlist_recency_secs setting
The hotlist_recency_secs
setting causes the cache manager to protect buckets that contain recent data over other buckets. The setting determines the cache retention period for warm buckets based on age. When eviction is necessary, the cache manager will not evict buckets until they reach the configured age, unless all other buckets have already been evicted.
The cache manager attempts to defer bucket eviction until all data in the bucket is older than the value of the setting. The setting defaults to 86400 seconds, or 24 hours.
To determine a bucket's age, or "recency", the age of the bucket is calculated by subtracting the time of the bucket's most recent event data from the current time. For example, if the current time (expressed in UTC epoch time) is 1567891234 ( Sep 7 23:20:34 CEST 2019) and the bucket is named db_1567809123_1557891234_10_8A21BEE9-60D4-436B-AA6D-21B68F631A8B (between May 15 05:33:54 CEST 2019 and Sep 7 00:32:03 CEST 2019), thus indicating that the time of the most recent event in the bucket is 1567809123 (Sep 7 00:32:03 CEST 2019), then the bucket's age, in seconds, is 82111 (~23 hours).
Ensure that the cache is of sufficient size to handle the value of this setting. Otherwise, cache eviction cannot function optimally. In other words, do not configure this setting to a size that will cause the cache to retain a quantity of buckets that approach or exceed the size of the cache based on this setting alone. Also, consider the rate of data ingestion and the typical time spans of your searches to determine for how long your recent buckets should remain in cache.
As a best practice, start with a fairly low value for this setting and adjust over time. For example, if the cache size is 100 GB and you typically add 10 GB of new buckets to the indexer in a 24 hour period, configuring this setting to 172800 (48 hours) means that the cache manager will try to keep 20 GB of recent buckets in the cache at all times.
The hotlist_bloom_filter_recency_hours setting
The hotlist_bloom_filter_recency_hours
setting protects certain small metadata files, such as the bloomfilter
file, from eviction. By inspecting such metadata files, the cache manager can sometimes eliminate the need to fetch larger bucket files, such as the rawdata journal and the tsidx
files, from remote storage when handling search requests. See The SmartStore cache manager.
The hotlist_bloom_filter_recency_hours
setting affects the cache retention period for small warm bucket files. The cache manager attempts to defer eviction of the non-journal and non-tsidx bucket files, such as the bloomfilter
file, until the interval between the bucket's latest time and the current time exceeds this setting. This setting defaults to 360 hours, or 15 days.
The recency of a bloomfilter
file is based on its bucket's recency and is calculated in the same manner described for hotlist_recency_secs
.
This setting works in concert with hotlist_recency_secs
, which is designed to be configured for a shorter age. If hotlist_recency_secs
leads to the eviction of a bucket, the bucket's bloomfilter
and associated files will continue to remain in the cache until they reach the age configured with hotlist_bloom_filter_recency_hours
. Thus, the bucket will remain in cache, but without its journal and tsidx
files.
Configure recency globally or for individual indexes
When you configure these settings globally, they override the eviction policy, which, by default, favors buckets that have been recently searched. For example, if hotlist_recency_secs
is set globally to 604800 (7 days), the cache manager will attempt to retain buckets with data that is less than seven days old. It will instead evict older buckets, even if those older buckets were searched more recently. The cache manager will only evict buckets containing data less than seven days old if there are no older buckets to evict.
By configuring the recency settings on a per-index level, you can favor data in critical indexes over data in less critical indexes. Since all SmartStore indexes share the cache and otherwise follow the global cache eviction policy, the per-index recency settings provide the only means to retain data from critical indexes for a longer period than data from less critical indexes.
For example, if you have an index with critical data, such as the ES threat_activity
index, and another index whose data is less critical, such as the default _internal
index, you can set hotlist_recency_secs
to 5184000 (60 days) for threat_activity
, while keeping the default setting of 86400 (1 day) for _internal
. By doing so, you cause the cache manager to favor threat_activity
buckets over _internal
buckets, thus reducing the likelihood that the cache will need to fetch data from the remote store to handle threat_activity
searches.
Configure globally for all indexes
To configure the hotlist_recency_secs
and hotlist_bloom_filter_recency_hours
settings globally, for all SmartStore indexes, you must set them in the [cachemanager]
stanza in server.conf
.
You can override the global settings on a per-index basis.
Configure for individual indexes
To configure the hotlist_recency_secs
and hotlist_bloom_filter_recency_hours
settings on a per-index basis, you must set them in each index's stanza in indexes.conf
.
If you do not configure the settings for a particular SmartStore index, that index inherits the global value from server.conf
.
Set the maximum download and upload rates
The max_concurrent_downloads
setting in server.conf
specifies the maximum number of buckets that can be downloaded simultaneously from remote storage. Its default is 8.
The max_concurrent_uploads
setting in server.conf
specifies the maximum number of buckets that can be uploaded simultaneously to remote storage. Its default is 8.
Configure SmartStore | Configure data retention for SmartStore indexes |
This documentation applies to the following versions of Splunk® Enterprise: 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.1.7, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.2.4, 9.3.0, 9.3.1, 9.3.2
Feedback submitted, thanks!