Set a retirement and archiving policy

Note: Most of this topic is not relevant to SmartStore indexes. See Configure data retention for SmartStore indexes.

Configure data retirement and archiving policy by controlling the size of indexes or the age of data in indexes.

The indexer stores indexed data in directories called buckets. Buckets go through four stages of retirement. When indexed data reaches the final, frozen state, the indexer removes it from the index. You can configure the indexer to archive the data when it freezes, instead of deleting it entirely. See "Archive indexed data" for details.

Bucket stage	Description	Searchable?
Hot	Contains newly indexed data. Open for writing. One or more hot buckets for each index.	Yes
Warm	Data rolled from hot. There are many warm buckets.	Yes
Cold	Data rolled from warm. There are many cold buckets.	Yes
Frozen	Data rolled from cold. The indexer deletes frozen data by default, but you can also archive it. Archived data can later be thawed.	No

You configure the sizes, locations, and ages of indexes and their buckets by editing indexes.conf, as described in "Configure index storage".

Caution: When you change your data retirement and archiving policy settings, the indexer can delete old data without prompting you.

Set attributes for cold to frozen rolling behavior

The maxTotalDataSizeMB and frozenTimePeriodInSecs attributes in indexes.conf help determine when buckets roll from cold to frozen. These attributes are described in detail below.

Freeze data when an index grows too large

You can use the size of an index to determine when data gets frozen and removed from the index. If an index grows larger than its maximum specified size, the oldest data is rolled to the frozen state.

The default maximum size for an index is 500,000MB. To change the maximum size, edit the maxTotalDataSizeMB attribute in indexes.conf. For example, to specify the maximum size as 250,000MB:

[main]
maxTotalDataSizeMB = 250000

Specify the size in megabytes.

Restart the indexer for the new setting to take effect. Depending on how much data there is to process, it can take some time for the indexer to begin to move buckets out of the index to conform to the new policy. You might see high CPU usage during this time.

This setting works with frozenTimePeriodInSecs to determine when data gets frozen. Data rolls to frozen when either setting is reached.

If maxTotalDataSizeMB is reached before frozenTimePeriodInSecs, data will be rolled to frozen before the configured time period has elapsed. If archiving policy has not been properly configured, unintended data loss can occur.

Freeze data when it grows too old

You can use the age of data to determine when a bucket gets rolled to frozen. When the most recent data in a particular bucket reaches the configured age, the entire bucket is rolled.

To specify the age at which data freezes, edit the frozenTimePeriodInSecs attribute in indexes.conf. This attribute specifies the number of seconds to elapse before data gets frozen. The default value is 188697600 seconds, or approximately 6 years. This example configures the indexer to cull old events from its index when they become more than 180 days (15552000 seconds) old:

[main]
frozenTimePeriodInSecs = 15552000

Specify the time in seconds.

Depending on how much data there is to process, it can take some time for the indexer to begin to move buckets out of the index to conform to the new policy. You might see high CPU usage during this time.

Archive data

If you want to archive frozen data instead of deleting it entirely, you must tell the indexer to do so, as described in "Archive indexed data". You can create your own archiving script or you can just let the indexer handle the archiving for you. You can later restore ("thaw") the archived data, as described in "Restore archived data".

Other ways that buckets age

There are a number of other conditions that can cause buckets to roll from one stage to another, some of which can also trigger deletion or archiving. These are all configurable, as described in "Configure index storage". For a full understanding of all your options for controlling retirement policy, read that topic and look at the indexes.conf spec file.

For example, the indexer rolls buckets when they reach their maximum size. You can reduce bucket size by setting a smaller maxDataSize in indexes.conf so they roll faster. But note that it takes longer to search more small buckets than fewer large buckets. To get the results you are after, you will have to experiment a bit to determine the right size for your buckets.

Troubleshoot the archive policy

I ran out of disk space so I changed the archive policy, but it's still not working

If you changed your archive policy to be more restrictive because you've run out of disk space, you may notice that events haven't started being archived according to your new policy. This is most likely because you must first free up some space so the process has room to run. Stop the indexer, clear out ~5GB of disk space, and then start the indexer again. After a while (exactly how long depends on how much data there is to process) you should see INFO entries about BucketMover in splunkd.log showing that buckets are being archived.

Related answers from Splunk Community

Set a retirement and archiving policy

Set attributes for cold to frozen rolling behavior

Freeze data when an index grows too large

Freeze data when it grows too old

Archive data

Other ways that buckets age

Troubleshoot the archive policy

I ran out of disk space so I changed the archive policy, but it's still not working

Comments

Set a retirement and archiving policy

Was this topic useful?