Splunk® Enterprise

Managing Indexers and Clusters of Indexers

Download manual as PDF

Download topic as PDF

Set a retirement and archiving policy

Configure data retirement and archiving policy by controlling the size of indexes or the age of data in indexes.

The indexer stores indexed data in directories called buckets. Buckets go through four stages of retirement. When indexed data reaches the final, frozen state, the indexer removes it from the index. You can configure the indexer to archive the data when it freezes, instead of deleting it entirely. See "Archive indexed data" for details.

Bucket stage Description Searchable?
Hot Contains newly indexed data. Open for writing. One or more hot buckets for each index. Yes
Warm Data rolled from hot. There are many warm buckets. Yes
Cold Data rolled from warm. There are many cold buckets. Yes
Frozen Data rolled from cold. The indexer deletes frozen data by default, but you can also archive it. Archived data can later be thawed. No

You configure the sizes, locations, and ages of indexes and their buckets by editing indexes.conf, as described in "Configure index storage".

Caution: When you change your data retirement and archiving policy settings, the indexer can delete old data without prompting you.

Set attributes for cold to frozen rolling behavior

The maxTotalDataSizeMB and frozenTimePeriodInSecs attributes in indexes.conf help determine when buckets roll from cold to frozen. These attributes are described in detail below.

Freeze data when an index grows too large

You can use the size of an index to determine when data gets frozen and removed from the index. If an index grows larger than its maximum specified size, the oldest data is rolled to the frozen state.

The default maximum size for an index is 500,000MB. To change the maximum size, edit the maxTotalDataSizeMB attribute in indexes.conf. For example, to specify the maximum size as 250,000MB:

[main]
maxTotalDataSizeMB = 250000

Specify the size in megabytes.

Restart the indexer for the new setting to take effect. Depending on how much data there is to process, it can take some time for the indexer to begin to move buckets out of the index to conform to the new policy. You might see high CPU usage during this time.

This setting takes precedence over frozenTimePeriodInSecs. If maxTotalDataSizeMB is reached before frozenTimePeriodInSecs, data will be rolled to frozen before the configured time period has elapsed. If archiving has not been configured, unintended data loss can occur.

Freeze data when it grows too old

You can use the age of data to determine when a bucket gets rolled to frozen. When the most recent data in a particular bucket reaches the configured age, the entire bucket is rolled.

To specify the age at which data freezes, edit the frozenTimePeriodInSecs attribute in indexes.conf. This attribute specifies the number of seconds to elapse before data gets frozen. The default value is 188697600 seconds, or approximately 6 years. This example configures the indexer to cull old events from its index when they become more than 180 days (15552000 seconds) old:

[main]
frozenTimePeriodInSecs = 15552000

Specify the time in seconds.

Restart the indexer for the new setting to take effect. Depending on how much data there is to process, it can take some time for the indexer to begin to move buckets out of the index to conform to the new policy. You might see high CPU usage during this time.

Archive data

If you want to archive frozen data instead of deleting it entirely, you must tell the indexer to do so, as described in "Archive indexed data". You can create your own archiving script or you can just let the indexer handle the archiving for you. You can later restore ("thaw") the archived data, as described in "Restore archived data".

Other ways that buckets age

There are a number of other conditions that can cause buckets to roll from one stage to another, some of which can also trigger deletion or archiving. These are all configurable, as described in "Configure index storage". For a full understanding of all your options for controlling retirement policy, read that topic and look at the indexes.conf spec file.

For example, the indexer rolls buckets when they reach their maximum size. You can reduce bucket size by setting a smaller maxDataSize in indexes.conf so they roll faster. But note that it takes longer to search more small buckets than fewer large buckets. To get the results you are after, you will have to experiment a bit to determine the right size for your buckets.

Troubleshoot the archive policy

I ran out of disk space so I changed the archive policy, but it's still not working

If you changed your archive policy to be more restrictive because you've run out of disk space, you may notice that events haven't started being archived according to your new policy. This is most likely because you must first free up some space so the process has room to run. Stop the indexer, clear out ~5GB of disk space, and then start the indexer again. After a while (exactly how long depends on how much data there is to process) you should see INFO entries about BucketMover in splunkd.log showing that buckets are being archived.

PREVIOUS
Back up indexed data
  NEXT
Archive indexed data

This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18, 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 7.0.0, 7.0.1, 7.0.2, 7.0.3


Comments

Garethatiag - Thank you for the heads-up on that issue. I'll look into it in more detail and then update this page to provide better guidance.

Sgoodman, Splunker
August 31, 2016

One item not clearly discussed here is that when using size based retention, the bucket with the oldest data is the one to be rolled to frozen if the size limits are hit.
This can become an issue if you have incorrectly parsed timestamps on data going into your index which are based on a past time, for example the data is parsed as events from 2014, however at the same time the bucket may receive correctly timestamped events.

Due to the size limit (this would not happen with frozenTimePeriodInSecs as it rolls based on the newest data in the bucket), the bucket with the oldest data in it (which might also have current data) will be rolled to frozen.
I found this out through experience so be careful with timestamp parsing of data!

Garethatiag
August 31, 2016

Ibondarets - It sounds like you just want to base your retention policy on maximum index size and not age of data. One way to do this, on a per index basis, would be to set maxTotalDataSizeMB to a maximum size of, say, 500,000 MB (500 GB) - going with your example - and then set the frozenTimePeriodInSecs attribute to some value large enough that it will never be the cause of the data freezing.

If you want to set disk size across the set of indexes, rather than on a per index basis, you can use volumes. See http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Configureindexstoragesize

Sgoodman, Splunker
May 17, 2016

Is it possible to set a priority of disk space over age? For example i want to limit index by 50Gb max and 90 days old. But if 90 days passed and there are still enough space - then do not delete that data. So in this case I can have 100 days of data, for example, but Index never grows more than 50Gb.

Ibondarets
May 17, 2016

Lvidali, I might be mistaken, but Splunk appears to use the date of indexing, not the inner timestamp of data, for retirement and archiving policy. Few days ago I've indexed 10 years of postfix logs and it's fully searchable, meaning it's not frozen even if oldest log entry is 4 years beyond frozenTimePeriodInSecs.

Patpro
September 29, 2014

Hi,<br />sometimes happens that when we add a new source, the log collected from this source starts from a period older that the timestamp of the events stored in a bucket.<br />For example: the bucket contains log from 04/07/2014 to "now", I add a new source a this source has log since 01/01/2013 to "now".. In this case the oldest log in the bucket is not 04/07/2014 but 01/01/2013.. If I have a retirement policy that frozen the log older than 6 month, what happens to this bucket? Is the retirement policy looks at the original event timestamp or the received timestamp to decide to delete the bucket based on the retirement policy?<br />Many thanks

Lvidali
July 4, 2014

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters