Admin Manual

 


Set a retirement and archiving policy

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Set a retirement and archiving policy

Configure data retirement and archiving policy by controlling the size of indexes or the age of data in indexes.

Splunk stores indexed data in buckets. For a discussion of buckets and how Splunk uses them, see "How Splunk stores indexes".

Splunk index buckets go through four stages of retirement. When indexed data reaches a frozen state, Splunk deletes it. (Splunk deletes all frozen data by default. You must specify an archiving script to avoid losing frozen data.)

Retirement stage Description Searchable?
Hot Open for writing. One or more hot buckets for each index. Yes.
Warm Data rolled from hot. There are many warm buckets. Yes.
Cold Data rolled from warm. There are many cold buckets. Yes.
Frozen Data rolled from cold. Eligible for deletion. N/A: Splunk deletes frozen data by default.

Splunk defines the sizes, locations, and ages of indexes and their buckets in indexes.conf.

Caution: When you change your data retirement and archiving policy settings, Splunk deletes old data without prompting you.

Edit a copy of indexes.conf in $SPLUNK_HOME/etc/system/local/, or in your own custom application directory in $SPLUNK_HOME/etc/apps/. Do not edit the copy in $SPLUNK_HOME/etc/system/default. For information on configuration files and directory locations, see "About configuration files".

Note: To configure data, all index locations must be writable.

Remove files beyond a certain size

If an index grows bigger than a specified maximum size, the oldest data is rolled to frozen, which means it gets immediately deleted unless you have created a script to archive the data, as described in "Archive indexed data". The default maximum size for an index is 500000 MB. To change the maximum size, edit this line in indexes.conf:

maxTotalDataSizeMB = <non-negative number> 

For example:

[main]
maxTotalDataSizeMB = 2500000

Note: Make sure that the data size you specify for maxTotalDataSizeMB is expressed in megabytes.

Restart Splunk for the new setting to take effect. Depending on how much data there is to process, it can take some time for Splunk to begin to move buckets out of the index to conform to the new policy. You might see high CPU usage during this time.

Remove data beyond a certain age

Splunk ages out data by buckets. Specifically, when the most recent data in a particular bucket reaches the configured age, the entire bucket is rolled.

Splunk also rolls buckets when the reach a maximum size. If you are indexing a large volume of events, bucket size is less a concern for retirement policy because the buckets will fill quickly. You can reduce bucket size by setting a smaller maxDataSize in indexes.conf so they roll faster. But note that it takes longer to search more small buckets than fewer large buckets. To get the results you are after, you will have to experiment a bit to determine the right size. Due to the structure of the index, there isn't a direct relationship between time and data size.

To remove data beyond a specified age, set frozenTimePeriodinSecs in indexes.conf to the number of seconds to elapse before the data gets erased. The default value is 188697600 seconds, or approximately 6 years. This example configures Splunk to cull old events from its index when they become more than 180 days (15552000 seconds) old:

[main]
frozenTimePeriodInSecs = 15552000

Note: Make sure that the time you specify for frozenTimePeriodInSecs is expressed in seconds.

Restart Splunk for the new setting to take effect. Depending on how much data there is to process, it can take some time for Splunk to begin to move buckets out of the index to conform to the new policy. You might see high CPU usage during this time.

I changed the archive policy and restarted but it's not working

If you changed your archive policy to be more restrictive because you've run out of disk space, you may notice that events haven't started being archived according to your new policy. This is most likely because you must free up some space so the process has room to run. Stop Splunk, clear out ~5GB of disk space, and then start Splunk again (refer to Start Splunk in this manual for details on stopping and starting Splunk). After a while (exactly how long depends on how much data there is to process) you should see INFO entries about BucketMover in splunkd.log showing that buckets are being archived.

Archive data

If you want to archive your frozen data instead of deleting it, you must create an archiving script, as described in "Archive indexed data". You can later restore the archived data, as described in "Restore archived data".

This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!