Admin Manual

 


Back up indexed data

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Back up indexed data

This topic discusses backing up Splunk indexed data. It first gives an overview of how your indexed data moves through Splunk, then describes a basic backup strategy based on common or default Splunk index configurations. Finally, it provides options for setting or changing the retirement policy for your Splunk index data.

The default values and policies described in this topic are set in indexes.conf. If you have a more complex index configuration, or have unusual data volumes, you can refer there for detailed information and options. Before modifying any configuration file, read "About configuration files".

For more information on backing up indexed data, see "Best practices for backing up" on the Community Wiki.

For information on setting a data retirement and archiving policy, see "Set a retirement and archiving policy".

How data ages

When Splunk is indexing, the data moves through a series of stages based on policies that you define. At a high level, the default behavior is as follows:

When data is first indexed, it is put into a "hot" database, or bucket.

The data remains in the hot bucket until the policy conditions are met for it to be reclassified as "warm" data. This is called "rolling" the data into the warm bucket. By default, this happens when a hot bucket reaches a specified size or age. When a hot bucket is rolled, its directory is renamed, and it becomes a warm bucket. It is safe to back up the warm buckets.

Next, when you reach a specified number of warm buckets, the oldest bucket becomes a cold bucket, thus maintaining a constant number of warm buckets. (If your colddb directory is located on another fileshare, the buckets are moved there and deleted from the warm db directory.) The default number of warm buckets is 300.

Finally, at a time based on your defined policy requirements, the bucket will roll from cold to "frozen". By default, Splunk deletes frozen buckets. If you need to archive or otherwise preserve the data, you can provide a script that performs actions on the bucket prior to deletion.

Summary:

  • hot bucket - Currently written to; non-incrementally changing; do not back this up.
  • warm bucket - Rolled from hot; added to incrementally; can be safely backed up; consists of multiple warm buckets.
  • cold bucket - Rolled from warm; buckets are moved to another location.
  • frozen bucket - Default policy is to delete.

For detailed information on how buckets work and where they are stored, see "How Splunk stores indexes".

Choose your backup strategy

The general recommendation is to schedule backups of your warm buckets regularly, using the incremental backup utility of your choice.

Hot buckets can only be backed up by taking a snapshot of the files, using a tool like Volume Shadow Copy Services (on Windows/NTFS), ZFS snapshots (on ZFS), or a snapshot facility provided by the storage subsystem. If you do not have such a facility available, the data within the hot bucket can only be backed up after it has rolled to a warm bucket.

Splunk rolls a hot bucket to a warm bucket based on the policy defined in indexes.conf. By default, the main index rolls a hot bucket when it reaches a certain size. (While it is possible to force a roll of a hot bucket to a warm bucket, this is not recommended, as each forced roll permanently decreases search performance over the data. In cases where hot data needs to be backed up, a snapshot backup is the preferred method.)

You can set retirement and archiving policy by controlling the size of indexes or buckets or the age of the data.

The sizes, locations, and ages of index files are set in indexes.conf. See "How Splunk stores indexes" for detailed information on buckets and indexes.conf.

Caution: All index locations must be writable.

Recommendations for recovery

If you experience a non-catastrophic disk failure (for example you still have some of your data, but Splunk won't run), Splunk recommends that you move the index directory aside and restore from a backup rather than restoring on top of a partially corrupted datastore. Splunk will automatically create hot directories on startup as necessary and resume indexing. Monitored files and directories will pick up where they were at the time of the backup.

Rolling buckets manually from hot to warm

To roll the buckets of an index manually from hot to warm, use the following command, replacing <index_name> with the name of the index you want to roll:

From the CLI

./splunk _internal call /data/indexes/<index_name>/roll-hot-buckets –auth <admin_username>:<admin_password>

From the search bar

This has been deprecated and cannot be used from the search bar any longer

This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!