Admin Manual

 


About the Splunk Admin Manual
How Splunk Works

Back up your data

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Back up your data

Back up your configurations or all your indexed data.


Back up your Splunk configurations only

To back up your configurations, make an archive or copy of $SPLUNK_HOME/etc/ (where $SPLUNK_HOME is the directory into which you installed Splunk, /opt/splunk by default). This directory contains all the default and custom settings for your Splunk install, including your saved searches, user accounts, tags, custom source type names and configuration files.

Copy this directory to a new Splunk instance to restore. You don't have to stop Splunk to do this.


Back up your indexed data

This topic discusses some considerations for planning your Splunk index backup strategy. It first gives an overview of how your indexed data moves through Splunk, then makes recommendations for planning your backup strategy based on this information.

For specific details on changing the default values mentioned in this topic, refer to this topic about setting up data retirement policies. For a discussion of the best practices for for backing up your Splunk data, see "Best practices for backing up" on the Deployment Wiki. For a related discussion of "buckets", and how Splunk uses them, see "Understanding buckets" on the Deployment Wiki.

How data moves through Splunk

When Splunk is indexing, the data moves through a series of stages based on policies that you define. At a high level, the default behavior is as follows:

When data is first indexed, it is put into the hot database, also known as the hot db.

The data remains in the hot db until the policy conditions are met for it to be reclassified as warm data. This is called rolling the data into the warm db. By default, this happens when the hot db reaches a specified size, but you can set up a saved search to force it to happen on a schedule, or better still, write a script to force it to happen on a schedule from the Splunk CLI. Some details on doing this are given a little later in this topic.

When the hot db is rolled, its directory is renamed to be a bucket in the warm db, and a new hot db is created immediately to receive the new data being indexed. At this point, it is safe to back up the warm db buckets.

Next, when you get to a specified number of warm buckets (the default value is 300 buckets), buckets are renamed to be cold buckets to maintain 300 warm buckets. (If your cold db is located on another fileshare, the warm buckets are moved to it and then deleted from the warm db directory.) Be aware that the more warm buckets you have, the more places Splunk has to look to execute searches, so adjust this setting accordingly.

Finally, when your data meets the policy requirements defined, it is frozen. The default setting for this is to delete them. If you need to save data indefinitely, you must change this setting.

Summary:

Choose your backup strategy

The general recommendation is to schedule backups of your warm db buckets regularly using the incremental backup utility of your choice.

Splunk's default policy is to roll the hot db to the warm db based on the policy you define. By default, this policy is set to roll the data when your hot db reaches a certain size. If your indexing volume is fairly low, Splunk's default 'rolling' policy means that your hot db will be rolled to warm very infrequently. This means that if you experience a disk failure in which you lose your hot db, you could lose a lot of un-backed-up data from your hot db.

If you're concerned about losing data in this fashion, you can configure Splunk to force a roll from hot to warm on whatever schedule you're comfortable with. and then schedule your backup utility to back up the warm db immediately after that.

You should note, however, that if you roll too frequently, you might experience a degradation in search speed, as well as use more disk space than you otherwise would. Every time data is rolled from hot to warm, a new 'bucket' is created, which means that searches have to look in more buckets to see all the data. As a result, Splunk recommends that you roll no more frequently than once a day. Tune this to suit your particular data retention, search performance, and backup needs.

If your environment requires that you back up more than once a day, you can deploy Splunk in an HA configuration where forwarders are configured to send all your data to two different Splunk indexers, and use the second one as your hot backup.

Rolling from the CLI

You can use the following syntax to force a roll of the hot db to warm:

./splunk search '| oldsearch !++cmd++::roll' -auth admin:changeme

This will roll the default index, which is typically main.

You can specify an index to be rolled like this:

./splunk search ' | oldsearch index=_internal !++cmd++::roll' -auth admin:changeme

You'll always see an error about Search Execute failed because Hot db rolled out to warm right afterwards; you can safely ignore it. You'll also need to provide the admin password to execute this CLI command.

If you want to roll more than one index, you have to do them each separately. To list out your indexes, use ./splunk list index

Recommendations for recovery

If you experience a non-catastrophic disk failure (for example you still have some of your data, but Splunk won't run), Splunk recommends that you move the index directory aside and restore from the backup rather than restoring on top of a partially corrupted datastore. Splunk will automatically create the db-hot directories on startup and resume indexing. Monitored files and directories will pick up where they were at the time of the backup.

Before you Restore

If you restore a full /opt/splunk backup, check these two items before starting the new instance.

License key (Splunk Professional)

Your backup may include an expired license key in $SPLUNK_HOME/etc/splunk.license. Install a current one or get a temporary evaluation key from splunk.com if you don't have one.

Active input configurations

If you don't want your restored Splunk Server to instantly begin adding new data to its index, move any active inputs.conf files out of the way before starting the server. This is useful if you want to revisit an old index without having new events added to it.

# mv $SPLUNK_HOME/etc/system/local/input.conf $SPLUNK_HOME/etc/system/local/input.conf.disabled
# mv $SPLUNK_HOME/etc/system/default/input.conf $SPLUNK_HOME/etc/system/default/input.conf.disabled
# splunk start

This documentation applies to the following versions of Splunk: 3.3 , 3.3.1 , 3.3.2 , 3.3.3 , 3.3.4 , 3.4 , 3.4.1 , 3.4.2 , 3.4.3 , 3.4.5 , 3.4.6 , 3.4.8 , 3.4.9 , 3.4.10 , 3.4.11 , 3.4.12 , 3.4.13 , 3.4.14 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!