Restore archived indexed data

You restore archived data by moving the archived bucket into your thawed directory (for example, $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb) and then processing it, as described later in this topic. Data in thaweddb is not subject to the server's index aging scheme (hot > warm> cold > frozen). You can put archived data in the thawed directory for as long as you need it. When the data is no longer needed, simply delete it or move it out of thawed.

Important: You restore archived data differently depending on whether it was originally indexed in Splunk Enterprise version 4.2 or later. This is because Splunk Enterprise changed its rawdata format in 4.2.

See "Archive indexed data" for information on how to archive data in the first place. You can also use that page as guidance if you want to re-archive data after you've thawed it.

Restored data does not count against your license.

Restrictions when restoring an archive to a different instance of the indexer

For the most part, you can restore an archive to any instance of the indexer, not just the one that originally indexed it. This, however, depends on a couple of factors:

Splunk Enterprise version. You cannot restore a bucket created by Splunk Enterprise 4.2 or later to a pre-4.2 indexer. The bucket data format changed between 4.1 and 4.2, and pre-4.2 indexers do not understand the new format. This means:
- 4.2+ buckets: You can restore a 4.2+ bucket to any 4.2+ instance.
- Pre-4.2 buckets: You can restore a pre-4.2 bucket to any indexer, pre-4.2 or post-4.2, aside from a few OS-related issues, described in the next bullet.

OS version. You can usually restore buckets to an indexer running on a different OS. Specifically:
- 4.2+ buckets: You can restore a 4.2+ bucket to an indexer running any operating system.
- Pre-4.2 buckets: You can restore a pre-4.2 bucket to an indexer running any operating system, with the restriction that you cannot restore pre-4.2 data to a system of different endian-ness. For example, data generated on 64-bit systems is not likely to work well on 32-bit systems, and data cannot be moved from PowerPC or Sparc systems to x86 or x86-64 systems, and vice versa.

In addition, make sure that you do not introduce bucket ID conflicts to your index when restoring the archived bucket. This issue is discussed later.

How to tell whether your archive bucket contains 4.2+ data

Before thawing the archive bucket, you need to identify whether the archive bucket is pre- or post-4.2. Here's how to tell the difference, assuming you archived the buckets using coldToFrozenDir or the provided example script:

4.2+ bucket: The bucket directory contains only the rawdata directory, which contains journal.gz.
Pre-4.2 bucket: The bucket directory contains gzipped versions of .tsidx and .data files, along with a rawdata directory containing files named <int>.gz.

Important: If you archived the data through some script of your own, the resulting bucket could contain just about anything.

If you archived the buckets using coldToFrozenDir or the provided example script, you can use the following procedures to thaw them.

Thaw a 4.2+ archive

*nix users

Here is an example of safely restoring a 4.2+ archive bucket to thawed:

1. Copy your archive bucket into the thawed directory:

cp -r db_1181756465_1162600547_1001 $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb

Note: The bucket id cannot conflict with any other bucket in the index. This example assumes that the bucket id '1001' is unique for the index. If it isn't, choose some other, non-conflicting bucket ID.

2. Execute the splunk rebuild command on the archive bucket to rebuild the indexes and associated files:

splunk rebuild $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/db_1181756465_1162600547_1001

3. Restart the indexer:

splunk restart

Windows users

Here is an example of safely restoring a 4.2+ archive bucket to thawed:

1. Copy your archive bucket into the thawed directory:

xcopy D:\MyArchive\db_1181756465_1162600547_1001 %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\db_1181756465_1162600547_1001 /s /e /v

Note: The bucket id cannot conflict with any other bucket in the index. This example assumes that the bucket id '1001' is unique for the index. If it isn't, choose some other, non-conflicting bucket ID.

2. Execute the splunk rebuild command on the archive bucket to rebuild the indexes and associated files:

splunk rebuild %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\db_1181756465_1162600547_1001

3. Restart the indexer:

splunk restart

Thaw a pre-4.2 archive

*nix users

Here is an example of safely restoring a pre-4.2 archive bucket to thawed:

1. Copy your archive bucket to a temporary location in the thawed directory:

# cp -r db_1181756465_1162600547_0  $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/temp_db_1181756465_1162600547_0

2. If the bucket was compressed when originally archived, uncompress the contents in the thawed directory.

3. Rename the temporary bucket to something that the indexer will recognize:

# cd $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/
# mv temp_db_1181756465_1162600547_0 db_1181756465_1162600547_1001

Note: You must choose a bucket id that does not conflict with any other bucket in the index. This example assumes that the bucket id '1001' is unique for the index. If it isn't, choose some other, non-conflicting bucket ID.

4. Refresh the manifests:

# cd $SPLUNK_HOME/bin
# ./splunk login
# ./splunk _internal call /data/indexes/main/rebuild-metadata-and-manifests

After a few moments, the contents of your newly thawed bucket should be searchable again.

Windows users

Here is an example of safely restoring a pre-4.2 archive bucket to thawed:

1. Copy your archive bucket to the thawed directory:

> xcopy D:\MyArchive\db_1181756465_1162600547_0 %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\temp_db_1181756465_1162600547_0 /s /e /v

2. If the bucket was compressed when originally archived, uncompress the contents in the thawed directory.

3. Rename the temporary bucket to something that the indexer will recognize:

> cd %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb
> move temp_db_1181756465_1162600547_0 db_1181756465_1162600547_1001

Note: You must choose a bucket id that does not conflict with any other bucket in the index. This example assumes that the bucket id '1001' is unique for the index. If it isn't, choose some other, non-conflicting bucket ID.

4. Refresh the manifests:

> cd %SPLUNK_HOME%\bin
> splunk login
> splunk _internal call /data/indexes/main/rebuild-metadata-and-manifests

After a few moments, the contents of your newly thawed bucket should be searchable again.

Clustered data thawing

You can thaw archived clustered data onto individual peer nodes the same way that you thaw data onto any individual indexer. However, as described in "Archive indexed data", it is difficult to archive just a single copy of clustered data in the first place. If, instead, you archive data across all peer nodes in a cluster, you can later thaw the data, placing the data into the thawed directories of the peer nodes from which it was originally archived. You will end up with replication factor copies of the thawed data on your cluster, since you are thawing all of the original data, including the copies.

Note: Data does not get replicated from the thawed directory. So, if you thaw just a single copy of some bucket, instead of all the copies, only that single copy will reside in the cluster, in the thawed directory of the peer node where you placed it.

Related answers from Splunk Community

Restore archived indexed data

Restrictions when restoring an archive to a different instance of the indexer

How to tell whether your archive bucket contains 4.2+ data

Thaw a 4.2+ archive

*nix users

Windows users

Thaw a pre-4.2 archive

*nix users

Windows users

Clustered data thawing

Comments

Restore archived indexed data

Was this topic useful?