
Restore archived indexed data
You restore archived data by moving the archived bucket into your thawed directory (for example, $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb
) and then processing it, as described later in this topic. Data in thaweddb
is not subject to the server's index aging scheme (hot > warm> cold > frozen). You can put archived data in the thawed directory for as long as you need it. When the data is no longer needed, simply delete it or move it out of thawed.
Important: You restore archived data differently depending on whether it was originally indexed in Splunk Enterprise version 4.2 or later. This is because Splunk Enterprise changed its rawdata format in 4.2.
See "Archive indexed data" for information on how to archive data in the first place. You can also use that page as guidance if you want to re-archive data after you've thawed it.
Restored data does not count against your license.
Restrictions when restoring an archive to a different instance of the indexer
For the most part, you can restore an archive to any instance of the indexer, not just the one that originally indexed it. This, however, depends on a couple of factors:
- Splunk Enterprise version. You cannot restore a bucket created by Splunk Enterprise 4.2 or later to a pre-4.2 indexer. The bucket data format changed between 4.1 and 4.2, and pre-4.2 indexers do not understand the new format. This means:
- 4.2+ buckets: You can restore a 4.2+ bucket to any 4.2+ instance.
- Pre-4.2 buckets: You can restore a pre-4.2 bucket to any indexer, pre-4.2 or post-4.2, aside from a few OS-related issues, described in the next bullet.
- OS version. You can usually restore buckets to an indexer running on a different OS. Specifically:
- 4.2+ buckets: You can restore a 4.2+ bucket to an indexer running any operating system.
- Pre-4.2 buckets: You can restore a pre-4.2 bucket to an indexer running any operating system, with the restriction that you cannot restore pre-4.2 data to a system of different endian-ness. For example, data generated on 64-bit systems is not likely to work well on 32-bit systems, and data cannot be moved from PowerPC or Sparc systems to x86 or x86-64 systems, and vice versa.
In addition, make sure that you do not introduce bucket ID conflicts to your index when restoring the archived bucket. This issue is discussed later.
How to tell whether your archive bucket contains 4.2+ data
Before thawing the archive bucket, you need to identify whether the archive bucket is pre- or post-4.2. Here's how to tell the difference, assuming you archived the buckets using coldToFrozenDir
or the provided example script:
- 4.2+ bucket: The bucket directory contains only the rawdata directory, which contains
journal.gz
. - Pre-4.2 bucket: The bucket directory contains gzipped versions of
.tsidx
and.data
files, along with a rawdata directory containing files named<int>.gz
.
Important: If you archived the data through some script of your own, the resulting bucket could contain just about anything.
If you archived the buckets using coldToFrozenDir
or the provided example script, you can use the following procedures to thaw them.
Thaw a 4.2+ archive
*nix users
Here is an example of safely restoring a 4.2+ archive bucket to thawed:
1. Copy your archive bucket into the thawed directory:
cp -r db_1181756465_1162600547_1001 $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb
Note: The bucket id cannot conflict with any other bucket in the index. This example assumes that the bucket id '1001' is unique for the index. If it isn't, choose some other, non-conflicting bucket ID.
2. Execute the splunk rebuild
command on the archive bucket to rebuild the indexes and associated files:
splunk rebuild $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/db_1181756465_1162600547_1001
3. Restart the indexer:
splunk restart
Windows users
Here is an example of safely restoring a 4.2+ archive bucket to thawed:
1. Copy your archive bucket into the thawed directory:
xcopy D:\MyArchive\db_1181756465_1162600547_1001 %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\db_1181756465_1162600547_1001 /s /e /v
Note: The bucket id cannot conflict with any other bucket in the index. This example assumes that the bucket id '1001' is unique for the index. If it isn't, choose some other, non-conflicting bucket ID.
2. Execute the splunk rebuild
command on the archive bucket to rebuild the indexes and associated files:
splunk rebuild %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\db_1181756465_1162600547_1001
3. Restart the indexer:
splunk restart
Thaw a pre-4.2 archive
*nix users
Here is an example of safely restoring a pre-4.2 archive bucket to thawed:
1. Copy your archive bucket to a temporary location in the thawed directory:
# cp -r db_1181756465_1162600547_0 $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/temp_db_1181756465_1162600547_0
2. If the bucket was compressed when originally archived, uncompress the contents in the thawed directory.
3. Rename the temporary bucket to something that the indexer will recognize:
# cd $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/ # mv temp_db_1181756465_1162600547_0 db_1181756465_1162600547_1001
Note: You must choose a bucket id that does not conflict with any other bucket in the index. This example assumes that the bucket id '1001' is unique for the index. If it isn't, choose some other, non-conflicting bucket ID.
4. Refresh the manifests:
# cd $SPLUNK_HOME/bin # ./splunk login # ./splunk _internal call /data/indexes/main/rebuild-metadata-and-manifests
After a few moments, the contents of your newly thawed bucket should be searchable again.
Windows users
Here is an example of safely restoring a pre-4.2 archive bucket to thawed:
1. Copy your archive bucket to the thawed directory:
> xcopy D:\MyArchive\db_1181756465_1162600547_0 %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\temp_db_1181756465_1162600547_0 /s /e /v
2. If the bucket was compressed when originally archived, uncompress the contents in the thawed directory.
3. Rename the temporary bucket to something that the indexer will recognize:
> cd %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb > move temp_db_1181756465_1162600547_0 db_1181756465_1162600547_1001
Note: You must choose a bucket id that does not conflict with any other bucket in the index. This example assumes that the bucket id '1001' is unique for the index. If it isn't, choose some other, non-conflicting bucket ID.
4. Refresh the manifests:
> cd %SPLUNK_HOME%\bin > splunk login > splunk _internal call /data/indexes/main/rebuild-metadata-and-manifests
After a few moments, the contents of your newly thawed bucket should be searchable again.
Clustered data thawing
You can thaw archived clustered data onto individual peer nodes the same way that you thaw data onto any individual indexer. However, as described in "Archive indexed data", it is difficult to archive just a single copy of clustered data in the first place. If, instead, you archive data across all peer nodes in a cluster, you can later thaw the data, placing the data into the thawed directories of the peer nodes from which it was originally archived. You will end up with replication factor copies of the thawed data on your cluster, since you are thawing all of the original data, including the copies.
Note: Data does not get replicated from the thawed directory. So, if you thaw just a single copy of some bucket, instead of all the copies, only that single copy will reside in the cluster, in the thawed directory of the peer node where you placed it.
PREVIOUS Archive indexed data |
NEXT About clusters and index replication |
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.5.10, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 6.6.11, 6.6.12, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9
Comments
Instead of restarting an indexer(s) after a restore, is there an API / REST call that can be made to avoid disrupting a cluster?
Thawing one or two buckets at a time is very slow and Splunk doesn't allow you to selectively rebuild buckets (based off index or time range), neither does it rebuild indexes for multiple buckets concurrently, which really slows things down ALOT. See this answer for a script that solves these problems: http://answers.splunk.com/answers/120007/thawing-out-multiple-buckets-at-once.html#answer-246439
The 4.2+ procedures have been updated to remove the "temp_" prefix from the bucket name.
This does not appear to work on buckets that exceed frozenTimePeriodInSecs if they have the tmp_ prefix (verified on windows 7). In my experience you will either need to increase this to match the bucket's contents or change the name to a standard name (db_([0-9]{10}_?){2}_[0-9]+) to make splunk ignore it as thawed data.
Tysonstewart - When you initially copied the archive bucket to the thaweddb directory, did you rename the copied bucket to use the "temp_" prefix? Because of some formatting issues, the wiki cuts off the right side of the command (although you can see it if you scroll horizontally). The full command should be:<br /><br /># cp -r db_1181756465_1162600547_0 $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/temp_db_1181756465_1162600547_0
Having just done this for a 64-bit Linux host running Splunk 6.0, there are two things I learned:<br />1. The command "splunk rebuild" did not work if the bucket had a "temp_" prefix as in the instructions. It did work if the directory was the normal bucket name.<br />2. Don't try to thaw buckets generated on a 64-bit host on a 32-bit host. It might appear to work at first, but the lexicon files won't get generated properly, so searching won't actually work.
Michaeljlancaster -
Unfortunately, the restart is necessary.