Archive cold buckets to frozen in Hadoop

Data is aged locally on every indexer. The way you configure your index determines the data size or age at which the data to moves to the next state (hot, warm, cold, frozen) and is ultimately deleted.

Once you configure an index to archive data, the archiving of indexes runs on a schedule that is determined globally on the Splunk search head.

When both processes occur, a disconnect can occur between the indexer's local processes and the archiving process. As a result, the indexers can delete a bucket before it's been archived.

To avoid buckets from being deleted you can use the the splunk_archiver app coldToFrozen.sh script on the local indexer process. This script shifts the responsibility for deleting buckets from the indexer to Hadoop Data Roll, so only use this script for indexes that are being archived.

Consider the coldToFrozen.sh script as a fallback and not your primary hook for archiving. This script buys you more time when either your system is receiving data faster than normal, or when the archiving storage layer is down, so that you'll have more time to archive a given bucket. To facilitate this further, for each archive index you can set your vix.output.buckets.older.than = seconds as low as possible, so that buckets are archived as quickly as possible.

Configure the cold bucket to roll to frozen

Note the following if you are using the coldToFrozen.sh script:

The script must be installed on each stanza which configures an index that is being archived.
All the search peers to the search head must have the script installed. You can do each peer manually or use the deployer for search head clusters.
The script must be removed from any index for which you disable archiving. Otherwise, the script will continue to run and the data will overfill your existing disk space because there is no archive to receive that data (and thus it will not get deleted).
Do not add this script to any indexers that are not configured to archive data.

For each Splunk index, use the provided script located in $SPLUNK_HOME/etc/apps/splunk_archiver/bin/ and named coldToFrozen.sh to archive your cold data to frozen. This path may very depending upon your configuration path. For example:

[<index name>]
coldToFrozenScript = "$SPLUNK_HOME/etc/apps/splunk_archiver/bin/coldToFrozen.sh"

Related answers from Splunk Community

Archive cold buckets to frozen in Hadoop

Configure the cold bucket to roll to frozen

Comments

Was this topic useful?