Splunk Cloud

Splunk Cloud Admin Manual

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Store expired Splunk Cloud data to a Splunk-managed archive

You might need to maintain older data to access it or for compliance purposes. Dynamic Data Active Archive allows you to move your data from your Splunk Cloud indexes to a Splunk-maintained archive. You specify archiving at the index level; i.e., you create an archiving rule for a specified index. This allows you the flexibility to only archive the data that you need to archive. You can configure Splunk Cloud to automatically archive the data from an index when the data reaches the end of the Splunk Cloud searchable retention period for an index. In addition, you can restore archived data to your Splunk Cloud environment for searching within the configured archival retention time period. Restored data can be manually cleared or auto-expires from searchable storage after 30 days. You can also track archived and restored storage consumption relative to entitlement as well the growth and expiration of your archived data. Dynamic Data Active Archive moves data from Splunk Index to a Splunk-maintained archive, and subsequently back from Splunk-maintained archive to the Splunk Index in a secure and tamper-resistant manner.

Dynamic Data Active Archive is not available in the following GCP regions: Iowa, London, and Singapore (GCP).

How Dynamic Data Active Archive works

Data is moved to the archive when the index meets a configured time threshold. When that threshold is met, Splunk Cloud attempts to move the data to the archive location. If an error occurs, if there are connection issues, Splunk Cloud attempts to move the data every 15 minutes until it can successfully move it.

It can take up to 48 hours from the archive initiation for the archiving process to complete.

If an error occurs, the error is logged to the splunkd.log. Splunk Cloud does not delete data from the Splunk Cloud environment until it has successfully moved the data to the archive. If you need to restore the data so that it is searchable, you can restore the data to your Splunk Cloud environment. You can then search the data and delete it when you have finished.

When you restore archived data to Splunk Cloud, it does not count against the indexing license volume for the Splunk Cloud deployment.

Dynamic Data Active Archive Performance

When you restore data, it can sometimes impact performance if there is a large amount of data. Splunk Cloud has checks in place to help you determine whether the size of the data is too large to restore, and will provide a warning if the data size may impact performance. Splunk Cloud will block you from restoring an amount of data that could potentially have a very negative impact on performance. If this occurs, select a smaller time range.

Configure archive settings for an index

Configure archive settings for a specific index.

Managing archive settings requires the indexes_edit capability. All archiving changes are logged in the audit.log file.

Configure archiving for an index

  1. In Splunk Cloud, go to Settings > Indexes.
  2. Click New Index to create a new index or click Edit in the Actions column for an existing index.
  3. In the Dynamic Data Storage field, select Splunk Archive.
  4. Set the archive retention period. You can specify this value in years, months, or days. Note that the maximum archive retention period is displayed. This value is based on your licensed archive retention period. Specify a value within this range.
  5. Click Save.

Disable archiving for an index

  1. Go to Settings > Indexes.
  2. Click Edit in the Actions column for the index you want to manage.
  3. In the Dynamic Data Storage field, select Self Storage to move data to self-storage location when it expires or No Additional Storage to delete data as it expires.
  4. Click Save. When data in this index expires, it is deleted.

Disabling archiving for an index marks the existing archived data with a status of delete. Deleted archive data will be permanently erased 30 days after the deletion date. If you disable archiving for an index in error, contact Splunk Support as soon as possible. If you have a support contract, file a new case using the Splunk Support Portal. Otherwise, contact Splunk Customer Support.
Be aware that disabling archiving for an index does not affect the time or size of the data retention policy for the index.

Restore archived data to Splunk Cloud

You might need to restore indexed data from the Splunk archive. When the data is restored, you can then search it like any other data. You restore data based on the time period for the data you want to search. For example, you might want to restore data for a period of a day. When you pick a date from the date-picker, it is treated as 12 AM UTC of the selected date. So, if you want to restore one day's worth of archived data, (for example, on 07/10/2018) you would need to specify 07/10/2018 in the 'from' field and 07/11/2018 in the 'to' field. By default, restored data is searchable for a period of a month and is removed from Splunk after this period. It is not removed from the archive.

The archival process can take up to 48 hours to complete and the restoration process can take up to 24 hours to complete. Because the complete archival and restoration cycle can take up to 72 hours to complete, be sure to plan any data restoration processes accordingly.

Before starting the restoration process, ensure that the data is fully archived and the timestamps are correct, or you will receive the following error message: ''You cannot restore data that was archived less than 48 hours ago. Please try again later''. For more information, see Troubleshoot Dynamic Data Active Archive.

How restoring data works

When you restore data to Splunk Cloud from the archive, a copy of the archived data is moved back to the Splunk Cloud environment. To ensure your data is safe, the original archived data is never moved or deleted. This method of temporary data restoration ensures that you can never mistakenly delete your archived data.

When you restore archived data to an index in your Splunk Cloud instance, it does not count against the retention periods configured for data in your index. Restored data exists outside of the constraints of retention periods and size limits and does not affect the retention of your existing index data.

When you restore data, Splunk Cloud checks several conditions to ensure that you do not experience performance issues and that you do not duplicate data and cause your queries to return incorect results:

  • Check for overlapping data. Splunk Cloud does not restore data if you have already restored data in that same time range. This is to ensure you do not restore duplicate data, which would cause inaccurate search results. For example, if you specify that you want to restore data from 07/01/2018-07/03/2018 but you have already restored data from 07/01/2018-07/02/2018, Splunk Cloud will prevent your data restore. In this case, it is recommended that you restore that data that falls outside of the range of the data you have already restored. In this example, you would restore data from 07/03/2018-07/04/2018.
  • Check to ensure data is not likely to cause performance issues. Splunk Cloud checks the size of the data you want to restore and presents you with a warning if the size of the data may cause performance issues. If the size of that data is very likely to cause performance issues, Splunk Cloud will prevent you from restoring the data.

After you have restored data, you may notice that events appear in your index that are older than your configured retention period specifies. This restored data will remain in your index for 30 days or until you clear it.

If your attempt to restore archived data fails, verify that the data was not recently archived. Because there is a time period during which data is being transitioned from Splunk Cloud to the archive, you will not be able to restore that data during the processing period. Generally, the data moved to the archive is available in approximately 48 hours.


What happens when you are finished searching the restored data

After the data is temporarily restored to your Splunk Cloud environment it is available for searching for 30 days. Restored data is a copy of the archived data so you never need to move the data back to the archive, but for best performance, you should remove the temporarily restored data when you have finished searching it.

Steps to restore data to Splunk Cloud

  1. In Splunk Cloud, go to Settings > Indexes.
  2. For the index where you want to restore data, click Restore. The menu displays the restore history for the specified index. You can see the history of data restoration and file size for the data restored.
  3. Use the date picker to select a time range to retrieve.
  4. Click Check size. Splunk Cloud checks to see if the size of the file might impact performance. If the file size is too large, Splunk Cloud blocks you from restoring data. If there is a potential performance impact, Splunk Cloud displays a warning. Splunk Cloud also prevents you from restoring data that overlaps with existing restored data.
  5. Enter an email address to send job status notifications. Splunk Cloud will notify you when the restoration is complete.
  6. Click Restore when you have refined the file size or date range to acceptable limits.

    After you initiate data restoration, it can take up to 24 hours before data is restored. If it takes longer than 24 hours, contact Splunk Technical Support.

  7. To check the status of your data restoration, click Splunk Archive in the Storage Type field to open the Archive page. To view the restore status, click the Restore tab. In the JobStatus field, you can see the status of your job:
  • Pending: The job has been submitted, but has not begun processing.
  • In progress: The job has been started, and is progressing.
  • Success
  • Cleared: You've successfully deleted the temporary archive from your index.
  • Expired: The restored data has passed the 30 day retention period and has been deleted from the index.
  • Failed: If you receive a Failed status, click the > button for the archive to display more details about why the restoration failed.

Steps to remove restored data from Splunk Cloud

Splunk recommends you manually remove restored data when you are finished searching it.

Restored data is a copy of the archived data, so you never need to move the data back to the archive, but for best performance, you should remove the temporarily restored data when you are done searching it.

To remove restored data:

  1. In Splunk Cloud, go to Settings > Indexes.
  2. Select the index with data you want to remove and click Restore to open the Restore Archive page.
  3. For the range of data you want to remove, select Clear in the Actions column.

When the data is successfully removed, the Jobstatus column displays a Cleared status.

Monitor logs during archiving

Splunk generates logs when you archive data and when you restore archived data. You may want to monitor these logs to check for errors during these processes.

Archiving logs

To check for error messages that occur when you are archiving data, you can view the coldstoragearchiver entries in the splunkd.log. You can find these entries by running the following search:

index=_internal source=*/splunkd.log component=coldstoragearchiver

Data restoration logs

To check for error messages that occur when you restore archived data, you can view entries in the splunk_archiver_restoration.log, restoration.log, and python.log. You can find these entries by running the following search:

index=_internal source=*/splunk_archiver_restoration.log

index=_internal source=*/restoration.log

index=_internal source=*/python.log

Manage your archives

You might want to review the status of your archived indexes or understand how much of your entitlement has been used. You can review the status of your archived indexes on the Archived Indexes page.

Steps to review the overall status of your restore requests for the last 90 days

  1. From Splunk Web, go to Settings > Indexes.
  2. From the Indexes page, click on a value in the Archive Retention column.
  3. Click the Restore tab to open the Restore page.
  4. Review the Restore Summary (90 days) table to see the overall status of your restored data.
Field Description
Total Restored Data (GB) The total amount of raw data (uncompressed) that has been restored. This value is updated nightly.
Total Cleared Data (GB) The total amount of raw data (uncompressed) that has been deleted from the restored archive. This value is updated nightly.
Total Expired Data (GB) The total amount of raw data (uncompressed) that has expired from the restored archive. This value is updated nightly.

You can view the details for restored archived data from the last 90 days in the table below. For each index, you can see the following details:

Field Description
Index Name The name of the restored index.
Restored Count The total number of restoration requests, including both successful and failed restore requests. This value also includes cleared and expired restore requests.
Restored Size (GB) The total amount of raw data (uncompressed) that has been restored.
Cleared Count The total number of restored index requests that have been manually deleted.
Cleared Size (GB) The total amount of raw data (uncompressed) that has been manually deleted.
Expired Count The total number of restored index requests that have aged out.
Expired Size The total amount of restored raw data (uncompressed) that has aged out.

Steps to review the status of individual restore requests

  1. From Splunk Web, go to Settings > Indexes.
  2. From the Indexes page, click on a value in the Archive Retention column.
  3. Click the Restore tab to open the Restore page.
  4. Go to the Restore Request History (Last 50 requests) table.

From here, you can see the start time, end time, time of the request, data volume in GB, and the expiration date. To understand the status for each job, check the Job Status field for each index. The following table shows the possible values.

Field Description
Pending The request for restoration has been initiated, but has not yet begun.
In progress The restoration process has started, but it has not been completed.
Success The data has been successfully restored to your index.
Failure The restoration failed. Click the > button next to the archive to display more details about the failure.
Cleared You have successfully cleared the temporarily restored data.
Expired The restored data has passed the 30 day retention threshold.

After you have reviewed the archived indexes, you can determine what actions you want to take for each archived or restored index. You may want to clear archived data or stop archiving an index. Or you may see that a restoration or archive operation failed and chose to troubleshoot the issue.

Steps to review the overall size and growth of your archived indexes

You might want to review the size and growth of your archived indexes to better understand how much of your entitlement you are consuming. This can help you predict usage and expenses for your archived data.

  1. From Splunk Web, go to Settings > Indexes.
  2. From the Indexes page, click on a value in the Archive Retention column.

The Archive Summary page displays the following information:

Field Description
Total Archive Usage The total amount of raw data (uncompressed) that is stored in the archive. This number turns red when total archive usage exceeds the total entitlement. This value is updated nightly.
Total Entitlement Your total entitlement as determined in your service agreement.
Total Archive Data Growth (90 Days) The total amount of raw data (uncompressed) that has been added to the archive in the past 90 days. This value is updated nightly.
Total Archive Data Expiration (90 Days) The total amount of raw data (uncompressed) that has aged out of the archive within the past 90-day window. This value is updated nightly. Note that each index has an archive retention setting and the data ages out over time. For example, index A has 2-year archive retention. Every night for that index, Splunk ages out the data that is older than 2 years.

Steps to review the size and growth of each archived index

You might want to review the size and growth of each index to understand how much it grows over time.

  1. From Splunk Web, go to Settings > Indexes.
  2. From the Indexes page, click on a value in the Archive Retention column.

The Archive Summary page displays the following information:

Field Description
Index Name Name of the index.
Current Size (GB) The current amount of raw data (uncompressed) that is stored in the archive for each index.
Earliest Event The earliest event in the archived index.
Latest Event The latest event in the archived index.
90-Day Data Growth (GB) The amount of raw data (uncompressed) that has been added to the archive in the past 90 days for each index.
90-Day Data Expiration (GB) The amount of raw data (uncompressed) that has been removed from the archive after 90 days for each index.

Troubleshoot the Dynamic Data Active Archive

I received an error when attempting to restore data

If an error occurs, the error is logged to the splunkd.log. When you review the Archive page, if you experience errors, you may want to review the splunkd.log and specify the coldstoragearchiver component here: index=_internal source=*/splunkd.log component=coldstoragearchiver

I clicked the Check Size button and nothing happened

When restoring data, I clicked the Check Size button multiple times and nothing happened.

Diagnosis

When restoring a large amount of data, it may take some time for Splunk to verify that the size of the data can be restored without causing performance issues. If you click the Check Size buttons multiple times, it may trigger AWS to block the check process.

Solution

Do not click the Check Size button multiple times if you don't immediately receive feedback.

I archived some of my data. When I attempted to restore it a few hours later, an error message appeared.

When I archived data and attempted to restore it soon after, I received an error message.

Diagnosis

Data can take up to 48 hours to archive. If you attempt to restore the data before this time period completes, the restoration will fail and the following error message appears: You cannot restore data that was archived less than 48 hours ago. Please try again later.

Solution

Wait until the 48 hour threshold has been met, and then attempt to restore the data. You can check the status of the archival process as follows:

  • Run the following query against the internal log to determine when Splunk Cloud started the archival process:index=_internal component=ColdStorageArchiver "Successfully executed archiving script

I'm trying to restore fully archived data, but I'm still receiving an error message about data archival.

I'm trying to restore data and the archival process completed more than 48 hours ago, but I'm receiving the following error message: You cannot restore data that was archived less than 48 hours ago. Please try again later.

Diagnosis

Receiving this error message after the archival process is complete indicates that there are incorrect timestamps in the data.

Solution

Contact Splunk Technical Support for help with correcting the timestamp data.

Last modified on 18 November, 2020
PREVIOUS
Store expired Splunk Cloud data to your private archive
  NEXT
Workload Management overview

This documentation applies to the following versions of Splunk Cloud: 8.0.2006, 8.0.2007, 8.1.2008, 8.1.2009, 8.1.2011


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters