Splunk Cloud

Splunk Cloud Admin Manual

Acrobat logo Download manual as PDF

This documentation does not apply to the most recent version of SplunkCloud. Click here for the latest version.
Acrobat logo Download topic as PDF

Store expired Splunk Cloud data to your private archive

You might need to maintain older data to access it or for compliance purposes. Dynamic Data Self Storage allows you to move your data from your Splunk Cloud indexes to an Amazon S3 bucket in your own AWS environment. You can configure Splunk Cloud to automatically move the data in an index when the data reaches the end of the Splunk Cloud retention period you configure. In addition, you can restore your data to a Splunk Enterprise instance. To ensure there is no data loss, Dynamic Data Self Storage maintains your data in the Splunk Cloud environment until it is safely moved to your configured self storage location.

Requirements for Dynamic Data Self Storage

Dynamic Data Self Storage is available for Managed Splunk Cloud instances only.

To configure a self storage location in an Amazon S3 bucket in your own AWS environment, you need sufficient AWS permissions to create S3 buckets and assign bucket policies to them. If you are not the AWS administrator for your organization, ensure that you are able to work with the AWS administrator during this process to create a new S3 bucket and apply the bucket policy generated by Splunk.

After you move the data to your self storage location, Splunk Cloud does not maintain a copy of this data and does not have access to maintain the data in your self storage environment, so ensure that you understand how to maintain and monitor your data before moving it to your S3 bucket. The S3 buckets that you use to store your expired Splunk Cloud data must be in the same region as your Splunk Cloud environment. See Create an S3 bucket in your AWS environment for details about creating your S3 bucket.

Dynamic Data Self Storage is not available in the following GCP regions: Iowa, London, and Singapore.

If you intend to restore your data, you will also need access to a Splunk Enterprise instance.

Performance

Dynamic Data Self Storage is designed to retain your expired data with minimal performance impact. Also, Dynamic Data Self Storage ensures that the export rate does not spike in the case of a large volume of data. For example, if you reduce the retention period from one year to ninety days, the volume increases, but the export rate does not spike. This ensures that changes in data volume does not impact performance. For more details about Dynamic Data Self Storage performance and limits, see the Splunk Cloud service limits and constraints.

How Dynamic Data Self Storage works

Data is moved to your self storage location when the index meets a configured size or time threshold. When that threshold is met, Splunk Cloud attempts to move the data to your configured Amazon S3 bucket. Note the following:

  • If an error occurs, if there are connection issues, or if the Amazon S3 bucket is unavailable or full, Splunk Cloud attempts to move the data every 15 minutes until it can successfully move it.
  • Splunk Cloud does not delete data from the Splunk Cloud environment until is has successfully moved the data to your self storage location.
  • Data is moved within the AWS environment from your indexer to your self storage location using SSL. Data is encrypted by SSL during transit to your Amazon S3 bucket. Because the Splunk Cloud encryption functionality applies only to data handled within Splunk buckets, you might want to encrypt data in the target bucket after transit. To ensure your data is protected, enable AES256 SSE-S3 on your target bucket so that the data will resume encryption at rest immediately upon arrival at the SSE-S3 bucket. Enabling AES256 SSE-S3 provides server-side encryption with Amazon S3 Managed keys (SSE-S3).

After the data is stored to your Amazon S3 bucket, you maintain the data using your Amazon S3 tools according to the needs of your organization. If you need to restore the data so that it is searchable, you can restore the data to a Splunk Enterprise instance. The data is restored to a thawed directory, which exists outside of the thresholds for deletion you have configured on your Splunk Enterprise instance. You can then search the data and delete it when you have finished.

When you restore data to a thawed directory on Splunk Enterprise, it does not count against the indexing license volume for the Splunk Enterprise or Splunk Cloud deployment.

The graphic shows a diagram of a Splunk Cloud instance, an Amazon S3 bucket, and a Splunk Enterprise instance. The arrows show how the data moves from Splunk Cloud to the Amazon bucket when the data expires. Then, the arrows show the data moved from the AWS bucket to the Enterprise instance in order to restore the data.

Configure self storage locations on Amazon S3

Set up one or more Amazon S3 buckets to store your expired Splunk Cloud data. Managing self storage locations requires the Splunk indexes_edit capability. All self storage configuration changes are logged in the audit.log file.

Create an Amazon S3 bucket in your AWS environment

In your AWS Management Console, create a new S3 bucket. For information on how to create and manage S3 buckets, search for S3 in the AWS documentation.

Important:

  • Region: Ensure that the Amazon S3 bucket is in the same region as your Splunk Cloud environment.
  • Naming: When you name the S3 bucket, it must include the Splunk prefix provided to you and displayed in the UI under the AWS S3 bucket name field. Enter the prefix before the rest of the bucket name. This prefix contains your organization's Splunk Cloud ID, which is the first part of your organization's Splunk Cloud URL, and a 12-character string. The complete S3 bucket name is in following format:
Splunk Cloud ID-{12-character string}-{your bucket name}
As an example, you administer Splunk Cloud for Buttercup Cloudworks. Your organization's Splunk Cloud URL is buttercupcloudworks.splunkcloud.com, so your Splunk Cloud ID is buttercupcloudworks. The image shows the following example prefix you'd see when configuring an S3 bucket on the New Self Storage Location dialog box:
buttercupcloudworks-rs73hfjie674-{your bucket name}
This example shows the Splunk prefix needed when naming a new AWS S3 bucket on the New Self Storage Location dialog box.
If you do not use this prefix, Splunk cannot write to your bucket. By default, your Splunk Cloud instance has a security policy applied which disallows write operations to S3 buckets that do not include your Splunk Cloud ID. This security policy helps ensure that the write operation is allowed only for S3 buckets that you have created for the purpose of storing your expired Splunk Cloud data.


If you have enabled AES256 SSE-S3 on your target bucket, the data will resume encryption at rest upon arrival at the SSE-S3 bucket.

Access the Self Storage Locations page

  1. In Splunk Web, click Settings > Indexes.
  2. Click New Index to create a new index, or click Edit in the Actions column for an existing index.
  3. In the Dynamic Data Storage field, click the radio button for Self Storage
  4. Under the message "No self storage locations have been created yet" click Create a self storage location. Splunk Web opens a new tab to allow you to manage your self storage locations.

Create a self storage location

  1. On the Self Storage Locations page, click New Self Storage Location.
  2. Give your location a Title and an optional Description.
  3. Enter the Amazon S3 bucket name of the S3 bucket that you created.
  4. (Optional) Enter the bucket folder name.
  5. Click Generate. Splunk Cloud generates a bucket policy.
  6. Copy the bucket policy to your clipboard. Do not modify the permissions on the bucket policy.
  7. In a separate window, navigate to your AWS Management console and apply this policy to the S3 bucket you created earlier.
  8. On the Self Storage Locations page in Splunk Web, click Test bucket policy. Splunk Cloud writes a 0 KB test file to the root of your S3 bucket to verify that Splunk Cloud has permissions to write to the bucket. A success message displays, and the Submit button is enabled.
  9. Click Submit.
  10. In the AWS Management Console, verify that the 0 KB test file appears in the root of your bucket.

You cannot edit or delete a self storage location after it is defined, so verify the name and description before you save it.

Manage self storage settings on an index

Enable Dynamic Data Self Storage on any Splunk Cloud index to allow expired data to be stored to an Amazon S3 bucket.

Managing self storage settings requires the Splunk indexes_edit capability. All self storage configuration changes are logged in the audit.log file.

Enable self storage for an index

Prerequisite
You must have configured a self storage location. See Configure Self Storage Locations for details.

  1. Go to Settings > Indexes.
  2. Click New Index to create a new index, or click Edit in the Actions column for an existing index.
  3. In the Dynamic Data Storage field, click the radio button for Self Storage.
  4. Select a storage location from the drop-down list.
  5. Click Save.

Disable self storage for an index

If you disable self storage for an index, expired data is deleted.

  1. Go to Settings > Indexes.
  2. Click Edit in the Actions column for the index you want to manage.
  3. In the Dynamic Data Storage field, click the radio button for No Additional Storage.
  4. Click Save. Self storage is disabled for this index. When data in this index expires, it is deleted.


Disabling self storage for an index does not change the configuration of the external location, nor does it delete the external location or the data stored there. Disabling self storage also does not affect the time or size of the data retention policy for the index.

Verify Splunk Cloud successfully moved your data

To verify that your data was successfully moved to your self storage location, you can search the splunkd.log files.

You must have an sc_admin role to search the splunkd.log files.

  1. First, you can search your splunkd.log files to view the self storage logs. You search these files in Splunk Web by running the following search.

    index="_internal" component=SelfStorageArchiver

  2. Next, search to see which buckets were successfully moved to the self storage location:

    index="_internal" component=SelfStorageArchiver "Successfully transferred"

  3. Verify that all the buckets you expected to move were successfully transferred.

Monitor changes to self storage settings

You might want to monitor changes to self storage settings to ensure that the self storage locations and settings meet your company's requirements over time. When you make changes to self storage settings, Splunk Cloud logs the activity to the audit.log. You can search these log entries in Splunk Web by running the following search.

index="_audit"


Note that Splunk Cloud cannot monitor the settings for the self storage S3 bucket on AWS. For information about monitoring your Amazon S3 buckets, see the Amazon S3 documentation and search for "Monitoring Tools".

The following examples show the log entries available for monitoring your self storage settings.

Log entry for a new self storage location

Splunk Cloud logs the activity when you create a new self storage location. For example:

10-01-2017 11:28:26.180 -0700 INFO  AuditLogger - Audit:[timestamp=10-01-2017 11:28:26.180, user=splunk-system-user, action=self_storage_enabled, info="Self storage enabled for this index.", index="dynamic_data_sample" ][n/a]

You can search these log entries in Splunk Web by running the following search.

index="_audit" action=self_storage_create

Log entry when you remove a self storage location

Splunk Cloud logs the activity when you remove a self storage location. For example:

10-01-2017 11:33:46.180 -0700 INFO  AuditLogger - Audit:[timestamp=10-01-2017 11:33:46.180, user=splunk-system-user, action=self_storage_disabled, info="Self storage disabled for this index.", index="dynamic_data_sample" ][n/a]

You can search these log entries in Splunk Web by running the following search.

index="_audit" action=self_storage_disabled

Log entry when you change settings for a self storage location

Splunk Cloud logs the activity when you change the settings for a self storage location. For example:

 09-25-2017 21:14:21.190 -0700 INFO  AuditLogger - Audit:[timestamp=09-25-2017 21:14:21.190, user=splunk-system-user, action=self_storage_edit, info="A setting that affects data retention was changed.", index="dynamic_data_sample", setting="frozenTimePeriodInSecs", old_value="440", new_value="5000" ][n/a] 

The following table shows settings that might change.

Field Description
info="Archiver index setting changed." Notification that you successfully changed self storage settings for the specified index.
index="dynamic_data_sample" Name of the index for which self storage settings were modified.
setting="frozenTimePeriodInSecs" The number of seconds before an event is removed from an index. This value is specified in days when you configure index settings.
old_value="440" Value before the setting was updated.
new_value="5000" Value after the setting has been updated.

You can search these log entries in Splunk Web by by running the following search.

index="_audit" action=self_storage_edit

Restore indexed data from a self storage location

You might need to restore indexed data from a self storage location. You restore this data by moving the exported data into a thawed directory on a Splunk Enterprise instance, such as $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb. When it is restored, you can then search it. You can restore one bucket at a time. Data in the thaweddb directory is not subject to the server's index aging scheme, which prevents it from immediately expiring upon being restored. You can put archived data in the thawed directory for as long as you need it. When the data is no longer needed, simply delete it or move it out of the thawed directory.

As a best practice, restore your data using a 'nix machine. Using a Windows machine to restore indexed data to a Splunk Enterprise instance might result in a benign error message. See Troubleshoot Dynamic Data Self Storage.

  1. Set up a Splunk Enterprise instance. The Splunk Enterprise instance can be either local or remote. If you have an existing Splunk Enterprise instance, you can use it.

    You can restore self storage data only to a Splunk Enterprise instance. You can't restore self storage data to a Splunk Cloud instance.

  2. Install the AWS Command Line Interface tool on your local machine. The AWS CLI tool must be installed in the same location as the Splunk Enterprise instance responsible for rebuilding.
  3. Configure the AWS CLI tool with the credentials of your AWS self storage location. For instructions on configuring the AWS CLI tool, see the Amazon Command Line Interface Documentation.
  4. Use the recursive copy command to download data from the self storage location to the thaweddb directory for your index. You can restore only one bucket at a time. If you have a large number of buckets to restore, consider using a script to do so. Use syntax similar to the following:
    aws s3 cp s3://<self storage bucket>/<self_storage_folder(s)>/<index_name> /SPLUNK_HOME/var/lib/splunk/<index_name>/thaweddb/ --recursive 
    

    Make sure you copy all the contents of the archived Splunk bucket because they are needed to restore the data. For example, copy starting at the following level: db_timestamp_timestamp_bucketID. Do not copy the data at the level of raw data (.gz files). The buckets display in the thaweddb directory of your Splunk Enterprise instance.

  5. Restore the indexes by running the following command:
    ./splunk rebuild <SPLUNK_HOME>/var/lib/splunk/<index_name>/thaweddb/<bucket_folder>  <index_name>
    
    When the index is successfully restored, a success message displays and additional bucket files are added to the thawed directory, including tsidx source types.

After the data is restored, go to the Search & Reporting app, and search on the restored index as you would any other Splunk index.

When you restore data to the thawed directory on Splunk Enterprise, it does not count against the indexing license volume for the Splunk Enterprise or Splunk Cloud deployment.

Troubleshoot Dynamic Data Self Storage

When you configure self storage, you might encounter the following issues.

I don't know the region of my Splunk Cloud environment

I received the following error when testing my self storage location: Your S3 bucket must be in the same region as your Splunk Cloud environment <AWS Region> .

Diagnosis

Splunk Cloud detected that you created your S3 bucket in a different region than your Splunk Cloud environment.

Solution

If you are unsure of the region of your Splunk Cloud environment, review the error message. The <AWS Region> portion of the error message displays the correct region to create your S3 bucket. After you determine the region, repeat the steps to create the self storage location.

I received an error when testing the self storage location

When I attempted to create a new self storage location, the following error occurred when I clicked the Test button: Unable to verify the region of your S3 bucket, unable to get bucket_region to verify. An error occurred (403) when calling the Headbucket operation: Forbidden. Contact Splunk Support.

Diagnosis

You might get an error for the following reasons:

  • You modified the permissions on the bucket policy.
  • You pasted the bucket policy into the incorrect Amazon S3 bucket.
  • You did not paste the bucket policy to the Amazon S3 bucket, or you did not save the changes.
  • An error occurred during provisioning.

Solution

  1. Ensure that you did not modify the S3 bucket permissions. The following actions must be allowed: s3:PutObject, s3:GetObject, s3:ListBucket, s3:ListBucketVersions, s3:GetBucketLocation.
  2. Verify that you applied the bucket policy to the correct S3 bucket, and that you saved your changes.
  3. If you created the S3 bucket in the correct region, the permissions are correct and you applied and saved the bucket policy to the correct S3 bucket, contact Splunk Support to further troubleshoot the issue.

To review the steps to create the S3 bucket, see "Create a new self storage location" in this topic.

To review how to apply a bucket policy, see the Amazon AWS S3 documentation and search for "how do I add an S3 bucket policy?".

I received an error when using Windows to restore data.

I attempted to restore data using a Windows machine, but the following error occurred: "Reason='ERROR_ACCESS_DENIED'. Will try to copy contents"

Diagnosis

This error occurs only on Windows builds and is benign. Splunk Cloud bypasses this error by copying the content. You can safely ignore the error and continue with the restore process.

Solution

This error is benign. You can ignore it and continue with the restore process. See Restore indexed data from a self storage location.

I'm using Splunk Cloud for a US government entity, and received an error message that the bucket couldn't be found.

I received the following error message: "Cannot find the bucket '{bucket_name}', ensure that the bucket is created in the '{region_name} region."

Diagnosis

S3 bucket names aren't global for US government entities using Splunk Cloud because Splunk can only verify the region of the stack. Buckets with the same name can exist in both East and West regions.

Solution

If buckets that share the same name must exist in both regions, add the missing bucket to the appropriate region.

I don't see or I can't download the test file in my self storage location.

I followed the instructions in Create a self storage location, but I have one of the following issues:

  • I don't see the test file in my target folder.
  • I see the test file in the root directory, but I can't download it.

Diagnosis

Splunk Cloud places the test file in the root directory of your S3 bucket. You can view the test file, but you can't download it.

Solution

If you see the test file in the root directory, you've successfully created a self storage location.

Last modified on 30 November, 2020
PREVIOUS
Review the Splunk Cloud health report
  NEXT
Store expired Splunk Cloud data to a Splunk-managed archive

This documentation applies to the following versions of Splunk Cloud: 8.0.2006, 8.0.2007, 8.1.2008, 8.1.2009


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters