Splunk Cloud

Splunk Cloud User Manual

Download manual as PDF

Download topic as PDF

Store expired Splunk Cloud data

You might need to maintain older data to access it or for compliance purposes. Dynamic Data Self Storage allows you to move your data from your Splunk Cloud indexes to an Amazon S3 bucket in your own AWS environment. You can configure Splunk Cloud to automatically move the data in an index when the data reaches the end of the Splunk Cloud retention period you configure. In addition, you can restore your data to a Splunk Enterprise instance. To ensure there is no data loss, Dynamic Data Self Storage maintains your data in the Splunk Cloud environment until it is safely moved to your configured self storage location.

Requirements for Dynamic Data Self Storage

Dynamic Data Self Storage is available for Managed Splunk Cloud instances only.

To configure a self storage location in an Amazon S3 bucket in your own AWS environment, you need sufficient AWS permissions to create S3 buckets and assign bucket policies to them. If you are not the AWS administrator for your organization, ensure that you are able to work with the AWS administrator during this process to create a new S3 bucket and apply the bucket policy generated by Splunk.

After you move the data to your self storage location, Splunk Cloud does not maintain a copy of this data and does not have access to maintain the data in your self storage environment, so ensure that you understand how to maintain and monitor your data before moving it to your S3 bucket.

The S3 buckets that you use to store your expired Splunk Cloud data must be in the same region as your Splunk Cloud environment. See Create an S3 bucket in your AWS environment for details about creating your S3 bucket.

If you intend to restore your data, you will also need access to a Splunk Enterprise instance.

Performance

Dynamic Data Self Storage is designed to retain your expired data with minimal performance impact. Also, Dynamic Data Self Storage ensures that the export rate does not spike in the case of a large volume of data. For example, if you reduce the retention period from one year to ninety days, the volume increases, but the export rate does not spike. This ensures that changes in data volume does not impact performance. For more details about Dynamic Data Self Storage performance and limits, see the Spunk Cloud service limits and constraints.

How Dynamic Data Self Storage works

Data is moved to your self storage location when the index meets a configured size or time threshold. You may have configured the data to be stored for a certain number of days or until the index reaches a certain size. When that threshold is met, Splunk Cloud attempts to move the data to your configured Amazon S3 bucket. If you have enabled AES256 SSE-S3 on your target bucket, the data will resume encryption at rest upon arrival at the SSE-S3 bucket. If an error occurs, if there are connection issues, or if the Amazon S3 bucket is unavailable or full, Splunk Cloud attempts to move the data every 15 minutes until it can successfully move it. Splunk Cloud does not delete data from the Splunk Cloud environment until is has successfully moved the data to your self storage location. After the data is stored to your Amazon S3 bucket, you maintain the data using your Amazon S3 tools according to the needs of your organization. If you need to restore the data so that it is searchable, you can restore the data to a Splunk Enterprise instance. The data is restored to a thawed directory, which exists outside of the thresholds for deletion you have configured on your Splunk Enterprise instance. You can then search the data and delete it when you have finished.

When you restore data to a thawed directory on Splunk Enterprise, it does not count against the indexing license volume for the Splunk Enterprise or Splunk Cloud deployment.

The graphic shows a diagram of a Splunk Cloud instance, an Amazon S3 bucket, and a Splunk Enterprise instance. The arrows show how the data moves from Splunk Cloud to the Amazon bucket when the data expires. Then, the arrows show the data moved from the AWS bucket to the Enterprise instance in order to restore the data.

Configure self storage locations on Amazon S3

Set up one or more Amazon S3 buckets to store your expired Splunk Cloud data.

Managing self storage locations requires the Splunk indexes_edit capability. All self storage configuration changes are logged in the audit.log file.

Create an Amazon S3 bucket in your AWS environment

In your AWS Management Console, create a new S3 bucket. For information on how to create and manage S3 buckets, search for S3 in the AWS documentation.

Important: the Amazon S3 bucket must be in the same region as your Splunk Cloud environment. When you name the S3 bucket, it must include the prefix provided to you and displayed in the UI before the rest of the bucket name. If you do not use this prefix, Splunk cannot write to your bucket. By default, your Splunk Cloud instance has a security policy applied which disallows write operations to S3 buckets that do not include your Splunk Cloud ID. This security policy helps ensure that the write operation is allowed only for S3 buckets that you have created for the purpose of storing your expired Splunk Cloud data. If your organization's policies require you to name your S3 buckets without this prefix, open a case with Splunk Support to modify the string limitation.

If you have enabled AES256 SSE-S3 on your target bucket, the data will resume encryption at rest upon arrival at the SSE-S3 bucket.

Access the Self Storage Locations page

  1. In Splunk Web, click Settings > Indexes.
  2. Click New Index to create a new index, or click Edit in the Actions column for an existing index.
  3. In the Dynamic Data Storage field, click the radio button for Self Storage
  4. Under the message "No self storage locations have been created yet" click Create a self storage location. Splunk Web opens a new tab to allow you to manage your self storage locations.

Create a self storage location

  1. On the Self Storage Locations page, click New Self Storage Location.
  2. Give your location a Title and an optional Description.
  3. Enter the Amazon S3 bucket name of the S3 bucket that you created.
  4. (Optional) Enter the bucket folder name.
  5. Click Generate. Splunk Cloud generates a bucket policy.
  6. Copy the bucket policy to your clipboard. Do not modify the permissions on the bucket policy.
  7. In a separate window, navigate to your AWS Management console and apply this policy to the S3 bucket you created earlier.
  8. On the Self Storage Locations page in Splunk Web, click Test bucket policy. Splunk Cloud writes a 0KB test file to your S3 bucket to verify that Splunk Cloud has permissions to write to the bucket. A success message displays, and the Submit button is enabled.
  9. Click Submit.
  10. In the AWS Management Console, verify that the 0KB test file appears in your bucket.

You cannot edit or delete a self storage location after it is defined, so verify the name and description before you save it.

Manage self storage settings on an index

Enable Dynamic Data Self Storage on any managed Splunk Cloud index to allow expired data to be stored to an Amazon S3 bucket.

Managing self storage settings requires the Splunk indexes_edit capability. All self storage configuration changes are logged in the audit.log file.

Enable self storage for an index

Prerequisite
You must have configured a self storage location. See Configure Self Storage Locations for details.

  1. Go to Settings > Indexes.
  2. Click New Index to create a new index, or click Edit in the Actions column for an existing index.
  3. In the Dynamic Data Storage field, click the radio button for Self Storage
  4. Select a storage location from the drop-down list.
  5. Click Save.

Disable self storage for an index

If you disable self storage for an index, expired data is deleted.

  1. Go to Settings > Indexes.
  2. Click Edit in the Actions column for the index you want to manage.
  3. In the Dynamic Data Storage field, click the radio button for No Additional Storage
  4. Click Save. Self storage is disabled for this index. When data in this index expires, it is deleted.


Disabling self storage for an index does not change the configuration of the external location, nor does it delete the external location or the data stored there. Disabling self storage also does not affect the time or size of the data retention policy for the index.

Verify Splunk Cloud successfully moved your data

To verify that your data was successfully moved to your self storage location, you can search the splunkd.log files.

You must have an sc_admin role to search the splunkd.log files.

  1. First, you can search your splunkd.log files to view the self storage logs. You search these files in Splunk Web by running the following search.

    index="_internal" component=SelfStorageArchiver

  2. Next, search to see which buckets were successfully moved to the self storage location:

    index="_internal" component=SelfStorageArchiver "Successfully transferred"

  3. Verify that all the buckets you expected to move were successfully transferred.

Monitor changes to self storage settings

You might want to monitor changes to self storage settings to ensure that the self storage locations and settings meet your company's requirements over time. When you make changes to self storage settings, Splunk Cloud logs the activity to the audit.log. You can search these log entries in Splunk Web by running the following search.

index="_audit"


Note that Splunk Cloud cannot monitor the settings for the self storage S3 bucket on AWS. For information about monitoring your Amazon S3 buckets, see the Amazon S3 documentation and search for "Monitoring Tools".

The following examples show the log entries available for monitoring your self storage settings.

Log entry for a new self storage location

Splunk Cloud logs the activity when you create a new self storage location. For example:

10-01-2017 11:28:26.180 -0700 INFO  AuditLogger - Audit:[timestamp=10-01-2017 11:28:26.180, user=splunk-system-user, action=self_storage_enabled, info="Self storage enabled for this index.", index="dynamic_data_sample" ][n/a]

You can search these log entries in Splunk Web by running the following search.

index="_audit" action=self_storage_create

Log entry when you remove a self storage location

Splunk Cloud logs the activity when you remove a self storage location. For example:

10-01-2017 11:33:46.180 -0700 INFO  AuditLogger - Audit:[timestamp=10-01-2017 11:33:46.180, user=splunk-system-user, action=self_storage_disabled, info="Self storage disabled for this index.", index="dynamic_data_sample" ][n/a]

You can search these log entries in Splunk Web by running the following search.

index="_audit" action=self_storage_disabled

Log entry when you change settings for a self storage location

Splunk Cloud logs the activity when you change the settings for a self storage location. For example:

 09-25-2017 21:14:21.190 -0700 INFO  AuditLogger - Audit:[timestamp=09-25-2017 21:14:21.190, user=splunk-system-user, action=self_storage_edit, info="A setting that affects data retention was changed.", index="dynamic_data_sample", setting="frozenTimePeriodInSecs", old_value="440", new_value="5000" ][n/a] 

The following table shows settings that might change.

Field Description
info="Archiver index setting changed." Notification that you successfully changed self storage settings for the specified index.
index="dynamic_data_sample" Name of the index for which self storage settings were modified.
setting="frozenTimePeriodInSecs" The number of seconds before an event is removed from an index. This value is specified in days when you configure index settings.
old_value="440" Value before the setting was updated.
new_value="5000" Value after the setting has been updated.

You can search these log entries in Splunk Web by by running the following search.

index="_audit" action=self_storage_edit

Restore indexed data from a self storage location

You might need to restore indexed data from a self storage location. You restore this data by moving the exported data into a thawed directory on a Splunk Enterprise instance, such as $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb. When it is restored, you can then search it. You can restore one bucket at a time. Data in the thaweddb directory is not subject to the server's index aging scheme, which prevents it from immediately expiring upon being restored. You can put archived data in the thawed directory for as long as you need it. When the data is no longer needed, simply delete it or move it out of the thawed directory.

You cannot restore self storage data to a Splunk Cloud instance. You can restore self storage data only to a Splunk Enterprise instance.

  1. Set up a Splunk Enterprise instance. If you have an existing Splunk Enterprise instance, you can use it.
  2. Install the AWS CLI tool on your local machine. To find the CLI tool, see AWS Command Line Interface. The AWS CLI tool must be installed in the same location as the Splunk Enterprise instance responsible for rebuilding. The Splunk Enterprise instance can be either local or remote.
  3. Configure the AWS CLI tool with the credentials of your AWS self storage location. For instructions on configuring the AWS CLI tool, see the Amazon Command Line Interface Documentation.
  4. Use the recursive copy command to download data from the self storage location to the thaweddb directory for your index. You can restore only one bucket at a time. If you have a large number of buckets to restore, consider using a script to do so. Use syntax similar to the following:
    aws s3 cp s3://<self storage bucket>/<self_storage_folder(s)>/<index_name> /SPLUNK_HOME/var/lib/splunk/<index_name>/thaweddb/ --recursive 
    

    Make sure you copy all the contents of the archived Splunk bucket because they are needed to restore the data. For example, copy starting at the following level: db_timestamp_timestamp_bucketID. Do not copy the data at the level of raw data (.gz files). The buckets display in the thaweddb directory of your Splunk Enterprise instance.

  5. Restore the indexes by running the following command:
    ./splunk rebuild <SPLUNK_HOME>/var/lib/splunk/<index_name>/thaweddb/<bucket_folder>  <index_name>
    
    When the index is succesfully restored, a success message displays and additional bucket files are added to the thawed directory, including tsidx source types.

After the data is restored, go to the Search & Reporting app, and search on the restored index as you would any other Splunk index.

When you restore data to the thawed directory on Splunk Enterprise, it does not count against the indexing license volume for the Splunk Enterprise or Splunk Cloud deployment.

Troubleshoot Dynamic Data Self Storage

When you configure self storage, you might encounter the following issues.

I don't know the region of my Splunk Cloud environment

I received the following error when testing my self storage location: Your S3 bucket must be in the same region as your Splunk Cloud environment <AWS Region> .

Diagnosis

Splunk Cloud detected that you created your S3 bucket in a different region than your Splunk Cloud environment.

Solution

If you are unsure of the region of your Splunk Cloud environment, review the error message. The <AWS Region> portion of the error message displays the correct region to create your S3 bucket. After you determine the region, repeat the steps to create the self storage location.

I received an error when testing the self storage location

When I attempted to create a new self storage location, the following error occurred when I clicked the Test button: Unable to verify the region of your S3 bucket, unable to get bucket_region to verify. An error occurred (403) when calling the Headbucket operation: Forbidden. Contact Splunk Support.

Diagnosis

You might get an error for the following reasons:

  • You modified the permissions on the bucket policy.
  • You pasted the bucket policy into the incorrect Amazon S3 bucket.
  • You did not paste the bucket policy to the Amazon S3 bucket, or you did not save the changes.
  • An error occurred during provisioning.

Solution

  1. Ensure that you did not modify the S3 bucket permissions. The following actions must be allowed: s3:PutObject, s3:GetObject, s3:ListBucket, s3:ListBucketVersions, s3:GetBucketLocation.
  2. Verify that you applied the bucket policy to the correct S3 bucket, and that you saved your changes.
  3. If you created the S3 bucket in the correct region, the permissions are correct and you applied and saved the bucket policy to the correct S3 bucket, contact Splunk Support to further troubleshoot the issue.

To review the steps to create the S3 bucket, see "Create a new self storage location" in this topic.

To review how to apply a bucket policy, see the Amazon AWS S3 documentation and search for "how do I add an S3 bucket policy?".

PREVIOUS
Manage Splunk Cloud indexes
  NEXT
Archive expired Splunk Cloud data

This documentation applies to the following versions of Splunk Cloud: 7.2.3, 7.2.4, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 8.0.0


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters