Create a DSP connection to send data to Amazon S3

To send data from a data pipeline in Splunk Data Stream Processor (DSP) to an Amazon S3 bucket, you must first create a connection using the Write Connector for Amazon S3. You can then use the connection in the Send to Amazon S3 sink function to send data from your DSP pipeline to your Amazon S3 bucket.

The Write Connector for Amazon S3 can't be used to get data from Amazon S3 into a pipeline. If you want to collect data from Amazon S3, you must use the Amazon S3 connector. See Create a DSP connection to get data from Amazon S3 for more information.

Prerequisites

Before you can create the Amazon S3 connection, you must have the following:

An Identity and Access Management (IAM) user with at least read and write permissions for the destination bucket. Permissions for decrypting KMS-encrypted files might also be required. See the IAM user permissions section on this page for more information.
The access key ID and secret access key for that IAM user.

If you don't have an IAM user with the necessary permissions, ask your Amazon Web Services (AWS) administrator for assistance.

IAM user permissions

Make sure your IAM user has at least read and write permissions for the destination bucket. See the following list of permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:DeleteObject",
        "s3:AbortMultipartUpload",
        "s3:ListMultipartUploadParts",
        "s3:ListBucketMultipartUploads",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": "*"
    }
  ]
}

If you plan to encrypt your files using the SSE-KMS algorithm with a custom Customer Master Key (CMK), then your IAM user must also have the following permissions:

kms:Decrypt
kms:GenerateDataKey, if the IAM user is not in the same AWS account as the AWS KMS key.

Additionally, the key policy must also include the kms:Decrypt permission.

As a best practice for making sure that these permissions are not applied unnecessarily to other Amazon S3 buckets or other folders inside your bucket, in the Resource element, specify the names of your destination bucket and folder. For example, the following Resource definition ensures that your specified permissions are applied only to the bucket named MyBucket and the folder inside it named MyFolder:

      "Resource": [
        "arn:aws:s3:::MyBucket",
        "arn:aws:s3:::MyBucket/MyFolder/*"
      ]

To route data to specific subfolders in MyFolder, set the prefix parameter in the Send to Amazon S3 sink function to your desired subfolder path starting from MyFolder. For example, to route data to subfolders that are named with the year and month, set prefix to MyFolder/#{datetime:yyyy-MM}. See Send data to Amazon S3 in the Function Reference manual for more information about configuring the sink function.

Steps

From the Data Stream Processor home page, click Data Management and then select the Connections tab.
Click Create new connection.
Select Write Connector for Amazon S3 and then click Next.

Complete the following fields:

Field	Description
Connection Name	A unique name for your connection.
Description (Optional)	A description of your connection.
AWS Access Key ID	The access key ID for your IAM user.
AWS Secret Access Key	The secret access key for your IAM user.

Any credentials that you upload are transmitted securely by HTTPS, encrypted, and securely stored in a secrets manager.

Click Save.
If you're editing a connection that's being used by an active pipeline, you must reactivate that pipeline after making your changes. When you reactivate a pipeline, you must select where you want to resume data ingestion. See Using activation checkpoints to activate your pipeline in the Use the Data Stream Processor manual for more information.

You can now use your connection in a Send to Amazon S3 sink function at the end of your data pipeline to send data to Amazon S3. For instructions on how to build a data pipeline, see the Building a pipeline chapter in the Use the manual. For information about the sink function, see Send data to Amazon S3 in the Function Reference manual.

If you're planning to send data to Amazon S3 in Parquet format, make sure to include pipeline functions that extract relevant data from union-typed fields into explicitly typed top-level fields. See Formatting DSP data for Parquet files in Amazon S3.

Create a DSP connection to send data to Amazon S3

Prerequisites

IAM user permissions

Steps

Comments

Create a DSP connection to send data to Amazon S3

Was this topic useful?