Configure Incremental S3 inputs for the Splunk Add-on for AWS

Complete the steps to configure Incremental S3 inputs for the Splunk Add-on for Amazon Web Services (AWS):

You must manage accounts for the add-on as a prerequisite. See Manage accounts for the Splunk Add-on for AWS.
Configure AWS services for the Incremental S3 input.
Configure AWS permissions for the Incremental S3 input.
(Optional) Configure VPC Interface Endpoints for STS and S3 services from your AWS Console if you want to use private endpoints for data collection and authentication. For more information, see the Interface VPC endpoints (AWS PrivateLink) topic in the Amazon Virtual Private Cloud documentation.
Configure Incremental S3 inputs either through Splunk Web or configuration files.

From version 4.3.0 and higher, the Splunk Add-on for AWS provides the Simple Queue Service (SQS)-based S3 input, which is a more scalable and higher-performing alternative to the generic S3 and incremental S3 input types for collecting various types of log files from S3 buckets. For new inputs for collecting a variety of predefined and custom data types, consider using the SQS-based S3 input instead.

The incremental S3 input only lists and retrieves objects that have not been ingested from a bucket by comparing datetime information included in filenames against checkpoint record, which significantly improves ingestion performance.

Configure AWS services for the Incremental S3 input

To collect access logs, configure logging in the AWS console to collect the logs in a dedicated S3 bucket. See the AWS documentation for more information on how to configure access logs:

For S3 access logs, see http://docs.aws.amazon.com/AmazonS3/latest/dev/ServerLogs.html.
Enable ELB access logs, see http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/enable-access-logs.html.
Enable CloudFront access logs, see http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/AccessLogs.html.

See http://docs.aws.amazon.com/gettingstarted/latest/swh/getting-started-create-bucket.html for more information about how to configure S3 buckets and objects.

Configure S3 permissions

Required permissions for S3 buckets and objects:

ListBucket
GetObject
ListAllMyBuckets
GetBucketLocation

Required permissions for KMS:

Decrypt

In the Resource section of the policy, specify the Amazon Resource Names (ARNs) of the S3 buckets from which you want to collect S3 Access Logs, CloudFront Access Logs, ELB Access Logs, or generic S3 log data.

See the following sample inline policy to configure S3 input permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetObject",
        "s3:ListAllMyBuckets",
        "s3:GetBucketLocation",
        "kms:Decrypt"
      ],
      "Resource": "*"
    }
  ]
}

For more information and sample policies, see http://docs.aws.amazon.com/AmazonS3/latest/dev/using-iam-policies.html.

Configure an Incremental S3 input using Splunk Web

To configure inputs in Splunk Web, click Splunk Add-on for AWS in the navigation bar on Splunk Web home, then choose one of the following menu paths depending on which data type you want to collect:

Create New Input > CloudTrail > Incremental S3
Create New Input > CloudFront Access Log > Incremental S3
Create New Input > ELB Access Logs > Incremental S3
Create New Input > S3 Access Logs > Incremental S3

Make sure you choose the right menu path corresponding to the data type you want to collect. The system automatically sets the appropriate source type and may display slightly different field settings in the subsequent configuration page based on the menu path.

Use the following table to complete the fields for the new input in the .conf file or in Splunk Web:

Argument in configuration file	Field in Splunk Web	Description
`aws_account`	AWS Account	The AWS account or EC2 IAM role the Splunk platform uses to access the keys in your S3 buckets. In Splunk Web, select an account from the drop-down list. In inputs.conf, enter the friendly name of one of the AWS accounts that you configured on the Configuration page or the name of the automatically discovered EC2 IAM role. If the region of the AWS account you select is GovCloud, you may encounter errors, such as "Failed to load options for S3 Bucket". You need to manually add AWS GovCloud Endpoint in the S3 Host Name field. See http://docs.aws.amazon.com/govcloud-us/latest/UserGuide/using-govcloud-endpoints.html for more information.
`aws_iam_role`	Assume Role	The IAM role to assume, see The IAM role to assume, see Manage accounts for the Splunk Add-on for AWS.
`aws_s3_region`	AWS Region (Optional)	The AWS region that contains your bucket. In inputs.conf, enter the region ID. Provide an AWS Region only if you want to use specific regional endpoints instead of public endpoints for data collection. See the AWS service endpoints topic in the AWS General Reference manual for more information.
`bucket_name`	S3 Bucket	The AWS bucket name.
`log_file_prefix`	Log File Prefix	Configure the prefix of the log file, which along with other path elements, forms the URL under which the Splunk Add-on for AWS searches the log files. The locations of the log files are different for each S3 incremental log type: cloudtrail: The Splunk Add-on for AWS searches for the CloudTrail logs under `<bucket_name>/<log_file_prefix>/AWSLogs/<Account ID>/CloudTrail/<Region ID>/<YYYY/MM/DD>/<file_name>.json.gz`. elb: The Splunk Add-on for AWS searches the elb access logs under `<bucket_name>/<log_file_prefix>/AWSLogs/<Account ID>/elasticloadbalancing/<Region ID>/<YYYY/MM/DD>/<file_name>.log.gz`. S3: The Splunk Add-on for AWS searches the S3 access logs under `<bucket_name>/<log_file_prefix><YYYY-mm-DD-HH-MM-SS><UniqueString>`. cloudfront: The Splunk Add-on for AWS searches the CloudFront access logs under `<bucket_name>/<log_file_prefix><distributionID><YYYY/MM/DD>.<UniqueID>.gz.` Under one AWS account, to ingest logs in different prefixed locations in the bucket, you need to configure multiple AWS data inputs, one for each prefix name. Alternatively, you can configure one data input but use different AWS accounts to ingest logs in different prefixed locations in the bucket.
`log_type`	Log Type	The type of logs to ingest. Available log types are `cloudtrail`, `elb:accesslogs`, `cloudfront:accesslogs`, and `s3:accesslogs`. This value is automatically set based on the menu path you chose to access this configuration page.
`log_start_date`	Log Start Date	The start date of the log.
`distribution_id`	Distribution ID	CloudFront distribution ID. This field is displayed only when you access the input configuration page through the Create New Input > CloudFront Access Log > Incremental S3 menu path.
`log_path_format`	Log Path Format	CloudTrail Log Path Format. This field is displayed when you access the input configuration page by navigating to Create New Input, then CloudTrail, then Incremental S3. Default: Account Level Account Level: `prefix/AWSLogs/<account_id>/CloudTrail/` Organization Level: `prefix/AWSLogs/<org_id>/<account_id>/CloudTrail/`
`sourcetype`	Source Type	Source type for the events. This value is automatically set for the type of logs you want to collect based on the menu path you chose to access this configuration page.
`index`	Index	The index name where the Splunk platform puts the S3 data. The default is main.
`interval`	Interval	The number of seconds to wait before splunkd checks the health of the modular input so that it can trigger a restart if the input crashes. The default is 30 seconds.
`private_endpoint_enabled`	Use Private Endpoints	Check the checkbox to use private endpoints of AWS Security Token Service (STS) and AWS Simple Cloud Storage (S3) services for authentication and data collection. In inputs.conf, enter `0` or `1` to respectively disable or enable use of private endpoints.
`s3_private_endpoint_url`	Private Endpoint (S3)	Private Endpoint (Interface VPC Endpoint) of your S3 service, which can be configured from your AWS console. Supported Formats : `<http/https>://bucket.vpce-<endpoint_id>-<unique_id>.s3.<region_id>.vpce.amazonaws.com` `<http/https>://bucket.vpce-<endpoint_id>-<unique_id>-<availability_zone>.s3.<region_id>.vpce.amazonaws.com`
`sts_private_endpoint_url`	Private Endpoint (STS)	Private Endpoint (Interface VPC Endpoint) of your STS service, which can be configured from your AWS console. Supported Formats : `<http/https>://vpce-<endpoint_id>-<unique_id>.sts.<region_id>.vpce.amazonaws.com` `<http/https>://vpce-<endpoint_id>-<unique_id>-<availability_zone>.sts.<region_id>.vpce.amazonaws.com`

Configure an Incremental S3 input using a configuration file

When you configure inputs manually in inputs.conf, create a stanza using the following template and add it to $SPLUNK_HOME/etc/apps/Splunk_TA_aws/local/inputs.conf. If the file or path does not exist, create it.

[splunk_ta_aws_logs://<name>]
log_type =
aws_account =
[splunk_ta_aws_logs://<name>]
aws_s3_region = <value>
host_name =
bucket_name =
bucket_region =
log_file_prefix =
log_start_date =
log_name_format =
log_path_format =
aws_iam_role = AWS IAM role that to be assumed.
max_retries = @integer:[-1, 1000]. default is -1. -1 means retry until success.
max_fails = @integer: [0, 10000]. default is 10000. Stop discovering new keys if the number of failed files exceeded the max_fails.
max_number_of_process = @integer:[1, 64]. default is 2.
max_number_of_thread = @integer:[1, 64]. default is 4.
private_endpoint_enabled = <value>
s3_private_endpoint_url = <value>
sts_private_endpoint_url = <value>

Splunk Add-on for AWS

Related Answers

Configure Incremental S3 inputs for the Splunk Add-on for AWS

Configure AWS services for the Incremental S3 input

Configure S3 permissions

Configure an Incremental S3 input using Splunk Web

Configure an Incremental S3 input using a configuration file

Comments

Configure Incremental S3 inputs for the Splunk Add-on for AWS