Configure inputs for the Splunk Add-on for AWS
Configure SQS-based S3 inputs to collect the following events:
- CloudFront Access Logs
- Config
- ELB Access logs
- CloudTrail
- S3 Access Logs
- Custom data types
Before you configure SQS-based S3 inputs, perform the following tasks:
- Configure S3 to send notifications to SQS. This lets S3 notify the add-on that new events were written to the S3 bucket.
- Subscribe to the corresponding SNS Topic.
Keep the following in mind as you configure your inputs:
- The SQS-based S3 input only collects in AWS service logs that meet the following criteria:
- Near-real time
- Newly created
- Stored into S3 buckets
- Has event notifications sent to SQS
Events that occurred in the past, or events with no notifications sent through SNS to SQS end up in the Dead Letter Queue (DLQ), and no corresponding event is created by the Splunk Add-on for AWS. To collect historical logs stored into S3 buckets, use the generic S3 input instead. The S3 input lets you set the initial scan time parameter to collect data generated after a specified time in the past.
- To collect the same types of logs from multiple S3 buckets, even across regions, set up one input to collect data from all the buckets. To do this, configure these buckets to send notifications to the same SQS queue from which the SQS-based S3 input polls message.
- To achieve high throughput data ingestion from an S3 bucket, configure multiple SQS-based S3 inputs for the S3 bucket to scale out data collection.
- After configuring an SQS-based S3 input, you might need to wait for a few minutes before new events are ingested and can be searched. Also, a more verbose logging level causes longer data digestion time. Debug mode is extremely verbose and is not recommended on production systems.
- The SQS-based input allows you to ingest data from S3 buckets by optimizing the API calls made by the add-on and relying on SQS/SNS to collect events upon receipt of notification.
- The SQS-based S3 input is stateless, which means that when multiple inputs are collecting data from the same bucket, if one input goes down, the other inputs continue to collect data and take over the load from the failed input. This lets you enhance fault tolerance by configuring multiple inputs to collect data from the same bucket.
Configure an SQS-based S3 input using Splunk Web
To configure inputs in Splunk Web, click Splunk Add-on for AWS in the navigation bar on Splunk Web home, then choose one of the following menu paths depending on which data type you want to collect:
- Create New Input > CloudTrail > SQS-based S3
- Create New Input > CloudFront Access Log > SQS-based S3
- Create New Input > Config > SQS-based S3
- Create New Input > ELB Access Logs > SQS-based S3
- Create New Input > S3 Access Logs > SQS-based S3
- Create New Input > VPC Flow Logs > SQS-based S3
- Create New Input > Custom Data Type > SQS-based S3
- Create New Input > Custom Data Type > SQS-based S3 >
Delimited Files
S3 File Decoder
You must have the admin_all_objects role enabled in order to add new inputs.
Choose the menu path that corresponds to the data type you want to collect. The system automatically sets the source type and display relevant field settings in the subsequent configuration page.
Use the following table to complete the fields for the new input in the .conf file or in Splunk Web:
Argument in configuration file | Field in Splunk Web | Description |
---|---|---|
aws_account
|
AWS Account | The AWS account or EC2 IAM role the Splunk platform uses to access the keys in your S3 buckets. In Splunk Web, select an account from the drop-down list. In inputs.conf, enter the friendly name of one of the AWS accounts that you configured on the Configuration page or the name of the automatically discovered EC2 IAM role. If the region of the AWS account you select is GovCloud, you may encounter errors such as"Failed to load options for S3 Bucket". You need to manually add AWS GovCloud Endpoint in the S3 Host Name field. See http://docs.aws.amazon.com/govcloud-us/latest/UserGuide/using-govcloud-endpoints.html for more information. |
aws_iam_role
|
Assume Role | The IAM role to assume, see Manage accounts for the Splunk Add-on for AWS. |
using_dlq
|
Force using DLQ (Recommended) | Check the checkbox to remove the checking of DLQ (Dead Letter Queue) for ingestion of specific data. In inputs.conf, enter 0 or 1 to respectively disable or enable the checking. (Default value is 1 )
|
sqs_queue_region
|
AWS Region | AWS region that the SQS queue is in. |
private_endpoint_enabled
|
Use Private Endpoints | Check the checkbox to use private endpoints of AWS Security Token Service (STS) and AWS Simple Cloud Storage (S3) services for authentication and data collection. In inputs.conf, enter 0 or 1 to respectively disable or enable use of private endpoints.
|
sqs_private_endpoint_url
|
Private Endpoint URL (SQS) | Private Endpoint (Interface VPC Endpoint) of your SQS service, which can be configured from your AWS console.
|
sqs_sns_validation
|
SNS Signature Validation | SNS validation of your SQS messages, which can be configured from your AWS console. If selected, all messages will be validated. If unselected, then messages will not be validated until receiving a signed message. Thereafter, all messages will be validated for an SNS signature. For new SQS-based S3 inputs, this feature is enabled, by default.
|
s3_private_endpoint_url
|
Private Endpoint URL (S3) | Private Endpoint (Interface VPC Endpoint) of your S3 service, which can be configured from your AWS console.
|
sts_private_endpoint_url
|
Private Endpoint URL (STS) | Private Endpoint (Interface VPC Endpoint) of your STS service, which can be configured from your AWS console.
|
sqs_queue_url
|
SQS Queue Name | The SQS queue URL. |
sqs_batch_size
|
SQS Batch Size | The maximum number of messages to pull from the SQS queue in one batch. Enter an integer between 1 and 10 inclusive. Set a larger value for small files, and a smaller value for large files. The default SQS batch size is 10. If you are dealing with large files and your system memory is limited, set this to a smaller value. |
s3_file_decoder
|
S3 File Decoder | The decoder to use to parse the corresponding log files. The decoder is set according to the Data Type you select. If you select a Custom Data Type, choose one from Cloudtrail , Config , ELB Access Logs , S3 Access Logs , or CloudFront Access Logs .
|
sourcetype
|
Source Type | The source type for the events to collect, automatically filled in based on the decoder chosen for the input.
This add-on does not support custom sourcetypes for |
interval
|
Interval | The length of time in seconds between two data collection runs. The default is 300 seconds. |
index
|
Index | The index name where the Splunk platform puts the SQS-based S3 data. The default is main. |
polling_interval
|
Polling Interval | The number of seconds to wait before the Splunk platform runs the command again. The default is 1,800 seconds. |
parse_csv_with_header
|
Parse all files as CSV | If selected, all files will be parsed as a delimited file with the first line of each file considered the header. Set this checkbox to disabled for delimited files without a header. For new SQS-based S3 inputs, this feature is disabled, by default.
Supported Formats:
|
parse_csv_with_delimiter
|
CSV field delimiter | Delimiter must be one character. The character cannot be alphanumeric, single quote, or double quote. Tab-delimited files will be \t . By default the delimiter is a comma.
|
Configure data collection on your Splunk Enterprise instance | Validate your data |
This documentation applies to the following versions of Splunk® Enterprise: 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10
Feedback submitted, thanks!