Add an S3 input for the Splunk App for AWS
Create an S3 input to gather generic logs data from any S3 buckets in your environment. Because the nature of the data you can collect varies widely, this input is not tagged for CIM compliance, nor does data appear in any of the app dashboards. You can configure custom dashboards to fit the nature of the data that you collect.
Before you begin configuring your S3 inputs, note the following expected behaviors.
1. You cannot edit an S3 input after you create it. If you need to adjust an S3 input, delete it and recreate it.
2. The S3 data input is not intended to read frequently modified files. If a file is modified after it has been indexed, the Splunk platform will index the file again, resulting in duplicated data.
3. The S3 data input processes compressed files according to their suffixes. Use these suffixes only if the file is in the corresponding format, or data processing errors will occur. The data input supports the following compression types:
- single file in zip, gzip, tar, or tar.gz formats
- multiple files with or without folders in zip, tar, or tar.gz format
Note: Expanding compressed files requires significant operating system resources.
4. The Splunk platform auto-detects the character set used in your files among these options: UTF-8 with/without BOM, UTF-16LE/BE with BOM, UTF-32BE/LE with BOM. Other character sets are not supported.
5. You can configure multiple S3 inputs for a single S3 bucket to improve performance. The Splunk platform dedicates one process for each data input, so provided that your system has sufficient processing power, performance will improve with multiple inputs.
Note: Be sure that multiple inputs do not collect the same S3 folder and file data, to prevent indexing duplicate data.
6. As a best practice, archive your S3 bucket contents when you no longer need to actively collect them. AWS charges for list key API calls that the input uses to scan your buckets for new and changed files, so you can reduce costs and improve performance by archiving older S3 keys to another bucket or storage type.
Prerequisites
Before you can successfully configure an S3 input, you need to:
1. Create S3 buckets, folders, and files containing data that you want to collect with the Splunk App for AWS. If you have not already done this, see Configure your AWS services for the Splunk App for AWS in this manual.
2. Make sure that the account friendly name you use to configure this input corresponds to an AWS Account Access Key ID or EC2 IAM role that has the necessary permissions to gather this data. If you have not already done this, see Configure your AWS permissions for the Splunk App for AWS in this manual.
Add a new S3 input
1. In the app, click Configure in the app navigation bar.
2. Under Data Sources, in the S3 box, click New Input.
3. Select the friendly name of the AWS Account that you want to use to collect S3 data. If you have not yet configured the account you need, click Add New Account to configure one now.
4. Under S3 Bucket, select an S3 bucket from which you want to collect data.
5. Under Folder/File name, select either /, which collects all folders and files in the bucket, or a specific folder or file to index. If your S3 bucket has too many folders and files to list in the drop-down, the screen prompts you to filter using a partial name to help you find the folder or file you want.
6. Enter a Source type for the input. Select from the predefined source types or define a custom source type. One common use case is to configure the source type aws:cloudtrail
to collect your CloudTrail logs directly from an S3 bucket, rather than through the CloudTrail input.
7. (Recommended) Configure a custom Index for this data.
8. (Optional) Open the Advanced Settings section to configure a collection interval. This interval specifies how often the app should run the collection job. The default is 1800 seconds, or 30 minutes.
9. Click Add to save and enable this data input.
Once saved, the input begins collecting all historical data immediately and checks for updates every 30 minutes, unless you changed the default interval.
Delete an S3 input
You can view or delete your existing S3 inputs from the S3 Inputs screen. You cannot edit existing S3 inputs.
1. In the app, click Configure in the app navigation bar.
2. Under Data Sources, in the S3 box, click the link that tells you how many inputs you currently have configured for S3.
3. The S3 Inputs screen displays a list of S3 inputs, organized by the name auto-assigned to the input.
4. From here, you can click the names to open the individual inputs to view them, or you can delete an input by clicking the trash can icon.
Add a Billing input for the Splunk App for AWS | Add a Metadata input for the Splunk App for AWS |
This documentation applies to the following versions of Splunk® App for AWS (Legacy): 4.2.0, 4.2.1, 5.0.0, 5.0.1, 5.0.2
Feedback submitted, thanks!