Splunk® Enterprise

Add AWS CloudFront access log data: Distributed deployment with indexer clustering

Splunk Enterprise version 7.2 is no longer supported as of April 30, 2021. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk® Enterprise. For documentation on the most recent version, go to the latest release.

Configure inputs for the Splunk Add-on for AWS

SQS-based S3 is the recommended input type for collecting CloudFront Access Logs. Before you begin configuring your inputs, make sure you configure S3 to send notification to SQS via SNS to notify the add-on that new events were written to the S3 bucket.

Keep the following in mind as you configure your inputs:

  • The SQS-based S3 input only collects in near-real time newly created AWS service logs stored into S3 buckets with event notifications sent to SQS; any events that occurred in the past or events with no notifications sent through SNS to SQS will not be collected. If you want to collect historical logs stored into S3 buckets in the past, use the S3 input instead. The S3 input lets you set the initial scan time parameter (log start date) to collect data generated after a specified time in the past.
  • To collect logs of the same type from multiple S3 buckets, even across regions, you can set up one input to collect data from all the buckets by configuring these buckets to send notifications to the same SQS queue the SQS-based S3 input polls messages from.
  • To achieve high throughput in ingesting data from an S3 bucket, you can configure multiple SQS-based S3 inputs for the S3 bucket to scale out data collection.
  • After configuring an SQS-based S3 input, you may need to wait for a few minutes before new events are ingested and can be searched. Also, more verbose logging level causes longer data digestion time. Be warned that debug mode is extremely verbose and is not recommended on production systems.
  • The SQS-based input allows you to ingest data from S3 buckets by optimizing the API calls made by the add-on and relying on SQS/SNS to collect events upon receipt of notification.
  • The SQS-based S3 input is stateless, which means that when multiple inputs are collecting data from the same bucket, if one input goes down, the other inputs will continue to collect data and take over the load from the failed input. This way, you can enhance fault tolerance by configuring multiple inputs to collect data from the same bucket.

Input configuration overview

You can use the Splunk Add-on for AWS to collect data from AWS. For each supported data type, one or more input types are provided for data collection.

Follow these steps to plan and perform your AWS input configuration:

Users adding new inputs must have the admin_all_objects role enabled.

  1. Click input type to go to the input configuration details.
  2. Follow the steps described in the input configuration details to complete the configuration.

Configure a CloudFront input using Splunk Web

To configure inputs in Splunk Web, click on Splunk Add-on for AWS in the left navigation bar on Splunk Web home, then choose the following menu path:

  • Create New Input > CloudFront Access Log > SQS-based S3

You must have the admin_all_objects role enabled in order to add new inputs.

Choose the menu path that corresponds to the data type you want to collect. The system will automatically set the source type and display relevant field settings in the subsequent configuration page.

Argument Corresponding Field in Splunk Web Description
aws_account AWS Account The AWS account or EC2 IAM role the Splunk platform uses to access the keys in your S3 buckets. In Splunk Web, select an account from the drop-down list. In inputs.conf, enter the friendly name of one of the AWS accounts that you configured on the Configuration page or the name of the autodiscovered EC2 IAM role.

Note: If the region of the AWS account you select is GovCloud, you may encounter errors like Failed to load options for S3 Bucket. You need to manually add AWS GovCloud Endpoint in the S3 Host Name field. See http://docs.aws.amazon.com/govcloud-us/latest/UserGuide/using-govcloud-endpoints.html for more information.

aws_iam_role Assume Role The IAM role to assume, see Manage IAM roles.
sqs_queue_region AWS Region AWS region that the SQS queue is in e.g. us-east-1.
sqs_queue_url SQS Queue The SQL queue URL.
sqs_batch_size SQS Batch Size The maximum number of messages to pull from the SQS queue in one batch. Enter an integer between 1 and 10 inclusive. Splunk recommends that you set a larger value for small files, and a smaller value for large files. The default SQS batch size is 10. If you are dealing with large files and your system memory is limited, set this to a smaller value.
s3_file_decoder S3 File Decoder The decoder to use to parse the corresponding log files. The decoder is set according to the Data Type you select. If you select a Custom Data Type, choose one from Cloudtrail, Config, ELB Access Logs, S3 Access Logs, CloudFront Access Logs.
sourcetype Source Type The source type for the events to collect, automatically filled in based on the decoder chosen for the input.
interval Interval The length of time in seconds between two data collection runs. The default is 300 seconds.
index Index The index name where the Splunk platform should put the SQS-based S3 data. The default is main.

Configure a CloudFront input using a configuration file

When you configure inputs manually in inputs.conf, create a stanza using the following template and add it to $SPLUNK_HOME/etc/apps/Splunk_TA_aws/local/inputs.conf. If the file or path does not exist, create it.

You can configure the parameters below.

[aws_sqs_based_s3://<stanza_name>]
aws_account = <value>
interval = <value>
s3_file_decoder = <value>
sourcetype = <value>
sqs_batch_size = <value>
sqs_queue_region = <value>
sqs_queue_url = <value>

Valid values for s3_file_decoder are: CloudTrail, Config, S3 Access Logs, ELB Access Logs, CloudFront Access Logs, and CustomLogs.

If you want to ingest custom logs other the natively supported AWS log types, you must set s3_file_decoder = CustomLogs. This lets you ingest custom logs into Splunk but does not parse the data. To process custom logs into meaningful events, you need to perform additional configurations in props.conf and transforms.conf to parse the collected data to meet your specific requirements.

For more information on these settings, see /README/inputs.conf.spec under your add-on directory.

Last modified on 10 July, 2019
Configure data collection on your Splunk Enterprise instance   Validate your data

This documentation applies to the following versions of Splunk® Enterprise: 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters