Configure the Splunk Add-on for Amazon Kinesis Firehose
Event formatting requirements
The Splunk Add-on for Amazon Kinesis Firehose supports data collection using raw and event HEC types. When you collect data using the raw endpoint, data is sent directly to the raw endpoint without any preprocessing.
When you collect data using an event endpoint, format your events into the JSON format expected by HEC before sending them from Amazon Kinesis Firehose to the Splunk platform. You can apply an AWS Lambda blueprint to preprocess your events into the JSON structure and set event-specific fields. This allows you greater control over how your events are handled by the Splunk platform. For example, you can create and apply a Lambda blueprint that sends data from the same Firehose stream to different indexes depending on event type. For information about using an AWS Lambda function to preprocess events into this format, see Use AWS Lambda with HTTP Event Collector on the Splunk Developer Portal.
Note: If you work with VPC Flow Log data, the aws:cloudwatchlogs:vpcflow
contains a nested events JSON array that cannot be parsed by the HTTP Event Collector. In this case you can prepare this data for the Splunk platform using an AWS Lambda function that extracts the nested JSON events correctly into a newline-delimited set of events. For information about the required JSON structure, see Format events for HTTP Event Collector on the Splunk developer portal.
Indexers in AWS VPC
If you have placed your indexers on the AWS Virtual Private Cloud, use an elastic load balancer to send data to your indexers.
Configure HTTP event collection
Configure the HTTP event collector (HEC) on a single-instance Splunk Enterprise deployment to ingest data using the Splunk Add-on for Amazon Kinesis Firehose.
Prerequisite
- Install the Splunk Add-on for Amazon Kinesis Firehose on a single-instance Splunk Enterprise deployment
- For optimal performance, set
ackIdleCleanup
to true ininputs.conf
located in$SPLUNK_HOME/etc/apps/splunk_httpinput/local/inputs.conf
for *nix users and%SPLUNK_HOME%\etc\apps\splunk_httpinput\local\inputs.conf
for Windows users.
Steps
- Decide what index you want to use to collect your Amazon Kinesis Firehose data. Ensure that this index is enabled and active. Sending data to a disabled or deleted index results in dropped events.
- Go to Settings > Data inputs > HTTP Event Collector click Global Settings.
- Check the box next to Enable SSL, then click Save.
- Create an HTTP event collector token with indexer acknowledgments enabled. During the configuration:
- Specify a Source type for your incoming data.
- Select an Index to which Amazon Kinesis Firehose will send data.
- Check the box next to Enable indexer acknowledgement.
Configure timestamp extraction
You can configure your add-on to send timestamped events to HTTP Event Collector when auto_extract_timestamp is set to "true" in the /event URL.
To configure this, enable one of the following endpoints:
services/collector/event/1.0
: Provides timestamps for event data events when auto_extract_timestamp is set to "true" in the /event URLservices/collector/raw/1.0
: Provides timestamps for raw data events when auto_extract_timestamp is set to "true" in the /event URL
When one or both of these endpoints are enabled, the add-on extracts timestamps as follows:
* If there is no timestamp in the event's JSON envelope, extraction is performed by leverage pipeline. * If there is a timestamp, Splunk honors it. * If "time=xxx" is used in the /event URL then auto_extract_timestamp is disabled.
https://docs.splunk.com/Documentation/Splunk/1/SimplerGDI/HECEndpoints#HEC_Endpoints
This documentation applies to the following versions of Splunk® Enterprise: 7.3.0, 7.3.1
Feedback submitted, thanks!