The Guided Data Onboarding documentation assumes you are familiar with the Splunk software. If you are new to using Splunk Enterprise or to Splunk Cloud, see the Additional Resources topic in this manual.
To complete the tasks described in this manual, you must have the following:
- A distributed Splunk Enterprise deployment with indexer clustering enabled that is installed, running, and meets your hardware capacity requirements.
- Access to Splunk Web.
- A user role that permits apps installation.
About the Splunk Add-on for Amazon Kinesis Firehose
Splunk Add-on for Amazon Kinesis Firehose provides CIM-compatible knowledge for data collected through the HTTP event collector (HEC). After the Splunk platform indexes the events, you can analyze the data directly or use other Splunk apps, such as the Splunk App for AWS and Splunk Enterprise Security.
Download the Splunk Add-on for Amazon Kinesis Firehose from Splunkbase.
The Splunk Add-on for Amazon Kinesis Firehose requires Splunk platform version 6.6.X or later.
Amazon Kinesis Firehose requires that the HTTP Event Collector (HEC) endpoint be terminated with a valid CA-signed certificate that matches the DNS hostname that connects to the HEC endpoint. You must use a trusted CA-signed certificate in your configuration, self-signed certificates are not supported.
To send data directly into Splunk indexers in your own internal network or AWS VPC, install a CA-signed certificate on each indexer. To use an Elastic Load Balancer (ELB) to send data in distributed deployments, also install a CA-signed certificate on the load balancer.
Managed Splunk Cloud users are provided an ELB with a proper CA-signed certificate and a hostname for each stack.
Event formatting requirements
The Splunk Add-on for Amazon Kinesis Firehose supports data collection using raw and event HEC types. When you collect data using the raw endpoint, data is sent directly to the raw endpoint without any preprocessing.
When you collect data using an event endpoint, format your events into the JSON format expected by HEC before sending them from Amazon Kinesis Firehose to the Splunk platform. You can apply an AWS Lambda blueprint to preprocess your events into the JSON structure and set event-specific fields. This allows you greater control over how your events are handled by the Splunk platform. For example, you can create and apply a Lambda blueprint that sends data from the same Firehose stream to different indexes depending on event type. For information about using an AWS Lambda function to preprocess events into this format, see Use AWS Lambda with HTTP Event Collector on the Splunk Developer Portal.
Note: If you work with VPC Flow Log data, the
aws:cloudwatchlogs:vpcflow contains a nested events JSON array that cannot be parsed by the HTTP Event Collector. In this case you can prepare this data for the Splunk platform using an AWS Lambda function that extracts the nested JSON events correctly into a newline-delimited set of events. For information about the required JSON structure, see Format events for HTTP Event Collector on the Splunk developer portal.
Configure Amazon Web Services to collect data
This documentation applies to the following versions of Splunk® Enterprise: 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8