All DSP releases prior to DSP 1.4.0 use Gravity, a Kubernetes orchestrator, which has been announced end-of-life. We have replaced Gravity with an alternative component in DSP 1.4.0. Therefore, we will no longer provide support for versions of DSP prior to DSP 1.4.0 after July 1, 2023. We advise all of our customers to upgrade to DSP 1.4.0 in order to continue to receive full product support from Splunk.
Send data to Splunk HTTP Event Collector
Use the Send data to Splunk HTTP Event Collector sink function to send data to an external Splunk Enterprise system.
This function combines the actions of three underlying DSP functions into one for convenience:
This function adds out-of-the-box support for index-based routing with batched data. If you want to send data from DSP to multiple Splunk Enterprise indexes, you can use this function to specify the target index on a per-record basis. Additionally, you can specify how often batches are emitted by one of two optional arguments: batch_size
, which specifies a max payload size in bytes or batch_millis
which specifies a max time to wait before emitting the batch.
Prerequisites
Before you can use this function, you must create a connection. See Create a DSP connection to a Splunk index in the Connect to Data Sources and Destinations with the manual. When configuring this sink function, set the connection_id
argument to the ID of that connection.
Function input schema
See Connecting Splunk indexes to your DSP pipeline.
Required arguments
- connection_id
- Syntax: string
- Description: The ID of the Splunk Enterprise Connection.
- Example: "576205b3-f6f5-4ab7-8ffc-a4089a95d0c4"
- index
- Syntax: expression<string>
- Description: An expression to get the Splunk Index, if it exists, in your record. If your data does not contain an index, set this field to empty string
""
. - Example: cast(map_get(attributes, "index"), "string")
- default_index
- Syntax: expression<string>
- Description: If your record doesn't contain a Splunk Index field, then this function sends your data to the index specified in this argument. If you do not want to specify a default index, set this field to empty string
""
. - Example: "main"
Optional arguments
- parameters
- Syntax: map<string, string>
- Description: The optional parameters you can enter in this function. See the following table for a description of each parameter. Defaults to empty
{ }
.
Parameter Syntax Description Example hec-token-validation boolean Set to true to enable HEC token validation. Defaults to true. hec-token-validation: true hec-enable-ack boolean Set to true for the function to wait for an acknowledgement for every single event. Set to false if acknowledgments in your Splunk platform are disabled or to increase throughput. Defaults to true. hec-enable-ack: true hec-gzip-compression boolean Set to true to compress HEC JSON data and increase throughput at the expense of increasing pipeline CPU utilization. Defaults to false. hec-gzip-compression: false async boolean Set to true to send data asynchronously. In async mode, send operations from DSP do not wait for a response to return therefore increasing performance. See Performance expectations for sending data from DSP pipelines to Splunk Enterprise. Defaults to false.
Best practices are to enable this for performance optimization. When async is enabled, the DSP HEC client attempts to write a HEC JSON payload to the Splunk HEC endpoint a maximum of three times. Each attempt has a 10 second timeout, and a maximum of 100 async I/O operations can happen concurrently across all indexers. If you require additional optimizations and you have a support contract, contact Splunk Customer Support.
async: true
- batch_size
- Syntax: string
- Description: The maximum size, in bytes, of the emitted batched byte[]. The size of your emitted batched bytes cannot exceed 100 MB.
- Default: 10MB
- Example: "2MB"
- batch_millis
- Syntax: long
- Description: The interval, in milliseconds, at which to send batched data to Splunk Enterprise. Cannot exceed 8000 milliseconds (8 seconds).
- Default: 2000
- Example: 2000
SPL2 example
In this example, records are sent to the index specified in the index
key of the attributes
field. If the index
key does not exist in attributes
, then that record is sent to the main
index. Additionally, the hec-token-validation
and hec-gzip-compression
fields are configured for optimal throughput. Finally, the Splunk Enterprise Indexes function sends your data to the HEC endpoint when your payload reaches 100B in size.
When working in the SPL View, you can write the function by providing arguments in this exact order.
| from splunk_firehose() | into splunk_enterprise_indexes( "b5c57cbd-1470-4639-9938-deb3509cbbc8", cast(map_get(attributes, "index"), "string"), "events_idx_2", {"hec-enable-ack": "false", "hec-token-validation": "true", "hec-gzip-compression": "true"}, "100B" );
Alternatively, you can use named arguments to declare the arguments in any order without having to list all arguments. All unprovided arguments use their default values. The following example skips the parameters
argument but still provides other optional arguments.
| from splunk_firehose() | into splunk_enterprise_indexes( connection_id: "b5c57cbd-1470-4639-9938-deb3509cbbc8", index: cast(map_get(attributes, "index"), "string"), default_index: "events_idx_2", batch_millis: 2000, batch_size: "100B" );
If you want to use a mix of unnamed and named arguments in your functions, you need to list all unnamed arguments in the correct order before providing the named arguments.
Assume that you have the following three records in your data:
Record{ body="my data 1", source_type="mysourcetype1", id="id1", source="mysource", timestamp=1234567890011, host="myhost1", attributes={"attr1":"val1", "index":"index1"} }
Record{ body="my data 2", source_type="mysourcetype2", id="id2", source="mysource", timestamp=1234567890012, host="myhost2", attributes={"index":"index2"} }
Record{ body="my data 3", source_type="mysourcetype3", id="id3", source="mysource", timestamp=1234567890013, host="myhost3" }
Sending these records to the Splunk_Enterprise_Indexes
function with the arguments specified in the earlier SPL2 example results in the following HEC JSON payload:
{"event":"my data 1", "sourcetype":"mysourcetype1", "source":"mysource", "host":"myhost1", "index": "index1", "time":"1234567890.011"} {"event":"my data 2", "sourcetype":"mysourcetype2", "source":"mysource", "host":"myhost2", "index": "index2", "time":"1234567890.012"} {"event":"my data 3", "sourcetype":"mysourcetype3", "source":"mysource", "host":"myhost3", time":"1234567890.013"}
Get data from Microsoft Azure Event Hubs | Send data to a Splunk index |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5, 1.4.6
Feedback submitted, thanks!