Send data to a Splunk index with batching

Use the Send to a Splunk Index with Batching sink function to send data to an external Splunk Enterprise system.

The Splunk Enterprise Indexes function combines the actions of three underlying functions into one for convenience:

This function adds out-of-the-box support for index-based routing with batched data. If you want to send data to multiple Splunk Enterprise indexes, you can use this function to specify the target index on a per-record basis. Additionally, you can specify how often batches are emitted by one of two optional arguments: batch_size, which specifies a max payload size in bytes or batch_millis which specifies a max time to wait before emitting the batch.

Prerequisites

Before you can use this function, you must create a connection. See Create a DSP connection to a Splunk index in the Connect to Data Sources and Destinations with the manual. When configuring this sink function, set the connection_id argument to the ID of that connection.

Function input schema

See Connecting Splunk indexes to your pipeline.

Required arguments

connection_id: Syntax: string; Description: The ID of the Splunk Enterprise Connection.; Example in Canvas View: "576205b3-f6f5-4ab7-8ffc-a4089a95d0c4"
index: Syntax: expression<string>; Description: An expression to get the Splunk Index, if it exists, in your record. If your data does not contain an index, set this field to empty string "".; Example in Canvas View: cast(map_get(attributes, "index"), "string")
default_index: Syntax: expression<string>; Description: If your record doesn't contain a Splunk Index field, then this function sends your data to the index specified in this argument. If you do not want to specify a default index, set this field to empty string "".; Example in Canvas View: "main"

The following argument is only a required argument if you want to send data from the to a Splunk Cloud index, otherwise it is optional.

parameters: Syntax: map<string, string>; Description: The additional parameters you can enter in this function. If you want to send data to Splunk Cloud, you must set hec-token-validation to false. For other parameters you can specify, see "Parameters" in the "Optional arguments" section.; Example in Canvas View: hec-token-validation: false

Optional arguments

parameters: Syntax: map<string, string>; Description: The optional parameters you can enter in this function. See the following table for a description of each parameter. Defaults to empty { }.

Parameter	Syntax	Description	Example in Canvas View
hec-token-validation	boolean	Set to true to enable HEC token validation. Defaults to true.	hec-token-validation: true
hec-enable-ack	boolean	Set to true for the function to wait for an acknowledgement for every single event. Set to false if acknowledgments in your Splunk platform are disabled or to increase throughput. Defaults to true.	hec-enable-ack: true
hec-gzip-compression	boolean	Set to true to compress HEC JSON data and increase throughput at the expense of increasing pipeline CPU utilization. Defaults to false.	hec-gzip-compression: false

batch_size: Syntax:string; Description: The maximum size, in bytes, of the emitted batched byte[]. The size of your emitted batched bytes cannot exceed 100 MB. Defaults to 10MB.; Example in Canvas View: "2MB"
batch_millis: Syntax: long; Description: The interval, in milliseconds, at which to send batched data to Splunk Enterprise. Defaults to 10000.; Example in Canvas View: 2000

SPL2 example

In this example, records are sent to the index specified in the index key of the attributes field. If the index key does not exist in attributes, then that record is sent to the main index. Additionally, the hec-token-validation and hec-gzip-compression fields are configured for optimal throughput. Finally, the Splunk Enterprise Indexes function sends your data to the HEC endpoint when your payload reaches 100B in size.

When working in the SPL View, you can write the function by providing arguments in this exact order.

| from splunk_firehose()
| into splunk_enterprise_indexes(
    "b5c57cbd-1470-4639-9938-deb3509cbbc8",
    cast(map_get(attributes, "index"), "string"),
    "events_idx_2",
    {"hec-enable-ack": "false", "hec-token-validation": "true", "hec-gzip-compression": "true"},
    "100B"
  );

Alternatively, you can use named arguments to declare the arguments in any order and without having to list all arguments. All unprovided arguments use their default values. The following example skips the parameters argument but still provides other optional arguments.

| from splunk_firehose()
| into splunk_enterprise_indexes(
    connection_id: "b5c57cbd-1470-4639-9938-deb3509cbbc8",
    index: cast(map_get(attributes, "index"), "string"),
    default_index: "events_idx_2",
    batch_millis: 2000,
    batch_size: "100B"
  );

If you want to use a mix of unnamed and named arguments in your functions, you need to list all unnamed arguments in the correct order before providing the named arguments.

Assume that you have the following three records in your data:

Record{ 
  body="my data 1", source_type="mysourcetype1", id="id1", source="mysource", timestamp=1234567890011, host="myhost1", attributes={"attr1":"val1", "index":"index1"}
}

Record{ 
  body="my data 2", source_type="mysourcetype2", id="id2", source="mysource", timestamp=1234567890012, host="myhost2", attributes={"index":"index2"}
}

Record{ 
  body="my data 3", source_type="mysourcetype3", id="id3", source="mysource", timestamp=1234567890013, host="myhost3"
}

Sending these records to the Splunk_Enterprise_Indexes function with the arguments specified in the earlier SPL2 Pipeline Builder example results in the following HEC JSON payload:

{"event":"my data 1", "sourcetype":"mysourcetype1", "source":"mysource", "host":"myhost1", "index": "index1", "time":"1234567890.011"}
{"event":"my data 2", "sourcetype":"mysourcetype2", "source":"mysource", "host":"myhost2", "index": "index2", "time":"1234567890.012"}
{"event":"my data 3", "sourcetype":"mysourcetype3", "source":"mysource", "host":"myhost3", time":"1234567890.013"}

Related answers from Splunk Community

Send data to a Splunk index with batching

Prerequisites

Function input schema

Required arguments

Optional arguments

SPL2 example

Comments

Send data to a Splunk index with batching

Was this topic useful?