Batch Records
This topic describes how to use the function in the Splunk Data Stream Processor.
Description
Batches records by count or milliseconds. Batching records, as opposed to sending each record individually, can increase throughput by reducing the quantity of data sent. However, batching records can also increase latency because records must be held until the batch is ready to be sent.
Because Batch Records sends records in batches, and the Write Splunk Index and Write to Splunk Enterprise sink functions set index per record, the index that you specify in your sink function gets applied to the entire batch of your records. If you want to route your data to different indexes while batching records, you need to create a branch per index you want to send data to. See optimize performance for more information.
Function Input/Output Schema
- Function Input
- collection<record<R>>
- This function takes in collections of records with schema R.
- Function Output
- collection<record<schema<batch: collection<map<string,any>>>>>
Syntax
- batch_records
- num_events=<int>
- millis=<int>
Required arguments
- num_events
- Syntax: int
- Description: The maximum number of records to send per batch.
- millis
- Syntax: int
- Description: The interval, in milliseconds, at which to send batched records.
Usage
The following is an example of batched data. Assume that your data looks something like the following snippet, and you've configured your function with the arguments as shown in the SPL2 example.
[ <"name": "record1", "timestamp": "1s">, <"name": "record2", "timestamp": "2s">, <"name": "record3", "timestamp": "2s">, <"name": "record4", "timestamp": "5s">, <"name": "record5", "timestamp": "5s">, ... ]
The batch_records
function sends your records thusly.
[ [ {"name": "record1", "timestamp": "1s"}, {"name": "record2", "timestamp": "2s"} ], [ {"name": "record3", "timestamp": "2s"} ], [ {"name": "record4", "timestamp": "5s"}, {"name": "record5", "timestamp": "5s"} ], ... ]
SPL2 examples
Group records into batches of either 2 events or after 2 seconds has passed in each batch
... | batch_records num_events=2L millis=2000L |...;
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.1.0
Feedback submitted, thanks!