Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Batch Records

This topic describes how to use the function in the Splunk Data Stream Processor.

Description

Batches records by count or milliseconds. Batching records, as opposed to sending each record individually, can increase throughput by reducing the quantity of data sent. However, batching records can also increase latency because records must be held until the batch is ready to be sent.

There are two functions for batching records: Batch Records and Batch Bytes. Use Batch Records when you do not want to serialize your data or you want to perform serialization after batching. Use Batch Bytes when you want to serialize your data before batching.

Function Input/Output Schema

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<schema<batch: collection<map<string,any>>>>>

Syntax

batch_records
num_events=<long>
millis=<long>

Required arguments

num_events
Syntax: expression<long>
Description: The maximum number of records to send per batch.
Default: 100,000,000
Example: 2000
millis
Syntax: expression<long>
Description: The interval, in milliseconds, at which to send batched records.
Default: 10000 milliseconds (10 seconds).
Example: 2000

Usage

The following is an example of batched data. Assume that your data looks something like the following snippet, and you've configured your function with the arguments as shown in the SPL2 example.

[
<"name": "record1", "timestamp": "1s">,
<"name": "record2", "timestamp": "2s">,
<"name": "record3", "timestamp": "2s">,
<"name": "record4", "timestamp": "5s">,
<"name": "record5", "timestamp": "5s">,
...
]

The batch_records function sends your records thusly.

[
    [
{"name": "record1", "timestamp": "1s"},
{"name": "record2", "timestamp": "2s"}
    ],
    [
{"name": "record3", "timestamp": "2s"}
    ],
[
{"name": "record4", "timestamp": "5s"},
{"name": "record5", "timestamp": "5s"}
    ],
...
]

SPL2 examples

Group records into batches of either 2 records or after 2 seconds has passed in each batch

... | batch_records num_events=2L millis=2000L |...;
Last modified on 30 November, 2020
PREVIOUS
Batch Bytes
  NEXT
Bin

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.1.0, 1.2.0


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters