Splunk® Data Stream Processor

Function Reference

On April 3, 2023, Splunk Data Stream Processor will reach its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.

Bin

This topic describes how to use the function in the Splunk Data Stream Processor.

Description

Puts continuous numerical values into discrete sets, or bins, by adjusting the value of <field> so that all of the items in a particular set have the same value.

Function Input/Output schema

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<S>>
This function outputs the same collection of records but with a different schema S.

Syntax

bin
<span-options>
<field> [AS <result>]

How the bin function works

Use the bin function to group records by the numerical values in a field. Suppose your incoming data looks like the following:

Body Timestamp Hour and minute Minutes from first timestamp
1 2019-08-22 01:56:37.000 01:56
2 2019-08-22 01:58:21.000 01:58 2 minutes
3 2019-08-22 01:59:59.000 01:59 3 minutes
4 2019-08-22 02:03:16.000 02:03 7 minutes
5 2019-08-22 02:05:43.000 02:05 9 minutes
6 2019-08-22 02:09:38.000 02:09 13 minutes
7 2019-08-22 02:12:31.000 02:12 16 minutes

You decide to add a bin function to your pipeline that bins the streaming data using a 5 minute time span on the timestamp field.

...| bin span=5m timestamp;

The bin function groups the timestamps in the timestamp field into 5 minutes intervals. The groups are:

Group Timestamp values Timestamp span range for each bin
1 2019-08-22 01:56:37.000

2019-08-22 01:58:21.000
2019-08-22 01:59:59.000

2019-08-22 01:56:37.000 --- 2019-08-22 02:01:36.000
2 2019-08-22 02:03:16.000

2019-08-22 02:05:43.000

2019-08-22 02:01:37.000 --- 2019-08-22 02:06:36.000
3 2019-08-22 02:09:38.000 2019-08-22 02:07:37.000 --- 2019-08-22 02:11:36.000
4 2019-08-22 02:12:31.000 2019-08-22 02:11:37.000 --- 2019-08-22 02:16:36.000

Required arguments

field
Syntax: string
Description: The name of the field to bin. The value of the field must be a numerical data type.
Example: timestamp
span
Syntax: <time-specifier>
Description: Sets the size of each bin, using a span length based on time or log-based span.
Example: 5m

Span options

log-span
Syntax: [<num>]log[<num>]
Description: Sets to logarithm-based span. The first number is a coefficient. The second number is the base. If the first number is supplied, it must be a real number >= 1.0 and < the base number. Base, if supplied, must be real number > 1.0 (strictly greater than 1).
Example: span=2log10
span-length
Syntax: <int>[<timescale>]
Description: A span of each bin. If discretizing based on the timestamp field or used with a timescale, this is treated as a time range. If not, this is an absolute bin length.
timescale
Syntax: <subseconds> | <sec> | <min> | <hr> | <day> | <month> | <year>
Description: Time scale units. If discretizing based on the timestamp field.
Default: sec
Time scale Syntax Description
<subseconds> us | ms | cs | ds Time scale in microseconds (us), milliseconds (ms), centiseconds (cs), or deciseconds (ds).
<sec> s | sec | secs | second | seconds Time scale in seconds.
<min> m | min | mins | minute | minutes Time scale in minutes.
<hr> h | hr | hrs | hour Time scale in hours.
<day> d | day | days Time scale in days.
<month> mon | month | months Time scale in months.
<year> y | year | years Time scale in years.

Optional arguments

aligntime
Syntax: aligntime=<time-specifier>
Description: Align the bin times to something other than base UTC time (epoch 0). The aligntime option is valid only when doing a time-based discretization. Ignored if span is in days, months, or years. Aligntime of earliest and latest are not supported.
Example: 4h
result
Syntax: AS <string>
Description: A new name for the field.
Example: time

SPL2 example

Align the bins to 3 hours and set the span to 1 hour intervals from that time

... | bin aligntime=3h span=1h timestamp | ...;
Last modified on 10 September, 2020
Batch Records   Break Events

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.1.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters