All DSP releases prior to DSP 1.4.0 use Gravity, a Kubernetes orchestrator, which has been announced end-of-life. We have replaced Gravity with an alternative component in DSP 1.4.0. Therefore, we will no longer provide support for versions of DSP prior to DSP 1.4.0 after July 1, 2023. We advise all of our customers to upgrade to DSP 1.4.0 in order to continue to receive full product support from Splunk.
Bin
This topic describes how to use the function in the .
Description
The Bin function puts continuous numerical values into discrete sets, or bins, by adjusting the value of <field> so that all of the items in a particular set have the same value.
Function Input/Output schema
- Function Input
collection<record<R>>
- This function takes in collections of records with schema R.
- Function Output
collection<record<S>>
- This function outputs the same collection of records but with a different schema S.
Syntax
- bin
- <span-options>
- <field> [AS <result>]
How the Bin function works
Use the Bin function to group records by the numerical values in a field. Suppose your incoming data looks like the following:
Body | Timestamp | Hour and minute | Minutes from first timestamp |
---|---|---|---|
1 | 2019-08-22 01:56:37.000 | 01:56 | |
2 | 2019-08-22 01:58:21.000 | 01:58 | 2 minutes |
3 | 2019-08-22 01:59:59.000 | 01:59 | 3 minutes |
4 | 2019-08-22 02:03:16.000 | 02:03 | 7 minutes |
5 | 2019-08-22 02:05:43.000 | 02:05 | 9 minutes |
6 | 2019-08-22 02:09:38.000 | 02:09 | 13 minutes |
7 | 2019-08-22 02:12:31.000 | 02:12 | 16 minutes |
You decide to add a Bin function to your pipeline that bins the streaming data using a 5 minute time span on the timestamp
field.
...| bin span=5m timestamp;
The Bin function groups the timestamps in the timestamp
field into 5 minutes intervals. The groups are:
Group | Timestamp values | Timestamp span range for each bin |
---|---|---|
1 | 2019-08-22 01:56:37.000 2019-08-22 01:58:21.000 |
2019-08-22 01:56:37.000 --- 2019-08-22 02:01:36.000 |
2 | 2019-08-22 02:03:16.000 2019-08-22 02:05:43.000 |
2019-08-22 02:01:37.000 --- 2019-08-22 02:06:36.000 |
3 | 2019-08-22 02:09:38.000 | 2019-08-22 02:07:37.000 --- 2019-08-22 02:11:36.000 |
4 | 2019-08-22 02:12:31.000 | 2019-08-22 02:11:37.000 --- 2019-08-22 02:16:36.000 |
Required arguments
- field
- Syntax: string
- Description: The name of the field to bin. The value of the field must be a numerical data type.
- Example: timestamp
- span
- Syntax: <time-specifier>
- Description: Sets the size of each bin, using a span length based on time or log-based span.
- Example: 5m
Span options
- log-span
- Syntax: <num>log<num>
- Description: Sets to logarithm-based span. The first number is a coefficient. The second number is the base. The coefficient must be a real number >= 1.0 and < the base number.
- Example: span=2log10
- span-length
- Syntax: <int><timescale>
- Description: A span of each bin. If discretizing based on the
timestamp
field or used with a timescale, this is treated as a time range. If not, this is an absolute bin length.
- timescale
- Syntax: <subseconds> | <sec> | <min> | <hr> | <day> | <month> | <year>
- Description: Time scale units. If discretizing based on the
timestamp
field. - Default: sec
Time scale Syntax Description <subseconds> us | ms | cs | ds Time scale in microseconds (us), milliseconds (ms), centiseconds (cs), or deciseconds (ds). <sec> s | sec | secs | second | seconds Time scale in seconds. <min> m | min | mins | minute | minutes Time scale in minutes. <hr> h | hr | hrs | hour Time scale in hours. <day> d | day | days Time scale in days. <month> mon | month | months Time scale in months. <year> y | year | years Time scale in years.
Optional arguments
- aligntime
- Syntax: aligntime=<time-specifier>
- Description: Align the bin times to something other than base UTC time (epoch 0). The aligntime option is valid only when doing a time-based discretization. Ignored if span is in days, months, or years. Aligntime of earliest and latest are not supported.
- Example: 4h
- result
- Syntax: AS <string>
- Description: A new name for the field.
- Example: time
Example
An example of a common use case follows. These examples assume that you have added the function to your pipeline.
SPL2 Example: Align the bins to 3 hours and set the span to 1 hour intervals from that time
This example assumes that you are in the SPL View.
... | bin aligntime=3h span=1h timestamp | ...;
Batch Records | Break Events |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5, 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5, 1.4.6
Feedback submitted, thanks!