Stream aggregation methods
Splunk Stream lets you apply aggregation to network data at capture-time on the collection endpoint before data is sent to indexers. You can use aggregation to enhance your data with a variety of statistics that provide additional insight into activities on your network.
When you apply aggregation to a Stream, only the aggregated data is sent to indexers. Using aggregation can help you decrease both storage requirements and license usage.
Stream aggregate functions
Splunk Stream supports a subset of the aggregate functions provided by the SPL (Splunk Processing Language) stats
command to calculate statistics based on fields in your network event data. You can apply aggregate functions to your data when you configure a stream in the Configure Streams UI.
Splunk Stream supports these aggregate functions:
- sum
- sum squared
- max
- min
- mean
- median
- mode
- sample standard deviation
- population standard deviation
- sample variance
- population variance
- distinct count
- distinct values
For more information on aggregate functions, see Statistical and charting functions in the Splunk Enterprise Search Reference.
How aggregates work
You apply aggregate functions to stream events over a user-defined time interval. When Stream calculates the selected aggregates, it groups events into aggregation buckets, with one bucket allocated for each unique value of the "Key" field (or unique combination of values if there are multiple "Key" fields). At the end of the time interval, the app emits an object that represents each bucket.
For example, to gain more insight into the amount of inbound http traffic, you might select src_ip
as a Key field, and apply aggregate functions such as max
, mean
, std dev
(standard deviation), continuing on to the bytes_in
field of an http stream, over a 60 second time interval.
Stream calculates these aggregates for the bytes_in
field for each unique value of src_ip
that appears in the http stream, over the specified time interval. Search results for these aggregates might appear as follows:
Aggregated field syntax
Aggregated fields in Splunk Stream version 6.6.0 and later have the following syntax:
function(field_name)
This is a change from version 6.5.x and earlier, where the aggregated field names matched the original field name (such as bytes_in
) while actually containing the sum aggregate. To access the latest field aggregation capabilities in Splunk Stream, upgrade to Splunk Stream version 7.0.0, see Upgrade to Splunk Stream 7.0.0 in the Splunk Stream Installation and Configuration Manual.
To upgrade aggregated streams from earlier versions of the app to the new syntax in 6.6.0, Splunk Stream provides a migration script that runs automatically when you upgrade to version 6.6.0. For more information,
About the count field
Each aggregated event has a single count field that reflects the total number of raw events aggregated. For example, a search result that displays count: 73
contains 73 total aggregated events, as shown:
About the values aggregate
The values
aggregate function produces a list (JSON array) of distinct values of the target field, even if the list contains a single entry. The values in the array are sorted in alphabetical order for text fields and in ascending order for numeric fields.
For example, you can apply the values
aggregate to the time_taken
field in an http stream to get a list of values for the number of microseconds it took to complete each flow event over the selected time interval. Search results for the values(time_taken)
aggregate appear as follows:
About the sum of squares aggregate
In version 6.5.x and earlier, any field X which was being aggregated had a corresponding field psrsvd_ss_X which contained the sum of squares of X. This field did not appear in the stream configuration, but was automatically generated. As of version 6.6.0, the corresponding field is called sumsq(X), and can be selected for generation in the same way as any other aggregation method. (See Configure Streams UI.)
Stream field details | Use Global IP filters |
This documentation applies to the following versions of Splunk Stream™: 7.1.2, 7.1.3, 7.2.0
Feedback submitted, thanks!