Stream aggregation methods
Splunk Stream lets you apply aggregation to network data at capture-time on the collection endpoint before data is sent to indexers. You can use aggregation to enhance your data with a variety of statistics that provide additional insight into activities on your network.
When you apply aggregation to a stream, only the aggregated data is sent to indexers, thus using aggregation can help you decrease both storage requirements and license usage.
Stream aggregate functions
Splunk Stream supports a subset of the aggregate functions provided by the SPL (Splunk Processing Language)
stats command to calculate statistics based on fields in your network event data. You can apply aggregate functions to your data when you configure a stream in the Configure Streams UI.
Splunk Stream supports these aggregate functions:
- sum squared
- sample standard deviation
- population standard deviation
- sample variance
- population variance
- distinct count
- distinct values
For more information on aggregate functions, see Statistical and charting functions in the Splunk Enterprise Search Reference.
How aggregates work
You apply aggregate functions to stream events over a user-defined time interval. When Stream calculates the selected aggregates, it groups events into aggregation buckets, with one bucket allocated for each unique value of the "Key" field (or unique combination of values if there are multiple "Key" fields). At the end of the time interval, the app emits an object that represents each bucket.
For example, to gain more insight into the amount of inbound http traffic, you might select
src_ip as a Key field, and apply aggregate functions such as
std dev (standard deviation), and so on to the
bytes_in field of an http stream, over a 60 second time interval.
Stream calculates these aggregates for the
bytes_in field for each unique value of
src_ip that appears in the http stream, over the specified time interval. Search results for these aggregates might appear as follows:
Aggregated field syntax
Aggregated fields in Splunk Stream version 6.6.0 and later have the following syntax:
This is a change from version 6.5.x and earlier, where the aggregated field names matched the original field name (such as
bytes_in) while actually containing the sum aggregate. To access the latest field aggregation capabilities in Splunk Stream, upgrade to Splunk Stream version 7.0.0, see Upgrade to Splunk Stream 7.0.0 in the Splunk Stream Installation and Configuration Manual.
To use the latest agg
To upgrade aggregated streams from earlier versions of the app to the new syntax in 6.6.0, Splunk Stream provides a migration script that runs automatically when you upgrade to version 6.6.0. For more information,
About the count field
Each aggregated event has a single count field that reflects the total number of raw events aggregated. For example, a search result that displays
count: 73 contains 73 total aggregated events, as shown:
About the values aggregate
values aggregate function produces a list (JSON array) of distinct values of the target field, even if the list contains a single entry. The values in the array are sorted in alphabetical order for text fields and in ascending order for numeric fields.
For example, you might apply the
values aggregate to the
time_taken field in an http stream to get a list of values for the number of microseconds it took to complete each flow event over the selected time interval. Search results for the
values(time_taken) aggregate appear as follows:
About the sum of squares aggregate
In version 6.5.x and earlier, any field X which was being aggregated had a corresponding field psrsvd_ss_X which contained the sum of squares of X. This field did not appear in the stream configuration, but was automatically generated. As of version 6.6.0, the corresponding field is called sumsq(X), and can be selected for generation in the same way as any other aggregation method. (See Configure Streams UI.)
Stream field details
Global IP Filters
This documentation applies to the following versions of Splunk Stream™: 7.0.0, 7.0.1