Search Reference

 


streamstats

NOTE - Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.

streamstats

The streamstats command, similar to the stats command, calculates summary statistics on search results. Unlike, stats (which works on the results as a whole), streamstats calculates statistics for each event at the time the event is seen.

Synopsis

Adds summary statistics to all search results in a streaming manner.

Syntax

streamstats [current=<bool>] [window=<int>] [global=<bool>] [allnum=<bool>] <stats-agg-term>* [<by clause>]

Required arguments

stats-agg-term
Syntax: <stats-func>( <evaled-field> | <wc-field> ) [AS <wc-field>]
Description: A statistical specifier optionally renamed to a new field name. The specifier can be by an aggregation function applied to a field or set of fields or an aggregation function applied to an arbitrary eval expression.

Optional arguments

current
Syntax: current=<bool>
Description: If true, tells Splunk to include the given, or current, event in the summary calculations. If false, tells Splunk to use the field value from the previous event. Defaults to true.
window
Syntax: window=<int>
Description: The 'window' option specifies the number of events to use when computing the statistics. Defaults to 0, which means that all previous (plus current) events are used.
global
Syntax: global=<bool>
Description: If the 'global' option is set to false and 'window' is set to a non-zero value, a separate window is used for each group of values of the group by fields. Defaults to true.
allnum
Syntax: allnum=<bool>
Description: If true, computes numerical statistics on each field if and only if all of the values of that field are numerical. Defaults to false.
by clause
Syntax: by <field-list>
Description: The name of one or more fields to group by.

Stats functions options

stats-function
Syntax: avg() | c() | count() | dc() | distinct_count() | first() | last() | list() | max() | median() | min() | mode() | p<in>() | perc<int>() | per_day() | per_hour() | per_minute() | per_second() | range() | stdev() | stdevp() | sum() | sumsq() | values() | var() | varp()
Description: Functions used with the stats command. Each time you invoke the stats command, you can use more than one function; however, you can only use one by clause. For a list of stats functions with descriptions and examples, see "Functions for stats, chart, and timechart".

Description

The streamstats command is similar to the eventstats command except that it uses events before a given event to compute the aggregate statistics applied to each event. If you want to include the given event in the stats calculations, use current=true (which is the default).

Examples

Example 1

Each day you track unique users, and you'd like to track the cumulative count of distinct users. This example calculates the running total of distinct users over time.

eventtype="download" | bin _time span=1d as day | stats values(clientip) as ips dc(clientip) by day | streamstats dc(ips) as "Cumulative total"

The bin command breaks the time into days. The stats command calculates the distinct users (clientip) and user count per day. The streamstats command finds the running distinct count of users.

This search returns a table that includes: day, ips, dc(clientip), and Cumulative total.

Example 2

This example uses streamstats to produce hourly cumulative totals for category values.

... | timechart span=1h sum(value) as total by category | streamstats global=f sum(total) as accu_total

The timechart command buckets the events into spans of 1 hour and counts the total values for each category. The timechart command will also fill NULL values, so that there are no missing values. Then, the streamstats command is used to calculate the accumulated total.

Example 3

This example uses streamstats to figure out when a DHCP IP lease address changed for a MAC address, 54:00:00:00:00:00.

source=dhcp MAC=54:00:00:00:00:00 | head 10 | streamstats current=f last(DHCP_IP) as new_dhcp_ip last(_time) as time_of_change by MAC

You can also clean up the presentation to display a table of the DHCP IP address changes and the times the occurred.

source=dhcp MAC=54:00:00:00:00:00 | head 10 | streamstats current=f last(DHCP_IP) as new_dhcp_ip last(_time) as time_of_change by MAC | where DHCP_IP!=new_dhcp_ip | convert ctime(time_of_change) as time_of_change | rename DHCP_IP as old_dhcp_ip | table time_of_change, MAC, old_dhcp_ip, new_dhcp_ip

For more details, refer to the Splunk Blogs post for this example.

More examples

Example 1: Compute the average value of foo for each value of bar including only 5 events (specified by the window size) with that value of bar.

... | streamstats avg(foo) by bar window=5 global=f

Example 2: For each event, compute the average of field foo over the last 5 events (including the current event). Similar to doing trendline sma5(foo)

... | streamstats avg(foo) window=5

Example 3: This example adds to each event a count field that represents the number of events seen so far (including that event). For example, it adds 1 for the first event, 2 for the second event, etc.

... | streamstats count

If you didn't want to include the current event, you would specify:

... | streamstats count current=f

See also

accum, autoregress, delta, fillnull, eventstats, stats, streamstats, trendline

Answers

Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has using the streamstats command.

This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 , 4.3.3 , 4.3.4 , 4.3.5 , 4.3.6 , 4.3.7 , 5.0 , 5.0.1 , 5.0.2 , 5.0.3 , 5.0.4 , 5.0.5 , 5.0.6 , 5.0.7 , 5.0.8 , 6.0 , 6.0.1 , 6.0.2 , 6.0.3 View the Article History for its revisions.


Comments

When would you want to group "by" a field and yet use "global=t" (which is the default). I'm not sure what happens in this case. (Or should you always use "global=f" whenever you have a "by")

Lalleman
July 7, 2010

You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!