streamstats
Contents
streamstats
The streamstats command, similar to the stats command, calculates summary statistics on search results. Unlike, stats (which works on the results as a whole), streamstats calculates statistics for each event at the time the event is seen.
Synopsis
Adds summary statistics to all search results in a streaming manner.
Syntax
streamstats [current=<bool>] [window=<int>] [global=<bool>] [allnum=<bool>] <stats-agg-term>* [<by clause>]
Required arguments
- stats-agg-term
- Syntax: <stats-func>( <evaled-field> | <wc-field> ) [AS <wc-field>]
- Description: A statistical specifier optionally renamed to a new field name. The specifier can be by an aggregation function applied to a field or set of fields or an aggregation function applied to an arbitrary eval expression.
Optional arguments
- current
- Syntax: current=<bool>
- Description: If true, tells Splunk to include the given, or current, event in the summary calculations. Defaults to true.
- window
- Syntax: window=<int>
- Description: The 'window' option specify window size to be used in computing the statistics. Defaults to 0, which means that all previous (plus current) events are used.
- global
- Syntax: global=<bool>
- Description: If the 'global' option is set to false and 'window' is set to a non-zero value, a separate window is used for each group of values of the group by fields. Defaults to true.
- allnum
- Syntax: allnum=<bool>
- Description: If true, computes numerical statistics on each field if and only if all of the values of that field are numerical. Defaults to false.
- by clause
- Syntax: by <field-list>
- Description: The name of one or more fields to group by.
Stats functions options
- stats-function
- Syntax: avg() | c() | count() | dc() | distinct_count() | first() | last() | list() | max() | median() | min() | mode() | p<in>() | perc<int>() | per_day() | per_hour() | per_minute() | per_second() | range() | stdev() | stdevp() | sum() | sumsq() | values() | var() | varp()
- Description: Functions used with the stats command. Each time you invoke the
statscommand, you can use more than one function; however, you can only use oneby clause. For a list of stats functions with descriptions and examples, see "Functions for stats, chart, and timechart".
Description
The streamstats command is similar to the eventstats command except that it uses events before a given event to compute the aggregate statistics applied to each event. If you want to include the given event in the stats calculations, use current=true (which is the default).
Example 1
Each day you track unique users, and you'd like to track the cumulative count of distinct users. This is example calculates the running total of distinct users over time.
eventtype="download" | bin _time span=1d as day | stats values(clientip) as ips dc(clientip) by day | streamstats dc(ips) as "Cumulative total"The bin command breaks the time into days. The stats command calculates the distinct users (clientip) and user count per day. The streamstats command find the running distinct count of users.
This search returns a table that includes: day, ips, dc(clientip), and Cumulative total.
Example 2
This example uses streamstats to produce hourly cumulative totals for category values.
... | timechart span=1h sum(value) as total by category | streamstats global=f sum(total) as accu_total by categoryThe timechart command buckets the events into spans of 1 hour and counts the total values for each category. The timechart command will also fill NULL values, so that there are no missing values. Then, the streamstats command is used to calculate the accumulated total for each category value.
More examples
Example 1: Compute the average value of foo for each value of bar including only the only 5 events with that value of bar.
... | streamstats avg(foo) by bar window=5 global=fExample 2: For each event, compute the average of field foo over the last 5 events (including the current event). Similar to doing trendline sma5(foo)
... | streamstats avg(foo) window=5Example 3: This example adds to each event a count field that represents the number of events seen so far (including that event). For example, it adds 1 for the first event, 2 for the second event, etc.
... | streamstats countIf you didn't want to include the current event, you would specify:
... | streamstats count current=fSee also
accum, autoregress, delta, fillnull, eventstats, stats, streamstats, trendline
Answers
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has using the streamstats command.
This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 View the Article History for its revisions.
When would you want to group "by" a field and yet use "global=t" (which is the default). I'm not sure what happens in this case. (Or should you always use "global=f" whenever you have a "by")