eventstats command overview
The SPL2 eventstats
command generates summary statistics from fields in your events and saves those statistics into a new field. The eventstats
command places the generated statistics in new field that is added to the original raw events.
Syntax
The required syntax is in bold.
- eventstats
- [allnum=<bool>]
- <stats-agg-term>...
- [<by-clause>]
How the SPL2 eventstats command works
It's much easier to see what the SPL2 eventstats
command does by showing you examples, using a set of simple events.
These examples use the from
command to create a set of events. The streamstats
and eval
commands are used to create additional fields in the events.
Creating a set of events
Let's start by creating a set of four events by using a dataset literal.
| from [{"age":25, "city": "San Francisco"}, {"age": 39, "city": "Seattle"}, {"age":31, "city": "San Francisco"}, {"city": "Seattle"}]
| eval _time = now()
| streamstats count()
- The
from
command is used to create four results, which contain the timestamp when the results where created. The dataset literal specifies fields and values for four events. The fields are "age" and "city". - The last event does not contain the
age
field. - The
streamstats
command is used to create thecount
field. Thestreamstats
command calculates a cumulative count for each event, at the time the event is processed.
The results of the search look like this:
_time | age | city | count |
---|---|---|---|
02 May 2022 18:32:07 | 25 | San Francisco | 1 |
02 May 2022 18:32:07 | 39 | Seattle | 2 |
02 May 2022 18:32:07 | 31 | San Francisco | 3 |
02 May 2022 18:32:07 | Seattle | 4 |
Using eventstats with a BY clause
The BY clause in the eventstats
command is optional, but is used frequently with this command. The BY clause groups the generated statistics by the values in a field. You can use any of the statistical functions with the eventstats
command to generate the statistics. See the Quick Reference for SPL2 Stats and Charting Functions.
In this example, the eventstats
command generates the average age for each city. The generated averages are placed into a new field called avg(age)
.
The following search is the same as the previous search, with the eventstats
command added at the end:
| from [{"age":25, "city": "San Francisco"}, {"age": 39, "city": "Seattle"}, {"age":31, "city": "San Francisco"}, {"city": "Seattle"}]
| eval _time = now()
| streamstats count()
| eventstats avg(age) BY city
- For
San Francisco
, the average age is 28 = (25 + 31) / 2. - For
Seattle
, there is only one event with a value. The average is 39 = 39 / 1. Theeventstats
command places that average in every event for Seattle, including events that did not contain a value forage
.
The results of the search look like this:
_time | age | avg(age) | city | count |
---|---|---|---|---|
02 May 2022 18:32:07 | 25 | 28 | San Francisco | 1 |
02 May 2022 18:32:07 | 39 | 39 | Seattle | 2 |
02 May 2022 18:32:07 | 31 | 28 | San Francisco | 3 |
02 May 2022 18:32:07 | 39 | Seattle | 4 |
Renaming the new field
By default, the name of the new field that is generated is the name of the statistical calculation. In these examples, that name is avg(age)
. You can rename the new field using the AS keyword.
In the following search, the eventstats
command has been adjusted to rename the new field to average age by city
.
| from [{"age":25, "city": "San Francisco"}, {"age": 39, "city": "Seattle"}, {"age":31, "city": "San Francisco"}, {"city": "Seattle"}]
| eval _time = now()
| streamstats count()
| eventstats avg(age) AS 'average age by city' BY city
The results of the search look like this:
_time | age | average age by city | city | count |
---|---|---|---|---|
02 May 2022 18:32:07 | 25 | 28 | San Francisco | 1 |
02 May 2022 18:32:07 | 39 | 39 | Seattle | 2 |
02 May 2022 18:32:07 | 31 | 28 | San Francisco | 3 |
02 May 2022 18:32:07 | 39 | Seattle | 4 |
Events with text values
The previous examples show how an event is processed that does not contain a value in the age
field. Let's see how events are processed that contain an alphabetic character value in the field that you want to use to generate statistics.
The following search includes the word test
as a value in the age
field.
| from [{"age":25, "city": "San Francisco"}, {"age": 39, "city": "Seattle"}, {"age":31, "city": "San Francisco"}, {"age":"test", "city": "Seattle"}]
| eval _time = now()
| streamstats count()
The results of the search look like this:
_time | age | city | count |
---|---|---|---|
02 May 2022 18:32:07 | 25 | San Francisco | 1 |
02 May 2022 18:32:07 | 39 | Seattle | 2 |
02 May 2022 18:32:07 | 31 | San Francisco | 3 |
02 May 2022 18:32:07 | test | Seattle | 4 |
Let's add the eventstats
command to the search.
| from [{"age":25, "city": "San Francisco"}, {"age": 39, "city": "Seattle"}, {"age":31, "city": "San Francisco"}, {"age":"test", "city": "Seattle"}]
| eval _time = now()
| streamstats count()
| eventstats avg(age) BY city
The alphabetic values are treated like null values. The results of the search look like this:
_time | age | avg(age) | city | count |
---|---|---|---|---|
02 May 2022 18:32:07 | 25 | 28 | San Francisco | 1 |
02 May 2022 18:32:07 | 39 | 39 | Seattle | 2 |
02 May 2022 18:32:07 | 31 | 28 | San Francisco | 3 |
02 May 2022 18:32:07 | test | 39 | Seattle | 4 |
Using the allnum argument
But suppose you don't want statistics generated when there are alphabetic characters in the field or the field is empty?
The allnum
argument controls how the eventstats
command processes field values. The default setting for the allnum
argument is FALSE. Which means that the field used to generate the statistics does not need to contain all numeric values. Fields with empty values or alphabetic character values are ignored. You've seen this in the earlier examples.
You can force the eventstats
command to generate statistics only when the fields contain all numeric values. To accomplish this, you can set the allnum
argument to TRUE.
| from [{"age":25, "city": "San Francisco"}, {"age": 39, "city": "Seattle"}, {"age":31, "city": "San Francisco"}, {"age":"test", "city": "Seattle"}]
| eval _time = now()
| streamstats count()
| eventstats allnum=true avg(age) BY city
The results of the search look like this:
_time | age | avg(age) | city | count |
---|---|---|---|---|
02 May 2022 18:32:07 | 25 | 28 | San Francisco | 1 |
02 May 2022 18:32:07 | 39 | Seattle | 2 | |
02 May 2022 18:32:07 | 31 | 28 | San Francisco | 3 |
02 May 2022 18:32:07 | test | Seattle | 4 |
Because the age
field contains values for Seattle that are not all numbers, the entire set of values for Seattle are ignored. No average is calculated.
The allnum=true
argument applies to empty values as well as alphabetic character values.
See also
- eventstats command
- eventstats command syntax details
- eventstats command usage
- eventstats command examples
- Other commands
- stats command overview
- streamstats command overview
eval command examples | eventstats command syntax details |
This documentation applies to the following versions of Splunk® Cloud Services: current
Feedback submitted, thanks!