 Download topic as PDF

# Aggregation

The following are scalar functions that you can use in the `stats` and `aggregate with trigger` streaming functions to perform calculations over data in a given time-window.

## average(value)

Calculates the average (mean) of values in a time window.

Function Input
value: T
Function Output
double

### SPL2 example

The following example returns the average (mean) "size" for each distinct "host".

When working in the SPL View, you can write the function by using the following syntax.

`...| stats average(size) BY host, span(timestamp, 50s, 10s) |...; `

Alternatively, you can use named arguments.

`...| stats average(value: size) BY host, span(timestamp, 50s, 10s) |...; `

## count(value)

Returns the number of non-null values in a time window.

Function Input
value: any
Function Output
long

### SPL2 example

Returns the count of the "status_code" field.

When working in the SPL View, you can write the function by using the following syntax.

```...| stats count(status_code) by status_code, span(window_start, 5000ms, 1000ms, 1000ms) |...;
```

Alternatively, you can use named arguments.

```...| stats count(value: status_code) by status_code, span(window_start, 5000ms, 1000ms, 1000ms) |...;
```

## estdc(value)

Estimated Distinct Count (estdc) is a stats function that calculates an approximated distinct count value for any field. This function works with ~1.5% error bound.

Function Input
value: string
Function Output
long

### SPL2 example

Returns an estimated count of the distinct values in the `input` field.

When working in the SPL View, you can write the function by using the following syntax.

```... | stats estdc(input) by span(timestamp, 10ms);
```

Alternatively, you can use named arguments.

```... | stats estdc(value: input) by span(timestamp, 10ms);
```

## max(value)

Returns the maximum value in a time window.

Function Input
value: number
Function Output
number

### SPL2 example

Returns the maximum value of the "time_taken" field.

When working in the SPL View, you can write the function by using the following syntax.

```...| stats max(time_taken) by time_taken, span(timestamp, 50s, 10s) |...;
```

Alternatively, you can use named arguments.

```...| stats max(value: time_taken) by time_taken, span(timestamp, 50s, 10s) |...;
```

## mean(value)

Calculates the average (mean) of values in a time window.

Function Input
value: number
Function Output
double

### SPL2 example

Returns the average value of the "time_taken" field.

When working in the SPL View, you can write the function by using the following syntax.

```...| stats mean(time_taken) by time_taken, span(timestamp, 50s, 10s) |...;
```

Alternatively, you can use named arguments.

```...| stats mean(value: time_taken) by time_taken, span(timestamp, 50s, 10s) |...;
```

## min(value)

Returns the minimum value in a time window.

Function Input
value: number
Function Output
number

### SPL2 example

Returns the minimum value of the "time_taken" field.

When working in the SPL View, you can write the function by using the following syntax.

```...| stats min(time_taken) by time_taken, span(timestamp, 50s, 10s) |...;
```

Alternatively, you can use named arguments.

```...| stats min(value: time_taken) by time_taken, span(timestamp, 50s, 10s) |...;
```

## perc(value, q1, q2)

Percentiles (perc) is a stats function that computes the approximate q-th percentile value of a numeric field input field with ~1.5% error bound. The perc(input, 0.25, 0.5, 0.75) takes multiple quantile queries as inputs, and then outputs fields as a list with appropriate percentile values.

Function Input
value: number
q1: double
q2: double
Function Output
collection: double

### SPL2 examples

When working in the SPL View, you can write the function by using the following syntax.

```...| stats perc(input, 0.75) by span(timestamp, 10ms);
```
```...| stats perc(input, 0.5, 0.99) by span(timestamp, 10ms);
```

## sum(value)

Returns the sum of values in a time window.

Function Input
value: number
Function Output
number

### SPL2 example

Returns the sum of the "time_taken" field.

When working in the SPL View, you can write the function by using the following syntax.

```...| stats sum(time_taken) by time_taken, span(timestamp, 50s, 10s) |...;
```

Alternatively, you can use named arguments.

```...| stats sum(value: time_taken) by time_taken, span(timestamp, 50s, 10s) |...;
```