Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Adaptive Thresholding

Adaptive Thresholding detects anomalies/outliers on the stream. There are two approaches provided:

  • Distribution free (quantile estimation)
  • Gaussian (or other distribution based) estimation

Adaptive Thresholding dynamically generates threshold values based on observed data values. The default implementation of Adaptive Thresholding uses the Gaussian approach. The only difference between the distribution-free and Gaussian approaches is in the assumptions that are made on the underlying data distribution. Among other optional parameters, users may specify the rolling window on which to compute the adaptive threshold values.

The output of the Adaptive Thresholding function is (1) the estimated quantile or Gaussian mean and standard deviation, and (2) the predicted label to classify outliers.

Function Input/Output Schema

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<S>>
This function outputs collections of records with schema S.

Syntax

The required fields are in bold.

| adaptive_threshold algorithm="quantile" entity="key" value="input" window=-1L;

Required arguments

timestamp
Syntax: long
Description: Timestamp that comes with the value.
Example : cast(time, "long");
value
Syntax: double
Description: Value to detect anomaly.
Example: "input"

Optional arguments

algorithm
Syntax: string
Description: Anomaly detection algorithm. Default is gaussian.
Example: "quantile"
entity
Syntax: string
Description: The entity column for per-entity Adaptive Thresholding. If unset, entity is treated as corresponding to a single entity.
Example: "key"
eps
Syntax: double
Description: The error parameter for maintaining sliding window. Defaults to 0.01.
Example: 0.01
threshold
Syntax: double
Description: The threshold you set for flagging an outlier. Default is NaN to allow algorithm to choose its appropriate default value.
Example: Double.NaN
window
Syntax: long
Description: The time window (in milliseconds from epoch) to train on. Defaults to -1.
Example: 1L

Usage

For each data point observed, Adaptive Thresholding outputs predicted labels (binary classification of outliers) and the estimated quantile or Gaussian output. The distribution free approach (quantile) produces the q-th quantile of current data points. A distribution based approach (Gaussian) produces the mean and /variance of current data points. Both approaches generate predicted labels to classify outliers.

Adaptive Thresholding is frequently used to identify outliers in real-time on numeric time series, such as metrics and KPIs. Adaptive Thresholding is useful for monitoring and evaluating the performance of a metric where baseline values are subject to change.

For example, in monitoring the %CPU consumption of a server, you expect the base load to vary dynamically. Applying the Adaptive Thresholding function enables outlier detection on a rolling window (e.g., one hour). With the Gaussian approach, the function generates an estimation of where in the distribution each observed datapoint lies. Predicted outliers correspond to observations that are n-times (e.g., greater than 2-times) the standard deviation from the mean. With the Distribution-free approach, the function computes the q-th quantile of each observed datapoint. Predicted outliers correspond to observations that fall outside the n-th percentile (e.g., greater than 99th percentile).

SPL2 example

The following example uses Adaptive Thresholding to detect anomalies in battery voltage:

| from splunk_firehose()
| eval json=cast(body, "string"),
'json-map'=from_json_object(json), input=parse_double(ucast(map_get('json-map', "voltage"), "string", null)),
time=ucast(map_get('json-map', "timestamp"), "string", null), timestamp=cast(time, "long"),
key=""
| adaptive_threshold algorithm="quantile" entity="key" value="input" window=-1L;
Last modified on 30 October, 2020
PREVIOUS
SPL2 in DSP Primer
  NEXT
Aggregate with Trigger

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters