Time Series Decomposition (STL)
The Time Series Decomposition (STL) algorithm automatically decomposes time series data streams into trend, seasonal, and remainder components in real time, enabling use cases like demand forecasting and anomaly detection to identify outliers.
Time Series Decomposition (STL) implements the streaming version of proven STL (seasonal and trend decomposition using Loess) approaches. The model input is a single stream of numeric time series values. For each raw datapoint observed ingested in input, the model predicts three values corresponding to trend, seasonality, and remainder as output.
This version of the Time Series Decomposition (STL) algorithm only separates a single seasonality from the input time series. It requires the user to specify an estimated periodicity of the observed seasonality (e.g., daily, weekly, or monthly).
Function Input/Output Schema
- Function Input
collection<record<R>>
- This function takes in collections of records with schema R.
- Function Output
collection<record<S>>
- This function outputs collections of records with schema S.
Syntax
The required fields are in bold.
| stl value="input" seasonality=100;
Required arguments
- seasonality
- Syntax: integer
- Description: Seasonality sets the periodicity in the data.
- Example:
stl value="input" seasonality=1440
- timestamp
- Syntax: long
- Description: Timestamp that comes with the value.
- Example:
| time=ucast(map_get('json-map', "Start Time"), "string", null),
timestamp=cast(div(cast(get("time"), "long"), 1000000), "long");
- value
- Syntax: double
- Description: Use Time Series Decomposition (STL) on this value.
- Example:
| value=parse_double(ucast(map_get('json-map', "Bytes Sent"), "string", null));
Optional arguments
- samplingRate
- Syntax: integer
- Description: Set samplingRate in cases where timestamps are at irregular intervals.
- Example:
| stl value="input" seasonality=100 samplingRate=10;
Usage
For each observed data point, Time Series Decomposition (STL) computes a trend, seasonality, and residual value. This function can be applied to numeric time-series data, such as metrics or KPIs, to monitor for sudden changes or outliers in any of the three time-series components.
It can be challenging to identify anomalies in time-series metrics with high seasonality. For example, users monitoring web traffic may want to flag abnormally high activity that indicates an unexpected surge, or low activity that indicates a server is down. Traditional anomaly detection approaches may erroneously flag seasonal effects as anomalous - such as quiet hours over the weekend, or high volume days mid-week.
To overcome these false alarms, the Time Series Decomposition (STL) function can be used to first separate the seasonality and trend from the residual numeric time series values. Then, an anomaly detection model like Adaptive Thresholding can be applied to the residual. This approach is proven to improve anomaly detection accuracy to identify outliers, rather than erroneously flagging noisy data.
SPL2 examples
The following example uses Time Series Decomposition (STL) with Adaptive Thresholding on a test set:
| from read_csv("dataset") | extract_timestamp field="time" rules=[iso8601_timestamp()] | eval timestamp=parse_long(time), input=parse_double(value), key="" | stl value="input" seasonality=1440 | eval input=parse_double(residual) | adaptive_threshold algorithm="quantile" entity="key"
The following example uses Time Series Decomposition (STL) on a test set:
| from splunk_firehose() | eval json=cast(body, "string"), 'json-map'=from_json_object(json), input=parse_double(ucast(map_get('json-map', "Bytes Sent"), "string", null)), key=ucast(map_get('json-map', "Source Address"), "string", null), time=ucast(map_get('json-map', "Start Time"), "string", null), timestamp=cast(div(cast(get("time"), "long"), 1000000), "long") | stl value="input" seasonality=100;
Rename | Union |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.0.1
Feedback submitted, thanks!