Splunk® Enterprise

Search Reference

Download manual as PDF

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

predict

Description

The predict command performs future predictions for time-series data.

The command can fill in missing data in a time-series and provide predictions for the next several time steps. The command provides confidence intervals for all of its estimates. The command adds a predicted value and an upper and lower 95th percentile range to each event in the time-series.

Syntax

predict <variable_to_predict> [AS <newfield>] [<predict_options>]

Required arguments

<variable_to_predict>
Syntax: <field>
Description: The field name for the variable that you want to predict.

Optional arguments

<newfield>
Syntax: <string>
Description: Renames the field name for <variable_to_predict>.
<predict_options>
Syntax: algorithm=<algorithm_name> | correlate_field=<field> | future_timespan=<number> | holdback=<number> | period=<number> | lowerXX=<field> | upperYY=<field>
Description: Forecasting options. All options can be specified anywhere in any order.

Predict options

algorithm
Syntax: algorithm= LL | LLP | LLT | LLB | LLP5
Description: Specify the name of the forecasting algorithm to apply: LL (local level), LLP (seasonal local level), LLT (local level trend), LLB (bivariate local level), or LLP5 (which combines LLP and LLT). Each algorithm expects a minimum number of data points; for more information, see "Algorithm options" below.
Default: LLP5
correlate
Syntax: correlate=<field>
Description: For bivariate model, indicates the field to correlate against.
future_timespan
Syntax: future_timespan=<number>
Description: The length of prediction into the future. Must be a non-negative number. You would not use this option if algorithm=LLB.
holdback
Syntax: holdback=<number>
Description: Specifies the <number> of data points from the end that are NOT used to build the model. For example, 'holdback=10' computes the prediction for the last 10 values. Typically, this is used to compare the predicted values to the actual data. Required when algorithm=LLB.
lowerXX
Syntax: lower<int>=<field>
Description: Specifies a field name for the lower <int> percentage confidence interval. <int> is greater than or equal to 0 and less than 100.
Default: lower95, in which 95% of predictions are expected to fall.
period
Syntax: period=<number>
Description: If algorithm is LLP or LLP5, specify the seasonal period of the time series data. If not specified, the period is estimated using the data's auto-correlation. If algorithm is not LLP or LLP5, this is ignored.
upperYY
Syntax: upper<int>=<field>
Description: Specifies a field name for the upper <int> percentage confidence interval. <int> is greater than or equal to 0 and less than 100.
Default: upper95, in which 95% of predictions are expected to fall.

Algorithm options

All the algorithms are variations based on the Kalman filter. The algorithm names are: LL, LLP, LLT, LLB, and LLP5. Each algorithm above expects a minimum number of data points. If not enough effective data points are supplied, an error message is displayed. For instance, the field itself might have more than enough data points, but the number of effective data points might be small if the holdback is large.

Algorithm option Algorithm name Description
LL Local level This is a univariate model with no trends and no seasonality. Requires a minimum of 2 data points.
LLP Seasonal local level This is a univariate model with seasonality. The periodicity of the time series is automatically computed. Requires the minimum number of data points to be twice the period.
LLT Local level trend This is a univariate model with trend but no seasonality. Requires a minimum of 3 data points.
LLB Bivariate local level This is a bivariate model with no trends and no seasonality. Requires a minimum of 2 data points. LLB uses one set of data to make predictions for another. For example, assume it uses dataset Y to make predictions for dataset X. If the holdback=10, this means LLB takes the last 10 data points of Y to make predictions for the last 10 data points of X.
LLP5 Combines LLT and LLP models for its prediction.

Confidence intervals

The lower and upper confidence interval parameters default to lower95 and upper95. This specifies a confidence interval where 95% of the predictions are expected to fall.

It is typical for some of the predictions to fall outside the confidence interval because:

  • The confidence interval does not cover 100% of the predictions.
  • The confidence interval is about a probabilistic expectation and results do not match the expectation exactly.

Examples

Example 1:

Predict future downloads based on the previous download numbers.

index=download | timechart span=1d count(file) as count | predict count

Predict example1.png


Example 2:

Predict the values of foo using LL or LLP, depending on whether foo is periodic.

... | timechart span="1m" count AS foo | predict foo

Example 3:

Upper and lower confidence intervals do not have to match.

... | timechart span="1m" count AS foo | predict foo as fubar algorithm=LL upper90=high lower97=low future_timespan=10 holdback=20

Example 4:

Illustrates the LLB algorithm. The foo2 field is predicted by correlating it with the foo1 field.

... | timechart span="1m" count(x) AS foo1 count(y) AS foo2 | predict foo2 as fubar algorithm=LLB correlate=foo1 holdback=100

See also

trendline, x11

Answers

Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has about using the predict command.

PREVIOUS
pivot
  NEXT
rangemap

This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18, 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13


Comments

Thank you for correcting the typo :)

Strive
September 2, 2014

Hi,<br /><br />While explaining Predict Options, for lowerXX and upperXX it is mentioned that "in which 95% of predictions are expected to fail". Whereas while explaining confidence intervals, it is mentioned that "interval where 95% of the predictions are expected to fall". Please look at the word fail and fall. I think fall is the right word.<br /><br />Thanks,<br />Strive

Strive
September 1, 2014

Thanks Alacer! I've corrected the typo. :D

Sophy
January 24, 2014

Example 3: Should probably read as *Upper and lower confidence intervals DO not have to match*

Alacercogitatus
January 23, 2014

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters