Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Sequential Outlier Detection

Sequential Outlier Detection identifies anomalous events in time-series sequence data. Sequential Outlier Detection can be applied to online streams for real-time monitoring to identify if sequences or patterns of events in a time series are anomalous.

Sequential Outlier Detection predicts an output that indicates if an observed sequence is anomalous or not. Smaller predicted values indicate that an observed sequence is more likely to be anomalous. Higher predicted values indicate that an observed sequence is expected (not anomalous) based on previously observed data.

Function Input/Output Schema

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<S>>
This function outputs collections of records with schema S.

Syntax

The required fields are in bold.

| detect_sequential_outliers value="input";

Required arguments

eventID
Syntax: string
Description: An ordered string where each subsequent letter represents the following event. For example, if your categorical time series has 10 different types of events, you can encode the first event as "1", second event as "2", third event as "3", and so forth.
Example: ucast(map_get('json-map', "eventCode"), "string", null),ng", null),
timestamp=cast(time, "long")

Optional arguments

No optional arguments.

This algorithm contains three pre-tuned parameters: Markov Order, Prune Threshold, and Prune Trigger Count. These parameters do not require manual input and can be ignored by most users.

For users interested in understanding more about these pre-tuned parameters, they are as follows:

val markovOrder: Int = DEFAULT_MARKOV_ORDER
val pruneThreshold: Int = DEFAULT_PRUNE_THRESHOLD
val pruneTriggerCount: Int = DEFAULT_TRIGGER_COUNT

Usage

The input to Sequential Outlier Detection is a time series of events, such as logs of commands executed over time. The model predicts whether the observed sequence in a stream is anomalous in real time.

For each data point observed, Sequential Outlier Detection outputs a probability score between 0 and 1 that corresponds to the probability that the sequence is normal or not. The lower the predicted output, the less likely the past sequence has been observed. This corresponds to a higher probability of anomaly. Predicted outputs closer to 1 correspond to a lower probability of anomaly (i.e., that the sequence is more likely to be normal).

The max length of the past sequence the algorithm computes the probability for is given by the default markov order of the algorithm at 4 observations.

For example, security users may want to identify suspicious activity from shell commands. To do so, she can use Sequential Outlier Detection to identify anomalous sequences of command logs executed over time. Each command may seem normal in isolation, but the sequence of commands can be used to identify suspicious activity. This approach to anomaly detection provides more context about the events and time series that are being monitored, improving the ability to detect abnormal sequences among event data.

SPL2 example

The following example uses Sequential Outlier Detection to identify anomalies in event code:

| from splunk_firehose() 
| eval json=cast(body, "string"), 
	'json-map'=from_json_object(json), 
	input=ucast(map_get('json-map', "eventCode"), "string", null), 
	time=ucast(map_get('json-map', "timestamp"), "string", null), 
	key=ucast(map_get('json-map', "user"), "string", null), 
	timestamp=cast(time, "long") 
| detect_sequential_outliers value="input";
Last modified on 30 October, 2020
PREVIOUS
Sentiment Analysis
  NEXT
Stats

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters