anomalies
anomalies
Use the anomalies command to look for events that you don't expect to find based on the values of a field in a sliding set of events. The anomalies command assigns an unexpectedness score to each event in a new field named unexpectedness. Whether the event is considered anomalous or not depends on a threshold value that is compared against the calculated unexpectedness score. The event is considered unexpected or anomalous if the unexpectedness > threshold.
Note: After you run anomalies in the timeline Search view, add the unexpectedness field to your events list using the Pick fields menu.
Synopsis
Computes an unexpectedness score for an event.
Syntax
anomalies [threshold=num] [labelonly=bool] [normalize=bool] [maxvalues=int] [field=field] [blacklist=filename] [blacklistthreshold=num] [by-clause]
Optional arguments
- threshold
- Datatype: threshold=<num>
- Description: A number to represent the unexpectedness limit. If an event's calculated unexpectedness is greater than this limit, the event is considered unexpected or anomalous. Defaults to 0.01.
- labelonly
- Datatype: labelonly=<bool>
- Description: Specify how you want to output to be returned. The
unexpectednessfield is appended to all events. If set to true, no events are removed. If set to false, events that have aunexpectedscore less than the threshold (boring events) are removed. Defaults to false.
- normalize
- Datatype: normalize=<bool>
- Description: Specify whether or not to normalize numeric values. For cases where
fieldcontains numeric data that should not be normalized, but treated as categories, setnormalize=false. Defaults to true.
- maxvalues
- Datatype: maxvalues=<int>
- Description: Specify the size of the sliding window of previous events to include when determining the unexpectedness of an event's field value. This number is between 10 and 10000. Defaults to 100.
- field
- Datatype: field=<field>
- Description: The field to analyze when determining the unexpectedness of an event. Defaults to
_raw.
- blacklist
- Datatype: blacklist=<filename>
- Description: A name of a CSV file of events that is located in $SPLUNK_HOME/var/run/splunk/BLACKLIST.csv. Any incoming event that is similar to an event in the blacklist is treated as not anomalous (that is, uninteresting) and given an unexpectedness score of 0.0.
- blacklistthreshold
- Datatype: blacklistthreshold=<num>
- Description: Specify similarity score threshold for matching incoming events to blacklisted events. If the incoming event has a similarity score above the
blacklistthreshold, it is marked as unexpected. Defaults to 0.05.
- by clause
- Syntax: by <fieldlist>
- Description: Used to specify a list of fields to segregate results for anomaly detection. For each combination of values for the specified field(s), events with those values are treated entirely separately.
Description
For those interested in how the unexpected score of an event is calculated, the algorithm is proprietary, but roughly speaking, it is based on the similarity of that event (X) to a set of previous events (P):
unexpectedness = [s(P and X) - s(P)] / [s(P) + s(X)]
Here, s() is a metric of how similar or uniform the data is. This formula provides a measure of how much adding X affects the similarity of the set of events and also normalizes for the differing event sizes.
You can run the anomalies command again on the results of a previous anomalies, to further narrow down the results. As each run operates over 100 events, the second call to anomalies is approximately running over a window of 10,000 previous events.
Examples
Example 1: This example just shows how you can tune the search for anomalies using the threshold value.
index=_internal | anomalies by group | search group=*This search looks at events in the _internal index and calculates the unexpectedness score for sets of events that have the same group value. This means that the sliding set of events used to calculate the unexpectedness for each unique group value will only include events that have the same group value. The search command is then used to show only events that include the group field. Here's a snapshot of the results:
With the default threshold=0.01, you can see that some of these events may be very similar. This next search increases the threshold a little:
index=_internal | anomalies threshold=0.03 by group | search group=*With the higher threshold value, you can see at-a-glance that there is more distinction between each of the events (the timestamps and key/value pairs).
Also, you might not want to hide the events that are not anomalous. Instead, you can add another field to your events that tells you whether or not the event is interesting to you. One way to do this is with the eval command:
index=_internal | anomalies threshold=0.03 labelonly=true by group | search group=* | eval threshold=0.03 | eval score=if(unexpectedness>=threshold, "anomalous", "boring")This search uses labelonly=true so that the boring events are still retained in the results list. The eval command is used to define a field named threshold and set it to the value. This has to be done explicitly because the threshold attribute of the anomalies command is not a field. The eval command is then used to define another new field, score, that is either "anomalous" or "boring" based on how the unexpectedness compares to the threshold value. Here's a snapshot of these results:
More examples
Example 1: Show most interesting events first, ignoring any in the blacklist 'boringevents'.
... | anomalies blacklist=boringevents | sort -unexpectednessExample 2: Use with transactions to find regions of time that look unusual.
... | transaction maxpause=2s | anomaliesExample 3: Look for anomalies in each source separately -- a pattern in one source will not affect that it is anomalous in another source.
... | anomalies by sourceSee also
anomalousvalue, cluster, kmeans, outlier
Answers
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has using the anomalies command.
This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 View the Article History for its revisions.


