Detect Categorical Outliers
The Detect Categorical Outliers assistant finds data that are indicative of interesting, unusual, and possibly dangerous events. This assistant allows non-numeric and multi-dimensional data, such as string identifiers and IP addresses.
Algorithm
- Probabilistic measures
Workflow
To detect categorical outliers, you input data and select the fields for which to look for unusual combinations or a coincidence of rare values. When a lot fields have rare values, the result is an outlier. The basic steps are as follows:
- Enter a search to retrieve your data, then click the search button to run it.
- Select the fields you want to analyze. This list of fields is populated by the search you just ran.
- Click Detect Outliers.
Interpret and validate
After you detect the outliers, review the results to see how many outliers are identified. The expectation is to have a few outliers.
- Outliers: Shows the number of events flagged as outliers.
- Total Events: Shows the total number of events that were evaluated.
- Data and Outliers: Shows a list of the events that are marked outliers, stating the reason that the event is marked as an outlier.
Deploy outlier detection
Once you have detected outliers, you can take the following actions:
- Click the Open in Search button next to the Detect Outliers button to open a new Search tab, filled out with a search query that uses all data (not just the training set).
- Click the Show SPL button next to the Open in Search button to see the search query that was used to detect outliers. For example, you could use this same query on a different data set.
- Click the Schedule Alert button in a panel to set up an alert to detect when the number of outliers exceeds a certain value. After you save the alert, you can access it from the Scheduled Jobs > Alerts menu.
- Click any title to go to a new Search tab, filled out with a search query to replicate the outlier detection calculations.
Detect Numeric Outliers | Forecast Time Series |
This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 1.3.0, 2.0.0
Feedback submitted, thanks!