Splunk® Machine Learning Toolkit

User Guide

This documentation does not apply to the most recent version of Splunk® Machine Learning Toolkit. For documentation on the most recent version, go to the latest release.

Cluster Numeric Events Classic Assistant

Classic Assistants enable the creation of a machine learning model through a guided user interface. The Cluster Numeric Events Classic Assistant partitions events with multiple numeric fields into groups of events based on the values of those fields. The groupings aren't known in advance, therefore, the learning is unsupervised.

The following visualization illustrates a clustering of humidity data results. This visualization is from the showcase example for Power Plant Operating Regimes.
This visualization shows four clusters, differentiated by color.

Algorithms

The Cluster Numeric Events Classic Assistant uses the following algorithms:

Cluster numeric events

To cluster numeric events, input data, optionally perform preprocessing, then select the algorithm to use for clustering and other parameters as necessary.

Before you begin

  • The Predict Numeric Fields Assistant offers the option to preprocess your data. Read up on the preprocessing algorithms available here: Preprocessing machine data.
  • The toolkit default selects the K-means algorithm. Use this default if you aren't sure which one is best for you. Read up on the other algorithm options here: Algorithms.

Workflow

Follow these steps for the Cluster Numeric Events Classic Assistant.

  1. From the MLTK navigation bar select Classic > Assistants > Cluster Numeric Events.
  2. Run a search, including the selection of a date range.
  3. (Optional) Click + Add a step to add preprocessing steps.
  4. Select an algorithm from the Algorithm drop-down menu.
  5. Specify the Fields to use for clustering.
    If your data has been preprocessed, choose from the preprocessed fields.
  6. For K-means, Birch, and Spectral Clustering, specify the number of clusters to use.
    For DBSCAN, specify a value between 0 and 1 for eps (the size of the neighborhood).
    Smaller numbers result in more clusters.
  7. Type the name the model in Save the model as field.
    You must specify a name for the model in order to fit a model on a schedule or schedule an alert.

    You cannot save a model if you use the DBSCAN or Spectral Clustering algorithm.

  8. Click Cluster.

Interpret and validate

After the numeric events are clustered, review the cluster visualization. The fields included in the visualization are listed on screen. You can add and remove fields, and click Visualize to change the visualization.

You can drag a selection rectangle around some of the points in a plot to see the corresponding points on the other plots.

MLApp selectionrectangle.png

The visualization displays a maximum of 1000 points, 20 series and 6 fields (1 label and 5 variables).

Deploy clustering

After you interpret and validate the clustering, deploy it.

Within the Classic Assistant framework

  1. Click the Schedule Training button to the right of Cluster to run the clustering on a schedule.
    You can set up a regular interval to deploy clustering, for example, once a week.

You cannot schedule clustering if you use the DBSCAN or Spectral Clustering algorithms.

Outside the Classic Assistant framework

  1. Click Open in Search to to generate a New Search tab filled out with the search query that was used for the clustering. This new search will open in a new browser tab, away from the Classic Assistant. You can adjust the SPL directly and see results immediately. You can also save the query as a Report, Dashboard Panel or Alert.
  2. Click Show SPL to generate a new window showing the search query that was used for the clustering. Copy the SPL here to use this same query on a different data set.
  3. Click Schedule Alert to set up an alert that is triggered when the number of events in a cluster exceeds a threshold you specify.

Alerts cannot be scheduled if you use the DBSCAN or Spectral Clustering algorithms.

Once you navigate away from the Classic Assistant page, you cannot return to it through the Classic or Models tabs. Classic Assistants are great for generating SPL, but may not be ideal for longer-term projects.

For more information about alerts, see Getting started with alerts in the Splunk Enterprise Alerting Manual.

Last modified on 04 October, 2018
Forecast Time Series Classic Assistant   Preprocessing machine data using Assistants

This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 3.4.0, 4.0.0, 4.1.0, 4.2.0, 4.3.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters