Cluster Numeric Events

The Cluster Numeric Events assistant partitions events with multiple numeric fields into groups of events based on the values of those fields. The groupings aren't known in advance, therefore, the learning is unsupervised.

Algorithms

The Cluster Numeric Events assistant uses the following algorithms:

Workflow

To cluster numeric events, input data, optionally perform preprocessing, then select the algorithm to use for clustering and other parameters as necessary.

Enter a search.

A data preview is generated so you can preview the data.

Add preprocessing steps if desired.

See Preprocessing for information.

Select the algorithm to use for clustering.
Specify the fields to use.

If your data has been preprocessed, you should choose from the preprocessed fields.

For K-means, Birch, and Spectral Clustering, specify the number of clusters to use. For DBSCAN, specify a value between 0 and 1 for eps (the size of the neighborhood).

Smaller numbers result in more clusters.

Name the model if you want to save it.

You must specify a name for the model in order to schedule clustering or schedule an alert. This name and the settings you select are saved in the history in the Load Existing Settings tab. You cannot save a model if you use the DBSCAN or Spectral Clustering algorithm.

Click Cluster.

Interpret and validate

After the numeric events are clustered, review the cluster visualization. The fields included in the visualization are listed. You can add and remove fields, and then click Visualize to change the visualization.

You can drag a selection rectangle around some of the points in a plot to see the corresponding points on the other plots.

The visualization displays a maximum of 1000 points, 20 series and 6 fields (1 label and 5 variables).

Deploy clustering

Click the icon in the right part of the Cluster button to run the clustering on a schedule.

Scheduled Jobs > Scheduled Training

Next to the Cluster, click the Open in Search to open a new Search tab, filled out with the search query that was used to fit the model.
Click Show SPL next to the Cluster button to see the search query that was used for the clustering with comments that contain explanations.

You can use this same query on a different data set.

Click the Schedule Alert button beneath the cluster visualization to set up an alert that triggers when the number of events in a cluster exceeds a threshold you specify.

After you save the alert, you can access it from the Scheduled Jobs > Alerts menu. For more information about alerts, see Getting started with alerts in the Splunk Enterprise Alerting Manual. Alerts cannot be scheduled if you use the DBSCAN or Spectral Clustering algorithms or if you do not specify a name for the model.

Related answers from Splunk Community

Cluster Numeric Events

Algorithms

Workflow

Interpret and validate

Deploy clustering

Comments

Cluster Numeric Events

Was this topic useful?