Splunk® Enterprise

Search Reference

Download manual as PDF

Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

kmeans

Synopsis

Performs k-means clustering on selected fields.

Syntax

kmeans [kmeans-options]* <field-list>

Required arguments

field-list
Syntax: <field>, ...
Description: Specify the exact fields to use for the join. If none are specified, uses all numerical fields that are common to both result sets. Skips events with non-numerical fields.

Optional arguments

kmeans-options
Syntax: <reps>|<iters>|<tol>|<k>|<cnumfield>|<distype>
Description: Options for the kmeans command.

kmeans options

reps
Syntax: reps=<int>
Description: Specify the number of times to repeat kmeans using random starting clusters. Defaults to 10.
iters
Syntax: maxiters=<int>
Description: Specify the maximum number of iterations allowed before failing to converge. Defaults to 10000.
t
Syntax: t=<num>
Description: Specify the algorithm convergence tolerance. Defaults to 0.
k
Syntax: k=<int> | <int>-<int>
Description: Specify as a scalar integer value or a range of integers. When provided as single number, selects the number of clusters to use. This produces events annotated by the cluster label. When expressed as a range, clustering is done for each of the cluster counts in the range and a summary of the results is produced. These results express the size of the clusters, and a 'distortion' field which represents how well the data fits those selected clusters. Values must be greater than 1 and less than maxkvalue (see Limits section). Defaults to 2.
cnumfield
Syntax: cfield=<field>
Description: Names the field to annotate the results with the cluster number for each event. Defaults to CLUSTERNUM.
distype
Syntax: dt=l1 | l1norm | cityblock | cb | l2 | l2norm | sq | sqeuclidean | cos | cosine
Description: Specify the distance metric to use. l1, l1norm, and cb are synonyms for to cityblock. l2, l2norm, and sq are synonyms for sqeuclidean. cos is a synonym for cosine. Defaults to sqeucildean.

Description

Performs k-means clustering on select fields (or all numerical fields if empty). Events in the same cluster will be moved next to each other. Optionally the cluster number for each event is displayed.

Limits

The number of clusters to collect the values into -- k -- is not permitted to exceed maxkvalue, specified in limits.conf in the [kmeans] stanza. This defaults to 1000.

When a range is given for the k option, the total distance between the begin and end cluster counts is not permitted to exceed maxkrange, specified in limits.conf in the [kmeans] stanza. This defaults to 100.

The above limits are designed to avoid the computation work becoming unreasonably expensive.

The total number of values which are clustered by the algorithm (typically the number of input results) is limited by the maxdatapoints parameter in the [kmeans] stanza of limits.conf. If this limit is exceeded at runtime, a warning message displays in Splunk Web. This defaults to 100000000 or 100 million. This maxdatapoints limit is designed to avoid exhausting memory.

Examples

Example 1: Group search results into 4 clusters based on the values of the "date_hour" and "date_minute" fields.

... | kmeans k=4 date_hour date_minute

Example 2: Group results into 2 clusters based on the values of all numerical fields.

... | kmeans

See also

anomalies, anomalousvalue, cluster, outlier,

Answers

Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has using the kmeans command.

PREVIOUS
join
  NEXT
kvform

This documentation applies to the following versions of Splunk® Enterprise: 4.3, 4.3.1, 4.3.2, 4.3.3, 4.3.4, 4.3.5, 4.3.6, 4.3.7, 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18, 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters