Splunk® Enterprise

Search Manual

Download manual as PDF

Download topic as PDF

Detecting patterns

This section describes detecting patterns in your data. For a complete list of topics on detecting anomalies, finding and removing outliers, and time series forecasting see About advanced statistics, in this manual.

Detecting patterns in events

The cluster command is a powerful command for detecting patterns in your events. The command groups events based on how similar they are to each other. The cluster command groups events based on the contents of the _raw field, unless you specify another field.

When you use the cluster command, two new fields are appended to each event.

  • The cluster_count is the number of events that are part of the cluster. This is the cluster size.
  • The cluster_label specifies which cluster the event belongs to. For example, if the search returns 10 clusters, then the clusters are labeled from 1 to 10.

Anomalies come in small or large groups (or clusters) of events. A small group might consist of 1 or 2 login events from a user. An example of a large group of events might be a DDoS attack of thousands of similar events.

Use the cluster command parameters wisely

  • Use the labelonly=true parameter to return all of the events. If you use labelonly=false, which is the default, then only one event from each cluster is returned.
  • Use the showcount=true parameter so that a cluster_count field is added to all of the events. If showcount=false, which is the default, the event count is not added to the event.
  • The threshold parameter t adjusts the cluster sensitivity. The smaller the threshold value, the fewer the number of clusters.

Other commands to use with the cluster command

  • Use the dedup command on the cluster_label column to see the most recent grouped events within each cluster.
  • To group the events and make the results more readable, use the sort command with the cluster columns. Sort the cluster_count column based on the number of clusters.
    • For small groups of events, sort the cluster_count column in ascending order.
    • For large groups of events sort the cluster_count column in descending order.
    • Sort the cluster_label column in ascending order. Cluster labels are numeric. Sorting in ascending order organizes the events by label, in numerical order.

Return the 3 most recent events in each cluster

The following search uses the CustomerID in the sales_entries.log file. Setting showcount=true ensures that all events get a cluster_count. The cluster threshold is set to 0.7. Setting labelonly=true returns the incoming events. The dedup command is used to see the 3 most recent events within each cluster. The results are sorted in descending order to group the events.

source="/opt/log/ecommsv1/sales_entries.log" CustomerID | cluster showcount=true t=0.7 labelonly=true | table _time, cluster_count, cluster_label, _raw | dedup 3 cluster_label | sort -cluster_count, cluster_label, - _time

If you do not set labelonly=true, then only one event from each cluster is returned.

See also

About advanced statistics, dedup, cluster, sort

PREVIOUS
Detecting anomalies
  NEXT
About time series forecasting

This documentation applies to the following versions of Splunk® Enterprise: 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 7.0.0


Comments

Thanks for catching that Plaxos! I've fixed it.

Lstewart splunk, Splunker
December 15, 2015

Formatting problem with "code" tag in the "Use the cluster command parameters wisely" section.

Plaxos
December 13, 2015

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters