|This page is currently a work in progress; expect frequent near-term updates.|
You can use the
correlate command to see an overview of the co-occurrence between fields in your data. The results are presented in a matrix format, where the cross tabulation of two fields is a cell value that represents the percentage of times that the two fields exist in the same events.
Note: This command looks at the relationship among all the fields in a set of search results. If you want to analyze the relationship between the values of fields, refer to the contingency command, which counts the co-ocurrence of pairs of field values in events.
Calculates the correlation between different fields.
correlate [type=cocur] [_metainclude=<bool>]
- Syntax: type=cocur
- Description: Type of correlation to calculate. Currently the only available options is the co-occurrence matrix, which contains the percentage of times that two fields exist in the same events. Cell values of 1.0 indicate that the two fields always exist together in the data.
- Syntax: _metainclude=<bool>
- Description: This is an internal option. Specifies whether to include the internal metadata fields (that start with '_') in the analysis. Defaults to
Example 1: Look at the co-occurrence between all fields in the
index=_internal | correlate
Here is a snapshot of the results:
Because there are difference types of logs in the
_internal, you can expect to see that many that many of the fields do not co-occur.
Example 2: Calculate the co-occurrences between all fields in Web access events.
sourcetype=access_* | correlate
You expect all Web access events to share the same fields: clientip, referer, method, etc. But, because the
sourcetype=access_* includes both access_common and access_combined Apache log formats, you should see that the percentages of some of the fields are less than 1.0.
Example 3: Calculate the co-occurrences between all the fields in download events.
eventtype=download | correlate
The more narrow your search is before you pass the results into
correlate, the more likely all the field value pairs will have a correlation of 1.0 (co-occur in 100% of the search results). For these download events, you might be able to spot an issue depending on which pair have less than 1.0 co-occurrence.
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has using the correlate command.
This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 , 4.3.3 , 4.3.4 , 4.3.5 , 4.3.6 , 4.3.7 , 5.0 , 5.0.1 , 5.0.2 , 5.0.3 , 5.0.4 , 5.0.5 , 5.0.6 , 6.0