Transforming and reporting commands
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
- associate
- Syntax
- Arguments
- Examples
- chart
- Syntax
- Arguments
- Examples
- contingency
- Syntax
- Arguments
- Examples
- correlate
- Syntax
- Arguments
- Examples
- diff
- Syntax
- Arguments
- Examples
- format
- Syntax
- Arguments
- Examples
- rare
- Syntax
- Arguments
- Examples
- select
- Syntax
- Arguments
- Examples
- stats
- Syntax
- Arguments
- Examples
- timechart
- Syntax
- Arguments
- Examples
- top
- Syntax
- Arguments
- Examples
- xmlunescape
- Syntax
- Arguments
- Examples
Transforming and reporting commands
associate
This data-processing command identifies relationships between pairs of fields. It compares a field=value pair with a specified reference field=value pair.
Syntax
associate [associate-option]...
Arguments
associate-option
| associate-option= | action-option | supcnt-option | subfreq-option | improv-option | Associate command options. |
| action-option= | action= display | Specifies an action to perform on the associates. Currently, the only option for associate is to display the associates. |
| supcnt-option= | supcnt=integer(100) | Specifies the minimum number of times a reference field value pair must appear to be considered an associate. |
| subfreq-option= | supfreq=number(0.1) | Specifies the minimum frequency of reference key=value combinations, expressed as a fraction of the number of total number of results. |
| improv-option= | improv=number(0.5) | Sets the value that a target key must be greater than (in reference to the reference key=value pair) in order to be associated. |
Examples
chart
This data-processing command returns events in a tabular output suitable for charting (it does not have the x-axis designated as "time"). Chart creates a table with an arbitrary field as the x-axis (this is different from timechart, which generates a chart with _time as the x-axis). Chart fields are automatically converted to numerical values if necessary. Chart is automatically called during report on specific stat specifiers.
Syntax
chart [stat-specifiers] by x-axis-field [bucketing options]
Arguments
| x-axis-field | Specified field for the x-axis. |
stat-specifiers
| stat-specifiers= | count | distinct_count | first | last | sum | min | max | avg | mean | mode | median | stdev | var | percXX | Specifies the statistical operation to perform. |
| count | c= | count|c (field) | Find the count of values in the specified field(s). |
| distinct_count | dc= | distinct_count|dc (field) | Find the count of distinct values in the specified field(s). |
| first= | first | Show the first "seen" value of a field. |
| last= | last | Show the last "seen" value of a field. |
| sum= | sum | Produce the sum of the values of the field. |
| min= | min (field) | FInd the minimum value of values in the specified field(s). |
| max= | max (field) | Find the maximum value of values in the specified field(s). |
| avg= | avg (field) | Find the average value of values in the specified field(s). |
| mean= | mean (field) | Find the mean value of values in the specified field(s). |
| mode= | mode (field) | Find the mode value of values in the specified field(s). |
| median= | median (field) | Find the median value of values in the specified field(s). |
| stdev= | stdev (field) | Find the standard deviation of values in the specified field(s). |
| var= | var (field) | Find the variance of values in the specified field(s). |
| percXX= | percXX | Percentile, integer between 1 and 99 |
bucketing-option
| bucketing-option= | bins | span | type | fixedrange | cont | start | end | length | Discretization options. | |
| bins= | bins=integer(20) | Sets the maximum number of discrete bins to build. If using the _time field, the default=300. | |
| span= | span=integer span-length | Sets the size of each bucket. Example =span=10 or span=2d or span=5m
| |
| type= | type=(TIME | INT | NUM | CAT | AUTO) (AUTO) | Specifies the type of value in the field that is being discretized. Manually specify how sets are discretized.TIME = Time-based discretization. INT = Integer number discretization. NUM = Arbitrary number discretization. CAT = Categorical discretization. AUTO = Automatically diagnosed discretization. | |
| fixedrange= | fixedrange=T | F(T) | Applicable if bucketing by time. Setting to T causes the search-time boundaries to be used. | |
| cont= | cont=T | F (T) | When set, causes empty continuity bins to be added to the x-axis to make it uniform. | |
| start= | start=integer | Sets the minimum for numerical buckets. | |
| end= | end=integer | Sets the maximum for numerical buckets. | |
| length= | length=integer span-length | If using a timescale, specifies the time range. If not, specifies the absolute bucket length. |
span-length
| span-length= | ts-sec | ts-min | ts-hr | ts-day | ts-month | Time scale units |
| ts-sec= | s | sec | secs | second | seconds | Time scale in seconds. |
| ts-min= | m | min | mins | minute | minutes | Time scale in minutes. |
| ts-hr= | h | hr | hrs | hour | hours | Time scale in hours. |
| ts-day= | d | day | days | Time scale in days. |
| ts-month= | mon | month | months | Time scale in months. |
Examples
SplunkWeb:
- | chart avg(size) by host
- Searches all events, then returns a chart that is the average of all the sizes plotted against the name of the host.
CLI:
./splunk search "* | chart avg(size) by host"
- Gets the average (mean) size for each distinct host.
./splunk search "* | chart max(delay) by size bins=10"
- Gets the max delay by size, where size is broken down into up to 10 equal sized buckets.
contingency
This data-processing command builds a contingency table for two fields. Contingency tables are useful to record and analyze the relationship between two or more variables (in Splunk's case - fields). Useful statistical analysis such as calculation of the phi coefficient or Cramer's V is possible from a contingency table.
Syntax
contingency [contingency-options]... field field
Arguments
contingency-options
| contingency-options= | maxopts | mincover | usetotal | totalstr | Options for specifying a contingency table. |
| maxopts= | (maxrows= | maxcols=)integer(0) | Specifies the maximum number of rows or columns. If the number of distinct values exceeds the specified maximum, then the least common values are ignored. Specifying a value of 0 sets the maximum to unlimited. |
| mincover= | (mincolcover= | minrowcover=)number(1.0) | Specifies the precentage of values for a row or column to cover. |
| usetotal= | usetotal=(T | F)(T) | If set, adds the row and column totals together. |
| totalstr= | totalstr=field("Total") | Specify the field to place the row and column totals. |
Examples
- | contingency datafield1 datafield2 maxrows=5 maxcols=5 usetotal=F
- Searches all events and builds a contingency table for datafield1 & 2. Sets the maximum rows and columns to 5, and does not allow the rows and columns to add together.
correlate
This data-processing command calculates the correlation between different fields.
Syntax
correlate [correlate-type]...
Arguments
| correlate-type= | type= cocur | Specifies the type of correlation to calculate. Currently only the co-currence calculation is supported. Co-currence is the percentage of times that two fields exist in the same results. |
Examples
- | correlate cocur
- Search all events, and calculate the co-currence correlation between all fields.
diff
This data-processing command compares the raw data of two search results and returns a single search result that is the difference between the two compared results. Use the attribute argument to specify a particular field to compare values from (by default _raw is compared).
Syntax
diff search result 1 search result 2 [attribute]
Arguments
| search result 1= | integer(1) | Number of the first search result to compare. |
| search result 2= | integer(2) | Number of the second search result to compare. |
| attribute= | field name(none=_raw) | Attribute to compare. If left blank, diff will compare _raw. |
Examples
SplunkWeb:
- | diff
- Compares the raw text of the top two results (1 and 2).
CLI:
./splunk search "* | diff 1 3 attribute=host"
- Compares the top and 3rd results' hosts.
format
This data-processing command takes results of a subsearch and formats them into a single result (single result with an attribute value of: _query) that is a query built from the inputted search results. This is so they can be applied to another search (useful for subsearches). Six strings are needed to define row prefix, column prefix, column separator, column end, row separator, and row end. If no argument is specified, the default values are used.
Syntax
format row-prefix column-prefix column-separator column-end row-separator row-end
Arguments
| row-prefix= | char( ( ) | Specifies the character used for the row prefix. |
| column-prefix= | char( ( ) | Specifies the character used for the column prefix. |
| column-separator= | char( AND ) | Specifies the character used for the column separator. |
| column-end= | char( ) ) | Specifies the character used for the column end. |
| row-separator= | char( OR ) | Specifies the character used for the row separator. |
| row-end= | char( ) ) | Specifies the character used for the row end. |
Examples
SplunkWeb:
maxresults::2 | fields + source, sourcetype, host | format | outputraw- Gets 2 results and creates a search from their host, source and sourcetype. This will return:
_query=( ( "host::willlaptop" AND "source::/home/david/logs/syslog.log" AND "sourcetype::syslog" ) OR ( "host::willlaptop" AND "source::/home/david/logs/syslog.log" AND "sourcetype::syslog" ) )
which can be used in a subsearch as follows:
- [search maxresults::2 | fields + source, sourcetype, host | format | outputraw]
- The above example finds all events from the sourcetype and host of each of the most recent 2 events.
CLI:
./splunk search "maxresults::2 | fields + source, sourcetype, host | format | outputraw"
rare
This data-processing command displays the least common values of a field, along with a count and percentage.
Syntax
rare[option]... field list
Arguments
option
| option= | showcount | showperc | limit | Options for rare. |
| showcount= | T | F (T) | If set, creates a field called "count" that holds the count. |
| showperc= | T | F (T) | If set, creates a field called "percent" that holds the percentage of prevalence of values. |
| limit= | number(10) | Specifies how many values appear. Setting to "0" causes all values to be returned. |
| field list= | field1, ..., fieldx | Comma-separated list of fields to include. |
Examples
SplunkWeb:
- | rare url
- Displays the least common values of the url field.
CLI:
./splunk search "* | rare limit=20 url"
- Displays the 20 least common values for the url field.
select
Runs an SQLite statement against the search results using the SQLite syntax. Before the select command is executed, the previous search results are put into a temporary database table called "results". If a row has no values, "select" ignores it to prevent blank search results. All fields referenced in the select command must be prefixed with an underscore.
note: Be aware that the select command may cause searches to run slow. If you have an operation that you are using select for (such as statistical operations), try using another Splunk command (in this case - [select stats]).
See the SQLite online documentation for details on SQLite syntax, arguments, and expressions.
Syntax
select select-arg
Arguments
| select-arg | A valid SQLite statement using the SQLite syntax. |
Note: Select does not support the SQLite operators | and | | (double-bar and single-bar). This is because "|" is used in Splunk's search language to separate commands in a search string.
Note: When using field names in the select command, you must prepend them with an underscore "_". For example: the field "dest" or "port", they must be expressed as: "_dest" and "_port" in your select command expression.
netscreen | select distinct _dest _port from resultsExamples
SplunkWeb:
404 | select _url FROM results WHERE host = localhost- Returns a table with one column. URLs not from localhost.
CLI:
./splunk search "* | select _ip FROM results WHERE _ip regexp "^12.*255$""
- Returns ip's that start with "12" and end with "255".
stats
This data-processing command provides summary statistics, grouped optionally by field. Returns one result for each aggregated group. If there is no "by" argument, there will be only one returned result. If there is a "by" argument with a single field, there will be a returned result for every distinct value of the field. If there is a "by" argument with several fields, there will be a returned result for every distinct tuple of values for the fields. Each result contains all the "by" fields, as well as a field for each aggregator argument.
Syntax
stats [stat-specifiers]... by groupby-field(s)
Arguments
| groupby-fields= | field:field:... | Specifies the fields to group events by. One result is returned per distinct combination of values of the fields. |
stat-specifiers
| stat-specifiers= | count | distinct_count | first | last | sum | min | max | avg | mean | mode | median | stdev | var | percXX | Specifies the statistical operation to perform. |
| count | c= | count|c (field) | Find the count of values in the specified field(s). |
| distinct_count | dc= | distinct_count|dc (field) | Find the count of distinct values in the specified field(s). |
| first= | first | Show the first "seen" value of a field. |
| last= | last | Show the last "seen" value of a field. |
| sum= | sum | Produce the sum of the values of the field. |
| min= | min (field) | Find the minimum value of values in the specified field(s). |
| max= | max (field) | Find the maximum value of values in the specified field(s). |
| avg= | avg (field) | Find the average value of values in the specified field(s). |
| mean= | mean (field) | Find the mean value of values in the specified field(s). |
| mode= | mode (field) | Find the mode value of values in the specified field(s). |
| median= | median (field) | Find the median value of values in the specified field(s). |
| stdev= | stdev (field) | Find the standard deviation of values in the specified field(s). |
| var= | var (field) | Find the variance of values in the specified field(s). |
| percXX= | percXX | Percentile, integer between 1 and 99. |
Examples
SplunkWeb:
- | stats by _time
- Search for stats for every field, grouped by each unique timestamp.
CLI:
./splunk search "* | stats avg(*lay) BY _time"
- For each unique time, gives you the average for any field that ends with the the string 'lay' (e.g. delay, xdelay, relay, etc).
timechart
This data-processing command is used to create a chart for a statistical aggregation applied to a specified field (using time as the x-axis). Optionally split data by a field so that each distinct value of a split-by field is a series.
When called without specifying a bucketing-option, timechart assumes that bins=300.
Syntax
timechart [bucketing-option]... stat-specifier [ field timechart-option ( where-clause )]
Arguments
Note: The discretization options (bins, span) only apply when the axis field is numerical. (Counts, sizes and durations are common examples of numerical fields).
bucketing-option
| bucketing-option= | bins | span | type | fixedrange | cont | start | end | length | Discretization options. | |
| bins= | bins=integer(20) | Sets the maximum number of discrete bins to build. If using the _time field, the default=300. | |
| span= | span=integer span-length | Sets the size of each bucket. Example =span=10 or span=2d or span=5m
| |
| type= | type=(TIME | INT | NUM | CAT | AUTO) (AUTO) | Specifies the type of value in the field that is being discretized. Manually specify how sets are discretized.TIME = Time-based discretization. INT = Integer number discretization. NUM = Arbitrary number discretization. CAT = Categorical discretization. AUTO = Automatically diagnosed discretization. | |
| fixedrange= | fixedrange=T | F(T) | Applicable if bucketing by time. Setting to T causes the search-time boundaries to be used. | |
| cont= | cont=T | F (T) | When set, causes empty continuity bins to be added to the x-axis to make it uniform. | |
| start= | start=integer | Sets the minimum for numerical buckets. | |
| end= | end=integer | Sets the maximum for numerical buckets. | |
| length= | length=integer span-length | If using a timescale, specifies the time range. If not, specifies the absolute bucket length. |
stat-specifiers
| stat-specifiers= | count | distinct_count | first | last | sum | min | max | avg | mean | mode | median | stdev | var | percXX | Specifies the statistical operation to perform. |
| count | c= | count|c (field) | Find the count of values in the specified field(s). |
| distinct_count | dc= | distinct_count|dc (field) | Find the count of distinct values in the specified field(s). |
| first= | first | Show the first "seen" value of a field. |
| last= | last | Show the last "seen" value of a field. |
| sum= | sum | Produce the sum of the values of the field. |
| min= | min (field) | FInd the minimum value of values in the specified field(s). |
| max= | max (field) | Find the maximum value of values in the specified field(s). |
| avg= | avg (field) | Find the average value of values in the specified field(s). |
| mean= | mean (field) | Find the mean value of values in the specified field(s). |
| mode= | mode (field) | Find the mode value of values in the specified field(s). |
| median= | median (field) | Find the median value of values in the specified field(s). |
| stdev= | stdev (field) | Find the standard deviation of values in the specified field(s). |
| var= | var (field) | Find the variance of values in the specified field(s). |
| percXX= | percXX | Percentile, integer between 1 and 99 |
timechart-option
| timechart-option= | bucketing-option | usenull | useother | nullstr | otherstr | These options change the behavior of timechart when splitting by a field. |
| usenull= | usenull=T | F(T) | If set, usenull will create a series for events that do not contain the specified split-by field. The series created is labeled by the value of the nullstr option (the default label is "NULL"). |
| useother= | useother=T | F(F) | If set, useother causes a series to be added for data not included in the timechart. |
| nullstr= | nullstr=string | Specifies the value of the label of the null string. |
| otherstr= | otherstr=string | Specifies the value of the label of the other string. |
where-clause
| where-clause= | where stat-specifier where-comparison | |
| where-comparison= | (in | notin) (top | bottom) integer | Specifies the criteria for including a data series when a field is given in the timechart-option clause |
| Examples of where-comparison usage: | in top5 | in bottom10 | notin top2
|
span-length
| span-length= | ts-sec | ts-min | ts-hr | ts-day | ts-month | Time scale units |
| ts-sec= | s | sec | secs | second | seconds | Time scale in seconds. |
| ts-min= | m | min | mins | minute | minutes | Time scale in minutes. |
| ts-hr= | h | hr | hrs | hour | hours | Time scale in hours. |
| ts-day= | d | day | days | Time scale in days. |
| ts-month= | mon | month | months | Time scale in months. |
Examples
SplunkWeb:
sourcetype::ps | multikv | timechart span=1m avg(CPU) by host- Get all output from the ps command, convert the tabular output of ps into one event per line and extract fields based on the headers using the multikv command, and calculate the average of CPU for each 1 minute time span for each host.
CLI:
./splunk search "* | timechart span=5m avg(thruput) by host"
- Graph the average thruput over time, with time sets differentiated by 5 minute spans.
top
This data-processing command displays the most common values of a field, along with a count and percentage.
Syntax
top[option]...[option] field list
Arguments
option
| showcount= | showcount=T | F (T) | If set, creates a field called "count" that holds the count. |
| showperc= | showperc=T | F (T) | If set, creates a field called "percent" that holds the percentage of prevalence of values. |
| limit= | limit=number(10) | Specifies how many values appear. Setting to "0" causes all values to be returned. |
| field list | field1,...,fieldx | Comma-separated list of fields to include. |
Examples
SplunkWeb:
- | top url
- Displays the most common 10 values of the url field.
CLI:
./splunk search "* | top limit=20 url"
- Displays the most common 20 values of the url field.
xmlunescape
Syntax
xmlunescape [max-inputs]
Arguments
| max-inputs= | maxinputs=integer |
Examples
This documentation applies to the following versions of Splunk: 3.1.4 View the Article History for its revisions.