Transform and report
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
Transform and report
Use transforming commands to mine your data by transforming values, manipulating fields, or by creating new results from existing data in your results.
Use reporting commands to produce reports and summarize your search results.
associate
The associate command searches for relationships between pairs of fields by comparing the values of one field to a reference field and value pair. If a list of fields is provided, the associate command restricts analysis to only those fields. By default, all fields are used; this takes longer to evaluate.
Note: To see the results of the associate command, you need to be in Splunk Web's report builder view.
Syntax
associate [associate-option]... [field list]
Arguments
| field list | field,field,... | Specify a comma-delimited list of fields to restrict the associate evaluation to. |
associate-option
| associate-option | supcnt-option | subfreq-option | improv-option | Associate command options. |
| supcnt-option | supcnt=integer(100) | Specifies the minimum number of times a reference field and value pair must appear to be considered an associate. |
| supfreq-option | supfreq=number(0.1) | Specifies the minimum frequency of reference key/value combinations, expressed as a fraction of the number of total number of results. |
| improv-option | improv=number(0.5) | Sets the value of the reference key/value pair that other pairs must be greater than to be associated. |
Examples
Example 1: This example searches all source types that begin with access and displays the events that are associated with each other that have at least 3 references to each other.
sourcetype=access* | associate supcnt=3 Example 2:
... | associateExample 3:
host="reports" | associate supcnt=50 supfreq=0.2 improv=0.5chart
Create a chart and a corresponding table of statistics from data obtained from search results. Add chart to any search in Splunk Web, and use its options to define any series of data you want to plot. Add chart to CLI searches to generate a table to export to an external charting tool (you can't generate a graphical chart in the CLI).
Use chart produce a tabular output of events suitable for charting. When you use it In Splunk Web, chart generates a table with an arbitrary field that you choose as the x-axis (this is different from timechart, which generates a chart with _time as the x-axis). If you choose to chart fields that don't have numerical values, chart automatically converts those values to numerical values if necessary.
Note: chart supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
chart [stat-operator]... by x-axis-field [bucketing option] [split-by-clause]
Arguments
Specify which fields to display for values of the x-axis.
| x-axis-field | field | Specify a field for the x-axis. |
stat-operator
Specify a statistic to display as a series. You can specify more than one statistic in a chart.
| count | c | count|c (field) | Find the count of values in the specified field(s). |
| distinct_count | dc | distinct_count|dc (field) | Find the count of distinct values in the specified field(s). |
| first | first | Show the first "seen" value of a field. |
| last | last | Show the last "seen" value of a field. |
| sum | sum[(field)] | Produce the sum of the values of the field. |
| min | min (field) | Find the minimum value of values in the specified field(s). |
| max | max (field) | Find the maximum value of values in the specified field(s). |
| avg | avg (field) | Find the average value of values in the specified field(s). |
| mean | mean (field) | Find the mean value of values in the specified field(s). |
| mode | mode (field) | Find the mode value of values in the specified field(s). |
| median | median (field) | Find the median value of values in the specified field(s). |
| stdev | stdev (field) | Find the standard deviation of values in the specified field(s). |
| var | var (field) | Find the variance of values in the specified field(s). |
| percXX | percXX | Percentile, integer between 1 and 99 |
bucketing-option
Change the bucket size or time span to control resolution of the x-axis of a time series chart.
| bins | bins=integer (default=300) | Specify the maximum possible number of bins to display in a chart. Default is bins=300 for a time series chart. |
| span | span=integer span-length | Specify the span for buckets in the x-axis. Specify what units of time to use with the span-length option (For example: span=10mon, span=2d, span=5m).
|
| fixedrange | fixedrange=T | F (default=T) | Set to true (T) to use the search time boundaries (start and finish of a search) as boundaries for time buckets. |
| cont | cont=T | F (default=T) | Set to true (T) to add empty buckets to the x-axis to make data in a chart appear more uniform. |
| start | start=integer | Set the minimum number of numerical buckets. |
| end | end=integer | Set the maximum number of numerical buckets. |
| length | length=integer span-length | Specify the span of time buckets (x-axis). |
split-by-clause
Split a series into more than one series by splitting it based on values of another field.
| split-by-clause | by field... [split-by-option [where-clause] | Specify a field to split a series by. Optionally specify additional split-by-options, or filter with a where-clause. |
split-by-option
Change the behavior of a time series when splitting a series by values of another field.
| bucketing-option | See bucketing-option table for syntax. | Define bucket size or time span to control x-axis resolution. |
| usenull | usenull=T | F (default=T) | If set to true (T), create a series for events that doesn't contain the field of the split-by-clause. The series it creates is labeled by the value of the nullstr option (the default series label is "NULL"). |
| nullstr | nullstr=string (default="NULL") | Specify a value to label a series created by usenul. |
| useother | useother=T | F (default=T) | If set to true (T), add a series in the table for any series that doesn't match the where-clause. |
| otherstr | otherstr=string (default="OTHER") | Specify a value to label the series that aren't added to the time series chart because they don't match the where-clause. |
where-clause
Filter which series to display in a chart by filtering how values match the field in a split-by-clause.
| where-clause | where stat-operator (where-comparison | where-threshold) (Default= where sum in top 10) | |
| where-comparison | (in | notin) (top | bottom) integer | Specify a condition to match the values of series that is split by the field in the split-by-clause. |
| where-threshold | (< | >) number | Specify a numeric threshold to match the values of series that is split by the field in the split-by-clause. |
span-length
Specify the time units to use when choosing a time span.
| ts-sec | s, sec, secs, second, seconds | Time scale in seconds. |
| ts-min | m, min, mins, minute, minutes | Time scale in minutes. |
| ts-hr | h, hr, hrs, hour, hours | Time scale in hours. |
| ts-day | d, day, days | Time scale in days. |
| ts-month | mon, month, months | Time scale in months. |
Examples
Splunk Web:
Example 1: This example searches all events, then returns a chart that is the average of all the sizes plotted against the name of the host.
| chart avg(size) by hostExample 2: This example searches for hits referred by searchengine, then charts the count by hour of the day on the x-axis and day of the week as series.
sourcetype=access_combined referer_domain=http://www.searchengine.com/ | chart count by date_hour, date_wdayExample 3: Return the the maximum "delay" by "size", where "size" is broken down into a maximum of 10 equal sized buckets.
... | chart max(delay) by size bins=10Example 4: Return the ratio of the average (mean) "size" to the maximum "delay" for each distinct "host" and "user" pair
... | chart eval(avg(size)/max(delay)) by host userExample 5: Return max(delay) for each value of foo split by the value of bar
... | chart max(delay) over foo by barExample 6: Return table of hostnames vs timespans. By default, if number of hosts exceeds the default of 10, then all hosts NOT in top 10 are consolidated into the OTHER category. To show the top 20 hosts, you must do:
... | timechart count by host where sum in top20
CLI:
Example 6: This example gets the average (mean) size for each distinct host.
./splunk search "* | chart avg(size) by host"
Example 7: This example gets the max delay by size, where size is broken down into up to 10 equal sized buckets.
./splunk search "* | chart max(delay) by size bins=10"
cluster
This data-processing command clusters events together based on their similarity to each other and represents that cluster with a single event. Use cluster to reduce a search with large number of similar events to fewer clusters that are much more manageable to view. This is useful if you want to find the most common or rarest events in your data.
How cluster works:
Splunk creates clusters by comparing events using the data in a field that you specify. Specify a field to compare using the field option (default field = _raw). Data in the field is broken into chunks for comparison. You can change how data is broken up by specifying delimiters (By default, every character except: 0-9, A-Z, a-z, and '_' are delimiters). Splunk evaluates the chunks of data in each event, and then compares them with a representative event from each cluster. If an event matches an existing cluster, it becomes a part of that cluster. If an event doesn't match a cluster, it starts a new cluster and becomes the representative event for that cluster.
You can change the threshold Splunk uses to determine how similar events must be to match in a cluster (from 0.0 to 1.0, default = 0.8). Set a higher threshold to create more clusters, and a lower to create fewer. The higher the threshold, the more events must match to be a part of the same cluster.
When you apply cluster, your search results are reduced to display a single representative event for each cluster of events. If you want to retain your original event data and only label what cluster events belong to, set the labelonly option to TRUE (T).
Syntax
cluster [cluster-options]...
Arguments
| cluster-options | threshold | delimiters | showcount | countfield | labelfield | field | labelonly | Options to configure clustering. |
| threshold | T=number 0.0-1.0 (0.8) | Set the threshold to specify how closely events must match in order to be clustered. Setting closer to 1 means that events have to be more similar to be in the same cluster. |
| delimiters | delims=character list | Specify the delimiters to separate tokens in clusters with. By default, every character except: 0-9, A-Z, a-z, and '_' are delimiters. Specify a space-delimited list of delimiters to override the default setting. |
| showcount | showcount=(T | F) (T) | Specify whether to show the size of each cluster. Default is TRUE (T). If labelonly is set to TRUE, then the size will not be shown. |
| countfield | countfield=field name (cluster_count) | Specify the name of a field to write the cluster size to. |
| labelfield | labelfield=field name (cluster_label) | Specify the name of a field to write the cluster number to. |
| field | field=field name (_raw) | Specify the name of a field to analyze for clustering. The default is _raw. |
| labelonly | labelonly=(T | F) (T) | If set to true, will not reduce clusters to a single event per cluster. Will instead, keep original event data, and label each event with their cluster number. |
Examples
Splunk Web:
This example returns the 20 most common clusters of events. First, it searches for syslog events that don't have the term "juniper". Then clusters the events and sorts the clusters by cluster_count. The Results returned will be the first 20 events, which are the 20 largest clusters (in data size).
sourcetype="syslog" (NOT juniper) | cluster t=0.9 showcount=true | sort - cluster_count | head 20
collect
Summary indexing uses the collect command to place the results of a saved search into a summary index so you can search them later. You can also use | collect in any search to place search results in any index. For example, if you create reports from a search, use | collect to index them so you can search across all of the reports uniformly, or create a larger aggregate report from multiple reports.
Before collect indexes search results, it saves them as events in a file ($SPLUNK_HOME/var/spool/splunk/events_random-number.stash by default). You can override the default file name and location using the file and path options. Use other collect options to override other default settings.
Syntax
collect collect-index [collect option],...
Arguments
| collect-index | index=string | Specify the name of the index to add search results to. Note: The specified index must already exist. Configure indexes in indexes.conf.
|
collect option
| collect option | addtime | file | path | marker | testmode | Specify options to override default settings of collect.
|
| addtime | addtime= (T | F) (default=T) | Set to true (T) to tell Splunk to prepend a timestamp to events that have no extractable timestamp in their _raw field.
|
| file | file=string (default=events_random-number.stash)
| Specify the file to write events to. |
| marker | marker=string (default=" ") | Specify a string of field/value pairs (comma-delimited list) to append to each event that's indexed. |
| path | path=string (default=$SPLUNK_HOME/var/spool/splunk/)
| Specify the path to store the file that events are written to. Note: Splunk must have this path set as a data input for events in the file to be indexed. |
| testmode | testmode=(T | F) (default=F) | Set to true (T) to put collect in test mode. In test mode, search results aren't written into the new index, but they are still rendered in Splunk Web as they'd appear if they were indexed.
|
Examples
Splunk Web:
This example searches Web server data and builds a report based on client IPs. The report is then indexed into the index WebReports.
host=webserver1 eventtype=banner_access NOT eventtypetag=bot NOT eventtypetag=images NOT eventtype=splunk_IPs NOT eventtype=10dot_IP_range NOT eventtypetag=invalid_page | stats distinct_count(clientip) as uniqueIPs, max(_time), min(_time) | eval site="update_banners" | addinfo | collect index=WebReportsThis example searches Web server data for raw downloads and indexes the results in the index downloadcount.
"eventtypetag=download" NOT eventtypetag=bot NOT eventtypetag=internal | addinfo | collect index=downloadcount
contingency
This data-processing command builds a contingency table for two fields. Contingency tables are useful to record and analyze the relationship between two or more variables (in Splunk's case - fields). Useful statistical analysis such as calculation of the phi coefficient or Cramer's V is possible from a contingency table.
Syntax
contingency [contingency-options]... field field
Arguments
contingency-options
| contingency-options | maxopts | mincover | usetotal | totalstr | Options for specifying a contingency table. |
| maxopts | (maxrows= | maxcols=)integer(0) | Specifies the maximum number of rows or columns. If the number of distinct values exceeds the specified maximum, then the least common values are ignored. Specifying a value of 0 sets the maximum to unlimited. |
| mincover | (mincolcover= | minrowcover=)number(1.0) | Specifies the percentage of values for a row or column to cover. |
| usetotal | usetotal=(T | F)(T) | If set, adds the row and column totals together. |
| totalstr | totalstr=field("Total") | Specify the field to place the row and column totals. |
Examples
Splunk Web:
This example searches all events and builds a contingency table for datafield1 & 2. Sets the maximum rows and columns to 5, and does not allow the rows and columns to add together.
| contingency datafield1 datafield2 maxrows=5 maxcols=5 usetotal=F
correlate
Calculate the statistical correlation between fields.
Syntax
correlate [correlate-type]...
Arguments
| correlate-type | type= cocur | Specifies the type of correlation to calculate. Currently only the co-occurrence calculation is supported. Co-occurrence is the percentage of times that two fields exist in the same results. |
Examples
Splunk Web:
This example searches all events, and calculates the co-occurrence correlation between all fields.
| correlate type=cocur
diff
This data-processing command compares the data of two search results and returns a single result that is the difference between the values compared. You can compare values of specific fields of results by using the attribute argument (by default the value of the _raw field is compared).
Syntax
diff result1 result2 [attribute] [header] [context]
Arguments
| result1 | pos1=integer(default = 1st result) | Number of the first search result to compare. |
| result2 | pos2=integer(default = 2nd result) | Number of the second search result to compare. |
| attribute | attribute=field name(none=_raw) | Specify a specific field value to compare (if left blank, compares the _raw field).
|
| header | header=(T | F)(default=F) | If set, displays a legend for the output of diff.
|
| context | context=(T | F)(default=F) | If set, displays context lines around the diff result. |
Examples
Splunk Web:
This example compares the raw text of result 45 and result 2 (because pos2 defaults to the second result).
* | diff pos1=45 CLI:
This example compares the top and 3rd results' hosts.
./splunk search "* | diff 1 3 attribute=host"
eventstats
Use the eventstats command to generate summary statistics of specified fields and save them into new fields. Specify a new field name for the statistics results with the as argument. If you don't specify a new field name, the default field name is the statistical operator and the field it operated on (for example: stat-operator(field)). You can group summary statistics by field (or more than one field) with the by argument. Each distinct set of by fields count as a distinct grouping.
Syntax
eventstats stat-operator [as new-field-name]... [by groupby-field(s)]
Arguments
| groupby-fields | field,field,... | Specifies the fields by which to group events. One result is returned per distinct combination of values of the fields. |
| new-field-name | name of new field | Specifies a new field name for the appended statistical result field. |
stat-operator
| stat-operator | count | distinct_count | first | last | sum | min | max | avg | mean | mode | median | stdev | var | percXX | Specifies the statistical operation to perform. |
| count | c | count|c (field) | Find the count of values in the specified field(s). |
| distinct_count | dc | distinct_count|dc (field) | Find the count of distinct values in the specified field(s). |
| first | first | Show the first "seen" value of a field. |
| last | last | Show the last "seen" value of a field. |
| sum | sum[(field)] | Produce the sum of the values of the field. |
| min | min (field) | FInd the minimum value of values in the specified field(s). |
| max | max (field) | Find the maximum value of values in the specified field(s). |
| avg | avg (field) | Find the average value of values in the specified field(s). |
| mean | mean (field) | Find the mean value of values in the specified field(s). |
| mode | mode (field) | Find the mode value of values in the specified field(s). |
| median | median (field) | Find the median value of values in the specified field(s). |
| stdev | stdev (field) | Find the standard deviation of values in the specified field(s). |
| var | var (field) | Find the variance of values in the specified field(s). |
| percXX | percXX | Percentile, integer between 1 and 99 |
Examples
Splunk Web:
This example searches the data in the sampledata index, and creates a field of the average value of bytes for each event with different values of date_minute.
index=sampledata | eventstats avg(bytes) as BYTEs by date_minute
format
This data-processing command takes results of a subsearch and formats them into a single result (single result with an attribute value of: _query) that is a query built from the input search results. This is so they can be applied to another search (useful for subsearches). Six strings are needed to define row prefix, column prefix, column separator, column end, row separator, and row end. If no argument is specified, the default values are used.
Syntax
format row-prefix column-prefix column-separator column-end row-separator row-end
Arguments
| row-prefix | character( ( ) | Specifies the character used for the row prefix. |
| column-prefix | character( ( ) | Specifies the character used for the column prefix. |
| column-separator | character( AND ) | Specifies the character used for the column separator. |
| column-end | character( ) ) | Specifies the character used for the column end. |
| row-separator | character( OR ) | Specifies the character used for the row separator. |
| row-end | character( ) ) | Specifies the character used for the row end. |
Examples
Splunk Web:
Example 1: Get top 2 results and create a search from their host, source and sourcetype, resulting in a single search result with a _query field:
_query=( ( "host::mylaptop" AND "source::syslog.log" AND "sourcetype::syslog" ) OR ( "host::bobslaptop" AND "source::bob-syslog.log" AND "sourcetype::syslog" ) )
... | head 2 | fields + source, sourcetype, host | formatExample 2: This example gets results that contain "/doc" and creates a search from their host, source and source type. Using a hypothetical set of data, this will return:
_query=( ( "host=willlaptop" AND "source=/home/david/logs/syslog.log" AND "sourcetype=syslog" ) OR ( "host=willlaptop" AND "source=/home/david/logs/syslog.log" AND "sourcetype=syslog" ) )
/doc | fields + source, sourcetype, host | format | outputrawThis can also be used in a subsearch as follows:
This subsearch finds all events that contain "will" from the source type and host of each.
will [search /doc | fields + source, sourcetype, host | format | outputraw]CLI:
Example 3: This is the CLI version of the first example.
./splunk search "/doc | fields + source, sourcetype, host | format | outputraw"
highlight
This data-processing command allows you to highlight one or more strings of text in your search results by specifying those strings in a list.
Syntax
highlight string,[string],...
Arguments
| string | string | string,...,string | Specify a comma or space-delimited list of strings you want to highlight. |
Examples
Splunk Web:
This example searches all sources that are a webserver sourcetype, and highlights the terms "login" and "logout".
sourcetype=webserver | highlight login,logout
join
The join commands provides traditional SQL-like joining of results from the main search with the results of a subsearch. Optionally, you can specify the exact fields to join. If no fields are specified, the command uses all fields that are common to both result sets.
Syntax
join <join-options> <field-list> [<subsearch>]
Arguments
| join-options | Options to the join command. |
| field-list | List of fields to join. |
| subsearch | A secondary search in the pipeline. |
join-options
| usetime=<bool> | usetime = T | F (F) | Indicates whether to limit matches to subsearch results that are earlier or later (depending on the 'earlier' option which is only valid when usetime=true) than the main result to join with. |
| earlier=<bool> | earlier = T | F (T) | When usetime=T, indicates whether to use an earlier or later result. |
| overwrite=<bool> | T | F (T) | Indicates if fields from the sub results should overwrite those from the main result if they have the same field name. |
| max=<int> | max = 1 | Indicates the maximum number of sub results each main result can join with. If max=0, no limit is set. |
Examples
To join results of search for "maybe" with results of search for "foo" on the id field:
maybe | join id [search foo]
makemv
Use makemv to change any field into a multi-value field during search time. Configure multi-value fields if a field's value string contains more than one useful value and you want to use them separately. For example, use multi-value fields to separate out multiple email addresses from a field so that you can get the distinct count of the number of people to whom an email was sent.
Specify a delim argument to parse a field value using a simple string delimiter. Specify a tokenizer argument to parse a field value like a regular expression. Add the allowempty argument to parse consecutive delim or tokenizer arguments separately.
makemv supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
makemv [tokenizer | delim] [allowempty] field
Arguments
| tokenizer | tokenizer= "string" | Use tokenizer to parse a field value as a regular expression. This is exactly like using the TOKENIZER=<regular expression> key when configuring multi-value fields in fields.conf.
|
| delim | delim= "string" | Use delim to parse a field value using a simple string delimiter (can be multiple characters). |
| allowempty | allowempty=(T | F)(default= F) | Set allowempty to T to accept empty values when parsing an entire field value. Empty values occur when makemv parses two consecutive delim arguments.
|
| field | field name (string) | Specify a field to change into a multi-value field. |
Note: If you don't specify a tokenizer or delim argument, makemv uses a single space as a delimiter (delim=" ") by default.
Examples
Splunk Web:
This example searches for sendmail events and uses makemv to parse the individual senders delimited by a comma (,). Splunk then reports the top senders.
eventtype="sendmail" | makemv delim="," senders | top senders
mvcombine
Use mvcombine to combine otherwise identical events in your search results that have a single differing field value into one result with a multi-value field of the differing field. Each result's differing field value then becomes a value in the multi-value field. Use mvcombine if your data has identical events coming from different sources, hosts, or client IP addresses. Use the field argument to specify the field to make into a multi-value field.
For example, if you have two search results:
- event #1:
field1=foo field2=bar field3=baz - event #2:
field1=foo field2=bar field3=new
Add | mvcombine field3 to your search. Splunk combines the two results into:
- combined single event:
field1=foo field2=bar field3=baz,new
mvcombine supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
mvcombine [delim] field
Arguments
| field | field name(string) | Specify a field to change into a multi-value field. |
| delim | delim= "string" | Specify a delimiter to use in the new multi-value field. |
Examples
Splunk Web:
This example combines identical search results with differing values in the field foo, and returns a single search result for all identical events with differing values of foo. Splunk lists each result's value of foo in the single result's multi-value field foo separated by colons (:).
|; mvcombine delim=":" foo
mvexpand
Use mvexpand to expand the values of a multi-value field into separate events for each value of the multi-value field. Specify a field to expand using the field argument. mvexpand copies the original event for each value of field. For example:
If you have:
- event #1:
field1=foo field2=bar,baz
and you add | mvexpand field2 to your search, you get:
- event #1a:
field1=foo field2=bar - event #1b:
field1=foo field2=baz
Note: If the field you specify isn't multi-value or there isn't a value of field for an event, then nothing happens to the event.
mvexpand supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
mvexpand field
Arguments
| field | field name (string) | Specify a multi-value field to expand. |
Examples
Splunk Web:
This example expands the values of the field foo into separate events for each value of foo.
|; mvexpand foonomv
Use nomv to change a multi-value field into a single-value field at search time. This is useful if you want to override multi-value field configurations in fields.conf. nomv causes multi-value field values to be considered as one single-value string (ignoring delimiters and tokenizers set in fields.conf).
Note: Learn how to configure multi-value fields via fields.conf.
Syntax
nomv field
Arguments
| field | field name (string) | Specify a multi-value field to change to a single-value field. |
Examples
Splunk Web:
This example searches sendmail events and returns the top lists of senders (a complete matching list of email addresses). If nomv isn't added to this search, then this example returns the top individual senders based on the multi-value field configuration in fields.conf.
eventtype="sendmail" | nomv senders | top senders
overlap
Use | overlap in a search to find events in a summary index that overlap in time, or find gaps in time that a scheduled saved search may have missed events. Overlaps can occur when you schedule a saved search to run with a time range that's shorter than the time range set in the search. Gaps can occur when you schedule a saved search to run with a longer time range than the time range set in the search.
For example, if you schedule the following search to run every minute, Splunk generates overlaps. If you schedule the same search to run every 5 minutes, Splunk returns gaps.
* minutesago=2 | stats countNote: Learn how to remove overlaps and gaps by referring to the preview:SummaryIndexingBestPractices:latest page.
Syntax
overlap
Arguments
None.
Examples
Splunk Web:
This example finds and returns overlapping events in the entire summary index.
index=summary | overlap
rare
This data-processing command displays the least common values of a field, along with a count and percentage.
rare supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
rare[option]... field list
Arguments
option
| option | showcount | showperc | rare | limit | Options for rare. |
| showcount | showcount=(T | F) (default=T) | If set, creates a field called "count" that holds the count. |
| showperc | showperc=(T | F) (default=T) | If set, creates a field called "percent" that holds the percentage of prevalence of values. |
| limit | limit=integer (default=10) | Specifies how many values appear. Setting to "0" causes all values to be returned. |
| field list | field,field,... | Comma-separated list of fields to include. |
Examples
Splunk Web:
This example Displays the least common values of the url field.
| rare url CLI:
This example displays the 20 least common values for the url field.
./splunk search "* | rare limit=20 url"
stats
This data-processing command provides summary statistics, grouped optionally by field. Returns one result for each aggregated group. If there is no "by" argument, there will be only one returned result. If there is a "by" argument with a single field, there will be a returned result for every distinct value of the field. If there is a "by" argument with several fields, there will be a returned result for every distinct tuple of values for the fields. Each result contains all the "by" fields, as well as a field for each aggregator argument.
stats supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
stats [stat-operator [as new-field-name] ]... [by groupby-field(s)]
Arguments
| groupby-fields | field,field,... | Specifies the fields to group events by. One result is returned per distinct combination of values of the fields. |
stat-operator
| stat-operator | count | distinct_count | first | last | sum | min | max | avg | mean | mode | median | stdev | var | percXX | list | values | range | Specifies the statistical operation to perform. |
| count | c | count|c (field) | Find the count of values in the specified field(s). |
| distinct_count | dc | distinct_count|dc (field) | Find the count of distinct values in the specified field(s). |
| first | first | Show the first "seen" value of a field. |
| last | last | Show the last "seen" value of a field. |
| sum | sum[(field)] | Produce the sum of the values of the field. |
| min | min (field) | Find the minimum value of values in the specified field(s). |
| max | max (field) | Find the maximum value of values in the specified field(s). |
| avg | avg (field) | Find the average value of values in the specified field(s). |
| mean | mean (field) | Find the mean value of values in the specified field(s). |
| mode | mode (field) | Find the mode value of values in the specified field(s). |
| median | median (field) | Find the median value of values in the specified field(s). |
| stdev | stdev (field) | Find the standard deviation of values in the specified field(s). |
| var | var (field) | Find the variance of values in the specified field(s). |
| percXX | percXX | Percentile, integer between 1 and 99 |
| list | list (field) | List the values of a field in a multi-value field (called list(fieldname)).
|
| values | values (field) | List all distinct values of a field as a multi-value field value (in the field values(fieldname//)).
|
| range | Coming Soon |
Examples
Splunk Web:
Search the access logs, and report the count of the number of hits from the top 100 referer domains.
sourcetype="access_combined" | top limit=100 referer_domain | stats sum(count)CLI:
For each unique time, give the average of any unique field that ends with the the string 'lay' (e.g. delay, xdelay, relay, etc).
./splunk search "* | stats avg(*lay) BY _time"
Note: The stats command replaces the deprecated select functionality to handle groupby calculations.
strcat
Combine any number of field values and strings of text together into a single field to create more meaningful data from your search results. For example, you can use strcat to combine the source and destination IP address fields in your search results to create a chart of IP address pairings.
host="mailserver" | strcat sourceIP "/" destIP comboIP | chart count by comboIPSyntax
strcat [required] sources destination
Arguments
| required | allrequired=(T | F) (F) | If set to true (T), requires that all of the source fields exist for a given event to write out the destination field. By default it's set to false (F). |
| sources | ("string" | field) ... ("string" | field name) | A space-delimited list of strings or fields to combine together. Strings are combined in the same order they are listed. |
| destination | field | Field to store the combined value in. This is always the last field listed. |
Examples
Splunk Web:
This example searches for all data from "access" sourcetype, then combines the host field with "::" and the port field. The combined strings are stored in the last field listed: address (values will be: host::port).
sourcetype=access | strcat host "::" port address
timechart
Create a time series chart and corresponding table of statistics generated from data obtained from search results. Add timechart to any search in Splunk Web, and use its options to define any series of data you want to plot. Add timechart to CLI searches to generate a table to export to an external charting tool (you can't generate a graphical chart in the CLI).
Use a time series chart to efficiently display large amounts of data for meaningful analysis. There are many useful ways to apply a time series chart. Use timechart to see any of your data in a time series. For example, use timechart to see a sum or statistic of a numerical field over time, show a simple count of events over time, or see the number of distinct values of a given field over time.
Define a series
A series in a time series chart is a collection of events collected over equal intervals of time. More specifically, a timechart series is a statistic applied to a field (using the stat-operator argument) shown over time. Each series shown in a chart is a value of whatever field that's in the stat-operator argument. You can have more than one series in a chart by splitting the field values based on another field value (using the split-by-clause). For example, if you want to see the number of events that occur on each of your hosts over time, create a time series chart that counts the number of occurrences of _raw (raw events) per host. The following example counts the number events that occur, and splits the series by the host field value.
* | timechart count(_raw) by hostControl the consistency of a series
When you create more than one series of data (using the split-by-clause), some events from the original series may not have a value in the split-by field. The resulting series that's shown on your chart may look inconsistent and reflect its trend inaccurately over time. Use a split-by-option to define bucketing-options to correctly scale the series, or add null values to make your time series more consistent. For example, use this if you want to see the number of products purchased over time, and show how many of each product is being purchased. To see this accurately, you don't want your resulting time series to include null values (usenull is set to true by default).
sourcetype=access_combined | timechart span=1m count(_raw) by product_id usenull=fControl chart resolution
Control the resolution of a chart by controlling the number or size of bins (a bin is a grouping of rows from the table that timechart generates from its stat-operator). Use a bucketing-option to set the maximum number of bins to display or set the span length of the bins in a chart. If you specify both the number of bins, and the span length, then span length takes precedence.
For example, if a time series chart of the number of events occurring by host value does not show a chart with enough resolution to see all of the events from each host, then set a relatively small span value (using span-length time scale units).
* | timechart span=1m count(_raw) by hostNote: timechart supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
timechart [bucketing-option]... stat-operator [split-by-clause]
Arguments
bucketing-option
Change the bucket size or time span to control resolution of the x-axis of a time series chart.
| bins | bins=integer (default=300) | Specify the maximum possible number of bins to display in a chart. Default is bins=300 for a time series chart. |
| span | span=integer span-length | Specify the span for buckets in the x-axis. Specify what units of time to use with the span-length option (For example: span=10mon, span=2d, span=5m).
|
| fixedrange | fixedrange=T | F (default=T) | Set to true (T) to use the search time boundaries (start and finish of a search) as boundaries for time buckets. |
| cont | cont=T | F (default=T) | Set to true (T) to add empty buckets to the x-axis to make data in a chart appear more uniform. |
| start | start=integer | Set the minimum number of numerical buckets. |
| end | end=integer | Set the maximum number of numerical buckets. |
| length | length=integer span-length | Specify the span of time buckets (x-axis). |
stat-operator
Specify a statistic to display as a series.
| count | c | count|c (field) | Find the count of values in the specified field(s). |
| distinct_count | dc | distinct_count|dc (field) | Find the count of distinct values in the specified field(s). |
| first | first | Show the first "seen" value of a field. |
| last | last | Show the last "seen" value of a field. |
| sum | sum[(field)] | Produce the sum of the values of the field. |
| min | min (field) | Find the minimum value of values in the specified field(s). |
| max | max (field) | Find the maximum value of values in the specified field(s). |
| avg | avg (field) | Find the average value of values in the specified field(s). |
| mean | mean (field) | Find the mean value of values in the specified field(s). |
| mode | mode (field) | Find the mode value of values in the specified field(s). |
| median | median (field) | Find the median value of values in the specified field(s). |
| stdev | stdev (field) | Find the standard deviation of values in the specified field(s). |
| var | var (field) | Find the variance of values in the specified field(s). |
| percXX | percXX | Percentile, integer between 1 and 99 |
split-by-clause
Split a series into more than one series by splitting it based on values of another field.
| split-by-clause | by field... [split-by-option [where-clause] | Specify a field to split a series by. Optionally specify additional split-by-options, or filter with a where-clause. |
split-by-option Change the behavior of a time series when splitting a series by values of another field.
| bucketing-option | See bucketing-option table for syntax. | Define bucket size or time span to control x-axis resolution. |
| usenull | usenull=T | F (default=T) | If set to true (T), create a series for events that doesn't contain the field of the split-by-clause. The series it creates is labeled by the value of the nullstr option (the default series label is "NULL"). |
| nullstr | nullstr=string (default="NULL") | Specify a value to label a series created by usenul. |
| useother | useother=T | F (default=F) | If set to true (T), add a series in the table for any series that doesn't match the where-clause. |
| otherstr | otherstr=string (default="OTHER") | Specify a value to label the series that aren't added to the time series chart because they don't match the where-clause. |
where-clause
Filter which series to display in a chart by filtering how values match the field in a split-by-clause.
| where-clause | where stat-operator (where-comparison | where-threshold) (Default= where sum in top 10) | |
| where-comparison | (in | notin) (top | bottom) integer | Specify a condition to match the values of series that is split by the field in the split-by-clause. |
| where-threshold | (< | >) number | Specify a numeric threshold to match the values of series that is split by the field in the split-by-clause. |
span-length
Specify the time units to use when choosing a time span.
| ts-sec | s, sec, secs, second, seconds | Time scale in seconds. |
| ts-min | m, min, mins, minute, minutes | Time scale in minutes. |
| ts-hr | h, hr, hrs, hour, hours | Time scale in hours. |
| ts-day | d, day, days | Time scale in days. |
| ts-month | mon, month, months | Time scale in months. |
Examples
Example 1: Change management This example shows how to see the number of change tickets created in the last 24 hours split by the priority of the tickets.
tag=ticket | dedup key | rename file as path | rename host_accepted as host | convert timeformat="%Y/%m/%d %T" mktime(created) as _time | timechart count(_raw) by priorityExample 2: Security This example shows the number of SSH login failures each month split by the source IP address.
sshd NOT success failed OR failure | timechart span=1mon count(eventtype) by source_ip usenull=fExample 3: Graph the average "thruput" of hosts over time.
... | timechart span=5m avg(thruput) by hostExample 4: Create a timechart of average "cpu_seconds" by "host", and remove data (outlying values) that may distort the timechart's axis.
... | timechart avg(cpu_seconds) by host | outlier action=tfExample 5: Calculate the average value of "CPU" each minute for each "host".
... | timechart span=1m avg(CPU) by hostExample 6: Create a timechart of the count of from "web" sources by "host"
... | timechart count by hostExample 7: Compute the product of the average "CPU" and average "MEM" each minute for each "host"
... | timechart span=1m eval(avg(CPU) * avg(MEM)) by hosttop
This data-processing command displays the most common values of a field, along with a count and percentage.
top supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
top [option]... field list
Arguments
option
| showcount | showcount=T | F (T) | If set, creates a field called "count" that holds the count. |
| showperc | showperc=T | F (T) | If set, creates a field called "percent" that holds the percentage of prevalence of values. |
| limit | limit=number(10) | Specifies how many values appear. Setting to "0" causes all values to be returned. |
| field list | field1,field2,... | Comma-separated list of fields to include. |
Examples
Splunk Web:
This example displays the most common 10 values of the url field.
|; top urlCLI:
This example displays the most common 20 values of the url field.
./splunk search "* | top limit=20 url"
transaction
This data-processing command takes the results of a search and groups related events into transactions. This allows you to apply a pre-defined transaction to your search, or define specifications to create transactions during your search. You can use transaction with any search.
Transactions that are returned consist of: the raw text of each event, the shared event types, and the field values.
Use macro search with transactions to create transactions that work with macro substitution.
transaction supports multi-value fields. Configure multi-value fields at index time by editing fields.conf (Learn how to configure multi-value fields via fields.conf).
Syntax
transaction [name] [transaction-options]...
Arguments
| name | string | Name of the transaction definition as defined in transactiontypes.conf. If you specify a transaction name, you can override any attribute/value pairs you have set for that transaction by explicitly listing them in your search. |
transaction-options
| transaction-options | maxspan | maxpause | fields | aliases | pattern | match | start | end | Optional constraints to specify for transaction processing. |
| aliases | aliases=(A | B | C) | Specify a list of aliases to use with the pattern option. Defaults: A=login, B=purchase, C=logout. You can't use start and end options when using aliases. |
| fields | fields="[field], [field],..."(" ") | Specifies a list of fields by which the events will be grouped. For each field specified, events will be grouped if they contain common fields with identical values. Further, events with common field names and different values will not be grouped. For example: if "host" is a constraint, then a search result that has "host=mylaptop" can never be in the same transaction as a search result with "host=myserver". A search result that has no "host" value can be in a transaction with a result that has "host=mylaptop". Specify multiple fields in quotes, eg. fields="field1, field2" |
| match | match= closest(closest) | Specifies the matching type to use with a transaction definition. The only value supported currently is: closest. |
| maxspan | maxspan=integer[s | m | h | d] | Specifies the constraint for the maximum span that a transaction can be. |
| maxpause | maxpause=integer[s | m | h | d] | Specifies the maximum pause between transactions. Requires there be no pause between a transaction's events greater than maxpause. If the value is negative, the maxspause constraint is disabled. The default maxpause is 2 seconds. If a pattern constraint is specified, the default maxpause is -1 (disabled) |
| pattern | pattern=regular expression | Defines a pattern of event types to be included in a transaction. |
| start | startswith= "string" | Specify a SQLite expression that must be true to begin a transaction. Strings must be quoted with " ". You can use SQLite wildcards (%) and use single quotes(' ') to specify a literal term. This syntax refers to an event type name, not an event string itself. |
| end | endswith= "string" | Specify a SQLite expression that must be true to end a transaction. Strings must be quoted with " ". You can use SQLite wildcards (%) and use single quotes(' ') to specify a literal term. This syntax refers to an event type name, not an event string itself. |
Note: Use escaped quotes (\") when you specify values that contain quotes for the start and end options.
For a start value of attr="value":
startswith=attr=\"value\"Note: The transaction command should not be used when you want to compute aggregate statistics over transactions defined by a unique identifier. For example, if you want to find the "longest" transactions, where the field "trade_id" defines the transactions, the following search is far more efficient:
|; stats min(_time) as earliest max(_time) as latest by trade_id | eval duration = latest - earliest | sort -durationExamples
Splunk Web:
This example searches for transactions that have a maximum span of 30 seconds, have a pause between transactions no greater than 5 seconds, and have matching from fields. For example, this search will return all events from the same sender occurring within 30 seconds of each other.
|; transaction fields=from maxspan=30s maxpause=5stranspose
The transpose command returns the specified number of rows as columns, where each row becomes a column. Takes a single optional integer argument that limits the number of rows we transpose (default = 5).
Syntax
transpose [<int>]
Examples
Turn the first five rows of your search results into columns.
... | transposetypelearner
This data-processing command generates a list of queries based on search results, to use as event types. It will create a "search=..." field in your results that contains a search for keywords and a punctuation pattern associated with that event.
Syntax
typelearner
Arguments
None.
Examples
Splunk Web:
This example searches all events, takes the last 20 events, and applies the event type learner. The event type learner will add a field to the results that contains a search that searches for keywords and the punctuation type for each event.
| tail 20 | typelearner
xmlunescape
This data-processing command un-escapes the XML entity references (for: &, >, and <) back to their corresponding characters in your search results. You can specify how many search results to unescape XML from by using the max-inputs argument.
Syntax
xmlunescape [max-inputs]
Arguments
| max-inputs | maxinputs=integer (100) | Sets how many results (starting from the top) are passed to xmlunescape.
|
Examples
Splunk Web:
This example searches for all events from the source "xml_escaped", then unescapes XML characters for &, >, and < in all search results.
source="xml_escaped" | xmlunescape This documentation applies to the following versions of Splunk: 3.3 , 3.3.1 , 3.3.2 , 3.3.3 , 3.3.4 , 3.4 , 3.4.1 , 3.4.2 , 3.4.3 , 3.4.5 , 3.4.6 , 3.4.8 , 3.4.9 , 3.4.10 , 3.4.11 , 3.4.12 , 3.4.13 , 3.4.14 View the Article History for its revisions.