How search commands work
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
How search commands work
This topic describes how your search results are changed after they pass through different types of search commands.
The search command pipeline
Splunk's search language uses the pipe operator "|" to chain together a series (or pipeline) of search commands. Searches in Splunk are rendered left to right. This means that the result of the command to the left of the pipe operator is fed into the command to the right of the pipe operator. The exception to this general rule is a subsearch, which is evaluated first; for more information, see How subsearches work.
At the beginning of the pipeline, if your search doesn't begin with a leading pipe operator, the search command is implied. Generally, the more specific you are before the first pipe operator (the more specific you are about the events you want Splunk to retrieve), the more efficiently the rest of the search commands will return results.
Once you have retrieved the set of events that you are interested in, you can use Splunk's search language to manipulate your data. For example, you can do one or more of the following in a single pipeline:
- filter out the events that you don't want
- reorder the way your results are displayed
- evaluate your numerical data
- extract more information to enrich your understanding of your data
- transform your data into statistical results
In Splunk's Search app, you can view your search results as an event list or an event table. If you visualize your data as a table, each time you pass the search results through a different type of command, Splunk manipulates this table by removing or adding cells, columns, or rows. Also, the type of commands you can chain together depends somewhat on the data you have at any point in the pipeline.
All indexed data
To better understand how search commands act on your data, it helps to visualize all of your indexed data (before you search on it) as a table. While indexing your data, Splunk automatically adds fields that it recognizes in your data. These default fields and any other fields that you extract after indexing, can be considered the columns of your table. Each individual event in your data can be considered the rows in your table. The values of the fields for each event can be considered the table's cells.
Generate search results
A typical Splunk search begins with a generating command that results in the event data that you are interested in. By now, you should be comfortable with generating search results using literal strings, wildcards, Boolean expressions, field name and value expressions, and even using the UI interactively. If not, you can go back and review the Start searching topic in this manual.
When you run a search against the indexed data, Splunk retrieves events that match your search terms. The columns that Splunk matches against depend on the types of search terms you use. For example, if you run a search for literal strings, Splunk matches the keywords against the _raw column; if your search includes explicit field and value pairs, Splunk matches those specific cells in each row.
You can visualize the table of search results as a table with the same number of columns but fewer rows (the original table minus the rows that did not match the search string). Also, searching doesn't change any of the cell values.
Let's look at some sample logs of email activity and performance. Splunk ships with sample sendmail data in "index=sample":
index=sample sendmailThe indexed logs may look something like this in Splunk's event list view:
Apr 8 11:25:13 splunk3 sendmail[4669]: n38IPBSB004668: to=<spamme@splunkit.com>, delay=00:00:02, xdelay=00:00:02, mailer=local, pri=31224, dsn=2.0.0, stat=Sent Apr 8 11:25:11 splunk3 sendmail[4668]: n38IPBSB004668: from=<spammer@spamdomain.com>, size=1032, class=0, nrcpts=1, msgid= <200904081825.n38IPBMZ021146@virt2.int.splunk.com>, proto=ESMTP, daemon=MTA, relay=[64.127.105.34] Apr 8 11:25:05 splunk3 sendmail[4647]: n38IP58A004647: lb1.int.splunk.com [10.2.1.2] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA Apr 8 11:24:11 splunk3 sendmail[21998]: n38HOAMr021998: from=<0407pc@163.com>, size=0, class=0, nrcpts=0, proto=SMTP, daemon=MTA, relay=61-231-65-253.dynamic.hinet.net [61.231.65.253] Apr 8 11:24:11 splunk3 sendmail[21998]: n38HOAMr021998: lost input channel from 61-231-65-253.dynamic.hinet.net [61.231.65.253] to MTA after rcpt
If viewed as an event table in Splunk it may look something like:
_time daemon delay from mailer msgid nrcpts proto relay size stat to xdelay
--------------------------- ------ -------- ------------------------ ------ -------------------------------------------------- ------ ----- ------------------------------- ---- ---- --------------------- --------
2009-04-08 11:25:13.000 PDT 00:00:02 local Sent <spamme@splunkit.com> 00:00:02
2009-04-08 11:25:11.000 PDT MTA <spammer@spamdomain.com> <200904081825.n38IPBMZ021146@virt2.int.splunk.com> 1 ESMTP [64.127.105.34] 1032
2009-04-08 11:25:05.000 PDT
2009-04-08 11:24:11.000 PDT MTA <0407pc@163.com> 0 SMTP 61-231-65-253.dynamic.hinet.net 0
2009-04-08 11:24:11.000 PDT
Generating search commands: crawl, file, savedsearch, (implied) search.
Filter unwanted information
When you use the search command elsewhere in the pipeline it acts as a filtering command. Filtering commands produce the same results as a generating search: a smaller table. However, depending on the search command that is used, the smaller table may have fewer rows or fewer columns.
Example: You can use the fields command keep fields you want or remove fields you don't want. This results in a table with fewer columns.
index=sample sendmail | fields _time, from, stat, to, size
_time from stat to size
--------------------------- ------------------------ ---- --------------------- ----
2009-04-08 11:25:13.000 PDT Sent <spamme@splunkit.com>
2009-04-08 11:25:11.000 PDT <spammer@spamdomain.com> 1032
2009-04-08 11:25:05.000 PDT
2009-04-08 11:24:11.000 PDT <0407pc@163.com> 0
2009-04-08 11:24:11.000 PDT
Filtering commands also do not change any cell values.
Filtering commands: dedup, fields, head, localize, regex, search, set, tail, where.
Extract more information
You can use extracting commands to add new fields, and even add new events. Extracting commands create new rows or columns from information found in the _raw column for each row.
Example: Use the rex command to extract the qid from the email data. This adds a new column to the table.
index=sample sendmail | rex field=_raw "sendmail\[\d+\]:s+(?P<qid>\S+)\s:"Extracting commands: addinfo, extract/kv, iplocation, multikv, rex, top, typer, xmlkv.
Evaluate your data
You can use evaluating commands to change specific column names or cell values. Depending on the command, an evaluating command may or may not add columns.
Example: Group events into transactions based on the sendmail qid.
index=sample sendmail | transaction qidEvaluating commands: abstract, addtotals, bucket, cluster, collect, convert, correlate, diff, eval, eventstats, format, fillnull, format, kmeans, makemv, mvcombine, mvexpand, nomv, outlier, overlap, replace, strcat, transaction, typelearner, xmlunescape.
Reorder your results
Use reordering commands to sort the rows of the entire table based on the values of the specified column name. These commands do not add or remove rows and do not change any cell values.
Reordering commands: reverse, sort.
Transform your data into statistical results
Transforming commands creates an entirely new table of data. These commands change the specified cell values for each event into numerical values that Splunk can use for statistical purposes.
Transforming commands: chart, contingency, rare, stats, timechart, top.
This documentation applies to the following versions of Splunk: 4.0 , 4.0.1 , 4.0.2 , 4.0.3 , 4.0.4 , 4.0.5 , 4.0.6 , 4.0.7 , 4.0.8 , 4.0.9 , 4.0.10 , 4.0.11 View the Article History for its revisions.