Anatomy of a search
A search consists of a series of commands that are delimited by pipe ( | ) characters. The first whitespace-delimited string after each pipe character controls the command used. The remainder of the text for each command is handled in a manner specific to the given command.
This topic discusses an anatomy of a Splunk search and some of the syntax rules shared by each of the commands and syntax rules for fields and field values.
The anatomy of a search
To better understand how search commands act on your data, it helps to visualize all your indexed data as a table. Each search command redefines the shape of your table.
For example, let's take a look at the following search.
sourcetype=syslog ERROR | top user | fields - percent
The Disk represents all of your indexed data. The Disk is a table of a certain size with columns that represent fields and rows that represent events. The first intermediate results table shows fewer rows--representing the subset of events retrieved from the index that matched the search terms "sourcetype=syslog ERROR". The second intermediate results table shows fewer columns, representing the results of the top command, "top user", which summarizes the events into a list of the top 10 users and displays the user, count, and percentage. Then, "fields - percent" removes the column that shows the percentage, so you are left with a smaller final results table.
About the search pipeline
The "search pipeline" refers to the structure of a Splunk search, in which consecutive commands are chained together using a pipe character, "|". The pipe character tells Splunk software to use the output or result of one command (to the left of the pipe) as the input for the next command (to the right of the pipe). This enables you to refine or enhance the data at each step along the pipeline until you get the results that you want.
A Splunk search starts with search terms at the beginning of the pipeline. These search terms are keywords, phrases, boolean expressions, key/value pairs, etc. that specify which events you want to retrieve from the index(es). See "About retrieving events".
The retrieved events can then be passed as inputs into a search command using a pipe character. Search commands tell Splunk software what to do to the events after you retrieved them from the index(es). For example, you might use commands to filter unwanted information, extract more information, evaluate new fields, calculate statistics, reorder your results, or create a chart. Some commands have functions and arguments associated with them. These functions and their arguments enable you to specify how the commands act on your results and which fields to act on; for example, how to create a chart, what kind of statistics to calculate, and what fields to evaluate. Some commands also enable you to use clauses to specify how you want to group your search results.
- For more information about what you can do with search commands, see "About the search processing language".
- In the Search Reference, For a list of search commands, see the "Command quick reference" and the individual search command reference topics for syntax and usage information.
Quotes and escaping characters
Generally, you need quotes around phrases and field values that include white spaces, commas, pipes, quotes, or brackets. Quotes must be balanced, an opening quote must be followed by an unescaped closing quote. For example:
- A search such as
error | stats count
will find the number of events containing the stringerror
. - A search such as
... | search "error | stats count"
would return the raw events containing the literal stringerror
, a pipe character ( | ) ,stats
, andcount
, in that order.
Additionally, you want to use quotes around keywords and phrases if you don't want to search for their default meaning, such as Boolean operators and field-value pairs. For example:
- A search for the keyword AND without meaning the Boolean operator:
error "AND"
- A search for this field-value phrase:
error "startswith=Error"
The backslash character (\) is used to escape quotes, pipes, and itself. Backslash escape sequences are still expanded inside quotes. For example:
- The sequence \| as part of a search will send a pipe character to the command, instead of having the pipe split between commands.
- The sequence \" will send a literal quote to the command, for example for searching for a literal quotation mark or inserting a literal quotation mark into a field using rex.
- The \\ sequence will be available as a literal backslash in the command.
If Splunk software does not recognize a backslash sequence, it will not alter it.
- For example \s in a search string will be available as \s to the command, because \s is not a known escape sequence.
- However, in the search string \\s will be available as \s to the command, because \\ is a known escape sequence that is converted to \.
Asterisks ( * ) cannot be searched for using a backslash to escape the character. Splunk software treats the asterisk character as a major breaker. Because of this, it will never be in the index. If you want to search for the asterisk character, you will need to run a post-filtering regex search on your data:
index=_internal | regex ".*\*.*"
For more information about major breakers, read "Overview of event processing" in the Getting Data in Manual.
Examples
Example 1: The myfield
field is created with the value of 6
.
... | eval myfield="6"
Example 2: The myfield
field is created with the value of "
.
... | eval myfield="\""
Example 3: The myfield
field is created with the value of \
.
... | eval myfield="\\"
Example 4: This search would produce an error because of unbalanced quotation marks.
... | eval myfield="\"
Fields
Events and results flowing through the Splunk search pipeline exist as a collection of fields. Fields can fundamentally come from the Splunk index, for example, _time as the time of the event, source as the filename, and so on. Or can be derived from a wide variety of sources at search time, such as eventtypes, tags, regex extractions using the rex
command, totals coming from the stats
command, and so on.
For a given event, a given field name might be present or absent. If present, it might contain a single value or multiple values. Each value is a text string. Values might be of positive length (a string, or text) or zero length (empty strings, or "").
Numbers, for example, are strings that contain the number. For example, a field containing a value of the number 10 contains the characters 1 and 0: "10". Commands that take numbers from values automatically convert them internally to numbers for calculations.
- Null field
- A null field is not present on a particular result or event. Other events or results in the same search might have values for this field. For example, the
fillnull
command adds a field and default value to events or results that lack fields present on other events or results in the search.
- Empty field
- An empty field is shorthand for a field that contains a single value that is the empty string.
- Empty value
- A value that is the empty string, or "". You can also describe this as a zero-length string.
- Multivalue field
- A field that contains more than one value. For example, events such as email logs often have multivalue fields in the To: and Cc: information. See Manipulate and evaluate fields with multiple values in the Search Manual.
About the Search app | Help building searches |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408
Feedback submitted, thanks!