Types of searches
When you search, you are usually trying to accomplish one of two things: you are either retrieving events from an index or summarizing results into a tabular or other visualization format. Because of this, you might hear us refer to two types of searches:
- Raw event searches are searches that just retrieve events from an index or indexes and are typically done when you want to analyze a problem. Some examples of these searches include: checking error codes, correlating events, investigating security issues, and analyzing failures. These searches do not usually include search commands, and the results are typically a list of raw events.
- Report generating searches are searches that perform some type of statistical calculation against a set of results. These are searches where you first retrieve events from an index and then pass them into one or more search commands. These searches will always require fields and at least one of a set of statistical commands. Some examples include: getting a daily count of error events, counting the number of times a specific user has logged in, or calculating the 95th percentile of field values.
Whether you're retrieving raw events or building a report, you should also consider whether you are running a search for sparse or dense information:
- Sparse searches are searches that look for single event or an event that occurs infrequently within a large set of data. You've probably heard these referred to as 'needle in a haystack' or "rare term" searches. Some examples of these searches include: searching for a specific and unique IP address or error code. When running a sparse search, use the Search (timeline) view, because it attempts to extract as much relevant information as possible from the event(s) which helps to search for the unique event.
- Dense searches are searches that scan through and report on many events. Some examples of these searches include: counting the number of errors that occurred or finding all events from a specific host. When running a dense search that reports on events across a lot of data, use the Advanced Charting view. In this view, Splunk doesn't process all field information, but optimizes only the amount of information it needs to complete your search.
A search consists of a series of commands, delimited by pipe (|) characters. The first whitespace-delimited string after each pipe character controls the command used. The remainder of the text for each command is handled in a manner specific to the given command.
Escaping and Quoting
The quotation mark character (") is used to group text, allowing embedded spaces and other whitespace, and embedded pipes to be part of a string. Quotes must be balanced, an opening quote must be followed by an unescaped closing quote.
For example a search such as
error | stats count
will find the number of events containing the string error. Meanwhile the string
"error | stats count"
would find the events containing error, a pipe, stats, and count, in that order. the raw events would be returned.
The backslash character (\) is used to escape quotes, pipes, and itself.
The sequence \| as part of a search will send a pipe character to the command, instead of having the pipe split between commands. The sequence \" will send a literal quote to the command, for example for searching for a literal quotation mark or inserting a literal quotation mark into a field using rex. The \\ sequence will be available as a literal backslash in the command.
Unrecognized backslash sequences are not altered. For example \s in a search string will be available as \s to the command, because \s is not a known escape sequence. However, in the search string \\s will be available as \s to the command, because \\ is a known escape sequence that is converted to \.
Backslash escape sequences are still expanded inside quotes.
myfield is created with the value of 6
myfield is created with the value of "
myfield is created with the value of \
error: unbalanced quotes.
Events and results flowing through the splunk search pipeline exist as a collection of fields. Fields can fundamentally come from the Splunk index -- _time as the time of the event, source as the filename, etc -- or can be derived from a wide variety of sources at search time -- eventtypes, tags, regex extractions using the rex command, totals coming from the stats command, etc.
For a given event, a given field name may be present or absent. If present, it may contain a single value or multiple values. Each value is a text string. Values may be of positive length (a string, or text) or zero length (empty strings, or "").
Numbers, for example, are just strings that contain the number. For example a field containing a value of the number 10 contains the characters 1 and 0: "10". Commands that take numbers from values automatically convert them internally to numbers for calculations.
We use the following terms in Splunk documentation and messages to refer to these scenarios.
- Null field
- A null field is not present on a particular result or event. Other events or results in the same search may have values for this field. For example, the fillnull search command is used to add a field and default value to events or results which lack fields present on other events or results in the search.
- Empty field
- An Empty Field is just shorthand for a field which contains a single value that is the empty string.
- Empty value
- A value that is the empty string, or ""; A zero-length string.
- Multvalue field
- A field which has more than one value. All non-null fields contain an ordered list of strings. The common case is that this is a list of one value. When the list contains more than one entry, we call this a multi-value field. For more information, see "Manipulate and evaluate fields with multiple values" in the User Manual.
List of data types
Best practices for searching
This documentation applies to the following versions of Splunk® Enterprise: 4.3, 4.3.1, 4.3.2, 4.3.3, 4.3.4, 4.3.5, 4.3.6, 4.3.7