How subsearches work
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
How subsearches work
A subsearch is a search with a search pipeline as an argument. Subsearches are contained in square brackets and evaluated first. The result of the subsearch is then used as an argument in the primary or outer search. Subsearches are mainly used for two purposes:
- Parameterize one search, using the output of another search (for example, find every record from IP addresses that visited some specific URL).
- Run a separate search, but stitch the output to the first search using the
The following is an example of using a subsearch to parameterize one search. You're interested in finding all events from the most active host in the last hour; but, you can't search for a specific host because it might not be the same host every hour. First, you need to identify which host is most active.
sourcetype=syslog earliest=-1h | top limit=1 host | fields + host
This search will only return one host value. In this example, the result is the host named "crashy". Now, you can search for all the events coming from "crashy":
But, instead of running two searches each time you want this information, you can use a subsearch to give you the hostname and pass it to the outer search:
sourcetype=syslog [search sourcetype=syslog earliest=-1h | top limit=1 host | fields + host]
Use subsearch to correlate data
You can use subsearches to correlate data, including data across different indexes or Splunk servers in a distributed environment.
For example, you may have two or more indexes for different application logs. The event data from these logs may share at least one common field. You can use the values of this field to search for events in one index based on a value that is not in another index:
sourcetype=some_sourcetype NOT [search sourcetype=another_sourcetype | fields field_val]
Note: This is equivalent to the SQL "NOT IN" functionality:
SELECT * from some_table
NOT IN (SELECT field_value FROM another_table)
Change the format of subsearch results
When you use a subsearch, the
format command is implicitly applied to your subsearch results. The format command changes your subsearch results into a single linear search string. This is used when you want to pass the returned values in the returned fields into the primary search.
If your subsearch returned a table, such as:
| field1 | field2 | ------------------- event/row1 | val1_1 | val1_2 | event/row2 | val2_1 | val2_2 |
The format command returns:
(field1=val1_1 AND field2=val1_2) OR (field1=val2_1 AND field2=val2_2)
For more information, see the format search command reference.
There are a couple of exceptions to this. First, all internal fields (fields that begin with a leading underscore "_*") are ignored and not reformatted in this way. Second, the "search" and "query" fields have their values rendered directly in the reformatted search string.
Generally,"search" can be useful when you need to append some static data or do some
eval on the data in your subsearch and then pass it to the primary search. When you use "search", the first value of the field is used as the actual search term. For example, if field2 was "search" (in the table above), the format command returns:
(field1=val1_1 AND val1_2) OR (field1=val2_1 AND val2_2)
You can also use "search" to modify the actual search string that gets passed to the primary search.
"Query" is useful when you are looking for the values in the fields returned from the subsearch, but not in these exact fields. The "query" field behaves similarly to format. Instead of passing the field/value pairs, as you see with
format, it passes the values:
(val1_1 AND val1_2) OR (val2_1 AND val2_2)
Let's say you have the following search, which searches for a
clID associated with a specific
Name. This value is then used to search for several sources.
index="myindex" [ search index="myindex" host="myhost" <Name> | top limit=1 clID | fields + clID ]
The subsearch returns something like:
( (clID="0050834ja") )
If you want to return only the value,
0050834ja, run this search:
index=myindex [ search index=myindex host=myhost MyName | top limit=1 clID | fields + clID | rename clID as search ]
If the field is named search (or query) the field name will be dropped and the subsearch (or technically, the implicit
| format command at the end of the subsearch) will drop the field name and return ( ( 0050834ja ) ). Multiple results will return, e.g., ( ( value1 ) OR ( value2 ) OR ( value3 ) ).
This is a special case only when the field is named either "search" or "query". Renaming your fields to anything else will make the subsearch use the new field names.
Performance of subsearches
If your subsearch returns a large table of results, it will impact the performance of your search. You can change the number of results that the format command operates over inline with your search by appending the following to the end of your subsearch:
| format maxresults = <integer> . For more information, see the format search command reference.
You can also control the subsearch with settings in
limits.conf for the runtime and maximum number of results returned:
maxout = <integer>
- Maximum number of results to return from a subsearch.
- This number cannot be greater than or equal to 10500.
- Defaults to 100.
maxtime = <integer>
- Maximum number of seconds to run a subsearch before finalizing
- Defaults to 60.
ttl = <integer>
- Time to cache a given subsearch's results.
- Defaults to 300.
After running a search you can click the Actions menu and select "Inspect Search". Scroll down to the
remoteSearch component, and you can see what the actual query that resulted from your subsearch. Read more about the "Search Job Inspector" in the Search reference manual.
Result output settings for subsearch commands
Limits.conf.spec indicates that subsearches return a maximum of 100 results by default. But you will see variations in the actual number of output results, because every command can change what the default maxout is for subsearches that belong to that command. Additionally, the default maxout only applies to subsearches that are intended to be expanded into a search expression, which is not the case for some commands, such as join and append. For example, the append command overrides that default to be the value of maxresultrows, unless the user has specified a maxout as a argument to append.
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has about using subsearches.