User Manual

 


How subsearches work

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

How subsearches work

A subsearch is a search with a search pipeline as an argument. Subsearches are contained in square brackets and evaluated first. The result of the subsearch is then used as an argument in the primary or outer search. You can use subsearches to match subsets of your data that you cannot describe directly in a search expression, but which can be generated from a search.

For example, if you're interested in finding all events from the most active host in the last hour, you can't search for a specific host because it might not be the same host every hour. First, you need to identify which host is most active.

sourcetype=syslog earliest=-1h | top limit=1 host | fields + host

Note that the previous search will only return one host value. Once you have this host, which is the most active host in the last hour, you can search for all events on that host. Let's say it's a server named, "crashy":

sourcetype=syslog host=crashy

But, instead of running two searches each time you want this information, you can use a subsearch to give you the hostname:

sourcetype=syslog [search sourcetype=syslog earliest=-1h | top limit=1 host | fields + host]

Modify subsearch limits in limits.conf

You can control the subsearch runtime and number of results by setting these limits in the [subsearch] stanza of a limits.conf file.

[subsearch]
maxout = <integer>
* Maximum number of results to return from a subsearch.
* Defaults to 100.

maxtime = <integer>
* Maximum number of seconds to run a subsearch before finalizing
* Defaults to 60.

ttl = <integer>
* Time to cache a given subsearch's results.
* Defaults to 300.

Use subsearch to correlate data

You can use subsearches to correlate data, including data across different indexes or Splunk servers in a distributed environment.

For example, you may have two or more indexes for different application logs. The event data from these logs may share at least one common field. You can use the values of this field to search for events in one index based on a value that is not in another index:

sourcetype=some_sourcetype NOT [search sourcetype=another_sourcetype | fields field_val]

Note: This is equivalent to the SQL "NOT IN" functionality:

SELECT * from some_table WHERE field_value NOT IN (SELECT field_value FROM another_table)

Change the format of subsearch results

The format command is implicitly applied to your subsearch results, but you can use this command to change your subsearch results into a single linear search string.

If your subsearch returned a table, such as:

host   source           sourcetype
----   --------------   ----------
me     syslog.log       syslog
bob    bob-syslog.log   syslog

The format command returns:

( ( host=mylaptop AND source=syslog.log AND sourcetype=syslog ) OR ( host=bobslaptop AND source=bob-syslog.log AND sourcetype=syslog ) ) 

If your subsearch returns a large table of results, it will impact the performance of your search. To change the maximum number of results that the format command operates over, edit the "maxresults" key in the [format] stanza of limits.conf.

[format]
maxresults = <integer> 
* Maximum number of events for a subsearch to use in generating a search.
* Defaults to 100.

For more information, see the format search command reference.

This documentation applies to the following versions of Splunk: 4.0 , 4.0.1 , 4.0.2 , 4.0.3 , 4.0.4 , 4.0.5 , 4.0.6 , 4.0.7 , 4.0.8 , 4.0.9 , 4.0.10 , 4.0.11 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!