Using the Search Job Inspector
The Search Job Inspector is a tool that lets you take a closer look at what your search is doing and see where Splunk is spending most of its time.
This topic discusses how to use the search job inspector to both troubleshoot the performance of a search job and understand the behavior of knowledge objects such as event types, tags, lookups and so on within the search. For more information, see Manage search jobs in this manual.
View search job properties
You can access the Search Job Inspector for a search job as long as the search artifact still exists (which means that the search has not expired). The search does not still have to be running.
To inspect a search:
1. Run the search.
2. In the Job menu, select "Inspect Job".
This opens the Search Job Inspector in a new browser window.
To view the properties of a search artifact:
You can use the URL to inspect a search job artifact if you have its search ID (SID). You can find the SID of a search in the Job Manager (click the Jobs link in the upper right hand corner) or listed in Splunk's dispatch directory,
$SPLUNK_HOME/var/run/splunk/dispatch . For more information about the Job Manager, see Manage search jobs in this manual.
If you look at the URI path for the Search Job Inspector window, you will see something like this at the end of the string:
namespace are the SID number and the name of the app that it belongs to. Here, the SID is 1299600721.22.
Type the search artifact's SID into the URI path, after
sid= and hit return. As long as you have the necessary ownership permissions to view the search, you will be able to inspect it.
Now, what exactly are you looking at?
What the Search Job Inspector shows you
While the search is running, the Search Job Inspector shows you two different panels. Execution costs lists information about the components of the search and how much impact each component has on the overall performance of the search. Search job properties lists other characteristics of the job. When the search finishes, the Search Job Inspector tells you how many results it found and the time it took to complete the search. After the search completes, the Search Job Inspector also displays error messages at the top of the screen. Most of the information is self-explanatory, but this section will discuss the panels in more detail.
The Execution costs panel lets you troubleshoot the efficiency of your search by narrowing it down to the performance impact of specific components relating to a search-time event processing action. It displays a table of the components that displays:
- the component durations in seconds.
- how many times each component was invoked while the search ran.
- the input and output event counts for each component.
The Search Job Inspector lists the components alphabetically. You will see more or fewer components depending on the search you run. The following table describes the significance of each individual search command and distributed search component.
Note: These are the components you will see if you just run a keyword search.
Execution costs of search commands
In general, for each command that is part of the search job, there is a parameter
command.<command_name>. The values for these parameters represent the time spent in processing each
<command_name>. For example, if the table command is used, you will see
|Search command component name||Description|
|command.search||Once Splunk identifies the events containing the indexed fields matching your search, it looks into the events themselves to identify the ones that match other criteria. These are concurrent operations, not consecutive.
There is a relationship between the type of commands used and the numbers you can expect to see for Invocations, Input count, and Output count. For searches that generate events, you expect the input count to be 0 and the output count to be some number of events X. If the search is both a generating search and a filtering search, the filtering search would have an input (equal to the output of the generating search, X) and an output=X. The total counts would then be input=X, output=2*X, and the invocation count is doubled.
Execution costs of dispatched searches
|Distributed search component name||Description|
|dispatch.check_disk_usage||The time spent checking the disk usage of this job.|
|dispatch.createProviderQueue||The time to connect to all search peers.|
|dispatch.evaluate||The time spent parsing the search and setting up the data structures needed to run the search. This component also includes the time it takes to evaluate and run subsearches. This is broken down further for each search command that is used. In general, |
|dispatch.fetch||The time spent waiting for or fetching events from search peers.|
|dispatch.preview||The time spent generating preview results.|
|dispatch.process_remote_timeline||The time spent decoding timeline information generated by search peers.|
|dispatch.reduce||The time spend reducing the intermediate report output.|
|dispatch.stream.local||The time spent by search head on the streaming part of the search.|
|dispatch.stream.remote||The time spent executing the remote search in a distributed search environment, aggregated across all peers. Additionally, the time spent executing the remote search on each remote search peer is indicated with: |
|dispatch.timeline||The time spent generating the timeline and fields sidebar information.|
|dispatch.writeStatus||The time spent periodically updating status.csv and info.csv in the job's dispatch directory.|
Search job properties
The Search job properties fields are listed in alphabetical order.
|cursorTime||The earliest time from which no events are later scanned. Can be used to indicate progress. See description for |
|delegate||For saved searches, specifies jobs that were started by the user. Defaults to scheduler.|
|diskUsage||The total amount of disk space used, in bytes.|
|dispatchState||The state of the search. Can be any of QUEUED, PARSING, RUNNING, PAUSED, FINALIZING, FAILED, DONE.|
|doneProgress||A number between 0 and 1.0 that indicates the approximate progress of the search.
doneProgress = (latestTime – cursorTime) / (latestTime – earliestTime)
|dropCount||For real-time searches only, the number of possible events that were dropped due to the |
|earliestTime||The earliest time a search job is configured to start. Can be used to indicate progress. See description for |
|eai:acl||Describes the app and user-level permissions. For example, is the app shared globally, and what users can run or view the search?|
|eventAvailableCount||The number of events that are available for export.|
|eventCount||The number of events returned by the search.|
|eventFieldCount||The number of fields found in the search results.|
|eventIsStreaming||Indicates if the events of this search are being streamed.|
|eventIsTruncated||Indicates if events of the search have not been stored, and thus not available from the events endpoint for the search.|
|eventSearch||Subset of the entire search that is before any transforming commands. The timeline and events endpoint represents the result of this part of the search.|
|eventSorting||Indicates if the events of this search are sorted, and in which order. asc = ascending; desc = descending; none = not sorted|
|isBatchMode||Indicates whether or not the search in running in batch mode. This applies only to searches that include transforming commands.|
|isDone||Indicates if the search has completed.|
|isFailed||Indicates if there was a fatal error executing the search. For example, if the search string had invalid syntax.|
|isFinalized||Indicates if the search was finalized (stopped before completion).|
|isPaused||Indicates if the search has been paused.|
|isPreviewEnabled||Indicates if previews are enabled.|
|isRealTimeSearch||Indicates if the search is a real time search.|
|isRemoteTimeline||Indicates if the remote timeline feature is enabled.|
|isSaved||Indicates that the search job is saved, storing search artifacts on disk for 7 days from the last time that the job has been viewed or touched. Add or edit the |
|isSavedSearch||Indicates if this is a saved search run using the scheduler.|
|isZombie||Indicates if the process running the search is dead, but with the search not finished.|
|keywords||All positive keywords used by this search. A positive keyword is a keyword that is not in a NOT clause.|
|label||Custom name created for this search.|
|latestTime||The latest time a search job is configured to start. Can be used to indicate progress. See description for doneProgress.|
|numPreviews||Number of previews that have been generated so far for this search job.|
|messages||Errors and debug messages.|
|performance||This is another representation of the Execution costs.|
|remoteSearch||The search string that is sent to every search peer.|
|reportSearch||If reporting commands are used, the reporting search.|
|request||GET arguments that the search sends to |
|resultCount||The total number of results returned by the search. In other words, this is the subset of scanned events (represented by the scanCount) that actually matches the search terms.|
|resultIsStreaming||Indicates if the final results of the search are available using streaming (for example, no transforming operations).|
|resultPreviewCount||The number of result rows in the latest preview results.|
|runDuration||Time in seconds that the search took to complete.|
|scanCount||The number of events that are scanned or read off disk.|
|search||The search string.|
|searchProviders||A list of all the search peers that were contacted.|
|sid||The search ID number.|
|statusBuckets||Maximum number of timeline buckets.|
|ttl||The time to live, or time before the search job expires after it completes.|
|Additional info||Links to further information about your search. These links may not always be available.
Note: When troubleshooting search performance, it's important to understand the difference between the scanCount and resultCount costs. For dense searches, the scanCount and resultCount are similar (scanCount = resultCount); and for sparse searches, the scanCount is much greater than the result count (scanCount >> resultCount). Search performance should not so much be measured using the resultCount/time rate but scanCount/time instead. Typically, the scanCount/second event rate should hover between 10k and 20k events per second for performance to be deemed good.
If there are errors in your search, these messages (which in previous versions were displayed as banners across the dashboard) are presented as DEBUG messages at the top of the Search Job Inspector window. For example, if there are fields missing from your results, the debug messages will say so.
Note: You won't see these messages until the search has completed.
Examples of Search Job Inspector output
Here's an example of the execution costs for a dedup search, run over All time:
* | dedup punct
The search commands component of the Execution costs panel might look something like this:
The command.search component and everything under it, gives you the performance impact of the
search command portion of your search, which is everything before the pipe character.
command.prededup gives you the performance impact of processing the results of the
search command before passing it into the
dedup command. The Input count of
command.prededup matches the Output count of
command.search, and the Input count of
command.prededup matches the Output count of
command.prededup. In this case, the Output count of
command.prededup should match the number of events returned at the completion of the search (which is the value of resultCount, under Search job properties).
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has about using the Search Job Inspector.
Manage search jobs
Dispatch directory and search artifacts
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14