Best practices for searching
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Best practices for searching
This topic discusses some simple rules of thumb to help you write searches that will run more efficiently. Many factors can affect the speed of your searches: the volume of data that you are searching, how you've constructed your searches, whether or not you've planned your deployment sufficiently to handle the number of users running searches at the same time, and so on. The key to optimizing your search speed is to make sure that Splunk isn't doing more work than necessary.
Types of searches
The recommendations for optimizing searches vary depending on the type of search that you run and the characteristics of the data you're searching. In general, we describe searches based on what you are trying to do: retrieve events or generate reports. If the events you want to retrieve occur frequently in the dataset, we call it a dense search. If the events you want to retrieve are rare in the dataset, we call it a sparse search.
Read more about types of searches.
Raw event searches
Raw event searches return events from a Splunk index without any additional processing to the events that are retrieved. The best rule of thumb to follow when retrieving events from the index is to be specific about the events that you want to retrieve. You can do this with keywords and field/value pairs that are unique to the events. One thing to keep in mind is that sparse searches against large volumes of data will take longer than dense searches against the same data set.
- Narrow down your search as much as possible from the start and limit the data that has to be pulled from disk to an absolute minimum. For example, if you're only interested in Web access events, restrict your search to the specific host, index, or source type for that data.
- If you rarely search across more than one type of data at a time, partition your different types of data into separate indexes and restrict your searches to the specific index. For example, store Web access data in one index and firewall data in another. Read more about ways to set up multiple indexes and how to search different indexes.
- Limit your search to the specific time window you need. For example, to see what might have led to errors a few minutes ago, search within the last hour '-1hr', not the last week '-1w'. Read more about how to Change the time range to narrow your search.
Report-generating searches perform additional processing on events after they've been retrieved from an index. This processing can include filtering, transforming, and other operations using one or more statistical functions against the set of results. Because this processing occurs in memory, the more restrictive and specific you are when specifying the events to retrieve from disk, the faster the search will be.
- If you are building a report, start your search from the Advanced Charting view instead of the timeline view. The timeline view requires a lot of processing to calculate and build the timeline. When you run a search from the Advanced Charting view, it disables preview and the processing overhead associated with it.
- Reports rely on fields, so all the optimization rules for fields apply.
Use fields in your searches
Searches with fields are faster when they use fields that have already been extracted (indexed fields) instead of fields extracted at search time.
- Leverage indexed and default fields whenever you can to help search or filter your data efficiently. At index time, Splunk extracts a set of default fields that are common to each event; these fields include
sourcetype. Use these fields to filter your data as early as possible in the search so that processing is done on a minimum amount of data. For example, if you're building a report on web access errors, search for those specific errors before the reporting command:
sourcetype=access_* (status=4* OR status=5*) | stats count by status
- Field extractions at search time add processing overhead.' If you don't need additional fields in your search, turn off the Discover Fields option in the timeline view or use the
fieldscommand to specify only the fields that you want to see in your results.
Summarize your data
It can take a lot of time to search through very large data sets. If you regularly generate reports on large volumes of data, use summary indexing to pre-calculate the values that you use most often in your reports. Schedule saved searches to collect metrics on a regular basis, and report on the summarized data instead of on raw data.
Read more about how to use summary indexing for increased reporting efficiency.
Use the Search Job Inspector
The Search Job Inspector is a tool you can use both to troubleshoot the performance of a search and to understand the execution costs of knowledge objects such as event types, tags, lookups, and other components within the search. It dissects the behavior of your searches so that you can better understand how to optimize them.
Read more about how to use the search job inspector.
This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 , 4.3.3 , 4.3.4 , 4.3.5 , 4.3.6