Write better searches
This topic discusses some simple rules of thumb to help you write searches that will run more efficiently. Many factors can affect the speed of your searches: the volume of data that you are searching, how you've constructed your searches, whether or not you've planned your deployment sufficiently to handle the number of users running searches at the same time, and so on. The key to optimizing your search speed is to make sure that Splunk isn't doing more work than necessary.
Types of searches
The recommendations for optimizing searches vary depending on the type of search that you run and the characteristics of the data you're searching. In general, we describe searches based on what you are trying to do: retrieve events or generate reports. If the events you want to retrieve occur frequently in the dataset, we call it a dense search. If the events you want to retrieve are rare in the dataset, we call it a sparse search.
Read more "Getting started with Search".
Raw event searches
Raw event searches return events from a Splunk index without any additional processing to the events that are retrieved. The best rule of thumb to follow when retrieving events from the index is to be specific about the events that you want to retrieve. You can do this with keywords and field/value pairs that are unique to the events. One thing to keep in mind is that sparse searches against large volumes of data will take longer than dense searches against the same data set.
- Narrow down your search as much as possible from the start and limit the data that has to be pulled from disk to an absolute minimum. For example, if you're only interested in Web access events, restrict your search to the specific host, index, or source type for that data.
- If you rarely search across more than one type of data at a time, partition your different types of data into separate indexes and restrict your searches to the specific index. For example, store Web access data in one index and firewall data in another. Read more about ways to set up multiple indexes and how to search different indexes.
- Limit your search to the specific time window you need. For example, to see what might have led to errors a few minutes ago, search within the last hour '-1hr', not the last week '-1w'. Read more about time ranges in search.
Report-generating searches perform additional processing on events after they've been retrieved from an index. This processing can include filtering, transforming, and other operations using one or more statistical functions against the set of results. Because this processing occurs in memory, the more restrictive and specific you are when specifying the events to retrieve from disk, the faster the search will be.
- If you are building a report, start your search from the Advanced Charting view instead of the timeline view. The timeline view requires a lot of processing to calculate and build the timeline. When you run a search from the Advanced Charting view, it disables preview and the processing overhead associated with it.
- Reports rely on fields, so all the optimization rules for fields apply.
Whether you're retrieving raw events or building a report, you should also consider whether you are running a search for sparse or dense information:
- Sparse searches are searches that look for single event or an event that occurs infrequently within a large set of data. You've probably heard these referred to as 'needle in a haystack' or "rare term" searches. Some examples of these searches include: searching for a specific and unique IP address or error code. When running a sparse search, use the Search (timeline) view, because it attempts to extract as much relevant information as possible from the event(s) which helps to search for the unique event.
- Dense searches are searches that scan through and report on many events. Some examples of these searches include: counting the number of errors that occurred or finding all events from a specific host. When running a dense search that reports on events across a lot of data, use the Advanced Charting view. In this view, Splunk doesn't process all field information, but optimizes only the amount of information it needs to complete your search.
Use fields in your searches
Searches with fields are faster when they use fields that have already been extracted (indexed fields) instead of fields extracted at search time.
- Leverage indexed and default fields whenever you can to help search or filter your data efficiently. At index time, Splunk extracts a set of default fields that are common to each event; these fields include
sourcetype. Use these fields to filter your data as early as possible in the search so that processing is done on a minimum amount of data. For example, if you're building a report on web access errors, search for those specific errors before the reporting command:
sourcetype=access_* (status=4* OR status=5*) | stats count by status
- Field extractions at search time add processing overhead. If you don't need additional fields in your search, set Search Mode to a setting that disables field discovery to improve search performance in the timeline view or use the
fieldscommand to specify only the fields that you want to see in your results.
The tradeoff to disabling field discovery is that doing so prevents automatic field extraction, except for fields that are required to fulfill your search (such as fields that you are specifically searching on) and default fields such as
sourcetype. The search runs faster because Splunk is no longer trying to extract every field possible from your events.
Search mode is set to Smart by default. Set it to Verbose if you are running searches with reporting commands, don't know what fields exist in your data, and think you might need them to help you narrow down your search in some way.
Learn more about the Search Mode setting in "Set search mode to adjust your search experience," in this manual.
Summarize your data
It can take a lot of time to search through very large data sets. If you regularly generate reports on large volumes of data, use summary indexing to pre-calculate the values that you use most often in your reports. Schedule saved searches to collect metrics on a regular basis, and report on the summarized data instead of on raw data.
Read more about how to use summary indexing for increased reporting efficiency.
Use the Search Job Inspector
The Search Job Inspector is a tool you can use both to troubleshoot the performance of a search and to understand the execution costs of knowledge objects such as event types, tags, lookups, and other components within the search. It dissects the behavior of your searches so that you can better understand how to optimize them.
Read more about how to use the search job inspector.
The search processing language syntax
About retrieving events
This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18