Write better searches
This topic discusses some causes of slow searches and suggests simple rules of thumb to help you write searches that will run more efficiently. Many factors can affect the speed of your searches: the volume of data that you are searching, how you've constructed your searches, whether or not you've planned your deployment sufficiently to handle the number of users running searches at the same time, and so on. The key to optimizing your search speed is to make sure that Splunk isn't doing more work than necessary.
Types of searches
The recommendations for optimizing searches vary depending on the type of search that you run and the characteristics of the data you're searching. In general, we describe searches based on what you are trying to do: retrieve events or generate reports. If the events you want to retrieve occur frequently in the dataset, we call it a dense search. If the events you want to retrieve are rare in the dataset, we call it a sparse search.
Raw event searches
Raw event searches return events from a Splunk index without any additional processing to the events that are retrieved. The best rule of thumb to follow when retrieving events from the index is to be specific about the events that you want to retrieve. You can do this with keywords and field/value pairs that are unique to the events. One thing to keep in mind is that sparse searches against large volumes of data will take longer than dense searches against the same data set.
Report-generating searches perform additional processing on events after they've been retrieved from an index. This processing can include filtering, transforming, and other operations using one or more statistical functions against the set of results. Because this processing occurs in memory, the more restrictive and specific you are when specifying the events to retrieve from disk, the faster the search will be.
Tips for tuning your searches
In most cases, your search is slow because of the complexity of your query to retrieve events from index. For example, if you search contains extremely large OR lists, complex subsearches (which break down into OR lists), and types of phrase searches, it will take longer to process. This section discusses some tips for tuning your searches so that they are more efficient.
Be more specific. That is, narrow down your search as much as possible from the start and limit the data that has to be pulled from disk to an absolute minimum:
- Add strings which only exist in your desired events.
- Restrict your search to the specific host, index, source, source type, or Splunk server whenever possible. Read more about using fields in your searches in the next section.
- Limit your search to the specific time window you need. For example, to see what might have led to errors a few minutes ago, search within the last 15 minutes '-15min' or last hour '-1hr', not the last week '-1w'. Read more about time ranges in search.
- Limit the quantity of data retrieved. You can do this easily using the head command:
sourcetype=access_* | head 1000.
Avoid using NOT expressions when possible. That is, instead of using
(NOT host=d NOT host=e) or
(host!=d OR host!=e), use
(host=a OR host=b OR host=c).
If you rarely search across more than one type of data at a time, partition your different types of data into separate indexes and restrict your searches to the specific index. For example, store Web access data in one index and firewall data in another. This is recommended for sparse data, which may otherwise be buried in a large volume of unrelated data. Read more about ways to set up multiple indexes and how to search different indexes.
Use fields in your searches
Searches with fields are faster when they use fields that have already been extracted (indexed fields) instead of fields extracted at search time.
Use indexed and default fields for improved search efficiency
Use indexed and default fields whenever you can to help search or filter your data efficiently. At index time, Splunk extracts a set of default fields that are common to each event; these fields include
sourcetype. Use these fields to filter your data as early as possible in the search so that processing is done on a minimum amount of data.
For example, if you're building a report on web access errors, search for those specific errors before the reporting command:
sourcetype=access_* (status=4* OR status=5*) | stats count by status
You can also run efficient searches for fields that have been indexed from structured data such as CSV files and JSON data sources. When you do this, replace the equal sign with double colons, like this:
This syntax works best in searches for fields that have been indexed from structured data, though it can be used to search for default and custom indexed fields as well. You cannot use it to search on Search-time fields.
Disable field discovery to improve search performance
If you don't need additional fields in your search, set Search Mode to a setting that disables field discovery to improve search performance in the timeline view or use the
fields command to specify only the fields that you want to see in your results.
The tradeoff to disabling field discovery is that doing so prevents automatic field extraction, except for fields that are required to fulfill your search (such as fields that you are specifically searching on) and default fields such as
sourcetype. The search runs faster because Splunk is no longer trying to extract every field possible from your events.
Search mode is set to Smart by default. Set it to Verbose if you are running searches with reporting commands, don't know what fields exist in your data, and think you might need them to help you narrow down your search in some way.
See "Set search mode to adjust your search experience," in this manual.
Summarize your data
It can take a lot of time to search through very large data sets. If you regularly generate reports on large volumes of data, use summary indexing to pre-calculate the values that you use most often in your reports. Schedule saved searches to collect metrics on a regular basis, and report on the summarized data instead of on raw data.
Read more about how to use summary indexing for increased reporting efficiency.
Use the Search Job Inspector
The Search Job Inspector is a tool you can use both to troubleshoot the performance of a search and to understand the execution costs of knowledge objects such as event types, tags, lookups, search commands, and other components within the search.
When your search is running too slow, the search job inspector can help you determine which phase of the search is taking what amounts of time. It dissects the behavior of your searches so that you can better understand how to optimize them.
Read more about how to use the search job inspector in this manual.
The search processing language syntax
About retrieving events
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14