Configure the priority of scheduled searches
This topic discusses the two options you can use to control the priority of concurrent scheduled searches with the search scheduler. The options are real-time scheduling and continuous scheduling:
- Real-time scheduling ensures that scheduled searches are always run over the most recent time range, even when a number of searches are scheduled to run at approximately the same time and the scheduler can only run one search concurrently. Because of the way it works, searches with real-time scheduling can end up skipping scheduled runs. However, they are always given priority over searches with continuous scheduling.
- Continuous scheduling ensures that each scheduled run of a search is eventually performed, even if the result is that those searches are delayed. These settings are managed at the saved search level via
savedsearches.conf. Splunk gives all scheduled searches real-time scheduling by default, but when a scheduled search is enabled for summary indexing, Splunk automatically changes its scheduling option to continuous.
To understand the necessity of these two scheduler options, you need to understand how the search scheduler handles concurrent searches.
This topic also explains how the "auto-summarization" searches that are automatically set up by Splunk for report acceleration are handled by by the search scheduler. See the subtopic at the end of this topic for more information.
How the search scheduler handles concurrent searches
The Splunk search scheduler limits the number of scheduled searches that can be run concurrently. The default, set by the
max_searches_perc setting in
limits.conf, sets the maximum number of concurrent searches that can be handled by the scheduler to 25% of the
max_searches_per_cpu value. By default,
max_searches_per_cpu is set to four searches for every CPU in your system plus two. So if your system only has one CPU, the scheduler can safely only run one search at a time (1.5 = 25% of 6).
Note: We strongly recommend that you avoid changing
limits.conf settings unless you know what you are doing.
So, if your scheduler can only run one search at a time, but you have multiple searches scheduled to run on an hourly basis over the preceding hour's data, what happens? The scheduler lines the searches up and runs them in consecutive order for the scheduled time period, but each search returns information for the time frame over which it was scheduled to run.
Example of real-time scheduling versus continuous scheduling
So, given how the scheduler works, how is real-time scheduling different from continuous scheduling, and under what conditions would you prefer one option over the other?
First, say you have two saved, scheduled searches that for the purpose of simplicity we'll call A and B:
- Search A runs every minute and takes 30 seconds to complete
- Search B runs every 5 minutes and takes 2 minutes to complete
Let's also say that you have a Splunk configuration that enables the search scheduler to run only one search at a time.
Both searches are scheduled to run at 1:05pm.
||The scheduler runs A for the 1:04 to 1:05 period, and schedules it to run again at 1:06pm. It is 1:05:30pm when search A completes.|
||The scheduler runs search B. Because it takes 2 minutes to run, search B won't complete until 1:07:30.|
||The scheduler wakes up and attempts to run search A, but it cannot run because search B is still in process.|
||The scheduler continues to attempt to run search A until 1:06:59. At this point what happens next depends on whether search A is using real-time or continuous scheduling (see below).|
If search A is configured to have:
- real-time scheduling, the scheduler skips the 1:05-1:06 run of the search and schedules the next run of search A for 1:07:00pm (for the 1:06 to 1:07 period). The new search run time is based on the current scheduled run time (1:06:00pm).
- continuous scheduling, the scheduler does not advance the schedule and attempts to run the search for the 1:05 to 1:06pm period indefinitely, and whatever the eventual search run time is, the next time period that search A would cover would be 1:06 to 1:07pm.
Real-time scheduling is the default for all scheduled searches. It's designed to ensure that the search returns current data. It assumes there won't be any problems if some scheduled searches are skipped, as long as it returns up-to-the minute results in the most recent run of the search.
Continuous scheduling is used for situations where problems arise when there's any gap in the collection of search data. In general this is only important for searches that populate summary indexes, though you may find other uses for it. When a search is enabled for summary indexing, Splunk changes its scheduling option to continuous automatically.
Note: For more information about summary index searches, see "Use summary indexing for increased reporting efficiency" in the Knowledge Manager manual.
Configure the realtime_schedule option
The system uses the
realtime_schedule option in
savedsearches.conf to determine the next run time of a scheduled search. This is set individually for each saved and scheduled search.
realtime_schedule= 0 | 1
1to use real-time scheduling. With this setting the scheduler makes sure that it is always running the search over the most recent time range. Because searches can't always run concurrently with others, this means that it may skip some search periods. This is the default value for a scheduled search.
0to use continuous scheduling. This setting ensures that scheduled search periods are never skipped. Splunk automatically sets this value to
0for any scheduled search that is enabled for summary indexing.
The scheduler is designed to give searches with real-time scheduling priority over those with continuous scheduling; it always tries to run the real-time searches first.
Note: You should never set
realtime_schedule=1 for a search that populates a summary index, precisely because doing so may cause it to skip search periods. This leads to gaps in the summary index, which makes the summary index unreliable.
Report acceleration and the search scheduler
If you are using report acceleration to speed up your slow-completing searches be aware that this process utilizes scheduled searches as well. It runs them behind the scenes to generate new report acceleration summaries and then to update those summaries thereafter.
By default, the search scheduler is only allowed to allocate up to 25% of its total search bandwidth for report acceleration summary creation and update. We don't recommend that you change this value but if you decide you must, go to
limits.conf and change the value of the
auto_summary_perc attribute to a number that works better for you.
The search scheduler also runs searches that populate report acceleration summaries at the lowest priority. If these "auto-summarization" searches have a scheduling conflict with user-defined alerts, summary-index searches, and regular scheduled searches, the user-defined searches always get run first. This means that you may run into situations where a summary isn't being created or updated because Splunk is busy running more prioritized searches.
For more information about report acceleration, see "Manage report acceleration," in this manual.
Manage saved searches
About search jobs and search job management
This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18