Knowledge Manager Manual

 


Configure the priority of scheduled searches

NOTE - Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Configure the priority of scheduled searches

This topic discusses the two options you can use to control the priority of concurrent scheduled searches with the search scheduler. The options are real-time scheduling and continuous scheduling:

  • Real-time scheduling ensures that scheduled searches are always run over the most recent time range, even when a number of searches are scheduled to run at approximately the same time and the scheduler can only run one search concurrently. Because of the way it works, searches with real-time scheduling can end up skipping scheduled runs. However, they are always given priority over searches with continuous scheduling.
  • Continuous scheduling ensures that each scheduled run of a search is eventually performed, even if the result is that those searches are delayed. These settings are managed at the saved search level via savedsearches.conf. Splunk gives all scheduled searches real-time scheduling by default, but when a scheduled search is enabled for summary indexing, Splunk automatically changes its scheduling option to continuous.

To understand the necessity of these two scheduler options, you need to understand how the search scheduler handles concurrent searches.

For more information about scheduling saved searches, see "Create an alert" in the User manual.

How the search scheduler handles concurrent searches

The Splunk search scheduler limits the number of scheduled searches that can be run concurrently. The default, set by the max_searches_perc setting in limits.conf, sets the maximum number of concurrent searches that can be handled by the scheduler to 25% of the max_searches_per_cpu value. By default, max_searches_per_cpu is set to four searches for every CPU in your system plus two. So if your system only has one CPU, the scheduler can safely only run one search at a time (1.5 = 25% of 6).

Note: We strongly recommend that you avoid changing limits.conf settings unless you know what you are doing.

So, if your scheduler can only run one search at a time, but you have multiple searches scheduled to run on an hourly basis over the preceding hour's data, what happens? The scheduler lines the searches up and runs them in consecutive order for the scheduled time period, but each search returns information for the time frame over which it was scheduled to run.

Example of real-time scheduling versus continuous scheduling

So, given how the scheduler works, how is real-time scheduling different from continuous scheduling, and under what conditions would you prefer one option over the other?

First, say you have two saved, scheduled searches that for the purpose of simplicity we'll call A and B:

  • Search A runs every minute and takes 30 seconds to complete
  • Search B runs every 5 minutes and takes 2 minutes to complete

Let's also say that you have a Splunk configuration that enables the search scheduler to run only one search at a time.

Both searches are scheduled to run at 1:05pm.

Time Scheduler action
1:05:00pm The scheduler runs A for the 1:04 to 1:05 period, and schedules it to run again at 1:06pm. It is 1:05:30pm when search A completes.
1:05:30pm The scheduler runs search B. Because it takes 2 minutes to run, search B won't complete until 1:07:30.
1:06:00pm The scheduler wakes up and attempts to run search A, but it cannot run because search B is still in process.
1:06:59pm The scheduler continues to attempt to run search A until 1:06:59. At this point what happens next depends on whether search A is using real-time or continuous scheduling (see below).

If search A is configured to have:

  • real-time scheduling, the scheduler skips the 1:05-1:06 run of the search and schedules the next run of search A for 1:07:00pm (for the 1:06 to 1:07 period). The new search run time is based on the current scheduled run time (1:06:00pm).
  • continuous scheduling, the scheduler does not advance the schedule and attempts to run the search for the 1:05 to 1:06pm period indefinitely, and whatever the eventual search run time is, the next time period that search A would cover would be 1:06 to 1:07pm.

Real-time scheduling is the default for all scheduled searches. It's designed to ensure that the search returns current data. It assumes there won't be any problems if some scheduled searches are skipped, as long as it returns up-to-the minute results in the most recent run of the search.

Continuous scheduling is used for situations where problems arise when there's any gap in the collection of search data. In general this is only important for searches that populate summary indexes, though you may find other uses for it. When a search is enabled for summary indexing, Splunk changes its scheduling option to continuous automatically.

Note: For more information about summary index searches, see "Use summary indexing for increased reporting efficiency" in the Knowledge Manager manual.

Configure the realtime_schedule option

The system uses the realtime_schedule option in savedsearches.conf to determine the next run time of a scheduled search. This is set individually for each saved and scheduled search.

realtime_schedule= 0 | 1

  • Set realtime_schedule to 1 to use real-time scheduling. With this setting the scheduler makes sure that it is always running the search over the most recent time range. Because searches can't always run concurrently with others, this means that it may skip some search periods. This is the default value for a scheduled search.
  • Set realtime_schedule to 0 to use continuous scheduling. This setting ensures that scheduled search periods are never skipped. Splunk automatically sets this value to 0 for any scheduled search that is enabled for summary indexing.

The scheduler is designed to give searches with real-time scheduling priority over those with continuous scheduling; it always tries to run the real-time searches first.

Note: You should never set realtime_schedule=1 for a search that populates a summary index, precisely because doing so may cause it to skip search periods. This leads to gaps in the summary index, which makes the summary index unreliable.

This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 , 4.3.3 , 4.3.4 , 4.3.5 , 4.3.6 , 4.3.7 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!