Splunk® IT Service Intelligence

Event Analytics Manual

Configure Rules Engine periodic backfill in ITSI

The IT Service Intelligence (ITSI) Rules Engine backfills missing events if the data on indexers is temporarily unavailable due to network issues. The backfill process runs every twelve minutes to check for missed events. This process helps the Rules Engine stabilize episode generation and action execution under unstable conditions.

Periodic backfill only functions when events are missed while the Rules Engine real-time rules search is running. It doesn't backfill events generated when the search isn't running, or when indexer nodes are shut down or restarted intentionally, like upgrading indexers or upgrading Splunk Enterprise.

The Rules Engine uses the following searches to backfill missing events:

grouping_missed_events_search = search (`itsi_event_management_index_with_close_events` ) OR ( `itsi_event_management_group_index`) NOT orig_sourcetype=snow:incident \ | stats first(_time) AS _time first(_raw) AS _raw first(source) AS source first(sourcetype) AS sourcetype count(eval(index="itsi_grouped_alerts")) AS c_grouped by event_id \ | where c_grouped=0 | fields _time, _raw, source, sourcetype

backfill_events_search = search (`itsi_event_management_index_with_close_events` ) OR ( `itsi_event_management_group_index`) NOT orig_sourcetype=snow:incident \ | stats first(_time) AS _time first(_raw) AS _raw first(source) AS source first(sourcetype) AS sourcetype count(eval(index="itsi_grouped_alerts")) AS c_grouped by event_id \ | where c_grouped=0 | fields _time, _raw, source, sourcetype | sort 0 _time

Update index in backfill searches for custom indexes

If you are using a custom index, you have to add these searches to a local version of the rules engine properties found in $SPLUNK_HOME/etc/apps/SA-ITOA/local/itsi_rules_engine.properties. Note, the index name in the searches is changed to itsi_grouped_alerts_prod.

grouping_missed_events_search = search (`itsi_event_management_index_with_close_events` ) OR ( `itsi_event_management_group_index`) NOT orig_sourcetype=snow:incident \ | stats first(_time) AS _time first(_raw) AS _raw first(source) AS source first(sourcetype) AS sourcetype count(eval(index="itsi_grouped_alerts_prod")) AS c_grouped by event_id \ | where c_grouped=0 | fields _time, _raw, source, sourcetype

backfill_events_search = search (`itsi_event_management_index_with_close_events` ) OR ( `itsi_event_management_group_index`) NOT orig_sourcetype=snow:incident \ | stats first(_time) AS _time first(_raw) AS _raw first(source) AS source first(sourcetype) AS sourcetype count(eval(index="itsi_grouped_alerts_prod")) AS c_grouped by event_id \ | where c_grouped=0 | fields _time, _raw, source, sourcetype | sort 0 _time

For more information about the ITSI Rules Engine and the Rules Engine search, see Overview of the ITSI Rules Engine.

Tune periodic backfill frequency and time windows

Periodic backfill is controlled by the following parameters in $SPLUNK_HOME/etc/apps/SA-ITOA/default/itsi_rules_engine.properties:

periodic_backfill_frequency = 720

periodic_backfill_time_window = 3600

periodic_backfill_to_realtime_gap = 720

periodic_backfill_search_job_check_limit = 15

The default values are optimized for most cases, but you can tune them based on your environment. For more information about editing itsi_rules_engine.properties, see Rules Engine properties reference in ITSI.

Configure backfill frequency

The periodic_backfill_frequency setting controls the frequency, in seconds, that the Rules Engine reprocesses events that were not grouped. By default, ungrouped events are reprocessed every 12 minutes (720 seconds). Consider increasing this setting if you increase the periodic backfill time window.

This setting must be at least 120 seconds higher than the default event_cache_expiry_time of 180 seconds. Therefore, the minimum periodic backfill frequency is 300 seconds. Otherwise duplicate events might be grouped.

Configure the backfill time window

The periodic_backfill_time_window setting defines a sliding time window, in seconds, used by the Rules Engine to pick up events that were not grouped. By default, the Rules Engine checks for missing events during a 1-hour window (3600 seconds). Therefore, for every time window, it has five chances (3600 / 720) to pick up missed events. Once an event is backfilled or processed by the normal Rules Engine pipeline, it isn't reprocessed again.

You can save resources for other Rules Engine processing tasks by tuning down the backfill window size. However, the value of periodic_backfill_time_window must be higher than the periodic_backfill_frequency.

Configure the backfill to real-time gap

When periodic backfill begins, the Rules Engine doesn't backfill the events with the most recent timestamps because the generated events might need some time to be indexed. The periodic_backfill_to_realtime_gap setting, which is 12 minutes by default, determines the time gap before beginning to backfill events. The wait time guarantees that each event has a chance to be indexed before it's considered for backfill.

Example

In the following example, the periodic backfill search is scheduled to run at 9:12 AM. The periodic_backfill_frequency is 12 minutes, so the next backfill will run 12 minutes later at 9:24 AM, and so on every 12 minutes after that.

The periodic_backfill_to_realtime_gap is also 12 minutes, so the ending backfill boundary is at 9:00 AM. The periodic_backfill_time_window to 60 minutes, so the starting backfill boundary is 8:00 AM.


PeriodicBackfill.png
Last modified on 28 April, 2023
Restore active episodes when the Rules Engine restarts in ITSI   Best practices for implementing Event Analytics in ITSI

This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.11.0, 4.11.1, 4.11.2, 4.11.3, 4.11.4, 4.11.5, 4.11.6, 4.12.0 Cloud only, 4.12.1 Cloud only, 4.12.2 Cloud only, 4.13.0, 4.13.1, 4.13.2, 4.13.3, 4.14.0 Cloud only, 4.14.1 Cloud only, 4.14.2 Cloud only, 4.15.0, 4.15.1, 4.15.2, 4.15.3, 4.16.0 Cloud only, 4.17.0, 4.17.1, 4.18.0, 4.18.1, 4.19.0, 4.19.1, 4.19.2


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters