Tune episode and aggregation policy sizing parameters in ITSI
The Split by setting in an IT Service Intelligence (ITSI) aggregation policy makes episodes relevant to one or more specific fields by splitting events into separate episodes based on those field values. Each event processed by a policy is placed in a unique episode based on the split-by field value. For an example of how the split by field works, see Split by field in the aggregation policy documentation.
For example, you can split by host
to create episodes based on hosts, where each episode contains the events pertaining to a particular host. You could split by datacenter
and application
to create episodes about infrastructure for each application focused on the data center from which it's served.
ITSI provides a file called itsi_rules_engine.properties
, located at $SPLUNK_HOME/etc/apps/SA-ITOA/default/
, where you can tune the settings that determine event and episode limits. To set custom configurations, open or create a local version of the file at $SPLUNK_HOME/etc/apps/SA-ITOA/local
. To see the contents of the entire file, see Rules Engine properties reference in ITSI.
The following settings in itsi_rules_engine.properties
control episode sizing limits:
sub_group_limit = 1000000 max_groups_per_sub_group = 50 max_event_in_group = 10000
What's a sub-group?
Sub-groups are the split-by hash keys that an aggregation policy creates when it splits events by fields. Sub-groups are the possible combinations of values from individual split-by fields. For example, if you split events by 'name' and 'severity', the Rules Engine creates separate hash keys for all name-severity combinations:
- name=check_dhcp, sev=1
- name=check_ntp_time, sev=1
- name=check_ntp_time, sev=2
...and so on.
sub_group_limit = 1,000,000
The sub_group_limit
setting controls the number of split-by hash keys that can exist for a single aggregation policy that splits events by fields. Sub-groups, or split-by hash keys, are the possible combinations of values from individual split-by fields.
In the following example, the aggregation policy is split by 'application' and 'datacenter'. Therefore, it creates separate hash keys for the App-DC combinations of App-OnlineShop and DC-AMEA, App-CRM and DC-APAC, APP-CRM and DC-NASA, and so on.
When episodes have been created for 1,000,000 different application-datacenter combinations, the limit is reached. If you exceed this limit, the oldest hash key and the episodes associated with it is cleared from memory. The episodes are still saved in the KV store, and events are stored in itsi_tracked_alerts and itsi_grouped_alerts indexes.
subgroup_alert_limit_offset = 500
The subgroup_alert_limit_offset
setting controls the offset used to calculate when to alert that the sub-group limit is approaching the default value of 1,000,000. The Rules Engine creates a message in the Splunk Messages dropdown when the sub-group limit is greater than or equal to the value of the sub_group_limit
setting minus the value of the subgroup_alert_limit_offset
setting.
For example, if the sub-group limit is 1,000,000 and the offset is 500, the Rules Engine sends an alert when the sub-group limit is greater than or equal to 999,500 (1,000,000 - 500).
If you receive a message that you're approaching this limit, increase the sub_group_limit
setting.
max_groups_per_sub_group = 50
The max_groups_per_sub_group
setting controls the number of episodes that can be created for each split-by hash key for an aggregation policy that splits events by fields. For each hash key, only one episode is active at a time and all previous ones are inactive. If you exceed the limit, all episodes associated with the hash key are cleared from memory. The episodes are still saved in the KV store, and events are stored in itsi_tracked_alerts and itsi_grouped_alerts indexes.
max_event_in_group = 10000
The max_event_in_group
setting controls the maximum number of events that can be in a single episode. If you exceed this limit, the episode breaks and a new episode is created.
Rules Engine properties reference in ITSI | Restore active episodes when the Rules Engine restarts in ITSI |
This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.11.0, 4.11.1, 4.11.2, 4.11.3, 4.11.4, 4.11.5, 4.11.6, 4.12.0 Cloud only, 4.12.1 Cloud only, 4.12.2 Cloud only, 4.13.0, 4.13.1, 4.13.2, 4.13.3, 4.14.0 Cloud only, 4.14.1 Cloud only, 4.14.2 Cloud only, 4.15.0, 4.15.1, 4.15.2, 4.15.3, 4.16.0 Cloud only, 4.17.0, 4.17.1, 4.18.0, 4.18.1, 4.19.0, 4.19.1, 4.19.2
Feedback submitted, thanks!