Splunk® IT Service Intelligence

Event Analytics Manual

Splunk IT Service Intelligence (ITSI) version 4.11.x reached its End of Life on December 6, 2023. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see Before you upgrade IT Service Intelligence.
This documentation does not apply to the most recent version of Splunk® IT Service Intelligence. For documentation on the most recent version, go to the latest release.

Troubleshoot the Rules Engine and event grouping in ITSI

Here are some common issues with notable event grouping in IT Service Intelligence (ITSI) and how to resolve them.

Issue: Events aren't being grouped into episodes

Even though notable events are coming into ITSI, aggregation policies aren't grouping them into episodes.

Mitigation

1. First, go to Configuration > Notable Event Aggregation Policies within ITSI and make sure aggregation policies are created and enabled.

2. Make sure the Rules Engine is running. Go to Activity > Jobs and change the App context to All. Search for itsi_event_grouping and make sure the status says Running. If it's not running, go to Settings > Searches, reports, and alerts and change the app context to All. Search for the itsi_event_grouping search again and check that it's enabled. If not, enable it.

3. Make sure you have the correct Java version. See Java requirements in the Install and Upgrade manual.

4. Make sure the itsi_grouped_alerts index exists and contains data. The index is deployed in the SA-IndexCreation directory.

5. If the Rules Engine is running, run the following search to check for error messages that might indicate what's causing the problem:

index=_internal source=*rules_engine* log_level=ERROR

6. If the previous steps fail, change the log level to debug at $SPLUNK_HOME/etc/apps/SA-ITOA/default/log4j_rules_engine.xml and finalize the old itsi_event_grouping search so a new one can pick up updated log level information:

<root level= "debug" >
     <!--<appender-ref ref= "Console" /> <!– To console –>-->
     <appender-ref ref= "RollingFile" /> <!-- And to a rotated file -->
</root>

7. If the problem persists, file a ticket with Splunk Support.

Issue: The Rules engine search (| itsirulesengine) is failing

You see the following error message:

Exception while parsing metadata JSON: Unexpected character in string: '\0A'

Mitigation

  1. Update your Java version to 1.8 or higher. See Java requirements in the Install and Upgrade manual.
  2. Set the JAVA_HOME environment variable to the new Java version.
  3. Restart your Splunk software.

Issue: Events are being grouped by the default aggregation policy instead of a custom policy

Even though notable events are being ingested into ITSI, they aren't being grouped by any of the custom aggregation policies you've created.

Mitigation

Events are only grouped by the default aggregation policy when they don't match the filtering criteria of any existing policies. Check the filtering criteria of the policies you've created. For more information, see Configure episode filtering and breaking criteria in ITSI.

Events are added to new episodes instead of to existing active episodes

The sub-group limit (sub_group_limit) has been exceeded, so the hash keys and the episodes associated with them are cleared from memory. Any new events with the old hash key create new sub-groups, which causes new episode to be created.

Mitigation

  1. Create a local copy of itsi_rules_engine.properties at $SPLUNK_HOME/etc/apps/SA-ITOA/local
  2. Increase the sub_group_limit setting accordingly.

For more information about sub-groups and other sizing parameters, see Tune episode and aggregation policy sizing parameters in ITSI.

Java process not starting when the Rules Engine search is executed

The Java process isn't starting when the itsirulesengine command is executed and no itsi_rules_engine.log files are generated.

Mitigation

  1. If you see the following error message:
    Invalid message received from external search command during setup, see search.log
    

    verify that /opt/splunk/etc/apps/SA-ITOA/bin/itsirulesengine has execute permissions. Run the following command from /opt/splunk/etc/apps/SA-ITOA/bin to see if you get a permission denied error:

    /opt/splunk/etc/apps/SA-ITOA/bin/itsirulesengine -J-Xmx2048M -Dlog4j.configurationFile=../default/log4j_rules_engine.xml -DitsiRulesEngine.configurationFile=../default/itsi_rules_engine.properties -Dfile.encoding=UTF-8 -Dconfig.file=../default/akka_application.conf
    If you get a permission denied error, add execute permission to the file.
  2. Make sure no duplicate JAR files are present under /opt/splunk/etc/apps/SA-ITOA/lib/java/event_management/libs and that all the JAR files have execute permission for the Splunk user.
  3. Make sure your Java is not 32-bit JRE/JDK. For more information, see Java requirements in the Install and Upgrade Manual.
  4. Make sure rtsearch, the real-time search capability, isn't disabled in the [role_admin] stanza in $SPLUNK_HOME/etc/apps/itsi/default/authorize.conf.

Issue: "Connection Refused" error

You see the following error message:

error=Connection refused java.lang.RuntimeException: Connection refused

Mitigation

Restart the Rules Engine, as this could mean Splunkd wasn't set up properly:

  1. Within Splunk Web, go to Settings > Searches, reports, and alerts.
  2. In the App dropdown, select All.
  3. Use the filter to locate the itsi_event_grouping search.
  4. Click Actions > Disable.
  5. Wait for about 10 seconds, then re-enable it again.

If the restart doesn't solve the problem, check your network permissions.

Issue: Notable event KV store collections are growing very large

This issue occurs because the indexed realtime search returns events over and over from buckets that use tsidx reduction.

Mitigation

Disable tsidx reduction on the itsi_tracked_alerts and itsi_summary indexes and rebuild all old buckets on these indexes. For more information, see Reduce tsidx disk usage in the Managing Indexers and Clusters of Indexers manual.

Upon upgrade, the Rules Engine search command fails

You see the following error message:

Error occurred during initialization of VM

Mitigation

This issue occurs because 32-bit Java can't run the Rules Engine with the new memory settings introduced in version 4.3.x.

  1. Open or create a local copy of commands.conf at $SPLUNK_HOME/etc/apps/SA-ITOA/local.
  2. Add the following stanza:
    [itsirulesengine]
     command.arg.1=-J-Xmx1024M
     # reduced to 1024MB for 32 bit JDK/JRE
  3. Restart the Rules Engine, either by disabling and reenabling the itsi_event_grouping search, or by restarting your Splunk software.

ITSI is constantly using 100% CPU because multiple java.exe processes are running

This issue occurs because ITSI Event Analytics is incompatible with Splunk Enterprise versions 7.2.4 - 7.2.10.

Mitigation

Perform the workaround in SPL-155648.

Last modified on 24 March, 2022
Best practices for implementing Event Analytics in ITSI   Configure the Rules Engine to handle indexer cluster rolling restarts and upgrades

This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.11.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters