Configure the Rules Engine to handle indexer cluster rolling restarts and upgrades

Configure the Rules Engine to automatically adjust for indexer cluster rolling restarts so you can avoid duplicate processing of events and unexpected breaks in episodes. During an indexer cluster rolling restart, search results are incomplete and real-time searches are restarted every time a new indexer completes its restart process. The ITSI Rules Engine must run searches to rebuild its in-memory state every time it restarts. When those searches return incomplete or inconsistent results, it leads to duplicate event processing and unnecessary breaks in episodes.

Because the Rules Engine can't reliably detect the rolling restart or upgrade of the indexer cluster, you have to manually configure the cluster master and search head to allow the Rules Engine to query the cluster masters and get the status of the rolling restart or upgrade.

After you perform these setup steps once, the following events take place anytime you initiate an indexer rolling restart or upgrade, and right before every periodic backfill operation:

When the restart or upgrade is initiated, the Rules Engine stops.
Upon startup, the Rules Engine immediately queries the cluster masters to get the status of the rolling restart or upgrade.
If the query returns "true", meaning a rolling restart is in progress, the Rules Engine restarts again and attempts the indexer cluster health status check.
The Rules Engine continues to follow these steps indefinitely until the cluster masters returns a healthy status.
Once the cluster masters return a healthy status, the Rules Engine proceeds with rebuilding its in-memory state.

If the cluster is not configured properly on the search head or if the credentials are missing or wrong on the search head, the Rules Engine treats the cluster as healthy and moves forward with the rolling restart or upgrade. Additionally, ensure you set up credentials properly to avoid cluster configuration errors.

Configure cluster masters and search heads

Perform the following steps to enable the Rules Engine to query the cluster masters and get the status of the indexer cluster rolling restart or upgrade.

Step 1: Configure the indexer cluster masters

In order for the Rules Engine to access the cluster master status REST endpoint, it needs an authenticated user with the correct authorization capability. Create a service account with at least the list_indexer_cluster capability.

Create a new role called sa_user_cluster_status with the list_indexer_cluster capability.
Create a new user and assign it the sa_user_cluster_status capability.
Note the hostname of the cluster master, which you'll use as the realm parameter when adding the credentials on the search head.
Repeat steps 1-3 on all cluster masters that have been added to the search heads as searchable clusters.

Step 2: Configure the search heads

The Rules Engine needs the plaintext password of the username to access the cluster master REST endpoint.

On one of the search heads, add the username, password, and realm of the service account to the search head password storage. You can add the information through the storage/passwords REST endpoint. For example:

curl -k -u admin:Chang3d! https://localhost:8089/servicesNS/nobody/SA-ITOA/storage/passwords -d name=<username> -d password=<password> -d realm=<cluster_master_hostname>

The realm is the host name of the cluster master. Make sure the realm matches the host part of the master_uri field returned from the services/cluster/config search head endpoint. For example, https://localhost:8089/services/cluster/config returns "master_uri": "https://master1:8089", so the realm is master1.

Limit the number of Rules Engine retries

By default, the Rules Engine indefinitely queries the cluster masters until they all return a healthy state. You can limit the number of retries if you want the Rules Engine to only attempt a specified number of status checks. After the specified number of attempts, the Rules Engine posts a message in Splunk Web and continues with the startup process of restoring active groups, backfilling events, and processing events.

Prerequisites

Only users with file system access, such as system administrators, can limit the number of cluster master checks using configuration files.
Review the steps in How to edit a configuration file in the Splunk Enterprise Admin Manual.
You can have configuration files with the same name in your default, local, and app directories. Read Where you can place (or find) your modified configuration files in the Splunk Enterprise Admin Manual.

Never change or copy the configuration files in the default directory. The files in the default directory must remain intact and in their original location. Make changes to the files in the local directory.

Steps

Open or create a local itsi_rules_engine.properties file at $SPLUNK_HOME/etc/apps/SA-ITOA/local.
Add the following setting and specify the number of retries:
```
max_cluster_rolling_restart_retry_count = <integer>
```
Restart your Splunk software.

Related answers from Splunk Community

Configure the Rules Engine to handle indexer cluster rolling restarts and upgrades

Configure cluster masters and search heads

Step 1: Configure the indexer cluster masters

Step 2: Configure the search heads

Limit the number of Rules Engine retries

Comments

Configure the Rules Engine to handle indexer cluster rolling restarts and upgrades

Was this topic useful?