Change the role of both systems to switch the primary and standby systems
Perform the following tasks if you want to switch the roles of both systems. For example, if System B failed over to System A, switch the roles of both systems so that System A becomes the new primary system and System B becomes the new standby system.
- Run a full sync on your current Primary System. For example, System B. This step ensures minimal data loss.
You can view the
curl -X POST -k -H "Authorization: Bearer $(grep '^\s*jobmanager.restServer.auth.user.token=' /opt/caspida/conf/uba-default.properties | cut -d'=' -f2)" https://localhost:9002/jobs/trigger?name=ReplicationCoordinator
/var/log/caspida/replication/replication.log
file on the management node of the primary system for additional information about the progress and status of the sync after executing a full sync. - On System B, run the following commands:
/opt/caspida/bin/CaspidaCleanup /opt/caspida/bin/Caspida stop
- On System A, run the following command:
/opt/caspida/bin/Caspida stop
- On both System A and System B, set the following properties in
/etc/caspida/local/conf/uba-site.properties
:- Enable replication:
replication.enabled=true
- Switch the active and standby systems:
replication.primary.host=<management node of System A> replication.standby.host=<management node of System B>
- In System A, enable the replication system job by adding the
ReplicationCoordinator
property into/etc/caspida/local/conf/caspida-jobs.json
file on the management node. TheReplicationCoordinator
must be set totrue
. Below is a sample of the file before adding the property:/** * Copyright 2014 - Splunk Inc., All rights reserved. * This is Caspida proprietary and confidential material and its use * is subject to license terms. */ { "systemJobs": [ { // "name" : "ThreatComputation", // "cronExpr" : "0 0 0/1 * * ?", // "jobArguments" : { "env:CASPIDA_JVM_OPTS" : "-Xmx4096M" } } ] }
After adding the property, the file should look like this:
/** * Copyright 2014 - Splunk Inc., All rights reserved. * This is Caspida proprietary and confidential material and its use * is subject to license terms. */ { "systemJobs": [ { // "name" : "ThreatComputation", // "cronExpr" : "0 0 0/1 * * ?", // "jobArguments" : { "env:CASPIDA_JVM_OPTS" : "-Xmx4096M" } }, { "name" : "ReplicationCoordinator", "enabled" : true } ] }
- In the management node of System B, remove the following properties from
/etc/caspida/local/conf/caspida-jobs.json
if they exist:{ "name" : "ReplicationCoordinator", "enabled" : true }
- Run the following command to synchronize the cluster:
/opt/caspida/bin/Caspida sync-cluster /etc/caspida/local/conf/
- Enable replication:
- On the management node of System A, run the following command:
/opt/caspida/bin/replication/setup -d standby -m primary
If System A has been registered before, run the command again with the reset option:
/opt/caspida/bin/replication/setup -d standby -m primary -r
If you see ERROR: cannot execute ALTER SUBSCRIPTION in a read-only transaction when you run this command, it means subscription_caspida is present on the standby system. Perform the following steps:
- Run the following command to verify that subscription_caspida is on the standby system:
psql -d caspidadb -c "select * from pg_subscription where subname='subscription_caspida'"
- If subscription_caspida is in the table, run the following command to turn off read-only mode on the standby system:
psql -d caspidadb -c 'BEGIN; SET transaction read write; ALTER DATABASE caspidadb SET default_transaction_read_only = off; COMMIT'
- Run the following command again on the primary system. Follow the prompts to delete subscription_caspida from the standby system:
/opt/caspida/bin/replication/setup -d standby -m primary -r
- Run the following command to verify that subscription_caspida is on the standby system:
- On the management node of System B, run the following command:
/opt/caspida/bin/replication/setup -d standby -m standby -r
- If System B is running RHEL, CentOS, or Oracle Linux operating systems, and if the directory
/var/vcap/sys/run/caspida
does not currently exist, run the following command on each node in the cluster:sudo mkdir -m a=rwx /var/vcap/sys/run/caspida
- On the management node of System A, run the following command:
/opt/caspida/bin/Caspida start
- On the management node of System B, start Splunk UBA without Caspida services:
/opt/caspida/bin/Caspida start-all --no-caspida
- On System A, verify that the initial sync between the systems has started. For instructions, see How Splunk UBA synchronizes the primary and standby systems.
Failover to a standby Splunk UBA system | Stop the primary system from synchronizing with the standby system |
This documentation applies to the following versions of Splunk® User Behavior Analytics: 5.1.0, 5.1.0.1, 5.2.0, 5.2.1, 5.3.0, 5.4.0, 5.4.1
Feedback submitted, thanks!