Migrate Splunk UBA using the backup and restore scripts

Use the backup and restore scripts located in /opt/caspida/bin/utils to migrate your Splunk UBA deployment to the next larger size on the same operating system. For example, you can migrate from 5 nodes to 7 nodes, or 10 nodes to 20 nodes. If you want to migrate from 7 nodes to 20 nodes, migrate from 7 nodes to 10 nodes first, then from 10 nodes to 20 nodes.

Below is a summary of the migration process using the backup and restore scripts. For example, to migrate from a 3 node cluster to a 5 node cluster:

Run the uba-backup.sh script on the 3 node cluster. The script stops Splunk UBA, perform the backup, then restarts Splunk UBA on the 3 node cluster.
Set up the 5 node cluster so that all nodes meet the system requirements, and install Splunk UBA. The version number of the Splunk UBA software must match the version number of the backup. See the Splunk UBA installation checklist in Install and Upgade Splunk User Behavior Analytics to begin a Splunk UBA installation.
Verify that Splunk UBA is up and running in the 5 node cluster. See Verify successful installation in Install and Upgrade Splunk User Behavior Analytics.
Run the uba-restore.sh script on the 5 node cluster. The sript stops Splunk UBA, restores the system from the earlier backup, then starts Splunk UBA.

In addition to migration, you can use the backup and restore scripts as an alternative way of capturing backups of your Splunk UBA system, in addition to or in place of the automated incremental backups. See Backup and restore Splunk UBA using automated incremental backups.

Requirements for using the backup and restore scripts

Make sure the following requirements are met before using the backup and restore scripts:

The target system you are migrating to must be set up with Splunk UBA already up and running.
The backup system and the target system you are migrating to must have the same version of Splunk UBA running on the same operating system.
The target system you are migrating to must be the same size or one deployment size larger than the backup system. See Plan and scale your Splunk UBA deployment for information about the supported Splunk UBA deployment sizes.

Back up Splunk UBA using the backup script

Perform a full backup of Splunk UBA using the /opt/caspida/bin/utils/uba-backup.sh script. View the command line options by using the --help option. The table lists and describes the various options that can be used in the script.

Option	Description
--archive	Create a single archive containing all of the backup data. The archive is created after the backup is completed and Splunk UBA is restarted.
--archive-type %FORMAT%	Specity the type of archive you want to create. Use gzip for good compression and good timings Use bzip2 for better compression but bad timings Use xz for a balance between compression and timings in gzip and bzip2 Use tar for the fastest timings but with zero compression and large sizes Install a package called pigz on the master node to use multi-threaded compression when creating `.tgz` archives. Use the following command to install this package: yum -y install pigz
--dateformat %FORMAT%	Override the default date/time format for the backup folder name. If this option is not used, the folder name is based on ISO 8601 format `YYYY-MM-DD`. To specify a backup folder name in the typical format used in the United States, specify `MM-DD-YYYY`. Using this option also overrides the date/time format of the logging messages.
--folder %FOLDER%	Override the target folder location where the backup is stored. Use this option if you configured a secondary volume for storing backups, such as another 1TB disk on the management node. Don't use NFS for performance ramifications.
--log-time	Add additional logging for how long each section takes, including all function calls and tasks. Use this option to help troubleshoot issues if your backup is taking more than two hours.
--no-data	Don't back up any data, only the Splunk UBA configuration.
--no-prestart	Don't start Splunk UBA before the backup begins, because Splunk UBA is already running. Make sure Splunk UBA is up and running before using this option.
--no-start	Don't start Splunk UBA after the backup is completed. Use this option to perform additional post-backup actions that required Splunk UBA to be offline.
--restart-on-fail	Restart Splunk UBA if the backup fails. If Splunk UBA encounters an error during the backup, the script attempts to restart Splunk UBA so the system does not remain offline.
--script %FILENAME%	Run the specified script after the backup is completed. Use this with the `--no-start` option if your script requires Splunk UBA to be offline.
--skip-hdfs-fsck	Skip the HDFS file system consistency check. This is useful in large environments if you want to skip this check due to time constraints.
--use-distcp	Perform a parallel backup of Hadoop. If the HDFS export is taking several hours, use this option to perform a parallel backup which may be faster. Use the `--log-time` option to examine how long the HDFS export is taking.

Below is an example backup.

Login to the master node of your Splunk UBA deployment as caspida using SSH.
Navigate to the /opt/caspida/bin/utils folder:
```
cd /opt/caspida/bin/utils
```

Run the backup script. Below is the command and its output:

[caspida@ubanode1]$ /opt/caspida/bin/utils/uba-backup.sh --no-prestart
UBA Backup Script - Version 1.9.2
Backup started at: Wed Jan  8 12:11:10 PST 2020
Backup running on: ubanode1.example.domain
Logfile:           /var/log/caspida/uba-backup-2020-01-08_12-11-10.log
Script Name:       uba-backup.sh
Script SHA:        06170431f2791e579bcba055df79d472d9c68614cf6c4c2497eb62ed48422e6a
Parsing any CLI args
- Disabling UBA pre-start before backup
Node Count: 1
Testing SSH connectivity to UBA node 1 (ubanode1)
Attempting to resolve the IP of UBA node ubanode1
  UBA node ubenode1 resolves to 192.168.19.88
Not starting UBA (pre-backup), disabled via CLI
Backup folder: /var/vcap/ubabackup/2020-01-08_12-11-11
Creating backup folder
Changing ownership of the backup folder
WARNING: No datasources were found as active in UBA
Determining current counts/stats from PostgreSQL
Stopping UBA (full)
Starting UBA (partial)
Checking that HDFS isnt in safe-mode
- Safe mode is disabled
Performing fsck of HDFS (this may take a while)
Creating backup of deployment configuration
Creating backup of local configurations
Creating backup of UBA rules
Creating backup of version information
Creating backup of PostgreSQL caspidadb database on UBA node 1 (spuba50)
Creating backup of PostgreSQL metastore database on UBA node 1 (spuba50)
Logging PostgreSQL sparkserverregistry table
Creating backup of Hadoop HDFS (this may take a while)
  - Checking status of PID 30850 (2020-01-08_12-16-22)
    - Backup job has finished (total size: 683M)
Logging Redis information
Stopping UBA (full)
Creating backup of timeseries data
Creating backup of Redis database (parallel mode)
  - Performing backup of UBA node 1 (spuba50)
  - Waiting for pid 12772 to finish
    - Process finished successfully
Creating summary of backup
Starting UBA (full)
Backup completed successfully
Time taken: 0 hour(s), 10 minute(s), 58 second(s)

You can review the log file in /var/log/caspida/uba-backup-<timestamp>.log.

Restore Splunk UBA using the restore script

After you have created a backup, restore Splunk UBA using the /opt/caspida/bin/utils/uba-restore.sh script. View the command line options by using the --help option. The table lists and describes the various options that can be used in the script.

Option	Description
--dateformat %FORMAT%	Override the default date/time format for the backup folder name. If this option is not used, the folder name is based on ISO 8601 format YYYY-MM-DD. To specify a backup folder name in the typical format used in the United States, specify MM-DD-YYYY. Using this option also overrides the date/time format of the logging messages.
--folder %FOLDER%	Override the source folder to perform the restore. By default the script looks for the restore archive in the default `/var/vcap/ubabackup` directory. If you used the `--folder` option when creating the backup to store the backup in a different directory, specify that same directory using the `--folder` option when restoring Splunk UBA.
--log-time	Add additional logging for how long each section takes, including all function calls and tasks. Use this option to help troubleshoot issues if your restore is taking a long time.

Below is an example restore:

Login to the master node of your Splunk UBA deployment as caspida using SSH.
Navigate to the /opt/caspida/bin/utils folder:
```
cd /opt/caspida/bin/utils
```

Run the restore script. In this example, we are restoring from a backup file in the /home/caspida directory from a single node system to a 3-node deployment. Below is the command and its output:

[caspida@ubanode1]$ /opt/caspida/bin/utils/uba-restore.sh --folder /home/caspida/2020-01-08_12-11-11/
UBA Restore Script - Version 1.9.2
Backup started at: Wed Jan  8 12:26:57 PST 2020
Backup running on: ubanode1.example.domain
Logfile:           /var/log/caspida/uba-restore-2020-01-08_12-26-57.log
Script Name:       uba-restore.sh
Script SHA:        4819a5d2ed713a5a040dfeb4dd30fed0a42406f2238d002ade7d293c3460285f
Parsing any CLI args
- Set source folder to /home/caspida/2020-01-08_12-11-11
Node Count: 3
Backup Node Count: 1
Detected migration from 1-node to 3-node
WARNING: The hostnames from the backup/restore hosts differ, this will be a migration
Execution Mode: Migration
Testing SSH connectivity to UBA node 1 (ubanode1)
Testing SSH connectivity to UBA node 2 (ubanode2)
Testing SSH connectivity to UBA node 3 (ubanode3)
Attempting to resolve the IP of UBA node ubanode1
  UBA node ubanode1 resolves to 192.168.19.88
Attempting to resolve the IP of UBA node ubanode2
  UBA node ubanode2 resolves to 192.168.19.89
Attempting to resolve the IP of UBA node ubanode3
  UBA node ubanode3 resolves to 192.168.19.90
Attempting to retrieve the IP of each node (old)
Stopping UBA (full)
Starting PostgreSQL
Logging PostgreSQL sparkserverregistry table (pre-restore)
Restoring PostgreSQL caspidadb database on UBA node 1 (ubanode1)
Restoring PostgreSQL metastore database on UBA node 1 (ubanode1)
Performing reset of connector stats/JMS schemas
Stopping PostgreSQL
Restoring timeseries data
Backing up existing uba-system-env.sh/uba-tuning.properties
Restoring local configurations
Restoring UBA rules
Restoring uba-system-env.sh/uba-tuning.properties
Syncing UBA cluster
Starting UBA (partial)
Checking that HDFS isnt in safe-mode
- Safe mode is disabled
Checking if the /user folder exists in HDFS
- Folder exists, will attempt to remove
Removing existing Hadoop HDFS content (attempt 1)
Waiting for 10 seconds before proceeding
Checking if the /user folder exists in HDFS
- Folder has been removed, will continue
Restoring Hadoop HDFS (this may take a while)
  - Checking status of PID 5754 (2020-01-08_12-32-47)
    - Restore is still running, please wait
    - Folder size: 242.2 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-33-09)
    - Restore is still running, please wait
    - Folder size: 254.2 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-33-31)
    - Restore is still running, please wait
    - Folder size: 543.2 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-33-53)
    - Restore is still running, please wait
    - Folder size: 543.6 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-34-15)
    - Restore is still running, please wait
    - Folder size: 546.4 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-34-37)
    - Restore is still running, please wait
    - Folder size: 547.1 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-34-59)
    - Restore is still running, please wait
    - Folder size: 548.7 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-35-21)
    - Restore is still running, please wait
    - Folder size: 549.6 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-35-43)
    - Restore is still running, please wait
    - Folder size: 550.2 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-36-06)
    - Restore is still running, please wait
    - Folder size: 550.8 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-36-28)
    - Restore is still running, please wait
    - Folder size: 551.7 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-36-50)
    - Restore is still running, please wait
    - Folder size: 553.0 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-37-12)
    - Restore is still running, please wait
    - Folder size: 554.1 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-37-34)
    - Restore is still running, please wait
    - Folder size: 554.7 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-37-56)
    - Restore is still running, please wait
    - Folder size: 555.9 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-38-18)
    - Restore is still running, please wait
    - Folder size: 556.6 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-38-40)
    - Restore is still running, please wait
    - Folder size: 557.6 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-39-02)
    - Restore is still running, please wait
    - Folder size: 558.7 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-39-24)
    - Restore is still running, please wait
    - Folder size: 561.9 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-39-46)
    - Restore is still running, please wait
    - Folder size: 562.8 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-40-08)
    - Restore is still running, please wait
    - Folder size: 563.7 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-40-30)
    - Restore is still running, please wait
    - Folder size: 564.3 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-40-52)
    - Restore is still running, please wait
    - Folder size: 564.9 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-41-14)
    - Restore is still running, please wait
    - Folder size: 566.2 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-41-36)
    - Restore is still running, please wait
    - Folder size: 566.7 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-41-58)
    - Restore is still running, please wait
    - Folder size: 567.3 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-42-20)
    - Restore is still running, please wait
    - Folder size: 568.1 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-42-42)
    - Restore is still running, please wait
    - Folder size: 568.7 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-43-04)
    - Restore is still running, please wait
    - Folder size: 570.2 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-43-26)
    - Restore is still running, please wait
    - Folder size: 570.7 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-43-48)
    - Restore is still running, please wait
    - Folder size: 571.2 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-44-10)
    - Restore is still running, please wait
    - Folder size: 571.9 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-44-32)
    - Restore is still running, please wait
    - Folder size: 572.6 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-44-54)
    - Restore is still running, please wait
    - Folder size: 573.2 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-45-16)
    - Restore is still running, please wait
    - Folder size: 574.1 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-45-38)
    - Restore is still running, please wait
    - Folder size: 589.0 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-46-00)
    - Restore is still running, please wait
    - Folder size: 601.5 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-46-23)
    - Restore is still running, please wait
    - Folder size: 610.3 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-46-45)
    - Restore is still running, please wait
    - Folder size: 646.0 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-47-07)
    - Restore is still running, please wait
    - Folder size: 647.0 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-47-29)
    - Restore is still running, please wait
    - Folder size: 647.5 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-47-51)
    - Restore is still running, please wait
    - Folder size: 665.0 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-48-13)
    - Restore is still running, please wait
    - Folder size: 669.1 M (target: 669.3 M)
  - Checking status of PID 5754 (2020-01-08_12-48-35)
  - Backup job has finished
Changing ownership of Hadoop HDFS files
Updating location of hive warehouse
Updating location of caspida analytics database
Flushing existing Redis data
Stopping Redis for Redis restore
Restoring Redis database (parallel mode)
Triggering Redis restore of node 1 to 3
  - Performing restore of data from UBA node 1 to UBA node 3 (ubanode3)
  - Performing rsync of database from UBA node 1 to UBA node 3
  - Waiting for pid 12992 to finish
    - Process finished successfully
Starting Redis after Redis restore
Retrieving Redis information (pre-fixup)
Performing Redis restore fixup (attempt 1)
Performing Redis restore rebalance (attempt 1)
Successfully finished Redis fixup/rebalance
Retrieving Redis information (post-fixup)
Determining number of Redis keys (post-restore)
- Retrieving keys from ubanode3 (ubanode3)
Redis key counts match (3207 vs 3207)
Comparing Redis keys
All Redis keys were found
Tuning configuration
Stopping UBA (partial)
Syncing UBA cluster
Copying /opt/caspida/conf to hdfs /user/caspida/config/etc/caspida/conf
Configuring containerization
Starting UBA (full)
Testing Impala
Logging PostgreSQL sparkserverregistry table (post-restore)
Comparing PostgreSQL backup/restore counts/stats
Migrating datasources
  - Deleting file-based datasource: 0_resolution-rainbow.infoblox
  - Deleting file-based datasource: HR-rainbow.csv
  - Migrating datasource: fileaccess
  - Deleting file-based datasource: rainbow.ad_multiline
  - Deleting file-based datasource: rainbow.ad_snare_flat
  - Deleting file-based datasource: rainbow.box
  - Deleting file-based datasource: rainbow.box_events
  - Deleting file-based datasource: rainbow.brivo
  - Deleting file-based datasource: rainbow.cef
  - Deleting file-based datasource: rainbow.ciscosa
  - Deleting file-based datasource: rainbow.o365_msg_trace
  - Deleting file-based datasource: rainbow.pan
  - Deleting file-based datasource: rainbow.splunk_cs
  - Deleting file-based datasource: rainbow.symantecdlp_dmp
  - Deleting file-based datasource: rainbow.symantecdlp_endpoint
  - Deleting file-based datasource: rainbow.webgateway
  - Deleting file-based datasource: rainbow.weblog
Restore completed successfully
Time taken: 0 hour(s), 29 minute(s), 19 second(s)

You can review the log file in /var/log/caspida/uba-restore-<timestamp>.log.

If you are integrating Splunk UBA with Splunk Enterprise Security (ES), install the Splunk ES SSL certificates in the restored deployment. See Configure the Splunk platform to receive data from the Splunk UBA output connector in the Send and Receive Data from the Splunk Platform manual.

Verify Splunk UBA is up and running

See Verify successfull installation in Install and Upgrade Splunk User Behavior Analytics for information about how to verify that Splunk UBA is up and running properly.

You can also run the uba_pre_check.sh script as part of this verification. See Check system status before and after installation in Install and Upgrade Splunk User Behavior Analytics.

Migrate Splunk UBA using the backup and restore scripts

Requirements for using the backup and restore scripts

Back up Splunk UBA using the backup script

Restore Splunk UBA using the restore script

Verify Splunk UBA is up and running

Comments

Migrate Splunk UBA using the backup and restore scripts

Was this topic useful?