Splunk® User Behavior Analytics

Administer Splunk User Behavior Analytics

Acrobat logo Download manual as PDF


This documentation does not apply to the most recent version of Splunk® User Behavior Analytics. For documentation on the most recent version, go to the latest release.
Acrobat logo Download topic as PDF

Migrate Splunk UBA using the backup and restore scripts

Use the backup and restore scripts located in /opt/caspida/bin/utils to migrate your Splunk UBA deployment to the next larger size on the same operating system. For example, you can migrate from 5 nodes to 7 nodes, or 10 nodes to 20 nodes. If you want to migrate from 7 nodes to 20 nodes, migrate from 7 nodes to 10 nodes first, then from 10 nodes to 20 nodes.

Below is a summary of the migration process using the backup and restore scripts. For example, to migrate from a 3 node cluster to a 5 node cluster:

  1. Run the uba-backup.sh script on the 3 node cluster. The script stops Splunk UBA, perform the backup, then restarts Splunk UBA on the 3 node cluster.
  2. Set up the 5 node cluster so that all nodes meet the system requirements, and install Splunk UBA. The version number of the Splunk UBA software must match the version number of the backup. See the Splunk UBA installation checklist in Install and Upgade Splunk User Behavior Analytics to begin a Splunk UBA installation.
  3. Verify that Splunk UBA is up and running in the 5 node cluster. See Verify successful installation in Install and Upgrade Splunk User Behavior Analytics.
  4. Run the uba-restore.sh script on the 5 node cluster. The sript stops Splunk UBA, restores the system from the earlier backup, then starts Splunk UBA.

In addition to migration, you can use the backup and restore scripts as an alternative way of capturing backups of your Splunk UBA system, in addition to or in place of the automated incremental backups. See Backup and restore Splunk UBA using automated incremental backups.

Requirements for using the backup and restore scripts

Make sure the following requirements are met before using the backup and restore scripts:

  • The target system you are migrating to must be set up with Splunk UBA already up and running.
  • The backup system and the target system you are migrating to must have the same version of Splunk UBA running on the same operating system.
  • The target system you are migrating to must be the same size or one deployment size larger than the backup system. See Plan and scale your Splunk UBA deployment for information about the supported Splunk UBA deployment sizes.

Back up Splunk UBA using the backup script

Perform a full backup of Splunk UBA using the /opt/caspida/bin/utils/uba-backup.sh script. View the command line options by using the --help option. The table lists and describes the various options that can be used in the script.

Option Description
--archive Create a single archive containing all of the backup data. The archive is created after the backup is completed and Splunk UBA is restarted.
--archive-type %FORMAT% Specity the type of archive you want to create.
  • Use gzip for good compression and good timings
  • Use bzip2 for better compression but bad timings
  • Use xz for a balance between compression and timings in gzip and bzip2
  • Use tar for the fastest timings but with zero compression and large sizes

Install a package called pigz on the master node to use multi-threaded compression when creating .tgz archives. Use the following command to install this package:

yum -y install pigz
--dateformat %FORMAT% Override the default date/time format for the backup folder name. If this option is not used, the folder name is based on ISO 8601 format YYYY-MM-DD. To specify a backup folder name in the typical format used in the United States, specify MM-DD-YYYY. Using this option also overrides the date/time format of the logging messages.
--folder %FOLDER% Override the target folder location where the backup is stored. Use this option if you configured a secondary volume for storing backups, such as another 1TB disk on the management node. Don't use NFS for performance ramifications.
--log-time Add additional logging for how long each section takes, including all function calls and tasks. Use this option to help troubleshoot issues if your backup is taking more than two hours.
--no-data Don't back up any data, only the Splunk UBA configuration.
--no-prestart Don't start Splunk UBA before the backup begins, because Splunk UBA is already running. Make sure Splunk UBA is up and running before using this option.
--no-start Don't start Splunk UBA after the backup is completed. Use this option to perform additional post-backup actions that required Splunk UBA to be offline.
--restart-on-fail Restart Splunk UBA if the backup fails. If Splunk UBA encounters an error during the backup, the script attempts to restart Splunk UBA so the system does not remain offline.
--script %FILENAME% Run the specified script after the backup is completed. Use this with the --no-start option if your script requires Splunk UBA to be offline.
--skip-hdfs-fsck Skip the HDFS file system consistency check. This is useful in large environments if you want to skip this check due to time constraints.
--use-distcp Perform a parallel backup of Hadoop. If the HDFS export is taking several hours, use this option to perform a parallel backup which may be faster. Use the --log-time option to examine how long the HDFS export is taking.

Below is an example backup.

  1. Login to the master node of your Splunk UBA deployment as caspida using SSH.
  2. Navigate to the /opt/caspida/bin/utils folder:
    cd /opt/caspida/bin/utils
  3. Run the backup script. Below is the command and its output:
    [caspida@ubanode1]$ /opt/caspida/bin/utils/uba-backup.sh --no-prestart
    UBA Backup Script - Version 1.9.2
    Backup started at: Wed Jan  8 12:11:10 PST 2020
    Backup running on: ubanode1.example.domain
    Logfile:           /var/log/caspida/uba-backup-2020-01-08_12-11-10.log
    Script Name:       uba-backup.sh
    Script SHA:        06170431f2791e579bcba055df79d472d9c68614cf6c4c2497eb62ed48422e6a
    Parsing any CLI args
    - Disabling UBA pre-start before backup
    Node Count: 1
    Testing SSH connectivity to UBA node 1 (ubanode1)
    Attempting to resolve the IP of UBA node ubanode1
      UBA node ubenode1 resolves to 192.168.19.88
    Not starting UBA (pre-backup), disabled via CLI
    Backup folder: /var/vcap/ubabackup/2020-01-08_12-11-11
    Creating backup folder
    Changing ownership of the backup folder
    WARNING: No datasources were found as active in UBA
    Determining current counts/stats from PostgreSQL
    Stopping UBA (full)
    Starting UBA (partial)
    Checking that HDFS isnt in safe-mode
    - Safe mode is disabled
    Performing fsck of HDFS (this may take a while)
    Creating backup of deployment configuration
    Creating backup of local configurations
    Creating backup of UBA rules
    Creating backup of version information
    Creating backup of PostgreSQL caspidadb database on UBA node 1 (spuba50)
    Creating backup of PostgreSQL metastore database on UBA node 1 (spuba50)
    Logging PostgreSQL sparkserverregistry table
    Creating backup of Hadoop HDFS (this may take a while)
      - Checking status of PID 30850 (2020-01-08_12-16-22)
        - Backup job has finished (total size: 683M)
    Logging Redis information
    Stopping UBA (full)
    Creating backup of timeseries data
    Creating backup of Redis database (parallel mode)
      - Performing backup of UBA node 1 (spuba50)
      - Waiting for pid 12772 to finish
        - Process finished successfully
    Creating summary of backup
    Starting UBA (full)
    Backup completed successfully
    Time taken: 0 hour(s), 10 minute(s), 58 second(s)
    

You can review the log file in /var/log/caspida/uba-backup-<timestamp>.log.

Restore Splunk UBA using the restore script

After you have created a backup, restore Splunk UBA using the /opt/caspida/bin/utils/uba-restore.sh script. View the command line options by using the --help option. The table lists and describes the various options that can be used in the script.

Option Description
--dateformat %FORMAT% Override the default date/time format for the backup folder name. If this option is not used, the folder name is based on ISO 8601 format YYYY-MM-DD. To specify a backup folder name in the typical format used in the United States, specify MM-DD-YYYY. Using this option also overrides the date/time format of the logging messages.
--folder %FOLDER% Override the source folder to perform the restore. By default the script looks for the restore archive in the default /var/vcap/ubabackup directory. If you used the --folder option when creating the backup to store the backup in a different directory, specify that same directory using the --folder option when restoring Splunk UBA.
--log-time Add additional logging for how long each section takes, including all function calls and tasks. Use this option to help troubleshoot issues if your restore is taking a long time.

Below is an example restore:

  1. Login to the master node of your Splunk UBA deployment as caspida using SSH.
  2. Navigate to the /opt/caspida/bin/utils folder:
    cd /opt/caspida/bin/utils
  3. Run the restore script. In this example, we are restoring from a backup file in the /home/caspida directory from a single node system to a 3-node deployment. Below is the command and its output:
    [caspida@ubanode1]$ /opt/caspida/bin/utils/uba-restore.sh --folder /home/caspida/2020-01-08_12-11-11/
    UBA Restore Script - Version 1.9.2
    Backup started at: Wed Jan  8 12:26:57 PST 2020
    Backup running on: ubanode1.example.domain
    Logfile:           /var/log/caspida/uba-restore-2020-01-08_12-26-57.log
    Script Name:       uba-restore.sh
    Script SHA:        4819a5d2ed713a5a040dfeb4dd30fed0a42406f2238d002ade7d293c3460285f
    Parsing any CLI args
    - Set source folder to /home/caspida/2020-01-08_12-11-11
    Node Count: 3
    Backup Node Count: 1
    Detected migration from 1-node to 3-node
    WARNING: The hostnames from the backup/restore hosts differ, this will be a migration
    Execution Mode: Migration
    Testing SSH connectivity to UBA node 1 (ubanode1)
    Testing SSH connectivity to UBA node 2 (ubanode2)
    Testing SSH connectivity to UBA node 3 (ubanode3)
    Attempting to resolve the IP of UBA node ubanode1
      UBA node ubanode1 resolves to 192.168.19.88
    Attempting to resolve the IP of UBA node ubanode2
      UBA node ubanode2 resolves to 192.168.19.89
    Attempting to resolve the IP of UBA node ubanode3
      UBA node ubanode3 resolves to 192.168.19.90
    Attempting to retrieve the IP of each node (old)
    Stopping UBA (full)
    Starting PostgreSQL
    Logging PostgreSQL sparkserverregistry table (pre-restore)
    Restoring PostgreSQL caspidadb database on UBA node 1 (ubanode1)
    Restoring PostgreSQL metastore database on UBA node 1 (ubanode1)
    Performing reset of connector stats/JMS schemas
    Stopping PostgreSQL
    Restoring timeseries data
    Backing up existing uba-system-env.sh/uba-tuning.properties
    Restoring local configurations
    Restoring UBA rules
    Restoring uba-system-env.sh/uba-tuning.properties
    Syncing UBA cluster
    Starting UBA (partial)
    Checking that HDFS isnt in safe-mode
    - Safe mode is disabled
    Checking if the /user folder exists in HDFS
    - Folder exists, will attempt to remove
    Removing existing Hadoop HDFS content (attempt 1)
    Waiting for 10 seconds before proceeding
    Checking if the /user folder exists in HDFS
    - Folder has been removed, will continue
    Restoring Hadoop HDFS (this may take a while)
      - Checking status of PID 5754 (2020-01-08_12-32-47)
        - Restore is still running, please wait
        - Folder size: 242.2 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-33-09)
        - Restore is still running, please wait
        - Folder size: 254.2 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-33-31)
        - Restore is still running, please wait
        - Folder size: 543.2 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-33-53)
        - Restore is still running, please wait
        - Folder size: 543.6 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-34-15)
        - Restore is still running, please wait
        - Folder size: 546.4 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-34-37)
        - Restore is still running, please wait
        - Folder size: 547.1 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-34-59)
        - Restore is still running, please wait
        - Folder size: 548.7 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-35-21)
        - Restore is still running, please wait
        - Folder size: 549.6 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-35-43)
        - Restore is still running, please wait
        - Folder size: 550.2 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-36-06)
        - Restore is still running, please wait
        - Folder size: 550.8 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-36-28)
        - Restore is still running, please wait
        - Folder size: 551.7 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-36-50)
        - Restore is still running, please wait
        - Folder size: 553.0 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-37-12)
        - Restore is still running, please wait
        - Folder size: 554.1 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-37-34)
        - Restore is still running, please wait
        - Folder size: 554.7 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-37-56)
        - Restore is still running, please wait
        - Folder size: 555.9 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-38-18)
        - Restore is still running, please wait
        - Folder size: 556.6 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-38-40)
        - Restore is still running, please wait
        - Folder size: 557.6 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-39-02)
        - Restore is still running, please wait
        - Folder size: 558.7 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-39-24)
        - Restore is still running, please wait
        - Folder size: 561.9 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-39-46)
        - Restore is still running, please wait
        - Folder size: 562.8 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-40-08)
        - Restore is still running, please wait
        - Folder size: 563.7 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-40-30)
        - Restore is still running, please wait
        - Folder size: 564.3 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-40-52)
        - Restore is still running, please wait
        - Folder size: 564.9 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-41-14)
        - Restore is still running, please wait
        - Folder size: 566.2 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-41-36)
        - Restore is still running, please wait
        - Folder size: 566.7 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-41-58)
        - Restore is still running, please wait
        - Folder size: 567.3 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-42-20)
        - Restore is still running, please wait
        - Folder size: 568.1 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-42-42)
        - Restore is still running, please wait
        - Folder size: 568.7 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-43-04)
        - Restore is still running, please wait
        - Folder size: 570.2 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-43-26)
        - Restore is still running, please wait
        - Folder size: 570.7 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-43-48)
        - Restore is still running, please wait
        - Folder size: 571.2 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-44-10)
        - Restore is still running, please wait
        - Folder size: 571.9 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-44-32)
        - Restore is still running, please wait
        - Folder size: 572.6 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-44-54)
        - Restore is still running, please wait
        - Folder size: 573.2 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-45-16)
        - Restore is still running, please wait
        - Folder size: 574.1 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-45-38)
        - Restore is still running, please wait
        - Folder size: 589.0 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-46-00)
        - Restore is still running, please wait
        - Folder size: 601.5 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-46-23)
        - Restore is still running, please wait
        - Folder size: 610.3 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-46-45)
        - Restore is still running, please wait
        - Folder size: 646.0 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-47-07)
        - Restore is still running, please wait
        - Folder size: 647.0 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-47-29)
        - Restore is still running, please wait
        - Folder size: 647.5 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-47-51)
        - Restore is still running, please wait
        - Folder size: 665.0 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-48-13)
        - Restore is still running, please wait
        - Folder size: 669.1 M (target: 669.3 M)
      - Checking status of PID 5754 (2020-01-08_12-48-35)
      - Backup job has finished
    Changing ownership of Hadoop HDFS files
    Updating location of hive warehouse
    Updating location of caspida analytics database
    Flushing existing Redis data
    Stopping Redis for Redis restore
    Restoring Redis database (parallel mode)
    Triggering Redis restore of node 1 to 3
      - Performing restore of data from UBA node 1 to UBA node 3 (ubanode3)
      - Performing rsync of database from UBA node 1 to UBA node 3
      - Waiting for pid 12992 to finish
        - Process finished successfully
    Starting Redis after Redis restore
    Retrieving Redis information (pre-fixup)
    Performing Redis restore fixup (attempt 1)
    Performing Redis restore rebalance (attempt 1)
    Successfully finished Redis fixup/rebalance
    Retrieving Redis information (post-fixup)
    Determining number of Redis keys (post-restore)
    - Retrieving keys from ubanode3 (ubanode3)
    Redis key counts match (3207 vs 3207)
    Comparing Redis keys
    All Redis keys were found
    Tuning configuration
    Stopping UBA (partial)
    Syncing UBA cluster
    Copying /opt/caspida/conf to hdfs /user/caspida/config/etc/caspida/conf
    Configuring containerization
    Starting UBA (full)
    Testing Impala
    Logging PostgreSQL sparkserverregistry table (post-restore)
    Comparing PostgreSQL backup/restore counts/stats
    Migrating datasources
      - Deleting file-based datasource: 0_resolution-rainbow.infoblox
      - Deleting file-based datasource: HR-rainbow.csv
      - Migrating datasource: fileaccess
      - Deleting file-based datasource: rainbow.ad_multiline
      - Deleting file-based datasource: rainbow.ad_snare_flat
      - Deleting file-based datasource: rainbow.box
      - Deleting file-based datasource: rainbow.box_events
      - Deleting file-based datasource: rainbow.brivo
      - Deleting file-based datasource: rainbow.cef
      - Deleting file-based datasource: rainbow.ciscosa
      - Deleting file-based datasource: rainbow.o365_msg_trace
      - Deleting file-based datasource: rainbow.pan
      - Deleting file-based datasource: rainbow.splunk_cs
      - Deleting file-based datasource: rainbow.symantecdlp_dmp
      - Deleting file-based datasource: rainbow.symantecdlp_endpoint
      - Deleting file-based datasource: rainbow.webgateway
      - Deleting file-based datasource: rainbow.weblog
    Restore completed successfully
    Time taken: 0 hour(s), 29 minute(s), 19 second(s)
    
    You can review the log file in /var/log/caspida/uba-restore-<timestamp>.log.
  4. If you are integrating Splunk UBA with Splunk Enterprise Security (ES), install the Splunk ES SSL certificates in the restored deployment. See Configure the Splunk platform to receive data from the Splunk UBA output connector in the Send and Receive Data from the Splunk Platform manual.

Verify Splunk UBA is up and running

See Verify successfull installation in Install and Upgrade Splunk User Behavior Analytics for information about how to verify that Splunk UBA is up and running properly.

You can also run the uba_pre_check.sh script as part of this verification. See Check system status before and after installation in Install and Upgrade Splunk User Behavior Analytics.

Last modified on 09 October, 2020
PREVIOUS
Backup and restore Splunk UBA using automated incremental backups
  NEXT
Configure warm standby in Splunk UBA

This documentation applies to the following versions of Splunk® User Behavior Analytics: 5.0.1, 5.0.2, 5.0.3


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters