Backup and restore Splunk UBA using automated incremental backups
Attach an additional disk to the Splunk UBA management node in your deployment and configure automated incremental backups.
- Periodic incremental backups are performed without stopping Splunk UBA. You can configure the frequency of these backups by configuring the cron job in
/opt/caspida/conf/jobconf/caspida-jobs.json
. - A weekly full backup is performed without stopping Splunk UBA. You can configure the frequency of these backups using the
backup.filesystem.full.interval
property.
You can use incremental backup and restore as an HA/DR solution that is less resource-intensive than the warm standby solution described in Configure warm standby in Splunk UBA. You can use the backups to restore Splunk UBA on the existing server, or to a new and separate server.
Configure incremental backups in Splunk UBA
Perform the following steps to configure incremental backups of your Splunk UBA deployment:
- On the Splunk UBA master node, attach an additional disk dedicated for filesystem backup. For example, mount a device on a local directory on the Splunk UBA management node.
In 20-node clusters, Postgres services run on node 2 instead of node 1. You will need two additional disks - one on node 1 and a second on node 2 for the Postgres services, or you may have a shared storage device that can be accessed by both nodes. Make sure that the backup folder on both nodes is the same. By default, the backups are written to the/backup
folder. - Stop Splunk UBA.
/opt/caspida/bin/Caspida stop
- Create a dedicated directory on the management node and change directory permissions so that backup files can be written into the directory. If warm standby is also configured, perform these tasks on the management node in the primary cluster.
sudo mkdir /backup sudo chmod 777 /backup
- Mount the dedicated device on the backup directory. For example, a new 5TB hard drive mounted on the backup directory:
caspida@node1:~$ df -h /dev/sdc Filesystem Size Used Avail Use% Mounted on /dev/sdc 5.0T 1G 4.9T 1% /backup
If the backup device is on the local disk, mount the disk using its UUID, which can be found in/etc/fstab
. See Prepare the server for installation in Install and Upgrade Splunk User Behavior Analytics. - Add the following properties into
/etc/caspida/local/conf/uba-site.properties
:backup.filesystem.enabled=true backup.filesystem.directory.restore=/backup
- Synchronize the configuration across the cluster:
/opt/caspida/bin/Caspida sync-cluster
- Register filesystem backup:
/opt/caspida/bin/replication/setup filesystem
If the same host has been registered before, run the command again with the reset flag:
/opt/caspida/bin/replication/setup filesystem -r
- Enable Postgres archiving.
- Create the directory where archives will be stored. For example,
/backup/wal_archive
:sudo mkdir /backup/wal_archive sudo chown postgres:postgres /backup/wal_archive
- Create a file called
archiving.conf
on the PostgreSQL node (node 2 for 20-node deployments, node 1 for all other deployments). On RHEL, Oracle Linux, and CentOS systems:sudo su cd /var/vcap/store/pgsql/10/data/conf.d/ sudo mv archiving.conf.sample archiving.conf sudo su - caspida
On Ubuntu systems:
If your archive directory is notcd /etc/postgresql/10/main/conf.d/ sudo mv archiving.conf.sample archiving.conf
/backup/wal_archive
, editarchiving.conf
to change the archive directory. - Restart PostgreSQL services on the master node:
/opt/caspida/bin/Caspida stop-postgres /opt/caspida/bin/Caspida start-postgres
- Create the directory where archives will be stored. For example,
- On the master node:
- In the primary cluster, enable the replication system job by adding the
ReplicationCoordinator
property into the/etc/caspida/local/conf/caspida-jobs.json
file. Below is a sample of the file before adding the property:/** * Copyright 2014 - Splunk Inc., All rights reserved. * This is Caspida proprietary and confidential material and its use * is subject to license terms. */ { "systemJobs": [ { // "name" : "ThreatComputation", // "cronExpr" : "0 0 0/1 * * ?", // "jobArguments" : { "env:CASPIDA_JVM_OPTS" : "-Xmx4096M" } } ] }
After adding the property, the file should look like this:
/** * Copyright 2014 - Splunk Inc., All rights reserved. * This is Caspida proprietary and confidential material and its use * is subject to license terms. */ { "systemJobs": [ { // "name" : "ThreatComputation", // "cronExpr" : "0 0 0/1 * * ?", // "jobArguments" : { "env:CASPIDA_JVM_OPTS" : "-Xmx4096M" } }, { "name" : "ReplicationCoordinator", "enabled" : true } ] }
- Run the following command to synchronize the cluster:
/opt/caspida/bin/Caspida sync-cluster
- In the primary cluster, enable the replication system job by adding the
- Start Splunk UBA:
/opt/caspida/bin/Caspida start
How Splunk UBA generates and stores the automated full and incremental backup files
An initial full backup is triggered automatically when the next scheduled job starts, as defined by the ReplicationCoordinator
property in the /opt/caspida/conf/jobconf/caspida-jobs.json
file. After the initial full backup, a series of incremental backups is performed until the next scheduled full backup. By default, Splunk UBA performs a full backup every 7 days. To change this interval, perform the following tasks:
- Log in to the Splunk UBA master node as the caspida user.
- Edit the
backup.filesystem.full.interval
property in/etc/caspida/local/conf/uba-site.properties
. - Synchronize the cluster.
/opt/caspida/bin/Caspida sync-cluster /etc/caspida/local/conf
You can identify the base directories containing the full backups and the incremental backup directories by the first digit of the directory name.
- A base directory has a sequence number starting with
1
. - An incremental directory has a sequence number starting with
0
.
In the following example, the base directory 1000123
contains a full backup taking up 35GB of space, while the incremental directories 0000124
, 0000125
and 0000126
have backup files around 1.5GB for each.
caspida@node1:~$ du -sh /backup/caspida/* 1.5G /backup/caspida/0000124 1.5G /backup/caspida/0000125 1.4G /backup/caspida/0000126 35G /backup/caspida/1000123
The following restore scenarios are supported, using this example:
- From a base directory with all incremental directories. Using our example, this includes all of
1000123
,0000124
,0000125
, and0000126
so Splunk UBA is restored to the latest checkpoint. See Restore Splunk UBA from incremental backups for instructions. - From a base directory with some incremental directory with contiguous sequences. Using our example, we can use
1000123
,0000124
and0000125
. The1000123
and0000125
directories cannot be used without0000124
as it skips the sequence number. See Restore Splunk UBA from incremental backups for instructions. - From a base directory only, such as
1000123
in our example. See Restore Splunk UBA from a base directory without incremental backups for instructions.
Generate a full backup on-demand without waiting for the next scheduled job
Perform the following tasks to generate a full backup without waiting for the next scheduled job to do it for you.
- Make sure you have set up your Splunk UBA deployment for automated incremental backups.
- On the master node, edit the
/etc/caspida/local/conf/uba-site.properties
file and set thebackup.filesystem.full.interval
property to 0 days. For example:backup.filesystem.full.interval = 0d
- Synchronize the configuration change across the cluster:
/opt/caspida/bin/Caspida sync-cluster /etc/caspida/local/conf
- Use the following
curl
command to trigger a new cycle:curl -X POST -k -H "Authorization: Bearer $(grep '^\s*jobmanager.restServer.auth.user.token=' /opt/caspida/conf/uba-default.properties | cut -d'=' -f2)" https://localhost:9002/jobs/trigger?name=ReplicationCoordinator
- Check the
/var/log/caspida/replication/replication.log
file to make sure the full backup is starting:2020-06-15 14:01:56,120 INFO MainProcess.MainThread coordinator.prepCycle.209: Target cycle is: 0000154 2020-06-15 14:02:03,422 INFO MainProcess.MainThread coordinator.isFullBackup.308: Need to perform full backup. Last cycle: 2020-06-11 16:20:10; Interval: 0:00:00
- (Recommended) Restore the
backup.filesystem.full.interval
property back to its default value of 7 days. You can set the property as follows and synchronize the cluster, or delete the the property altogether from the/etc/caspida/local/conf/uba-site.properties
file and synchronize the cluster:backup.filesystem.full.interval = 7d
Restore Splunk UBA
You can restore Splunk UBA from a full backup or from incremental backups. For successful incremental backups, at least one base backup directory containing a full backup must exist.
For an example of a full backup see, Restore Splunk UBA from a full backup.
For an example of an incremental backup see, Restore Splunk UBA from incremental backups.
Prepare automated incremental backups in Splunk UBA | Restore Splunk UBA from a full backup |
This documentation applies to the following versions of Splunk® User Behavior Analytics: 5.0.5, 5.0.5.1
Feedback submitted, thanks!