Backup and restore Splunk UBA using automated incremental backups

Attach an additional disk to the Splunk UBA management node in your deployment and configure automated incremental backups.

Periodic incremental backups are performed without stopping Splunk UBA. You can configure the frequency of these backups by configuring the cron job in /opt/caspida/conf/jobconf/caspida-jobs.json.
A weekly full backup is performed without stopping Splunk UBA. You can configure the frequency of these backups using the backup.filesystem.full.interval property.

You can use incremental backup and restore as an HA/DR solution that is less resource-intensive than the warm standby solution described in Configure warm standby in Splunk UBA. You can use the backups to restore Splunk UBA on the existing server, or to a new and separate server.

Configure incremental backups in Splunk UBA

Perform the following steps to configure incremental backups of your Splunk UBA deployment:

On the Splunk UBA master node, attach an additional disk dedicated for filesystem backup. For example, mount a device on a local directory on the Splunk UBA management node.
In 20-node clusters, Postgres services run on node 2 instead of node 1. You will need two additional disks - one on node 1 and a second on node 2 for the Postgres services, or you may have a shared storage device that can be accessed by both nodes. Make sure that the backup folder on both nodes is the same. By default, the backups are written to the /backup folder.
Stop Splunk UBA.
```
/opt/caspida/bin/Caspida stop
```
Create a dedicated directory on the management node and change directory permissions so that backup files can be written into the directory. If warm standby is also configured, perform these tasks on the management node in the primary cluster.
```
sudo mkdir /backup
sudo chmod 777 /backup
```
Mount the dedicated device on the backup directory. For example, a new 5TB hard drive mounted on the backup directory:
```
caspida@node1:~$ df -h /dev/sdc
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdc        5.0T  1G    4.9T   1% /backup
```
If the backup device is on the local disk, mount the disk using its UUID, which can be found in /etc/fstab. See Prepare the server for installation in Install and Upgrade Splunk User Behavior Analytics.
Add the following properties into /etc/caspida/local/conf/uba-site.properties:
```
backup.filesystem.enabled=true
backup.filesystem.directory.restore=/backup
```
Synchronize the configuration across the cluster:
```
/opt/caspida/bin/Caspida sync-cluster
```
Register filesystem backup:
```
/opt/caspida/bin/replication/setup filesystem
```
If the same host has been registered before, run the command again with the reset flag:
```
/opt/caspida/bin/replication/setup filesystem -r
```
Enable Postgres archiving.
1. Create the directory where archives will be stored. For example, /backup/wal_archive:
```
sudo mkdir /backup/wal_archive
sudo chown postgres:postgres /backup/wal_archive
```
2. Create a file called archiving.conf on the PostgreSQL node (node 2 for 20-node deployments, node 1 for all other deployments). On RHEL, Oracle Linux, and CentOS systems:
```
sudo su
cd /var/vcap/store/pgsql/10/data/conf.d/
sudo mv archiving.conf.sample archiving.conf
sudo su - caspida
```
  On Ubuntu systems:
```
cd /etc/postgresql/10/main/conf.d/
sudo mv archiving.conf.sample archiving.conf
```
  If your archive directory is not /backup/wal_archive, edit archiving.conf to change the archive directory.
3. Restart PostgreSQL services on the master node:
```
/opt/caspida/bin/Caspida stop-postgres
/opt/caspida/bin/Caspida start-postgres
```

On the master node:

In the primary cluster, enable the replication system job by adding the ReplicationCoordinator property into the /etc/caspida/local/conf/caspida-jobs.json file. Below is a sample of the file before adding the property:

/**
 * Copyright 2014 - Splunk Inc., All rights reserved.
 * This is Caspida proprietary and confidential material and its use
 * is subject to license terms.
 */
{
  "systemJobs": [
    {
      // "name" : "ThreatComputation",
      // "cronExpr"   : "0 0 0/1 * * ?",
      // "jobArguments" : { "env:CASPIDA_JVM_OPTS" :  "-Xmx4096M" }
    }
  ]
}

After adding the property, the file should look like this:

/**
 * Copyright 2014 - Splunk Inc., All rights reserved.
 * This is Caspida proprietary and confidential material and its use
 * is subject to license terms.
 */
{
  "systemJobs": [
    {
      // "name" : "ThreatComputation",
      // "cronExpr"   : "0 0 0/1 * * ?",
      // "jobArguments" : { "env:CASPIDA_JVM_OPTS" :  "-Xmx4096M" }
    },
    {
      "name"         : "ReplicationCoordinator",
      "enabled"      : true
    }
  ]
}

Run the following command to synchronize the cluster:
```
/opt/caspida/bin/Caspida sync-cluster
```

Start Splunk UBA:
```
/opt/caspida/bin/Caspida start
```

How Splunk UBA generates and stores the automated full and incremental backup files

An initial full backup is triggered automatically when the next scheduled job starts, as defined by the ReplicationCoordinator property in the /opt/caspida/conf/jobconf/caspida-jobs.json file. After the initial full backup, a series of incremental backups is performed until the next scheduled full backup. By default, Splunk UBA performs a full backup every 7 days. To change this interval, perform the following tasks:

Log in to the Splunk UBA master node as the caspida user.
Edit the backup.filesystem.full.interval property in /etc/caspida/local/conf/uba-site.properties.

Synchronize the cluster.

/opt/caspida/bin/Caspida sync-cluster  /etc/caspida/local/conf

You can identify the base directories containing the full backups and the incremental backup directories by the first digit of the directory name.

A base directory has a sequence number starting with 1.
An incremental directory has a sequence number starting with 0.

In the following example, the base directory 1000123 contains a full backup taking up 35GB of space, while the incremental directories 0000124, 0000125 and 0000126 have backup files around 1.5GB for each.

caspida@node1:~$ du -sh /backup/caspida/*
1.5G    /backup/caspida/0000124
1.5G    /backup/caspida/0000125
1.4G    /backup/caspida/0000126
35G /backup/caspida/1000123

The following restore scenarios are supported, using this example:

From a base directory with all incremental directories. Using our example, this includes all of 1000123, 0000124, 0000125, and 0000126 so Splunk UBA is restored to the latest checkpoint. See Restore Splunk UBA from incremental backups for instructions.
From a base directory with some incremental directory with contiguous sequences. Using our example, we can use 1000123, 0000124 and 0000125. The 1000123 and 0000125 directories cannot be used without 0000124 as it skips the sequence number. See Restore Splunk UBA from incremental backups for instructions.
From a base directory only, such as 1000123 in our example. See Restore Splunk UBA from a base directory without incremental backups for instructions.

Generate a full backup on-demand without waiting for the next scheduled job

Perform the following tasks to generate a full backup without waiting for the next scheduled job to do it for you.

Make sure you have set up your Splunk UBA deployment for automated incremental backups.
On the master node, edit the /etc/caspida/local/conf/uba-site.properties file and set the backup.filesystem.full.interval property to 0 days. For example:
```
backup.filesystem.full.interval = 0d
```

Synchronize the configuration change across the cluster:

/opt/caspida/bin/Caspida sync-cluster  /etc/caspida/local/conf

Use the following curl command to trigger a new cycle:

curl -X POST -k -H "Authorization: Bearer $(grep '^\s*jobmanager.restServer.auth.user.token=' /opt/caspida/conf/uba-default.properties | cut -d'=' -f2)"  https://localhost:9002/jobs/trigger?name=ReplicationCoordinator

Check the /var/log/caspida/replication/replication.log file to make sure the full backup is starting:

2020-06-15 14:01:56,120 INFO MainProcess.MainThread coordinator.prepCycle.209: Target cycle is: 0000154
2020-06-15 14:02:03,422 INFO MainProcess.MainThread coordinator.isFullBackup.308: Need to perform full backup. 
Last cycle: 2020-06-11 16:20:10; Interval: 0:00:00

(Recommended) Restore the backup.filesystem.full.interval property back to its default value of 7 days. You can set the property as follows and synchronize the cluster, or delete the the property altogether from the /etc/caspida/local/conf/uba-site.properties file and synchronize the cluster:
```
backup.filesystem.full.interval = 7d
```

Restore Splunk UBA

You can restore Splunk UBA from a full backup or from incremental backups. For successful incremental backups, at least one base backup directory containing a full backup must exist.

For an example of a full backup see, Restore Splunk UBA from a full backup.
For an example of an incremental backup see, Restore Splunk UBA from incremental backups.

Backup and restore Splunk UBA using automated incremental backups

Configure incremental backups in Splunk UBA

How Splunk UBA generates and stores the automated full and incremental backup files

Generate a full backup on-demand without waiting for the next scheduled job

Restore Splunk UBA

Comments

Backup and restore Splunk UBA using automated incremental backups

Was this topic useful?