Splunk® User Behavior Analytics

Install and Upgrade Splunk User Behavior Analytics

Download manual as PDF

This documentation does not apply to the most recent version of UBA. Click here for the latest version.
Download topic as PDF

Plan and scale your Splunk UBA deployment

Install Splunk UBA in a single-server or distributed deployment architecture. A distributed deployment helps you scale your Splunk UBA installation. All servers in your Splunk UBA deployment should meet the System requirements for Splunk UBA.

Scaling your deployment

A distributed Splunk UBA deployment scales horizontally with ETL and processing servers.

Size of cluster Max events per second capacity Number of accounts Number of devices Maximum data parsers
1 3K < 50K < 100K 6
3 10K < 50K < 200K 10
5 20K < 200K < 300K 12
7 40K < 350K < 500K 24
10 45K-50K 350K 500K 32

Data is load-balanced across the servers to ensure maximum performance capability. The maximum number of data parsers refers to the number of distributed parsers associated with the data sources.

Single-server deployment architecture

Use a single-server deployment to support up to three thousand events per second ingestion rate, assuming analysis on fewer than 50 thousand accounts and 100 thousand devices.

Distributed deployment architecture

To scale Splunk UBA in a distributed deployment, Splunk UBA assigns each server in a cluster to a specific task. Each server then has specific services installed to support that task. The specific services installed on each server can vary depending on the size of your cluster.

All servers in your Splunk UBA deployment should meet the System requirements for Splunk UBA.

UBA distributed deployment.png

Management server

The management server hosts the web user interface for Splunk UBA.

  • You only need one management server.
  • Typical services installed on this server include: UI server, Job manager master, Neo4j server, InfluxDB server, PostgreSQL, Impala, and Zookeeper Quorum

Processing server

The processing server handles the data processing tasks for Splunk UBA.

  • The processing server scales horizontally. Add more processing servers to increase throughput of your data.
  • Typical services installed on this server include: HDFS, Apache Storm, Apache Spark, Impala, and Zookeeper Quorum

ETL server

The ETL server handles the extract, transform, and load tasks for Splunk UBA.

  • The ETL server scales horizontally. Add more ETL servers to increase data ingestion capacity.
  • Typical services installed on this server include: Redis server, Job manager agent, Redis IR server, ETL processes, Apache Kafka, and Zookeeper Quorum

Services on each server

The services installed on each server vary based on the deployment architecture. This table lists the services installed on each server in the various supported deployment configurations. See Monitor the health of your Splunk UBA deployment for more on each type of service.

Deployment architecture Services on management server Services on processing server Services on ETL server
1 server All services. N/A N/A
3 servers The following services:
  • zookeeper-server
  • postgresql
  • redis-sysmon-server
  • neo4j-service
  • hadoop-hdfs-namenode
  • hive-metastore
  • influxdb
  • impala-server
  • impala-catalog
  • impala-state-store
  • storm-ui
  • storm-nimbus
  • spark-master*
  • spark-history*
  • caspida-jobmanager*
  • caspida-analytics-server*
  • caspida-eventstore*
  • caspida-ui*
  • caspida-outputconnector*
  • caspida-offlinerulexec*
  • caspida-realtimetuleexec*
  • caspida-sysmonitor*
  • caspida-resourcemonitor*
The following services:
  • zookeeper-server
  • redis-server
  • hadoop-hdfs-datanode
  • storm-supervisor
  • spark-worker*
  • storm-workers*
The following services:
  • zookeeper-server
  • hadoop-hdfs-secondarynamenode
  • redis-ir-server
  • kafka-server*
  • caspida-jobagent*
5 servers The following services:
  • zookeeper-server
  • postgresql
  • redis-sysmon-server
  • neo4j-service
  • hadoop-hdfs-namenode
  • hive-metastore
  • influxdb
  • impala-server
  • impala-catalog
  • impala-state-store
  • storm-ui
  • storm-nimbus
  • spark-master*
  • spark-history*
  • caspida-jobmanager*
  • caspida-analytics-server*
  • caspida-eventstore*
  • caspida-ui*
  • caspida-outputconnector*
  • caspida-offlinerulexec*
  • caspida-realtimetuleexec*
  • caspida-sysmonitor*
  • caspida-resourcemonitor*
The following services installed on both processing servers:
  • hadoop-hdfs-datanode
  • storm-supervisor
  • spark-worker*
  • storm-workers*

In addition, one processing server has the redis-server service installed

The following services installed on one ETL server:
  • zookeeper-server
  • redis-ir-server
  • caspida-jobagent*

The following service installed on the other ETL server:

  • zookeeper-server
  • hadoop-hdfs-secondarynamenode
  • kafka-server*
  • caspida-jobagent*
7 servers The following services:
  • zookeeper-server
  • postgresql
  • redis-sysmon-server
  • neo4j-service
  • hadoop-hdfs-namenode
  • hive-metastore
  • influxdb
  • impala-server
  • impala-catalog
  • impala-state-store
  • storm-ui
  • storm-nimbus
  • caspida-jobmanager*
  • caspida-analytics-server*
  • caspida-eventstore*
  • caspida-ui*
  • caspida-outputconnector*
  • caspida-offlinerulexec*
  • caspida-realtimetuleexec*
  • caspida-sysmonitor*
  • caspida-resourcemonitor*
Two processing servers have the following services installed (node 4 and node 6):
  • hadoop-hdfs-datanode
  • storm-supervisor
  • storm-workers*

One processing server has the following services installed (node 5):

  • hadoop-hdfs-datanode
  • redis-server
  • storm-supervisor
  • storm-workers*

One processing server has the following services installed (node 7):

  • hadoop-hdfs-datanode
  • spark-master*
  • spark-worker*
  • spark-history*
The following services installed on one ETL server:
  • zookeeper-server
  • redis-ir-server
  • caspida-jobagent*

The following service installed on the other ETL server:

  • zookeeper-server
  • hadoop-hdfs-secondarynamenode
  • kafka-server*
  • caspida-jobagent*
10 servers The following services:
  • zookeeper-server
  • postgresql
  • redis-sysmon-server
  • neo4j-service
  • hadoop-hdfs-namenode
  • hive-metastore
  • influxdb
  • impala-server
  • impala-catalog
  • impala-state-store
  • storm-ui
  • storm-nimbus
  • caspida-jobmanager*
  • caspida-analytics-server*
  • caspida-eventstore*
  • caspida-ui*
  • caspida-outputconnector*
  • caspida-offlinerulexec*
  • caspida-realtimetuleexec*
  • caspida-resourcemonitor*
One processing server has the following services installed (node 5):
  • redis-server
  • storm-supervisor
  • caspida-sysmonitor*
  • storm-workers*

Two processing server has the following services installed (nodes 6, 7, and 8):

  • hadoop-hdfs-datanode
  • storm-supervisor
  • storm-workers*

One processing server has the following services installed (node 9):

  • hadoop-hdfs-datanode
  • spark-master*
  • spark-worker*
  • spark-history*

One processing server has the following services installed (node 10):

  • hadoop-hdfs-datanode
  • spark-worker*
The following services installed on one ETL server:
  • zookeeper-server
  • redis-ir-server
  • caspida-jobagent*

The following services installed on one ETL server:

  • zookeeper-server
  • hadoop-hdfs-secondarynamenode
  • kafka-server*
  • caspida-jobagent*

The following service installed on one ETL server:

  • caspida-jobagent*

An * indicates services that are started with the caspida start command and stopped with the caspida stop command. To start or stop all services, use caspida start-all or caspida stop-all.

The node numbers represent the Splunk UBA host names in the order they were specified when running the /opt/caspida/bin/Caspida setup command during setup. For example, if you specified ubahost1,ubahost2,ubahost3,ubahost4,ubahost5 then node 1 corresponds to ubahost1, node 2 corresponds to ubahost2 and so on.

You can view the order in which the host names were entered:

  1. Login to any UBA node as caspida user.
  2. Run the following command:
    grep caspida.cluster.nodes /opt/caspida/conf/deployment/caspida-deployment.conf

    Below is a sample output of this command:

    caspida.cluster.nodes=ubahost1,ubahost2,ubahost3,ubahost4,ubahost5
PREVIOUS
About Splunk User Behavior Analytics
  NEXT
System requirements for Splunk UBA

This documentation applies to the following versions of Splunk® User Behavior Analytics: 3.3.0, 3.3.1, 4.0.0, 4.0.1, 4.0.2


Comments

What do you mean by node 4, node 5, node 10, node 9 ...etc in the section: "Services for each server" .. is this random naming ?! if not, is there a rule to decide what is the node number on each server ?!!

Ali alnajjar versos
October 27, 2017

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters