Install the Data Stream Processor

To install the Splunk Data Stream Processor, download, extract, and run the installer on each node in your cluster. You must contact your Splunk representative to access the Data Stream Processor download page. The Data Stream Processor is installed from a Gravity package, which builds a Kubernetes cluster that DSP is eventually installed and deployed onto. See the Gravity documentation for more information.

At a glance, the DSP Installer does the following things:

Checks that the system is running on a supported OS.
Checks that the system passes pre-installation checks such as meeting the minimum system requirements.
Checks that the system is not already running docker or other conflicting software.
Checks that the system has the necessary running services and kernel modules.
Installs Docker, Kubernetes, and other software dependencies like SCloud.
Prepares Kubernetes to run the Data Stream Processor.
Installs the Data Stream Processor.
Checks that the Data Stream Processor is ready for use.

Extract and run the Data Stream Processor installer

Do the following steps to extract and run the DSP installer.

Prerequisites

Your system meets the minimum Hardware and Software requirements for DSP. See Hardware and Software requirements.
You do not have FIPS mode enabled on your operating system.
You have the required ports open. See Port configuration requirements.
If you are installing on RHEL or Centos, you must temporarily disable SELinux on all of your nodes: setenforce 0
You have system administrator (root) permissions. You need administrator (root) permissions so Kubernetes can leverage system components like iptables and kernel modules. If you do not have root permissions, you can use the sudo command.
Make sure that your system clocks are synchronized on each node. Consult the system documentation for the particular operating systems on which you are running the Splunk Data Stream Processor. For most environments, Network Time Protocol (NTP) is the best approach.
Depending on the configurations of your environment, you may need to do additional prerequisites before installing DSP. Talk to your administrator and see if any of the steps listed in the additional installation considerations apply to you.

Steps

Download the Data Stream Processor installer on each node in your cluster.
On each node in your cluster, extract the Data Stream Processor installer from the tarball.
```
tar xf <dsp-version>-linux-amd64.tar
```
In order for the DSP installer to complete, you must have at least 3 nodes ready to join the cluster. The DSP installer times out after 5 minutes, and if you do not have these nodes prepared, you may need to start the installation process over again.
On the node that you want to be the master node, navigate to the extracted file.
```
cd <dsp-version>
```

From the extracted file directory, run the DSP installer command. You must run this command with the --flavor=ha flag, but the install command supports several optional flags as well:

./install [--optional-flags] --flavor=ha

Flag	Description
--accept-license	Automatically accepts the license agreement printed upon completion.
--token <token>	A secure token preventing rogue nodes from joining the cluster during installation. Your token must be at least six characters long.
--service-uid <numeric>	Specifies the Service User ID. For information about how this is used, see Service User in the Gravity documentation. If not specified, a user named `planet` is created with user id `1000`. Note: The `./join` command does not support the `--service-uid` or `--service-gid` flags, but instead, the worker nodes use whatever value is set on the master node with `./install`.
--service-gid <numeric>	Specifies the Service Group ID. For information about how this is used, see Service User in the Gravity documentation. If not specified, a group named `planet` is created. Note: The `./join` command does not support the `--service-uid` or `--service-gid` flags, but instead, the worker nodes use whatever value is set on the master node with `./install`.
--pod-network-cidr <10.244.0.0/16>	The CIDR range Kubernetes will be allocating node subnets and pod IPs from. Must be a minimum of `/16` so Kubernetes is able to allocate `/24` to each node. If not specified, defaults to `10.244.0.0/16`.
--service-cidr <10.100.0.0/16>	The CIDR range Kubernetes will be allocating service IPs from. If not specified, defaults to `10.100.0.0/16`.
--mount=data:/<mount-path>/gravity/pv --state-dir=/<mount-path>/gravity	Change the location where Gravity stores containers and state information. Use this flag instead of the `--location` flag if you want to use different directories to store mount and state information. By default, Gravity uses `/var/lib/gravity` for storage. If you do not have enough disk space in `/var` to support 24 hours of data retention, then use this command to override the default path used for storage. If you use the `--mount` and `--state-dir` flags to change the location where Gravity stores containers and state information, you must use the flags both when installing and when joining the nodes. Replace <mount-path> with the path that you'd like Gravity to use for storage. For example, if you wanted to install everything in `/opt/splunk/dsp` then you would run: `./install --mount=data:/opt/splunk/dsp/gravity/pv --state-dir=/opt/splunk/dsp/gravity`. `/gravity/pv` is an example subdirectory to hold the data received from all sources. `/gravity` is an example subdirectory to hold cluster state information.

Once the initial node has connected to the installer, a join command is outputted that you will need to run on the other nodes in your cluster. Continue to the next section for steps.

Join nodes to the cluster to finish install

You must now join the nodes together to form the cluster.

After some period of time, the installer prints out a command that you must use to join the other nodes to the first master node. Copy the text after gravity join.

Wed Oct  2 23:59:56 UTC Please execute the following join commands on target nodes:
Role    Nodes   Command
----    -----   -------
worker  2       gravity join <ip-address-master> --token=<token> --role=worker

From the working directory of the other nodes that you want to join the cluster, enter one of the following commands.
- Join this node to the cluster.
```
./join <ip-address-of-master> --token=<token> --role=worker
```
- Join this node to the cluster and change the location where Gravity stores container and state information. By default, Gravity uses /var/lib/gravity to store state information and mounts persistent volumes for containers to /var/data. If you do not have enough disk space in /var to support 24 hours of data retention, then use this command to override the default path used for storage.
```
./join <ip-address-of-master> --token=<token> --role=worker --mount=data:/<mount-path>/gravity/pv --state-dir=/<mount-path>/gravity
```

When you have a minimum of three nodes in your cluster, the install continues. The installation process may take up to 45 minutes. Keep this terminal window open.

Fri May 22 14:28:24 UTC	Connecting to installer
Fri May 22 14:28:28 UTC	Connected to installer
Fri May 22 14:28:28 UTC	Successfully added "worker" node on 10.202.6.81
Fri May 22 14:28:28 UTC	Please execute the following join commands on target nodes:
Role	Nodes	Command
----	-----	-------
worker	2	./gravity join 10.202.6.81 --token=fbb30e2cb9e015bab9e58b27420fcdf8 --role=worker

Fri May 22 14:28:29 UTC	Operation has been created
Fri May 22 14:28:57 UTC	Successfully added "worker" node on 10.202.2.195
Fri May 22 14:28:57 UTC	Please execute the following join commands on target nodes:
Role	Nodes	Command
----	-----	-------
worker	1	./gravity join 10.202.6.81 --token=fbb30e2cb9e015bab9e58b27420fcdf8 --role=worker

Fri May 22 14:29:01 UTC	Successfully added "worker" node on 10.202.4.222
Fri May 22 14:29:01 UTC	All agents have connected!
.....

At this point, Gravity continues with the install. Once the install has finished, Gravity will output the login credentials to access the DSP UI as well as information about what services are now available.

Cluster is active

To log into DSP:

Hostname: https://localhost:30000
Username: dsp-admin
Password: bf2be8066757ffc8

NOTE: this is the original password created during cluster bootstrapping,
and will not be updated if dsp-admin's password is changed

To see these login instructions again: please run ./print-login

The following services are installed:

SERVICE         IP:PORT
DSP UI          localhost:30000
Login UI        localhost:30002
S2S Forwarder   localhost:30001
API Gateway     localhost:31000

 * Please make sure your firewall ports are open for these services *

To see these services again: please run ./print-services

Configure the Data Stream Processor UI redirect URL

By default, the Data Stream Processor uses the IPv4 address of eth0 to derive several properties required by the UI to function properly. In the case that the eth0 network is not directly accessible (for example, it exists inside a private AWS VPC) or is otherwise incorrect, use the configure-ui script to manually define the IP or hostname that can be used to access DSP.

From the master node, enter the following:

DSP_HOST=<ip-address-of-master-node> ./configure-ui

Then, enter the following:
```
./deploy 
```
(Optional) DSP exposes four external network ports: 30000 for the DSP UI, 30002 for Authentication and Login, 31000 for the API Services, and 30001 for the Forwarders Service. By default, DSP uses self-signed certificates to connect to these services. To use your own SSL/TLS certificate to connect to these services, see Secure DSP with SSL/TLS certificates.
Navigate to the Data Stream Processor UI to verify your changes.
```
https://<DSP_HOST>:30000/
```
On the login page, enter the following:
```
User: dsp-admin
Password: <the dsp-admin password generated from the installer>
```
If you are using the Firefox or MS Edge browsers, you must trust the API certificate separately. Navigate to the host of your DSP instance at port 31000. For example, navigate to "https://1.2.3.4:31000" and trust the self-signed certificate.

If you are using the Google Chrome browser and encounter a "net::ERR_CERT_INVALID" error with no "Proceed Anyway" option when you click on Advanced, click anywhere on the background then type "thisisunsafe" to trust the certificate.
(Optional) If you need to retrieve the dsp-admin password, enter the following on your master node:
```
./print-login
```

Change your admin password

Perform the following steps to change the dsp-admin password.

From the master node, run the reset password script.
```
sudo ./reset-admin-password
```
Enter your new password.
Navigate back to the DSP UI and login with your new password.

The print-login script only returns the original password generated by the installer. If you forget your changed admin password, you'll need to reset your password again.

Check the status of your Data Stream Processor deployment

To check the status of your cluster, type the following.

gravity status

A response showing the current health of your cluster is displayed.

$ sudo gravity status
Cluster name:		sadlumiere3129
Cluster status:		active
Application:		dsp, version 1.2.0-daily.20200518.1043150
Gravity version:	6.1.22 (client) / 6.1.22 (server)
Join token:		fbb30e2cb9e015bab9e58b27420fcdf8
Periodic updates:	Not Configured
Remote support:		Not Configured
Last completed operation:
    * 3-node install
      ID:		36003096-505f-420d-ad1b-efc561e0fca6
      Started:		Fri May 22 14:28 UTC (1 hour ago)
      Completed:	Fri May 22 14:29 UTC (1 hour ago)
Cluster endpoints:
    * Authentication gateway:
        - 10.202.6.81:32009
        - 10.202.2.195:32009
        - 10.202.4.222:32009
    * Cluster management URL:
        - https://10.202.6.81:32009
        - https://10.202.2.195:32009
        - https://10.202.4.222:32009
Cluster nodes:
    Masters:
        * ip-10-202-6-81 / 10.202.6.81 / worker
            Status:	healthy
        * ip-10-202-2-195 / 10.202.2.195 / worker
            Status:	healthy
        * ip-10-202-4-222 / 10.202.4.222 / worker
            Status:	healthy

Install and administer the Data Stream Processor

Related Answers

Install the Data Stream Processor

Extract and run the Data Stream Processor installer

Prerequisites

Steps

Join nodes to the cluster to finish install

Configure the Data Stream Processor UI redirect URL

Change your admin password

Check the status of your Data Stream Processor deployment

Comments

Install the Data Stream Processor