Install the Data Stream Processor

To install the Splunk Data Stream Processor, download, extract, and run the installer on each node in your cluster. You must contact your Splunk representative to access the Data Stream Processor download page. The Data Stream Processor is installed from a Gravity package, which spins up a group of cloud instances and hardware into a cluster and deploys Kubernetes on top of it. All Data Stream Processor services gets deployed on top of Kubernetes, see the Gravity documentation for more information.

Extract and run the Data Stream Processor installer

Do the following steps to extract and run the DSP installer.

Prerequisites

Before you install the Splunk Data Stream Processor, make sure that your system clocks are synchronized on each node. Consult the system documentation for the particular operating systems on which you are running the Splunk Data Stream Processor. For most environments, Network Time Protocol (NTP) is the best approach.
You must have IPv4 Forwarding enabled on each node. See IPv4 Forwarding in the Gravity documentation.
You must have system administrator (root) permissions for the installation of Kubernetes and Gravity. Kubernetes leverages system components like iptables and kernel modules which require root access. If you do not have root permissions, you can use the sudo command. Once installed, non-privileged DSP containers and services do not run as the root user, but rather as a service user that you specify. See Service User in the Gravity documentation and step 5 below for details on how to specify the service user.

Steps

Download the Data Stream Processor installer on each node in your cluster.
On each node in your cluster, extract the Data Stream Processor installer from the tarball.
```
tar xf <dsp-version>-linux-amd64.tar
```
On the node that you want to be the master node, navigate to the extracted file.
```
cd <dsp-version>
```
If you are installing DSP with SELinux enabled, temporarily disable SELinux in your Linux OS. This disables SELinux until your Linux server is rebooted.
```
setenforce 0
```

From the extracted file directory, run the DSP installer command. You must run this command with the --flavor=ha flag, but the install command supports several optional flags as well:

./install [--optional-flags] --flavor=ha

Flag	Description
--accept-license	Automatically accepts the license agreement printed upon completion.
--token <token>	A secure token preventing rogue nodes from joining the cluster during installation. Your token must be at least six characters long.
--service-uid <numeric>	Specifies the Service User ID. For information about how this is used, see Service User in the Gravity documentation. If not specified, a user named `planet` is created with user id `1000`. Note: The `./join` command does not support the `--service-uid` or `--service-gid` flags, but instead, the worker nodes use whatever value is set on the master node with `./install`.
--service-gid <numeric>	Specifies the Service Group ID. For information about how this is used, see Service User in the Gravity documentation. If not specified, a group named `planet` is created. Note: The `./join` command does not support the `--service-uid` or `--service-gid` flags, but instead, the worker nodes use whatever value is set on the master node with `./install`.
--pod-network-cidr <10.244.0.0/16>	The CIDR range Kubernetes will be allocating node subnets and pod IPs from. Must be a minimum of `/16` so Kubernetes is able to allocate `/24` to each node. If not specified, defaults to `10.244.0.0/16`.
--service-cidr <10.100.0.0/16>	The CIDR range Kubernetes will be allocating service IPs from. If not specified, defaults to `10.100.0.0/16`.
--mount=data:/data/gravity/pv --state-dir=/data/gravity	Change the location where Gravity stores containers and state information. Defaults to `/var/lib/gravity`. This example stores data on the `/data` directory. Use this command if you do not have enough disk space in `/var` to support 24 hours of data retention.

This may take up to 15 minutes to install. Keep this terminal window open.

If you are installing DSP on a CentOS or RHEL Operating System, to prevent Gravity hosts from running out of inotify watches, increase fs.inotify.max_user_watches to 1000000 located in /etc/sysctl.d/99-sysctl.conf. If your Gravity hosts run out of inotify watches, Gravity throws an out of disk error.
- On each node, open the 99-sysctl.conf file in /etc/sysctl.d/
- Add the following line to the file:
```
fs.inotify.max_user_watches=1000000
```
- Save your changes.
- From the command-line of the master node, type the following command:
```
sysctl -p /etc/sysctl.d/99-sysctl.conf
```
  .

Open the required Gravity Ports and finish the install

Gravity is a toolkit that allows developers to package their Kubernetes clusters and apps as a tarball. All Data Stream Processor services are deployed on top of Kubernetes. Complete the following steps to open the requisite Gravity ports and complete the DSP installation.

Open the required ports that are required by the Data Stream Processor.

After the installation process has finished, the installer prints out a command that you must use to join the other nodes to the first master node. Copy the text after gravity join.

Wed Oct  2 23:59:56 UTC Please execute the following join commands on target nodes:
Role    Nodes   Command
----    -----   -------
worker  2       gravity join <ip-address-master> --token=<token> --role=worker

On each of the worker nodes, enter the following.

./join <ip-address-of-master> --token=<token> --role=worker

When a minimum of two nodes have joined your cluster, the install continues and the following things occur:

Checks that the system is running on a supported OS.
Checks that the system passes pre-installation checks such as meeting the minimum system requirements.
Checks that the system is not already running docker or other conflicting software
Checks that the system has the necessary running services and kernel modules
Installs Docker, Kubernetes, and other software dependencies like SCloud
Prepares Kubernetes to run the Data Stream Processor
Installs the Data Stream Processor
Checks that the Data Stream Processor is ready for use
- The application status hook will be invoked - failures will be tagged as "degraded" in gravity status.

Configure the Data Stream Processor UI redirect URL

By default, the Data Stream Processor uses the IPv4 address of eth0 to derive several properties required by the UI to function properly. This will work in many but not all cases.

In the case that the eth0 network is not directly accessible (for example, it exists inside a private AWS VPC) or is otherwise incorrect, use the configure-ui script to manually define the IP or hostname that can be used to access DSP.

From the master node, enter the following:

DSP_HOST=<ip-address-of-master-node> ./configure-ui

Then, enter the following:
```
./deploy 
```
Navigate to the Data Stream Processor UI to verify your changes.
```
https://<DSP_HOST>:30000/
```
On the login page, enter the following:
```
User: dsp-admin
Password: <the dsp-admin password generated from the installer>
```
If you are using the Google Chrome browser and encounter a "net::ERR_CERT_INVALID" error with no "Proceed Anyway" option when you click on Advanced, click anywhere on the background then type "thisisunsafe" to trust the certificate.
(Optional) If you need to retrieve the dsp-admin password, enter the following on your master node:
```
./print-login
```

Check the status of your Data Stream Processor deployment

To check the status of your cluster, do the following:

From a node, type the following.
```
gravity status
```

A response showing the current health of your cluster is displayed.

Cluster status:     active
Application:        dspbeta, version 0.1.5-503873
Join token:     cb8155ed37115fe4f70cd896e4a0eea5
Periodic updates:   Not Configured
Remote support:     Not Configured
Last completed operation:
    * operation_install (5b6cfae2-ee66-4789-a7e9-9d257d99cea9)
      started:      Fri Sep 27 16:02 UTC (3 days ago)
      completed:    Fri Sep 27 16:02 UTC (3 days ago)
Cluster endpoints:
    * Authentication gateway:
        - 172.31.24.68:32009
    * Cluster management URL:
        - https://172.31.24.68:32009
Cluster nodes:  musingbanach888
    Masters:
        * ip-172-31-24-68 / 172.31.24.68 / worker
            Status: healthy

Cluster configuration options

You can view or change the default configurations of your cluster by using the following commands.

get-config
set-config
list-configs
get-secret
set-secret
list-secrets

To set a new configuration or secret:

To set a new configuration, type the following from a node.
```
./set-config <PROPERTY_NAME> <VALUE> 
```
To set a new secret, type the following from a node.
```
./set-secret <SECRET_NAME>
```
Deploy the changes.
```
./deploy
```

To view existing cluster configurations:

From a node, type the following to see the list of configurations, including values.
```
./list-configs   
```
From a node, type the following to see the list of secret keys.
```
./list-secrets
```

To view individual configurations or secrets:

From a node, type the following to see an individual non-secret configuration property.
```
./get-config <PROPERTY_NAME>
```
From a node, type the following to see an individual secret.
```
./get-secret <SECRET_NAME>
```

Related answers from Splunk Community

Install the Data Stream Processor

Extract and run the Data Stream Processor installer

Open the required Gravity Ports and finish the install

Configure the Data Stream Processor UI redirect URL

Check the status of your Data Stream Processor deployment

Cluster configuration options

Comments

Install the Data Stream Processor

Was this topic useful?