Splunk® Data Stream Processor

Install and administer the Data Stream Processor

Download manual as PDF

This documentation does not apply to the most recent version of DSP. Click here for the latest version.
Download topic as PDF

Install the Data Stream Processor

To install the Splunk Data Stream Processor, download, extract, and run the installer on each node in your cluster. You must contact your Splunk representative to access the Data Stream Processor download page. The Data Stream Processor is installed from a Gravity package, which spins up a group of cloud instances and hardware into a cluster and deploys Kubernetes on top of it. All Data Stream Processor services gets deployed on top of Kubernetes, see the Gravity documentation for more information.

Extract and run the Data Stream Processor installer

Do the following steps to extract and run the DSP installer.

Prerequisites

  • Before you install the Splunk Data Stream Processor, make sure that your system clocks are synchronized on each node. Consult the system documentation for the particular operating systems on which you are running the Splunk Data Stream Processor. For most environments, Network Time Protocol (NTP) is the best approach.
  • You must have IPv4 Forwarding enabled on each node. See IPv4 Forwarding in the Gravity documentation.
  • You must have system administrator (root) permissions for the installation of Kubernetes and Gravity. Kubernetes leverages system components like iptables and kernel modules which require root access. If you do not have root permissions, you can use the sudo command. Once installed, non-privileged DSP containers and services do not run as the root user, but rather as a service user that you specify. See Service User in the Gravity documentation and step 5 below for details on how to specify the service user.

Steps

  1. Download the Data Stream Processor installer on each node in your cluster.
  2. On each node in your cluster, extract the Data Stream Processor installer from the tarball.
    tar xf <dsp-version>-linux-amd64.tar
  3. On the node that you want to be the master node, navigate to the extracted file.
    cd <dsp-version>
  4. If you are installing DSP with SELinux enabled, temporarily disable SELinux in your Linux OS. This disables SELinux until your Linux server is rebooted.
    setenforce 0
    
  5. From the extracted file directory, run the DSP installer command. You must run this command with the --flavor=ha flag, but the install command supports several optional flags as well:
    ./install [--optional-flags] --flavor=ha 
    
    Flag Description
    --accept-license Automatically accepts the license agreement printed upon completion.
    --token <token> A secure token preventing rogue nodes from joining the cluster during installation. Your token must be at least six characters long.
    --service-uid <numeric> Specifies the Service User ID. For information about how this is used, see Service User in the Gravity documentation. If not specified, a user named planet is created with user id 1000.
    Note: The ./join command does not support the --service-uid or --service-gid flags, but instead, the worker nodes use whatever value is set on the master node with ./install.
    --service-gid <numeric> Specifies the Service Group ID. For information about how this is used, see Service User in the Gravity documentation. If not specified, a group named planet is created.
    Note: The ./join command does not support the --service-uid or --service-gid flags, but instead, the worker nodes use whatever value is set on the master node with ./install.
    --pod-network-cidr <10.244.0.0/16> The CIDR range Kubernetes will be allocating node subnets and pod IPs from. Must be a minimum of /16 so Kubernetes is able to allocate /24 to each node. If not specified, defaults to 10.244.0.0/16.
    --service-cidr <10.100.0.0/16> The CIDR range Kubernetes will be allocating service IPs from. If not specified, defaults to 10.100.0.0/16.
    --mount=data:/data/gravity/pv --state-dir=/data/gravity Change the location where Gravity stores containers and state information. Defaults to /var/lib/gravity. This example stores data on the /data directory. Use this command if you do not have enough disk space in /var to support 24 hours of data retention.

    This may take up to 15 minutes to install. Keep this terminal window open.

  6. If you are installing DSP on a CentOS or RHEL Operating System, to prevent Gravity hosts from running out of inotify watches, increase fs.inotify.max_user_watches to 1000000 located in /etc/sysctl.d/99-sysctl.conf. If your Gravity hosts run out of inotify watches, Gravity throws an out of disk error.
    • On each node, open the 99-sysctl.conf file in /etc/sysctl.d/
    • Add the following line to the file:
      fs.inotify.max_user_watches=1000000
    • Save your changes.
    • From the command-line of the master node, type the following command:
      sysctl -p /etc/sysctl.d/99-sysctl.conf
      .

Open the required Gravity Ports and finish the install

Gravity is a toolkit that allows developers to package their Kubernetes clusters and apps as a tarball. All Data Stream Processor services are deployed on top of Kubernetes. Complete the following steps to open the requisite Gravity ports and complete the DSP installation.

  1. Open the required ports that are required by the Data Stream Processor.
  2. After the installation process has finished, the installer prints out a command that you must use to join the other nodes to the first master node. Copy the text after gravity join.
    Wed Oct  2 23:59:56 UTC Please execute the following join commands on target nodes:
    Role    Nodes   Command
    ----    -----   -------
    worker  2       gravity join <ip-address-master> --token=<token> --role=worker
    
  3. On each of the worker nodes, enter the following.
    ./join <ip-address-of-master> --token=<token> --role=worker
    

When a minimum of two nodes have joined your cluster, the install continues and the following things occur:

  • Checks that the system is running on a supported OS.
  • Checks that the system passes pre-installation checks such as meeting the minimum system requirements.
  • Checks that the system is not already running docker or other conflicting software
  • Checks that the system has the necessary running services and kernel modules
  • Installs Docker, Kubernetes, and other software dependencies like SCloud
  • Prepares Kubernetes to run the Data Stream Processor
  • Installs the Data Stream Processor
  • Checks that the Data Stream Processor is ready for use
    • The application status hook will be invoked - failures will be tagged as "degraded" in gravity status.

Configure the Data Stream Processor UI redirect URL

By default, the Data Stream Processor uses the IPv4 address of eth0 to derive several properties required by the UI to function properly. This will work in many but not all cases.

In the case that the eth0 network is not directly accessible (for example, it exists inside a private AWS VPC) or is otherwise incorrect, use the configure-ui script to manually define the IP or hostname that can be used to access DSP.

  1. From the master node, enter the following:
    DSP_HOST=<ip-address-of-master-node> ./configure-ui
    
  2. Then, enter the following:
    ./deploy 
  3. Navigate to the Data Stream Processor UI to verify your changes.
    https://<DSP_HOST>:30000/
    
  4. On the login page, enter the following:
    User: dsp-admin
    Password: <the dsp-admin password generated from the installer>
    
  5. (Optional) If you need to retrieve the dsp-admin password, enter the following on your master node:
    ./print-login
    

Check the status of your Data Stream Processor deployment

To check the status of your cluster, do the following:

  1. From a node, type the following.
    gravity status

A response showing the current health of your cluster is displayed.

Cluster status:     active
Application:        dspbeta, version 0.1.5-503873
Join token:     cb8155ed37115fe4f70cd896e4a0eea5
Periodic updates:   Not Configured
Remote support:     Not Configured
Last completed operation:
    * operation_install (5b6cfae2-ee66-4789-a7e9-9d257d99cea9)
      started:      Fri Sep 27 16:02 UTC (3 days ago)
      completed:    Fri Sep 27 16:02 UTC (3 days ago)
Cluster endpoints:
    * Authentication gateway:
        - 172.31.24.68:32009
    * Cluster management URL:
        - https://172.31.24.68:32009
Cluster nodes:  musingbanach888
    Masters:
        * ip-172-31-24-68 / 172.31.24.68 / worker
            Status: healthy

Cluster configuration options

You can view or change the default configurations of your cluster by using the following commands.

  • get-config
  • set-config
  • list-configs
  • get-secret
  • set-secret
  • list-secrets

To set a new configuration or secret:

  1. To set a new configuration, type the following from a node.
    ./set-config <PROPERTY_NAME> <VALUE> 
    
  2. To set a new secret, type the following from a node.
    ./set-secret <SECRET_NAME>
    
  3. Deploy the changes.
    ./deploy
    


To view existing cluster configurations:

  1. From a node, type the following to see the list of configurations, including values.
    ./list-configs   
    
  2. From a node, type the following to see the list of secret keys.
    ./list-secrets
    

To view individual configurations or secrets:

  1. From a node, type the following to see an individual non-secret configuration property.
    ./get-config <PROPERTY_NAME>
    
  2. From a node, type the following to see an individual secret.
    ./get-secret <SECRET_NAME>
    
Last modified on 02 April, 2020
PREVIOUS
Hardware and Software Requirements
  NEXT
Port configuration requirements

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.0.0


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters