Splunk® Data Stream Processor

Install and administer the Data Stream Processor

Acrobat logo Download manual as PDF


On April 3, 2023, Splunk Data Stream Processor reached its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.

All DSP releases prior to DSP 1.4.0 use Gravity, a Kubernetes orchestrator, which has been announced end-of-life. We have replaced Gravity with an alternative component in DSP 1.4.0. Therefore, we will no longer provide support for versions of DSP prior to DSP 1.4.0 after July 1, 2023. We advise all of our customers to upgrade to DSP 1.4.0 in order to continue to receive full product support from Splunk.
Acrobat logo Download topic as PDF

Troubleshoot your Splunk Data Stream Processor deployment

Use this information to troubleshoot issues relating to the Splunk Data Stream Processor (DSP) installation and deployment.

Support

To report bugs or receive additional support, do the following:

When contacting Splunk Customer Support, provide the following information:

Information to provide Notes
Pipeline ID To view the ID of a pipeline, open the pipeline in DSP, then click the pipeline options icon (DSP Ellipses button) and select Update pipeline metadata.
Pipeline name N/A
DSP version To view your DSP version, in the product UI, click the More Options icon (DSP Ellipses button) and select About.
DSP diagnostic report A DSP diagnostic report contains all DSP application logs as well as system and monitoring logs.


To generate this report, do the following:

  1. Navigate to the working directory of a DSP controller node. To identify the controller nodes in your DSP cluster, run the ./dsp status nodes command and check the ROLE section of the returned information. By default, the name of the working directory is dsp-<version>-linux-amd64, where <version> is the DSP version that you're running.
  2. Run the following command: sudo dsp report

The command creates a diagnostic report named dsp-report-<timestamp>.tar.gz in the working directory.

Summary of the problem and any additional relevant information N/A

[ERROR]: cannot allocate memory

DSP on RHEL7/CENTOS7 fails with a warning message similar to the following:

Warning FailedCreatePodContainer 5s (x2 over 16s) kubelet, 10.234.0.181 unable to ensure pod container exists: failed to create container for [kubepods burstable poded9bd025-c3e4-4ebb-a5b7-2a7adab9742d] : mkdir /sys/fs/cgroup/memory/kubepods/burstable/poded9bd025-c3e4-4ebb-a5b7-2a7adab9742d: cannot allocate memory

Cause

This warning is caused by a bug in the older RHEL7/CENTOS7 kernels such as v3.10.0-1127.19.1.el7 in combination with systemd v 231 or earlier where kernel-memory cgroups are not cleaned up properly. This manifests as a memory allocation error when new pods are created. For more information see the Kubernetes bug report: Kubelet CPU/Memory Usage linearly increases using CronJob.

Solution

Upgrade systemd to v232 or later, or disable kernel memory accounting by setting cgroup.memory=nokmem.

Do the following steps to disable kernel memory accounting:

  1. Find the kernel version.
    grubby --default-kernel
  2. Disable the kernel memory accounting by adding nokmem to the kernel boot parameters.
    grubby --args=cgroup.memory=nokmem --update-kernel /boot/<kernel_version>
  3. Reboot the host.

[ERROR]: waiting for agents to join:

You may see this error while running the installer.

Cause

The DSP installer waits ten minutes for nodes to join your cluster. If your cluster does not have a minimum of 3 nodes and ten minutes have elapsed, then the installer times out.

Solution

You must remove all nodes from k0s and start the installation process again.

  1. On the controller node, run sudo ./dsp leave --confirm to force the node to leave the k0s cluster.
  2. Make sure that you have all three nodes prepared, and then start the installation process over again.

[ERROR]: The following pre-flight checks failed:

The DSP installer fails to complete because of pre-flight checks.

Cause

The DSP installer runs pre-flight checks to make sure that your system meets the minimum requirements required for DSP. If your system does not meet the minimum requirements necessary for DSP, the installer quits installation.

Solution

The installer returns which pre-flight checks failed. Using that information, double-check that you meet the mandatory Hardware and Software requirements for DSP. See Hardware and Software Requirements.

[ERROR]: The following pre-flight checks failed: XXGB available space left on /var/data, minimum of 175GB is required

The DSP installer fails to complete because there isn't enough space left on /var/data even if another disk volume or partition is specified with --location.

Cause

The DSP installer runs a pre-flight check to make sure that your system has enough drive space on /var/data even if you have used --location to install DSP on another drive or partition. If there isn't enough disk space on /var, the pre-flight check fails with a disk space error.

Solution

Add a symlink from your intended install location to /var/data. For example, if you want to use --location /data, then add the following symlink.

ln -s /data /var/data

The DSP installer fails to complete due to clocks being out of sync

During DSP installation, the console returns the following error message.

Operation failure: servers ip-10-216-29-75 and ip-10-216-29-6 clocks are out of sync: Fri Sep 11 22:23:01.863 UTC and Fri Sep 11 22:23:02.562 UTC respectively, sync the times on servers before install, e.g. using ntp

Cause

The time difference between servers is greater than 300 milliseconds.

Solution

Synchronize the system clocks on each node. For most environments, Network Time Protocol (NTP) is the best approach. Consult the system documentation for the particular operating systems on which you are running the Splunk Data Stream Processor. If you are running DSP on an AWS EC2 environment, see "Setting the time for your Linux instance" in the Amazon Web Services documentation. If you are running DSP on a different environment, see "NTP" in the Debian documentation or the Chrony documentation.

Network bridge driver loading issues

Depending on the system configuration, network bridge drivers may not be loaded. If they are not loaded, the install fails at the /health phase. See installation checklist.

Installation failure due to disabled Network Bridge Driver

The installation fails with the following error message:

[ERROR]: failed to execute phase "/health" planet is not running yet: &{degraded [{ 10.216.31.29 master  degraded [kubernetes requires net.bridge.bridge-nf-call-iptables sysctl set to 1, https://www.gravitational.com/docs/faq/#bridge-driver]} { 10.216.31.218 master  degraded [kubernetes requires net.bridge.bridge-nf-call-iptables sysctl set to 1, https://www.gravitational.com/docs/faq/#bridge-driver]} { 10.216.31.252 master  healthy []}]} (planet is not running yet: &{degraded [{ 10.216.31.29 master  degraded [kubernetes requires net.bridge.bridge-nf-call-iptables sysctl set to 1, https://www.gravitational.com/docs/faq/#bridge-driver]} { 10.216.31.218 master  degraded [kubernetes requires net.bridge.bridge-nf-call-iptables sysctl set to 1, https://www.gravitational.com/docs/faq/#bridge-driver]} { 10.216.31.252 master  healthy []}]})

You must do the following on each node:

  1. sysctl -w net.bridge.bridge-nf-call-iptables=1
  2. echo net.bridge.bridge-nf-call-iptables=1 >> /etc/sysctl.d/10-bridge-nf-call-iptables.conf

Then, restart the installation process.

Unable to log in to the Splunk Cloud Services CLI

Logging in to the Splunk Cloud Services CLI results in a failed to get session token: failed to get valid response from csrfToken endpoint: Get "https://<ip_addr>/csrfToken": x509: cannot validate certificate for <ip> because it doesn't contain any IP SANs error.

Cause

The Splunk Cloud Services CLI configuration file is incorrectly configured.

Solution

Make sure that your Splunk Cloud Services CLI settings are configured correctly, given the particular version of the Splunk Cloud Services CLI that you are using. See Configure the Splunk Cloud Services CLI.

DSP UI times out

The DSP UI appears to find the controller node but fails to load.

Cause

The controller node's IP address has been changed, and the DSP UI is trying to redirect your browser to a private IP. Such IP reassignments are common with various public cloud providers when servers are stopped.

Solution

Reconfigure the DSP UI redirect URL. See Configure the Data Stream Processor UI redirect URL.

My data is not making it into my pipeline

If data is not making it into your activated pipelines, check to see whether all the ingestion services are running in Kubernetes.

Cause

One of the ingestion services could be down.

Solution

Make sure that all the ingest services are running. The ingest services are: ingest-hec, ingest-s2s, and splunk-streaming-rest.

kubectl get pods -n dsp
Last modified on 13 January, 2023
PREVIOUS
Use the Splunk App for DSP to monitor your DSP deployment
 

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.4.0, 1.4.1, 1.4.2, 1.4.3


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters