Splunk® Data Stream Processor

Install and administer the Data Stream Processor

Acrobat logo Download manual as PDF


DSP 1.2.1 is impacted by the CVE-2021-44228 and CVE-2021-45046 security vulnerabilities from Apache Log4j. To fix these vulnerabilities, you must upgrade to DSP 1.2.2-patch02. See Upgrade the Splunk Data Stream Processor from 1.2.1 to 1.2.2-patch02 for upgrade instructions, and Splunk Security Advisory for Apache Log4j (CVE-2021-44228 and CVE-2021-45046) for more information.
Acrobat logo Download topic as PDF

Increase internal partitions to improve pipeline throughput

The throughput of your pipelines is highly correlated with the parallelism of the pipeline. You can increase the parallelism of certain pipelines by increasing the number of input partitions of the internal Apache Pulsar message bus. The Splunk Data Stream Processor uses Apache Pulsar as the message bus for the following data sources: Read from Splunk Firehose, Read from Forwarders Service, and Read from the Ingest REST API.

Data loss may occur when decreasing the number of partitions later on. Therefore, if you want to increase the number of input partitions, make sure that you do not overallocate input partitions in the process. If you do need to decrease the number of partitions, contact Splunk Support.

Steps:

  1. From a master node in your cluster, get a list of running Apache Pulsar broker pods.
    kubectl get pods -n pulsar
  2. Log into a running broker pod.
    kubectl exec -it broker-0 -n pulsar /bin/bash
    
  3. (Optional) Get the current number of partitions.
    pulsar-admin topics get-partitioned-topic-metadata DSP/default-ingest/input
  4. Use the pulsar-admin CLI tool to update the number of partitions.
    pulsar-admin topics update-partitioned-topic -p <Number-of-Partitions> DSP/default-ingest/input  
  5. Confirm that the number of partitions has been changed by using the pulsar-admin CLI tool again.
    pulsar-admin topics get-partitioned-topic-metadata DSP/default-ingest/input
    
  6. Log in to the Data Stream Processor and restart your pipelines for changes to take effect.

To further improve pipeline throughput, you can add a batching function in your pipeline. See batch bytes or batch records.

Last modified on 14 June, 2021
PREVIOUS
Back up your Splunk Data Stream Processor deployment
  NEXT
About the Splunk App for DSP

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.1, 1.2.2-patch02


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters