Splunk® Data Stream Processor

Use the Data Stream Processor

Acrobat logo Download manual as PDF


On April 3, 2023, Splunk Data Stream Processor will reach its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.
Acrobat logo Download topic as PDF

Sending data from DSP to the Splunk platform

You can send data from the Splunk Data Stream Processor (DSP) into a Splunk Enterprise or a Splunk Cloud environment using the Splunk HTTP Event Collector (HEC). DSP uses the /services/collector/ HEC endpoint to send data to a Splunk index.

The HEC endpoint accepts JSON data in two different schemas:

Some sources (e.g. Read from Splunk Firehose) may have a compatible schema out of the box, while other sources (e.g. Read from Kafka) must be transformed via DSP functions to produce a compatible record.

In order to send data to a HEC endpoint from DSP, users should send records conforming with the event or metric schemas into a Splunk Enterprise sink function. There are a number of sink functions that can be used, depending on the configuration of your DSP environment. Use the following table to determine which Splunk Enterprise sink function is best for your use case. This table also includes the SPL2 representation of a Splunk Firehose to Splunk Enterprise pipeline.

Function name Description SPL2 Example
Write to the Splunk platform with Batching This is is the recommended sink function for sending data from DSP to Splunk Enterprise. It takes DSP event or metric records as input, and performs the common workflow of dropping the attributes field, turning records into JSON payloads using one of the formatting rules linked above, and batching the bytes of those payloads for better throughput. This function also adds out-of-the-box support for index-based routing with batched data.
| from read_splunk_firehose() 
| into splunk_enterprise_indexes(
    "b5c57cbd-1470-4639-9938-deb3509cbbc8",
    cast(map_get(attributes, "index"), "string"),
    "events_idx",
    {"async": "true", "hec-enable-ack": "false", "hec-token-validation": "true"},
    "2MB",
    5000
  );
Write to the Splunk platform This is a lower level function where you pass in a HEC JSON payload of type bytes, and it simply sends them to HEC. The bytes are presumed to be bytes for a JSON payload of an appropriate schema. The Write to the Splunk platform with Batching function does three things:
  1. Converts the DSP records to Splunk event JSON and/or metric JSON.
  2. Batches records for throughput
  3. Writes the HEC JSON to the Splunk platform.

In contrast, the Write to the Splunk platform function only performs the last step. Thus, the Write to the Splunk platform function provides the user more flexibility and composability at the cost of convenience. For instance, it is possible to use batch records as the batching mechanism instead of batch_bytes, but that limits you to only being able to specify an index per payload rather than per event or per metric.

| from read_splunk_firehose()
| to_splunk_json index=cast(map_get(attributes, "index"), "string")
| batch_bytes bytes=to_bytes(json) size="2MB" millis=5000
| into splunk_enterprise(
    "b5c57cbd-1470-4639-9938-deb3509cbbc8",
    "events_idx",
    bytes,
    {"async": "true", "hec-enable-ack": "false", "hec-token-validation": "true"});
Write to the Splunk platform (Default for Environment) Use this function to send data to the default, pre-configured Splunk environment associated with your DSP installation using the default HEC token. This function acts like splunk_enterprise, except that you don't need to explicitly pass in the bytes payload. It will be extracted automatically out of the incoming data stream.
| from read_splunk_firehose()                                                                    
| to_splunk_json index=cast(map_get(attributes, "index"), "string")
| batch_bytes bytes=to_bytes(json) size="2MB" millis=5000                                      
| into write_index("","events_idx");


By default, Splunk HEC endpoints are reachable via SSL and exposed over HTTPS. To securely send data to a HEC endpoint exposed over SSL, confirm with your DSP administrator that the proper environment variables have been set. See Configure the Data Stream Processor to send data to an SSL-enabled Splunk Enterprise instance. If you are using Splunk Enterprise, you can disable SSL on your HEC endpoint by going to Data Inputs > HTTP Event Collector and clicking Global Settings.

Example Workflow: Send data from Splunk Firehose to a Splunk Enterprise Index

If you are receiving streaming data from Splunk Firehose, follow these steps to send data to a Splunk Enterprise index.

  1. Create a connection to the Splunk platform in DSP.
  2. From the Build Pipeline tab, select the Read from Splunk Firehose data source. The Splunk Firehose source function reads streaming data from the DSP Ingest, Collect, HEC, Syslog, and Forwarders services.
  3. (Optional) If your data pipeline is receiving data from the universal forwarder, you must do additional transformations on your data. See get data from a universal forwarder.
  4. Click + to add additional desired data transformations on your data pipeline.
  5. End your pipeline with the Write to the Splunk platform with Batching sink function.
  6. Click Start Preview to verify that your pipeline is sending data.
  7. Save and activate your pipeline.
  8. After activating, click on View next to your pipeline name to go to the activated View only version of your pipeline. This view allows you to see metrics across your entire pipeline.
  9. After you see data flowing through your activated pipeline, navigate to the Splunk platform.
  10. From the Search & Reporting app in the Splunk platform, search for your data:

    index="your-index-name"

  11. (Optional) If your data pipeline is streaming data but it's not showing up in your index, check the HEC dashboards in the Splunk Monitoring Console to make sure that your HEC endpoint is receiving and indexing data.
Last modified on 17 July, 2020
PREVIOUS
Masking sensitive data
  NEXT
Create a connection to the Splunk platform in DSP

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.1.0


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters