Splunk® Data Stream Processor

Use the Data Stream Processor

On April 3, 2023, Splunk Data Stream Processor will reach its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.

Create a pipeline with multiple data sources

You can send data from multiple data sources in a single pipeline.

  1. From the Build Pipeline page, select a data source.
  2. From the Canvas view of your pipeline, click the + icon, and add a union function to your pipeline.
  3. In the Union function configuration panel, click Add a new function.
  4. Add a second source function to your pipeline.
  5. (Optional) In order to union two data streams, they must have the same schema. If your data streams don't have the same schema, you can use the normalize streaming function to match your schemas.
  6. Continue building your pipeline by clicking the + icon to the immediate right of the union function.

Create a pipeline with two data sources: Kafka and Read from Splunk Firehose

In this example, create a pipeline with two data sources, Kafka and Read from Splunk Firehose, and union the two data streams by normalizing them to fit the expected Kafka schema.

The following screenshot shows two data streams from two different data sources being unioned together into one data stream in a pipeline. This screen image shows two data streams from two different data sources being unioned together in a pipeline.


Prerequisites

Steps

  1. From the Build Pipeline page, select the Read from Splunk Firehose data source.
  2. From the Canvas view of your pipeline, add a union function to your pipeline.
  3. In the Union function configuration panel, click Add new function.
  4. Select the Read from Kafka source function, and provide your connection and topic names.
  5. Normalize the schemas to match. Hover over the circle in between the Read from Splunk Firehose and Union functions, click the + icon, and add an Eval function.
  6. In the Eval function, type the following streams DSL. This DSL converts the DSP event schema to the expected Kafka schema.
    as(to_bytes(cast(get("body"), "string")),"value");
    as(get("source_type"),"topic");
    as(to_bytes(time()), "key");
    
  7. Hover over the circle in between the Eval and Union functions, click the + icon, and add a Fields function.
  8. Because the Kafka record schema contains value, topic, and key fields, drop the other fields in your record by typing value, clicking + Add, typing topic, clicking + Add, and then typing key in the Fields function.
  9. Now, let's normalize the other data stream. Hover over the circle in between the Read from Kafka and Union functions, click the + icon, and add another Fields function.
  10. In the Fields function, type value, click + Add, type topic, click + Add, and then type key.
  11. Validate your pipeline.

You now have a pipeline that reads from two data sources, Kafka and Read from Firehose, and merges the data from both sources into one data stream.

Last modified on 05 March, 2020
Backup, restore, and share pipelines using Streams JSON   Data Stream Processor data types

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.0.1


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters