Create a pipeline with multiple data sources

You can send data from multiple data sources in a single pipeline.

From the Build Pipeline page, select a data source.
From the Canvas view of your pipeline, click the + icon, and add a union function to your pipeline.
In the Union function configuration panel, click Add a new function.
Add a second source function to your pipeline.
(Optional) In order to union two data streams, they must have the same schema. If your data streams don't have the same schema, you can use the normalize streaming function to match your schemas.
Continue building your pipeline by clicking the + icon to the immediate right of the union function.

Create a pipeline with two data sources: Kafka and Read from Splunk Firehose

In this example, create a pipeline with two data sources, Kafka and Read from Splunk Firehose, and union the two data streams by normalizing them to fit the expected Kafka schema.

The following screenshot shows two data streams from two different data sources being unioned together into one data stream in a pipeline.

Prerequisites

A Kafka connection

Steps

From the Build Pipeline page, select the Read from Splunk Firehose data source.
From the Canvas view of your pipeline, add a union function to your pipeline.
In the Union function configuration panel, click Add new function.
Select the Read from Kafka source function, and provide your connection and topic names.
Normalize the schemas to match. Hover over the circle in between the Read from Splunk Firehose and Union functions, click the + icon, and add an Eval function.
In the Eval function, type the following streams DSL. This DSL converts the DSP event schema to the expected Kafka schema.
```
as(to_bytes(cast(get("body"), "string")),"value");
as(get("source_type"),"topic");
as(to_bytes(time()), "key");
```
Hover over the circle in between the Eval and Union functions, click the + icon, and add a Fields function.
Because the Kafka record schema contains value, topic, and key fields, drop the other fields in your record by typing value, clicking + Add, typing topic, clicking + Add, and then typing key in the Fields function.
Now, let's normalize the other data stream. Hover over the circle in between the Read from Kafka and Union functions, click the + icon, and add another Fields function.
In the Fields function, type value, click + Add, type topic, click + Add, and then type key.
Validate your pipeline.

You now have a pipeline that reads from two data sources, Kafka and Read from Firehose, and merges the data from both sources into one data stream.

Create a pipeline with multiple data sources

Create a pipeline with two data sources: Kafka and Read from Splunk Firehose

Comments

Create a pipeline with multiple data sources

Was this topic useful?