Connecting Kafka to your DSP pipeline as a data source
When creating a data pipeline in Splunk Data Stream Processor, you can connect to an Apache Kafka or Confluent Kafka broker and use it as a data source. You can get data from Kafka into a pipeline, transform the data as needed, and then send the transformed data out from the pipeline to a destination of your choosing.
If you have a Universal license, you can also use Kafka as a data destination. See Connecting Kafka to your DSP pipeline as a data destination for information about this use case. See Licensing for the Splunk Data Stream Processor for information about licensing.
DSP supports three types of connections for accessing Kafka brokers:
|Kafka connection type||Description|
|SASL-authenticated||Username and password authentication is used. You can choose to protect your credentials using SCRAM (Salted Challenge Response Authentication Mechanism) or leave them in plaintext. The connection is encrypted using SSL.
|SSL-authenticated||Two-way SSL authentication is used, so that DSP and the Kafka broker authenticate each other using the SSL protocol. Additionally, the connection is encrypted using SSL.
|Unauthenticated||No authentication takes place between DSP and the Kafka broker. The connection is not encrypted.
To connect to Kafka as a data source, you must complete the following tasks:
- Create a connection that allows DSP to access your Kafka data.
- To create a SASL-authenticated connection, see Create a SASL-authenticated DSP connection to Kafka.
- To create an SSL-authenticated connection, see Create an SSL-authenticated DSP connection to Kafka.
- To create an unauthenticated connection, see Create an unauthenticated DSP connection to Kafka.
- Create a pipeline that starts with the Kafka source function. See the Building a pipeline chapter in the Use the Data Stream Processor manual for instructions on how to build a data pipeline.
- Configure the Kafka source function to use your Kafka connection. See Get data from Kafka in the Function Reference manual.
- (Optional) Convert the
valuefield in the Kafka records from bytes to a more commonly supported data type such as strings. This conversion makes the field human-readable during data preview and compatible with a wider range of streaming functions. See Deserialize and preview data from Kafka in DSP.
When you activate the pipeline, the source function starts collecting data from Kafka.
If your data fails to get into DSP, check the connection settings to make sure you have the correct broker, as well as the correct credentials and certificates if you are using an authenticated connection. DSP doesn't run a check to see if you enter valid credentials.
Formatting DSP data for Parquet files in Amazon S3
Connecting Kafka to your DSP pipeline as a data destination
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.3.0, 1.3.1, 1.4.0
Feedback submitted, thanks!