Connecting your DSP pipeline to a Splunk index

You can send data from the into a Splunk Enterprise or a Splunk Cloud Platform environment using the Splunk HTTP Event Collector (HEC). DSP uses the /services/collector/ HEC endpoint to send data to a Splunk index.

The HEC endpoint accepts JSON data in two different schemas:

Some sources (such as Splunk DSP Firehose) have a compatible schema out of the box, while other sources (such as Kafka) must be transformed via DSP functions to produce a compatible record.

In order to send data to a HEC endpoint from DSP, you must send records that conform with the event or metric schemas into a Splunk index sink function. There are a number of sink functions that can be used, depending on the configuration of your DSP environment. Use the following table to determine which Splunk index sink function is best for your use case. This table also includes the SPL2 representation of a Splunk DSP Firehose to Splunk index pipeline.

Function name Description SPL2 Example

Send to a Splunk Index with Batching

This is is the recommended sink function for sending data from DSP to a Splunk index. It takes DSP event or metric records as input, and performs the common workflow of dropping the attributes field, turning records into JSON payloads using one of the formatting rules linked above, and batching the bytes of those payloads for better throughput. This function also adds out-of-the-box support for index-based routing with batched data.

| from splunk_firehose() 
| into splunk_enterprise_indexes(
    "b5c57cbd-1470-4639-9938-deb3509cbbc8",
    cast(map_get(attributes, "index"), "string"),
    "events_idx",
    {"hec-enable-ack": "false", "hec-token-validation": "true"},
    "2MB",
    5000
  );

Send to a Splunk Index

This is a lower level function where you pass in a HEC JSON payload of type bytes, and it simply sends them to HEC. The bytes are presumed to be bytes for a JSON payload of an appropriate schema. The Send to a Splunk Index with Batching function does three things:

Converts the DSP records to Splunk event JSON and/or metric JSON.
Batches records for throughput.
Writes the HEC JSON to the Splunk platform.

In contrast, the Send to a Splunk Index function only performs the last step. Thus, the Send to a Splunk Index function provides the user more flexibility and composability at the cost of convenience. For instance, it is possible to use batch_records as the batching mechanism instead of batch_bytes, but that limits you to only being able to specify an index per payload rather than per event or per metric.

| from splunk_firehose()
| to_splunk_json index=cast(map_get(attributes, "index"), "string")
| batch_bytes bytes=to_bytes(json) size="2MB" millis=5000
| into splunk_enterprise(
    "b5c57cbd-1470-4639-9938-deb3509cbbc8",
    "events_idx",
    bytes,
    {"hec-enable-ack": "false", "hec-token-validation": "true"});

Send to a Splunk Index (Default for Environment)

Use this function to send data to the default, pre-configured Splunk environment associated with your DSP installation using the default HEC token. This function acts like splunk_enterprise, except that you don't need to explicitly pass in the bytes payload. It will be extracted automatically out of the incoming data stream.

| from splunk_firehose()                                                                    
| to_splunk_json index=cast(map_get(attributes, "index"), "string")
| batch_bytes bytes=to_bytes(json) size="2MB" millis=5000                                      
| into index("","events_idx");

By default, Splunk HEC endpoints are reachable via SSL and exposed over HTTPS. To securely send data to a HEC endpoint exposed over SSL, confirm with your DSP administrator that the proper environment variables have been set. See Configure the Data Stream Processor to send data to a self-signed Splunk Enterprise instance. If you are using Splunk Enterprise, you can disable SSL on your HEC endpoint by going to Data Inputs > HTTP Event Collector and clicking Global Settings.

Example Workflow: Send data from Splunk DSP Firehose to a Splunk Enterprise index

If you are receiving streaming data from Splunk DSP Firehose, follow these steps to send data to a Splunk Enterprise index.

Create a DSP connection to a Splunk index.
From the Build Pipeline tab, select the Splunk DSP Firehose data source. The Splunk DSP Firehose source function reads streaming data from the Ingest, Collect, HEC, and Forwarders services.
(Optional) If your data pipeline is receiving data from the universal forwarder, you must apply additional transformations to your data. See Process data from a universal forwarder in DSP.
Click + to add additional desired data transformations to your data pipeline.
End your pipeline with the Send to a Splunk Index with Batching sink function.
Click Start Preview to verify that your pipeline is sending data.
Save and activate your pipeline. View the metrics displayed on the functions to confirm that data is flowing through the pipeline.
After you see data flowing through your activated pipeline, navigate to the Splunk platform.
From the Search & Reporting app in the Splunk platform, search for your data:
index="your-index-name"
(Optional) If data is streaming through your pipeline but not showing up in your index, check the HEC dashboards in the Splunk Monitoring Console to make sure that your HEC endpoint is receiving and indexing data.

Related answers from Splunk Community

Connecting your DSP pipeline to a Splunk index

Example Workflow: Send data from Splunk DSP Firehose to a Splunk Enterprise index

Comments

Connecting your DSP pipeline to a Splunk index

Was this topic useful?