Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF


DSP 1.2.0 is impacted by the CVE-2021-44228 and CVE-2021-45046 security vulnerabilities from Apache Log4j. To fix these vulnerabilities, you must upgrade to DSP 1.2.4. See Upgrade the Splunk Data Stream Processor to 1.2.4 for upgrade instructions.

On October 30, 2022, all 1.2.x versions of the Splunk Data Stream Processor will reach its end of support date. See the Splunk Software Support Policy for details.
Acrobat logo Download topic as PDF

Send data to Microsoft Azure Event Hubs (Beta)

Use the Send to Microsoft Azure Event Hubs sink function to send data to Azure Event Hubs.

This is a beta function and not ready for production.

Prerequisites

Before you can use this function, you must do the following:

  • Create a connection. See Create a connection to Microsoft Azure Event Hubs in the Connect to Data Sources and Destinations with the manual. When configuring this sink function, set the connection_id argument to the ID of that connection.
  • Create the destination event hub in your Azure Event Hubs namespace. For information about creating an event hub, search for "Quickstart: Create an event hub using Azure portal" in the Azure Event Hubs documentation.

    If you activate your pipeline before creating the event hub specified in the event_hub_name argument, the pipeline fails to send data to Azure Event Hubs and returns an error.

Function input schema

collection<record<R>>
This function takes in collections of records with schema R.

Required arguments

connection_id
Syntax: string
Description: The Azure Event Hubs connection ID.
Example in Canvas View: "576205b3-f6f5-4ab7-8ffc-a4089a95d0c4"
event_hub_name
Syntax: string
Description: The name of the destination event hub.
Example in Canvas View: My Event Hub

Make sure that the destination event hub exists in your Azure Event Hubs namespace. If you activate your pipeline before the specified event hub is created, the pipeline fails to send data to Azure Event Hubs and returns an error.

value
Syntax: expression<bytes>
Description: The event body or payload to send to Azure Event Hubs.
Example in Canvas View: to_bytes (cast(body, "string"))

Optional arguments

key
Syntax: expression<string>
Description: The partition key to assign to the event. The default value is null.
Example in Canvas View: "1"
parameters
Syntax: map<string, string>
Description: Key-value pairs that specify how this sink function sends data to Azure Event Hubs. The following keys are supported:
  • batch_window: A number in string format. The amount of time to wait for data to accumulate in a batch before sending events in batches to Azure Event Hubs. This batch window can range from 10 milliseconds to 10,000 milliseconds, inclusive. The default value is 1000. See Event batching on this page for more information.
  • batch_size: A number in string format. The amount of data to accumulate in a batch before sending events in batches to Azure Event Hubs. This batch size can range from 1 byte to 100,000 bytes, inclusive. The default value is 20000. See Event batching on this page for more information.
  • unordered: A boolean indicating that strict ordering of events is not required. If the order of the events is not important, you can improve the throughput of this sink function by setting unordered to true and setting partition_key to an empty string or null. The default value is false.
Example in Canvas View: unordered = false

SPL2 example

When working in the SPL View, you can write the function by providing the arguments in this exact order.

...| into event_hubs("connection_id", "My Event Hub", to_bytes (cast(body, "string")), "1", {"batch_window": "5000", "batch_size": "10000", "unordered": "false"});

Alternatively, you can use named arguments in any order and leave out optional arguments you don't want to declare. All unprovided arguments use their default values. The following example skips the key argument and only declares the parameters argument.

...| into event_hubs("connection_id", "My Event Hub", to_bytes (cast(body, "string")), parameters: {"batch_window": "5000", "batch_size": "10000", "unordered": "false"});

If you want to use a mix of unnamed and named arguments in your functions, you need to list all unnamed arguments in the correct order before providing the named arguments.

Event batching

This sink function collects pipeline events into a batch, and then sends the batch to Azure Event Hubs when either of the thresholds specified by batch_window or batch_size is reached. However, because Azure Event Hubs batches incoming data according to a different internal logic, the batches that are sent out of your data pipeline may not correspond exactly with the batches that are stored at the destination event hub.

If an event exceeds the maximum batch capacity, the job will fail and it will not be restarted.

Last modified on 14 April, 2021
PREVIOUS
Send data to Kafka
  NEXT
Send data to SignalFx (metric)

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5, 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.4.2, 1.4.3


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters