Send data to Microsoft Azure Event Hubs (Beta)

Use the Send to Microsoft Azure Event Hubs sink function to send data to Azure Event Hubs.

This is a beta function and not ready for production.

Prerequisites

Before you can use this function, you must do the following:

Create a connection. See Create a connection to Microsoft Azure Event Hubs in the Connect to Data Sources and Destinations with the manual. When configuring this sink function, set the connection_id argument to the ID of that connection.
Create the destination event hub in your Azure Event Hubs namespace. For information about creating an event hub, search for "Quickstart: Create an event hub using Azure portal" in the Azure Event Hubs documentation.
If you activate your pipeline before creating the event hub specified in the event_hub_name argument, the pipeline fails to send data to Azure Event Hubs and returns an error.

Function input schema

collection<record<R>>: This function takes in collections of records with schema R.

Required arguments

connection_id: Syntax: string; Description: The Azure Event Hubs connection ID.; Example in Canvas View: "576205b3-f6f5-4ab7-8ffc-a4089a95d0c4"
event_hub_name: Syntax: string; Description: The name of the destination event hub.; Example in Canvas View: My Event Hub
Make sure that the destination event hub exists in your Azure Event Hubs namespace. If you activate your pipeline before the specified event hub is created, the pipeline fails to send data to Azure Event Hubs and returns an error.
value: Syntax: expression<bytes>; Description: The event body or payload to send to Azure Event Hubs.; Example in Canvas View: to_bytes (cast(body, "string"))

Optional arguments

key

Syntax: expression<string>

Description: The partition key to assign to the event. The default value is null.

Example in Canvas View: "1"

parameters

Syntax: map<string, string>

Description: Key-value pairs that specify how this sink function sends data to Azure Event Hubs. The following keys are supported:

batch_window: A number in string format. The amount of time to wait for data to accumulate in a batch before sending events in batches to Azure Event Hubs. This batch window can range from 10 milliseconds to 10,000 milliseconds, inclusive. The default value is 1000. See Event batching on this page for more information.
batch_size: A number in string format. The amount of data to accumulate in a batch before sending events in batches to Azure Event Hubs. This batch size can range from 1 byte to 100,000 bytes, inclusive. The default value is 20000. See Event batching on this page for more information.
unordered: A boolean indicating that strict ordering of events is not required. If the order of the events is not important, you can improve the throughput of this sink function by setting unordered to true and setting partition_key to an empty string or null. The default value is false.

Example in Canvas View: unordered = false

SPL2 example

When working in the SPL View, you can write the function by providing the arguments in this exact order.

...| into event_hubs("connection_id", "My Event Hub", to_bytes (cast(body, "string")), "1", {"batch_window": "5000", "batch_size": "10000", "unordered": "false"});

Alternatively, you can use named arguments in any order and leave out optional arguments you don't want to declare. All unprovided arguments use their default values. The following example skips the key argument and only declares the parameters argument.

...| into event_hubs("connection_id", "My Event Hub", to_bytes (cast(body, "string")), parameters: {"batch_window": "5000", "batch_size": "10000", "unordered": "false"});

If you want to use a mix of unnamed and named arguments in your functions, you need to list all unnamed arguments in the correct order before providing the named arguments.

Event batching

This sink function collects pipeline events into a batch, and then sends the batch to Azure Event Hubs when either of the thresholds specified by batch_window or batch_size is reached. However, because Azure Event Hubs batches incoming data according to a different internal logic, the batches that are sent out of your data pipeline may not correspond exactly with the batches that are stored at the destination event hub.

If an event exceeds the maximum batch capacity, the job will fail and it will not be restarted.

Related answers from Splunk Community

Send data to Microsoft Azure Event Hubs (Beta)

Prerequisites

Function input schema

Required arguments

Optional arguments

SPL2 example

Event batching

Comments

Send data to Microsoft Azure Event Hubs (Beta)

Was this topic useful?