Send data to Microsoft Azure Event Hubs (Beta)
Use the Send to Microsoft Azure Event Hubs sink function to send data to Azure Event Hubs.
This is a beta function and not ready for production.
Prerequisites
Before you can use this function, you must do the following:
- Create a connection. See Create a connection to Microsoft Azure Event Hubs in the Connect to Data Sources and Destinations with the manual. When configuring this sink function, set the
connection_id
argument to the ID of that connection. - Create the destination event hub in your Azure Event Hubs namespace. For information about creating an event hub, search for "Quickstart: Create an event hub using Azure portal" in the Azure Event Hubs documentation.
If you activate your pipeline before creating the event hub specified in the
event_hub_name
argument, the pipeline fails to send data to Azure Event Hubs and returns an error.
Function input schema
- collection<record<R>>
- This function takes in collections of records with schema R.
Required arguments
- connection_id
- Syntax: string
- Description: The Azure Event Hubs connection ID.
- Example in Canvas View: "576205b3-f6f5-4ab7-8ffc-a4089a95d0c4"
- event_hub_name
- Syntax: string
- Description: The name of the destination event hub.
- Example in Canvas View: My Event Hub
Make sure that the destination event hub exists in your Azure Event Hubs namespace. If you activate your pipeline before the specified event hub is created, the pipeline fails to send data to Azure Event Hubs and returns an error.
- value
- Syntax: expression<bytes>
- Description: The event body or payload to send to Azure Event Hubs.
- Example in Canvas View: to_bytes (cast(body, "string"))
Optional arguments
- key
- Syntax: expression<string>
- Description: The partition key to assign to the event. The default value is
null
. - Example in Canvas View: "1"
- parameters
- Syntax: map<string, string>
- Description: Key-value pairs that specify how this sink function sends data to Azure Event Hubs. The following keys are supported:
batch_window
: A number in string format. The amount of time to wait for data to accumulate in a batch before sending events in batches to Azure Event Hubs. This batch window can range from 10 milliseconds to 10,000 milliseconds, inclusive. The default value is1000
. See Event batching on this page for more information.batch_size
: A number in string format. The amount of data to accumulate in a batch before sending events in batches to Azure Event Hubs. This batch size can range from 1 byte to 100,000 bytes, inclusive. The default value is20000
. See Event batching on this page for more information.unordered
: A boolean indicating that strict ordering of events is not required. If the order of the events is not important, you can improve the throughput of this sink function by settingunordered
totrue
and settingpartition_key
to an empty string ornull
. The default value isfalse
.
- Example in Canvas View: unordered = false
SPL2 example
When working in the SPL View, you can write the function by providing the arguments in this exact order.
...| into event_hubs("connection_id", "My Event Hub", to_bytes (cast(body, "string")), "1", {"batch_window": "5000", "batch_size": "10000", "unordered": "false"});
Alternatively, you can use named arguments in any order and leave out optional arguments you don't want to declare. All unprovided arguments use their default values. The following example skips the key
argument and only declares the parameters
argument.
...| into event_hubs("connection_id", "My Event Hub", to_bytes (cast(body, "string")), parameters: {"batch_window": "5000", "batch_size": "10000", "unordered": "false"});
If you want to use a mix of unnamed and named arguments in your functions, you need to list all unnamed arguments in the correct order before providing the named arguments.
Event batching
This sink function collects pipeline events into a batch, and then sends the batch to Azure Event Hubs when either of the thresholds specified by batch_window
or batch_size
is reached. However, because Azure Event Hubs batches incoming data according to a different internal logic, the batches that are sent out of your data pipeline may not correspond exactly with the batches that are stored at the destination event hub.
If an event exceeds the maximum batch capacity, the job will fail and it will not be restarted.
Send data to Kafka | Send data to SignalFx (metric) |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5, 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5, 1.4.6
Feedback submitted, thanks!