Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Send data to Microsoft Azure Event Hubs (Beta)

Send data to Microsoft Azure Event Hubs. If you are using this function on an on-premises environment of DSP, you need the DSP Universal license to use this function.

This is a beta function and not ready for production.

Before you can use this function, you must do the following:

  • Create an Azure Event Hubs connection. See Create a DSP connection to Microsoft Azure Event Hubs. When configuring this sink function, use the ID of that connection for the connection_id argument.
  • Create the destination event hub in your Azure Event Hubs namespace. For information about creating an event hub, search for "Quickstart: Create an event hub using Azure portal" in the Azure Event Hubs documentation.

    If you activate your pipeline before creating the event hub specified in the event_hub_name argument, the pipeline fails to send data to Azure Event Hubs and returns an error.

Function input schema

collection<record<R>>
This function takes in collections of records with schema R.

Required arguments

connection_id
Syntax: string
Description: The Azure Event Hubs connection ID.
Example: "576205b3-f6f5-4ab7-8ffc-a4089a95d0c4"
event_hub_name
Syntax: string
Description: The name of the destination event hub.
Example: My Event Hub

Make sure that the destination event hub exists in your Azure Event Hubs namespace. If you activate your pipeline before the specified event hub is created, the pipeline fails to send data to Azure Event Hubs and returns an error.

value
Syntax: expression<bytes>
Description: The event body or payload to send to Azure Event Hubs.
Example: to_bytes (cast(body, "string"))

Optional arguments

key
Syntax: expression<string>
Description: The partition key to assign to the event. The default value is null.
Example: "1"
parameters
Syntax: map<string, string>
Description: Key-value pairs that specify how this sink function sends data to Azure Event Hubs. The following keys are supported:
  • batch_window: A number in string format. The amount of time to wait for data to accumulate in a batch before sending events in batches to Azure Event Hubs. This batch window can range from 10 milliseconds to 10,000 milliseconds, inclusive. The default value is 1000. See Event batching on this page for more information.
  • batch_size: A number in string format. The amount of data to accumulate in a batch before sending events in batches to Azure Event Hubs. This batch size can range from 1 byte to 100,000 bytes, inclusive. The default value is 20000. See Event batching on this page for more information.
  • unordered: A boolean indicating that strict ordering of events is not required. If the order of the events is not important, you can improve the throughput of this sink function by setting unordered to true and setting partition_key to an empty string or null. The default value is false.
Example: unordered = false

SPL2 example

You can write the function by providing the arguments in this exact order.

...| into event_hubs("connection_id", "My Event Hub", to_bytes (cast(body, "string")), "1", {"batch_window": "5000", "batch_size": "10000", "unordered": "false"});

Alternatively, you can use named arguments in any order and leave out optional arguments you don't want to declare. All unprovided arguments use their default values. See SPL2 syntax for more details. The following example skips the key argument and only declares the parameters argument.

...| into event_hubs("connection_id", "My Event Hub", to_bytes (cast(body, "string")), parameters: {"batch_window": "5000", "batch_size": "10000", "unordered": "false"});

Event batching

This sink function collects pipeline events into a batch, and then sends the batch to Azure Event Hubs when either of the thresholds specified by batch_window or batch_size is reached. However, because Azure Event Hubs batches incoming data according to a different internal logic, the batches that are sent out of your data pipeline may not correspond exactly with the batches that are stored at the destination event hub.

If an event exceeds the maximum batch capacity, the job will fail and it will not be restarted.

Last modified on 10 November, 2020
PREVIOUS
Send data to Kafka
  NEXT
Send data to SignalFx (metric)

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters