How the destination for Edge Processor works

In order to send data from an Edge Processor to a storage location such as an index or an Amazon S3 bucket, you must define the location as a destination in the Edge Processor service. Each destination contains the connection information necessary for allowing an Edge Processor to send data to a given location.

The steps for adding a destination to the Edge Processor service varies depending on whether the destination is part of the Splunk Cloud Platform deployment that's connected to your cloud tenant:

When you connect your tenant to a Splunk Cloud Platform deployment as part of the first-time setup of the Edge Processor solution, all the indexers and indexes that the service account can access become available as destinations. For information about working with destinations that are associated with this connection, see Send data from Edge Processors to the Splunk Cloud Platform deployment connected to your tenant.
For destinations that are not part of the connected deployment, such as Amazon S3 buckets or indexes from other Splunk platform deployments, you must use the Destinations page in the Edge Processor service to add and configure them. See the following pages for more information:

You can confirm the destinations that are available by checking the Destinations page, and view additional details about a given destination by selecting it on the Destinations page.

What happens to my data if a destination becomes unavailable?

Edge Processors currently provide no data delivery guarantees. However, to help prevent data loss, the Edge Processor instance holds data in a queue if it is unable to send data to a destination or if it receives more data than it can send. If the queue fills up before the destination is available again, then the Edge Processor back pressures the data until it is ready to be sent to the destination and will continue to attempt to put data in the queue unless the Edge Processor needs to restart or shut down. If the Edge Processor instance shuts down or restarts while data is being sent, data cannot be written to a persistent queue which can cause data loss.

Queued data is stored on the hard drive of the Edge Processor host. By default, the queue is configured to hold up to 10000 batches of events. Depending on which receiver you use, each batch can contain various amounts of events ranging from 1 to 128 events. The amount of data contained and how quickly the queue fills up varies depending on the rate at which the Edge Processor is receiving data.

If your pipeline uses either the branch or route command and one of the queues for your destination is full, then data may be delivered more than once for the other healthy destinations causing data duplication.

Once the destination is available again, the Edge Processor sends the queued events to the destination. It might take some time for newer data to be processed by an Edge Processor as the data in the queue is prioritized first. If you want to adjust the size of the queue, see the solution instructions in An Edge Processor fails to send data, and logs a "Dropping data because sending_queue is full" error.

Related answers from Splunk Community

How the destination for Edge Processor works

What happens to my data if a destination becomes unavailable?

Comments

How the destination for Edge Processor works

Was this topic useful?