Batch processor 🔗

The batch processor is an OpenTelemetry Collector component that batches and compresses spans, metrics, or logs based on size or time. Batching can help reduce the number of submission requests made by exporters, and help regulate the flow of telemetry from multiple or single receivers in a pipeline.

To ensure that batching happens after data sampling and filtering, add the batch processor after the memory_limiter processor and other sampling processors.

Get started 🔗

Note

This component is included in the default configuration of the Splunk Distribution of the OpenTelemetry Collector when deploying in host monitoring (agent) mode or data forwarding (gateway) modes. See Collector deployment modes for more information.

For details about the default configuration, see Configure the Collector for Kubernetes with Helm, Collector for Linux default configuration, or Collector for Windows default configuration. You can customize your configuration any time as explained in this document.

Follow these steps to configure and activate the component:

Deploy the Splunk Distribution of OpenTelemetry Collector to your host or container platform:

Install the Collector for Linux with the installer script

Install the Collector for Windows with the installer script

Install the Collector for Kubernetes using Helm

Configure the processor as described in this document.
Restart the Collector.

Sample configuration 🔗

The Splunk Distribution of the OpenTelemetry Collector adds the batch processor with the default configuration:

processors:
  batch:

The processor is included in all pipelines of the service section of your configuration file:

service:
  pipelines:
    metrics:
      processors: [batch]
    logs:
      processors: [batch]
    traces:
      processors: [batch]

Basic batching example 🔗

The following example shows how to configure the batch processor to send batches after 5,000 spans, data points, or logs have been collected. The timeout setting works as a fallback condition in case the size condition isn’t met.

processors:
  batch/custom:
    send_batch_size: 5000
    timeout: 15s

Batching by metadata 🔗

Starting from version 0.78 of the OpenTelemetry Collector, you can batch telemetry based on metadata. For example:

processors:
  batch:
    # batch data by tenant-id
    metadata_keys:
    - tenant_id

    # limit to 10 batcher processes before raising errors
    metadata_cardinality_limit: 10

To use metadata as batching criteria, add the include_metadata: true setting to your receivers’s configuration, so that the batch processor can use the available metadata keys.

Caution

Batching by metadata might increase memory consumption, as each metadata combination triggers the allocation of a new background task in the Collector. The maximum number of distinct combinations is defined using the metadata_cardinality_limit setting, which defaults to 1000.

Settings 🔗

The following table shows the configuration options for the batch processor:

Troubleshooting 🔗

If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.

Available to Splunk Observability Cloud customers

Submit a case in the Splunk Support Portal .
Contact Splunk Support .

Available to prospective customers and free trial users

Ask a question and get answers through community support at Splunk Answers .
Join the Splunk #observability user group Slack channel to communicate with customers, partners, and Splunk employees worldwide. To join, see Chat groups in the Get Started with Splunk Community manual.

This page was last updated on Feb 11, 2025.

Related Topics

Batch processor 🔗

Get started 🔗

Sample configuration 🔗

Basic batching example 🔗

Batching by metadata 🔗

Settings 🔗

Troubleshooting 🔗

Was this topic useful?

Splunk

Related Topics