Group by attributes processor π
The Group by Attributes processor is an OpenTelemetry Collector component that reassociates spans, log records, and metric data points to a resource that matches with the specified attributes. As a result, all spans, log records, or metric data points with the same values for the specified attributes are grouped under the same resource.
The supported pipeline types are traces
, metrics
, and logs
. See Process your data with pipelines for more information.
Get started π
Follow these steps to configure and activate the component:
Deploy the Splunk Distribution of OpenTelemetry Collector to your host or container platform:
Configure the
groupbyattrs
processor as described in the next section.Restart the Collector.
Sample configurations π
To activate the resource processor, add groupbyattrs
to the processors
section of your configuration file. Specify an array of attribute keys to use to βgroupβ spans, log records or metric data points together, as in the following example:
processors:
groupbyattrs:
keys:
- foo
- bar
The keys property describes which attribute keys will be considered for grouping:
If the processed span, log record and metric data point has at least one of the specified attributes key, it will be moved to a resource with the same value for these attributes. The resource will be created if none exists with the same attributes.
If none of the specified attributes key is present in the processed span, log record or metric data point, it remains associated to the same resource, without any change.
To complete the configuration, include the processor in any pipeline of the service
section of your configuration file. For example:
service:
pipelines:
metrics:
processors: [groupbyattrs]
logs:
processors: [groupbyattrs]
traces:
processors: [groupbyattrs]
See config.go for the config spec.
Typical use cases π
Use the processor to perform the following actions:
Extract resources from βflatβ data formats, such as Fluentbit logs or Prometheus metrics.
Associate Prometheus metrics to a resource that describes the relevant host, based on a label present on all metrics.
Optimize data packaging by extracting common attributes.
Compact multiple records that share the same
resource
andInstrumentationLibrary
attributes but are under multipleResourceSpans
orResourceMetrics
orResourceLogs
into a singleResourceSpans
orResourceMetrics
orResourceLogs
, when an empty list of keys is provided.This happens, for example, when you use the
groupbytrace
processor, or when data comes in multiple requests.If you compact data it takes less memory, itβs more efficiently processed and serialized, and the number of export requests is reduced.
Tip
Use the groupbyattrs
processor together with batch
processor, as a consecutive step. Grouping records together under matching resource and/or InstrumentationLibrary reduces the fragmentation of data.
Advanced configuration examples π
Group metrics by host π
Consider the below metrics, all originally associated to the same resource:
Resource {host.name="localhost",source="prom"}
Metric "gauge-1" (GAUGE)
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-B",id="eth0"}
Metric "gauge-1" (GAUGE) // Identical to previous Metric
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-B",id="eth0"}
Metric "mixed-type" (GAUGE)
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-B",id="eth0"}
Metric "mixed-type" (SUM)
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-A",id="eth0"}
Metric "dont-move" (Gauge)
DataPoint {id="eth0"}
Use the following configuration to re-associate the metrics with either host-A
or host-B
, based on the value of the host.name
attribute.
processors:
groupbyattrs:
keys:
- host.name
The output of the processor is:
Resource {host.name="localhost",source="prom"}
Metric "dont-move" (Gauge)
DataPoint {id="eth0"}
Resource {host.name="host-A",source="prom"}
Metric "gauge-1"
DataPoint {id="eth0"}
DataPoint {id="eth0"}
DataPoint {id="eth0"}
DataPoint {id="eth0"}
Metric "mixed-type" (GAUGE)
DataPoint {id="eth0"}
DataPoint {id="eth0"}
Metric "mixed-type" (SUM)
DataPoint {id="eth0"}
DataPoint {id="eth0"}
Resource {host.name="host-B",source="prom"}
Metric "gauge-1"
DataPoint {id="eth0"}
DataPoint {id="eth0"}
Metric "mixed-type" (GAUGE)
DataPoint {id="eth0"}
The groupbytrace
processor has accomplished the following:
The
DataPoints
for thegauge-1
metric were originally split under 2 metric instances, and have been merged in the output.The
DataPoints
of themixed-type
gauge
and mixed-typesum
metrics have not been merged under the same metric, because theirDataType
is different.The
dont-move
metricDataPoints
donβt have ahost.name
attribute, and therefore have remained under the original resource.The new resources inherited the attributes from the original resource (source=βpromβ), and the specified attributes from the processed metrics (
host.name="host-A"
orhost.name="host-B"
).The specified grouping attributes that are set on the new resources are also removed from the metric
DataPoints
.While not shown in this example, the processor also merges collections of records under matching
InstrumentationLibrary
.
Compact data π
In some cases, data might come in single requests to the Collector, or become fragmented due to use of the groupbytrace
processor. Even after batching there might be multiple duplicated ResourceSpans
or ResourceMetrics
or ResourceLogs
objects, which leads to additional memory consumption, increased processing costs, inefficient serialization, or increase of the export requests.
To remedy this, use the groupbyattrs
processor to compact the data by matching Resource
and InstrumentationLibrary
properties.
For example, consider the following input:
Resource {host.name="localhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=1, ...}
InstumentationLibrary {name="OtherLibrary"}
Spans
Span {span_id=2, ...}
Resource {host.name="localhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=3, ...}
Resource {host.name="localhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=4, ...}
Resource {host.name="otherhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=5, ...}
Use the following configuration to re-associate the spans with matching Resource
and InstrumentationLibrary
.
processors:
batch:
groupbyattrs:
pipelines:
traces:
processors: [batch, groupbyattrs/grouping]
...
The output of the processor is:
Resource {host.name="localhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=1, ...}
Span {span_id=3, ...}
Span {span_id=4, ...}
InstumentationLibrary {name="OtherLibrary"}
Spans
Span {span_id=2, ...}
Resource {host.name="otherhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=5, ...}
Settings π
The following table shows the configuration options for the groupbyattrs
processor:
Internal metrics π
The groupbyattrs
processor records the following internal metrics:
Metric |
Description |
---|---|
|
The number of spans that had attributes grouped |
|
The number of spans that did not have attributes grouped |
|
Distribution of groups extracted for spans |
|
Number of logs that had attributes grouped |
|
Number of logs that did not have attributes grouped |
|
Distribution of groups extracted for logs |
|
Number of metrics that had attributes grouped |
|
Number of metrics that did not have attributes grouped |
|
Distribution of groups extracted for metrics |
Troubleshooting π
If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.
Available to Splunk Observability Cloud customers
Submit a case in the Splunk Support Portal .
Contact Splunk Support .
Available to prospective customers and free trial users
Ask a question and get answers through community support at Splunk Answers .
Join the Splunk #observability user group Slack channel to communicate with customers, partners, and Splunk employees worldwide. To join, see Chat groups in the Get Started with Splunk Community manual.