Docs » Get started with the Splunk Distribution of the OpenTelemetry Collector » Collector components » Collector components: Processors » Kubernetes attributes processor

Kubernetes attributes processor 🔗

The Kubernetes attributes processor is an OpenTelemetry Collector component that manages resource attributes using Kubernetes metadata. The processor automatically discovers resources, extracts metadata from them, and adds the metadata to relevant spans, metrics and logs as resource attributes. The supported pipeline types are traces, metrics, and logs. See Process your data with pipelines for more information.

Caution

Don’t remove the Kubernetes attributes processor from your configuration. Default attributes extracted by the processor, such as k8s.pod.name, are required for Splunk Observability Cloud capabilities, such as Kubernetes navigator, Related Content, and accurate subscription usage.

Get started 🔗

The Helm chart for the Splunk Distribution of OpenTelemetry Collector already includes the Kubernetes attributes processor, which is activated by default for all deployment modes. See Install the Collector for Kubernetes using Helm.

To manually configure the Kubernetes attributes processor, follow these steps:

  1. Configure role-based access control

  2. Discovery filters

  3. Extract metadata

  4. Association lists

  5. Kubernetes labels and annotations

Sample configuration 🔗

The Splunk Distribution of OpenTelemetry Collector for Kubernetes adds the k8sattributes processor with the default configuration:

processors:
  k8sattributes:

You can include the processor in all pipelines of the service section of your configuration file:

service:
  pipelines:
    metrics:
      processors: [k8sattributes/demo]
    logs:
      processors: [k8sattributes/demo]
    traces:
      processors: [k8sattributes/demo]

Configuration example 🔗

The following example contains a list of extracted metadata, Kubernetes annotations and labels, and an association list:

k8sattributes/demo:
  auth_type: "serviceAccount"
  passthrough: false
  filter:
    node_from_env_var: <KUBE_NODE_NAME>
  extract:
    metadata:
      - k8s.pod.name
      - k8s.pod.uid
      - k8s.deployment.name
      - k8s.namespace.name
      - k8s.node.name
      - k8s.pod.start_time
  annotations:
    - key_regex: opentel.* # extracts Keys & values of annotations matching regex `opentel.*`
      from: pod
  labels:
    - key_regex: opentel.* # extracts Keys & values of labels matching regex `opentel.*`
      from: pod
  pod_association:
    - sources:
       - from: resource_attribute
         name: k8s.pod.ip
    - sources:
       - from: resource_attribute
         name: k8s.pod.uid
    - sources:
       - from: connection

Advanced use cases 🔗

Configure role-based access control 🔗

The Kubernetes attributes processor requires get, watch and list permissions on both pods and namespaces resources for all namespaces and pods included in the configured filters.

The following example shows how to give a ServiceAccount the necessary permissions for all pods and namespaces in a cluster. Replace <col_namespace> with the namespace where you’ve deployed the Collector:

apiVersion: v1
kind: ServiceAccount
metadata:
   name: collector
   namespace: <col_namespace>

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
   name: otel-collector
rules:
   - apiGroups: [""]
   resources: ["pods", "namespaces"]
   verbs: ["get", "watch", "list"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
   name: otel-collector
subjects:
- kind: ServiceAccount
  name: collector
  namespace: <col_namespace>
roleRef:
   kind: ClusterRole
   name: otel-collector
   apiGroup: rbac.authorization.k8s.io

Discovery filters 🔗

You can use the Kubernetes attributes processor in Collectors deployed either as agents or as gateways, using DaemonSets or Deployments respectively. See Collector deployment modes for more information.

Agent configuration 🔗

In host monitoring (agent) mode, the processor detects IP addresses of pods sending spans, metrics, or logs to the agent and uses this information to extract metadata from pods.

When running the Collector in host monitoring (agent) mode, apply a discovery filter so that only pods from the same host the Collector is running on are discovered. Using a discovery filter also optimizes resource consumption on large clusters.

To automatically filter pods by the node the processors is running on, configure the Downward API to inject the node name as an environment variable. For example:

spec:
  containers:
  - env:
    - name: KUBE_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName

Then, set the filter.node_from_env_var field to the name of the environment variable that containes the name of the node. For example:

k8sattributes:
  filter:
    node_from_env_var: KUBE_NODE_NAME

Gateway configuration 🔗

The processor can’t resolve the IP address of the pods that emit telemetry data when running in data forwarding (gateway) mode. To receive the correct IP addresses in a Collector gateway, configure the agents to forward addresses.

To forward IP addresses to gateways, configure the Collectors in host monitoring (agent) mode to run in passthrough mode. This ensures that agents detect IP addresses and pass them as an attribute attached to all telemetry resources.

k8sattributes:
  passthrough: true

Then, configure the Collector gateways as usual. The processor automatically detects the IP addresses of spans, logs, and metrics sent by the agents or by other sources, and call the Kubernetes API to extract metadata.

Extract metadata 🔗

Use the metadata option to define what resource attributes you want to add. You can only use attribute names from existing metadata defined in pod_association.resource_attribute. The processor ignores empty or nonexisting values.

The following attributes are added by default:

  • k8s.namespace.name

  • k8s.pod.name

  • k8s.pod.uid

  • k8s.pod.start_time

  • k8s.deployment.name

  • k8s.node.name

You can change this list by adding a metadata section. For example:

k8sattributes:
  auth_type: "serviceAccount"
  passthrough: false
  filter:
    node_from_env_var: KUBE_NODE_NAME
  extract:
    metadata:
      - k8s.pod.name
      - k8s.pod.uid
      - k8s.deployment.name
      - k8s.namespace.name
      - k8s.node.name
      - k8s.pod.start_time

Caution

Make sure that default attributes, such as k8s.pod.name, are always extracted, as they’re required for Splunk Observability Cloud capabilities, such as Kubernetes navigator, Related Content, and accurate subscription usage.

The following container level attributes require additional attributes to identify a container in a pod:

  • Container spec attributes: Set only if k8s.container.name is available as a resource attribute.

    • container.image.name

    • container.image.tag

  • Container attributes: Set only if k8s.container.name is available as a resource attribute.

    • container.id: Must be available in the metadata.

Note

Set the k8s.container.restart_count resource attribute to retrieve the association with a particular container instance. If k8s.container.restart_count is not set, the last container instance is used.

Association lists 🔗

Define rules for associating data passing through the processor with pod metadata using the pod_association field, which represents a list of associations executed in the specified order until the first match.

Each association is a list of sources. Sources contain rules. The processor executes all rules and produce a metadata cache key as a result. For example:

pod_association:
 # List of associations
  - sources:
      # List of sources. Each cointains rules
      - from: resource_attribute
        name: k8s.pod.name
      - from: resource_attribute
        name: k8s.namespace.name

To apply an association, each source has to be successfully retrieved from a log, trace, or metric. If you don’t configure association rules, the processor associates resources using the connection’s address.

Each source rule consists of a pair of from and name statements, representing the rule type and attribute name respectively. You can define two types of from statements:

  • from: connection: Extracts the IP address attribute from the connection context, if available.

  • from: resource_attribute: Specifies the attribute name to search in the list of attributes.

The following example shows the two type of from source statements in pod association rules:

pod_association:
  - sources:
    - from: resource_attribute
      name: ip
  - sources:
    - from: resource_attribute
      name: k8s.pod.ip
  - sources:
    - from: resource_attribute
      name: host.name
  - sources:
    - from: connection
      name: ip

Kubernetes labels and annotations 🔗

The Kubernetes attributes processor can also set resource attributes from Kubernetes labels and annotations of pods and namespaces. You can configure this through the annotations and labels lists inside extract.

The processor extracts annotations and labels from pods and namespaces and adds them to spans, metrics, and logs. You can specify each item using the following parameters:

  • tag_name: Name used to tag telemetry.

  • key: Key used to extract the value.

  • from: Kubernetes object used to extract the value. The two possible values are pod and namespace. The default value is namespace.

For example:

annotations:
# Extracts value of annotation from pods with key `annotation-one`
# and inserts it as a tag with key `a1`
  - tag_name: a1
    key: annotation-one
    from: pod
# Extracts value of annotation from namespaces with key `annotation-two`
# with regular expressions and inserts it as a tag with key `a2`
  - tag_name: a2
    key: annotation-two
    regex: field=(?P<value>.+)
    from: namespace

labels:
# Extracts value of label from namespaces with key `label1`
# and inserts it as a tag with key `l1`
  - tag_name: l1
    key: label1
    from: namespace
# Extracts value of label from pods with key `label2` with
#  regular expressions and inserts it as a tag with key `l2`
  - tag_name: l2
    key: label2
    regex: field=(?P<value>.+)
    from: pod

Settings 🔗

The following table shows the configuration options for the Kubernetes attributes processor:

NameTypeDefaultDescription
auth_typestringserviceAccount

How to authenticate to the K8s API server. This can be one of none (for no auth), serviceAccount (to use the standard service account token provided to the agent pod), or kubeConfig to use credentials from ~/.kube/config.

passthroughboolfalse

Passthrough mode only annotates resources with the pod IP and does not try to extract any other metadata. It does not need access to the K8S cluster API. Agent/Collector must receive spans directly from services to be able to correctly detect the pod IPs.

extract (see fields)struct

Extract section allows specifying extraction rules to extract data from k8s pod specs

filter (see fields)struct

Filter section allows specifying filters to filter pods by labels, fields, namespaces, nodes, etc.

pod_association (see fields)slice

Association section allows to define rules for tagging spans, metrics, and logs with Pod metadata.

exclude (see fields)struct

Exclude section allows to define names of pod that should be ignored while tagging.

Fields of extract

NameTypeDefaultDescription
metadataslice

Metadata allows to extract pod/namespace metadata from a list of metadata fields. The field accepts a list of strings.

Metadata fields supported right now are, k8s.pod.name, k8s.pod.uid, k8s.deployment.name, k8s.node.name, k8s.namespace.name, k8s.pod.start_time, k8s.replicaset.name, k8s.replicaset.uid, k8s.daemonset.name, k8s.daemonset.uid, k8s.job.name, k8s.job.uid, k8s.cronjob.name, k8s.statefulset.name, k8s.statefulset.uid, k8s.container.name, container.image.name, container.image.tag, container.id

Specifying anything other than these values will result in an error. By default, the following fields are extracted and added to spans, metrics and logs as attributes:

  • k8s.pod.name
  • k8s.pod.uid
  • k8s.pod.start_time
  • k8s.namespace.name
  • k8s.node.name
  • k8s.deployment.name (if the pod is controlled by a deployment)
  • k8s.container.name (requires an additional attribute to be set: container.id)
  • container.image.name (requires one of the following additional attributes to be set: container.id or k8s.container.name)
  • container.image.tag (requires one of the following additional attributes to be set: container.id or k8s.container.name)
annotations (see fields)slice

Annotations allows extracting data from pod annotations and record it as resource attributes. It is a list of FieldExtractConfig type. See FieldExtractConfig documentation for more details.

labels (see fields)slice

Labels allows extracting data from pod labels and record it as resource attributes. It is a list of FieldExtractConfig type. See FieldExtractConfig documentation for more details.

Fields of annotations

NameTypeDefaultDescription
tag_namestring

TagName represents the name of the resource attribute that will be added to logs, metrics or spans. When not specified, a default tag name will be used of the format:

  • k8s.pod.annotations.
  • k8s.pod.labels.

app.kubernetes.io/component: mysql app.kubernetes.io/version: 5.7.21

and you'd like to add tags for all labels with prefix app.kubernetes.io/ and also trim the prefix, then you can specify the following extraction rules:

extract: labels: - tag_name: $$1 key_regex: kubernetes.io/(.*)

this will add the component and version tags to the spans or metrics.

keystring

Key represents the annotation (or label) name. This must exactly match an annotation (or label) name.

key_regexstring

KeyRegex is a regular expression used to extract a Key that matches the regex. Out of Key or KeyRegex, only one option is expected to be configured at a time.

regexstring

Regex is an optional field used to extract a sub-string from a complex field value. The supplied regular expression must contain one named parameter with the string "value" as the name. For example, if your pod spec contains the following annotation,

kubernetes.io/change-cause: 2019-08-28T18:34:33Z APP_NAME=my-app GIT_SHA=58a1e39 CI_BUILD=4120

and you'd like to extract the GIT_SHA and the CI_BUILD values as tags, then you must specify the following two extraction rules:

extract: annotations: - tag_name: git.sha key: kubernetes.io/change-cause regex: GIT_SHA=(?P\w+) - tag_name: ci.build key: kubernetes.io/change-cause regex: JENKINS=(?P[\w]+)

this will add the git.sha and ci.build resource attributes.

fromstring

From represents the source of the labels/annotations. Allowed values are "pod" and "namespace". The default is pod.

Fields of labels

NameTypeDefaultDescription
tag_namestring

TagName represents the name of the resource attribute that will be added to logs, metrics or spans. When not specified, a default tag name will be used of the format:

  • k8s.pod.annotations.
  • k8s.pod.labels.

app.kubernetes.io/component: mysql app.kubernetes.io/version: 5.7.21

and you'd like to add tags for all labels with prefix app.kubernetes.io/ and also trim the prefix, then you can specify the following extraction rules:

extract: labels: - tag_name: $$1 key_regex: kubernetes.io/(.*)

this will add the component and version tags to the spans or metrics.

keystring

Key represents the annotation (or label) name. This must exactly match an annotation (or label) name.

key_regexstring

KeyRegex is a regular expression used to extract a Key that matches the regex. Out of Key or KeyRegex, only one option is expected to be configured at a time.

regexstring

Regex is an optional field used to extract a sub-string from a complex field value. The supplied regular expression must contain one named parameter with the string "value" as the name. For example, if your pod spec contains the following annotation,

kubernetes.io/change-cause: 2019-08-28T18:34:33Z APP_NAME=my-app GIT_SHA=58a1e39 CI_BUILD=4120

and you'd like to extract the GIT_SHA and the CI_BUILD values as tags, then you must specify the following two extraction rules:

extract: annotations: - tag_name: git.sha key: kubernetes.io/change-cause regex: GIT_SHA=(?P\w+) - tag_name: ci.build key: kubernetes.io/change-cause regex: JENKINS=(?P[\w]+)

this will add the git.sha and ci.build resource attributes.

fromstring

From represents the source of the labels/annotations. Allowed values are "pod" and "namespace". The default is pod.

Fields of filter

NameTypeDefaultDescription
nodestring

Node represents a k8s node or host. If specified, any pods not running on the specified node will be ignored by the tagger.

node_from_env_varstring

NodeFromEnv can be used to extract the node name from an environment variable. The value must be the name of the environment variable. This is useful when the node a Otel agent will run on cannot be predicted. In such cases, the Kubernetes downward API can be used to add the node name to each pod as an environment variable. K8s tagger can then read this value and filter pods by it.

For example, node name can be passed to each agent with the downward API as follows

env:

  • name: K8S_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName

Then the NodeFromEnv field can be set to K8S_NODE_NAME to filter all pods by the node that the agent is running on.

More on downward API here: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/

namespacestring

Namespace filters all pods by the provided namespace. All other pods are ignored.

fields (see fields)slice

Fields allows to filter pods by generic k8s fields. Only the following operations are supported:

  • equals
  • not-equals

Check FieldFilterConfig for more details.

labels (see fields)slice

Labels allows to filter pods by generic k8s pod labels. Only the following operations are supported:

  • equals
  • not-equals
  • exists
  • not-exists

Check FieldFilterConfig for more details.

Fields of fields

NameTypeDefaultDescription
keystring

Key represents the key or name of the field or labels that a filter can apply on.

valuestring

Value represents the value associated with the key that a filter operation specified by the Op field applies on.

opstring

Op represents the filter operation to apply on the given Key: Value pair. The following operations are supported equals, not-equals, exists, does-not-exist.

Fields of labels

NameTypeDefaultDescription
keystring

Key represents the key or name of the field or labels that a filter can apply on.

valuestring

Value represents the value associated with the key that a filter operation specified by the Op field applies on.

opstring

Op represents the filter operation to apply on the given Key: Value pair. The following operations are supported equals, not-equals, exists, does-not-exist.

Fields of pod_association

NameTypeDefaultDescription
sources (see fields)slice

List of pod association sources which should be taken to identify pod

Fields of sources

NameTypeDefaultDescription
fromstring

From represents the source of the association. Allowed values are "connection" and "resource_attribute".

namestring

Name represents extracted key name. e.g. ip, pod_uid, k8s.pod.ip

Fields of exclude

NameTypeDefaultDescription
pods (see fields)slice

ExcludePodConfig represent a Pod name to ignore

Fields of pods

NameTypeDefaultDescription
namestring

Metrics 🔗

The following metrics, resource attributes, and attributes are available.

Resource Attributes

NameType Description
k8s.cluster.uidstring

Gives cluster uid identified with kube-system namespace

k8s.namespace.namestring

The name of the namespace that the pod is running in.

k8s.pod.namestring

The name of the Pod.

k8s.pod.uidstring

The UID of the Pod.

k8s.pod.hostnamestring

The hostname of the Pod.

k8s.pod.start_timestring

The start time of the Pod.

k8s.pod.ipstring

The IP address of the Pod.

k8s.deployment.namestring

The name of the Deployment.

k8s.deployment.uidstring

The UID of the Deployment.

k8s.replicaset.namestring

The name of the ReplicaSet.

k8s.replicaset.uidstring

The UID of the ReplicaSet.

k8s.daemonset.namestring

The name of the DaemonSet.

k8s.daemonset.uidstring

The UID of the DaemonSet.

k8s.statefulset.namestring

The name of the StatefulSet.

k8s.statefulset.uidstring

The UID of the StatefulSet.

k8s.container.namestring

The name of the Container in a Pod template. Requires container.id.

k8s.job.namestring

The name of the Job.

k8s.job.uidstring

The UID of the Job.

k8s.cronjob.namestring

The name of the CronJob.

k8s.node.namestring

The name of the Node.

k8s.node.uidstring

The UID of the Node.

container.idstring

Container ID. Usually a UUID, as for example used to identify Docker containers. The UUID might be abbreviated. Requires k8s.container.restart_count.

container.image.namestring

Name of the image the container was built on. Requires container.id or k8s.container.name.

container.image.repo_digestsslice

Repo digests of the container image as provided by the container runtime.

container.image.tagstring

Container image tag. Defaults to "latest" if not provided (unless digest also in image path) Requires container.id or k8s.container.name.

Activate or deactivate specific metrics 🔗

You can activate or deactivate specific metrics by setting the enabled field in the metrics section for each metric. For example:

receivers:
  samplereceiver:
    metrics:
      metric-one:
        enabled: true
      metric-two:
        enabled: false

The following is an example of host metrics receiver configuration with activated metrics:

receivers:
  hostmetrics:
    scrapers:
      process:
        metrics:
          process.cpu.utilization:
            enabled: true

Note

Deactivated metrics aren’t sent to Splunk Observability Cloud.

Billing 🔗

  • If you’re in a MTS-based subscription, all metrics count towards metrics usage.

  • If you’re in a host-based plan, metrics listed as active (Active: Yes) on this document are considered default and are included free of charge.

Learn more at Infrastructure Monitoring subscription usage (Host and metric plans).

Known limitations 🔗

The Kubernetes attributes processor doesn’t work well in the following cases.

Host networking mode 🔗

The processor can’t identify pods running in the host network mode. Enriching telemetry data generated by such pods only works if the association rule isn’t based on the IP address attribute.

Sidecar 🔗

The processor can’t detect containers from the same pods when running as a sidecar. Instead, use the Kubernetes Downward API to inject environment variables into the pods and use their values as tags.

Troubleshooting 🔗

If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.

Available to Splunk Observability Cloud customers

Available to prospective customers and free trial users

  • Ask a question and get answers through community support at Splunk Answers .

  • Join the Splunk #observability user group Slack channel to communicate with customers, partners, and Splunk employees worldwide. To join, see Chat groups in the Get Started with Splunk Community manual.

This page was last updated on Dec 12, 2024.