Docs » Get started with the Splunk Distribution of the OpenTelemetry Collector » Configure the Collector » Advanced configuration for Kubernetes

Advanced configuration for Kubernetes πŸ”—

See the following advanced configuration options for the Collector for Kubernetes.

For basic Helm chart configuration, see Configure Helm for Kubernetes. For log configuration, refer to Configure logs for Kubernetes.

Note

The values.yaml file lists all supported configurable parameters for the Helm chart, along with a detailed explanation of each parameter. Review it to understand how to configure this chart.

The Helm chart can also be configured to support different use cases, such as trace sampling and sending data through a proxy server. See Examples of chart configuration for more information.

Override the default configuration πŸ”—

You can override the default OpenTelemetry agent configuration to use your own configuration. To do this, include a custom configuration using the agent.config parameter in the values.yaml file. For example:

agent:
  enabled: true

# Metric collection from k8s control plane components.
  controlPlaneMetrics:
    apiserver:
      enabled: true
    controllerManager:
      enabled: true
    coredns:
      enabled: false
    proxy:
      enabled: true
    scheduler:
      enabled: false

This custom configuration is merged into the default agent configuration.

Caution

After merging the files you need to fully redefine parts of the configuration, for example service, pipelines, logs, and processors.

Override a control plane configuration πŸ”—

If any of the control plane metric receivers are activated under the agent.controlPlaneMetrics configuration section, then the Helm chart will configure the Collector to use the activated receivers to collect metrics from the control plane.

To collect control plane metrics, the Helm chart uses the Collector on each node to use the receiver creator to represent control plane receivers at runtime. The receiver creator has a set of discovery rules that know which control plane receivers to create. The default discovery rules can vary depending on the Kubernetes distribution and version. See Receiver creator receiver for more information.

If your control plane is using non-standard specifications, then you can provide a custom configuration to allow the Collector to successfully connect to it.

The Collector relies on pod-level network access to collect metrics from the control plane pods. Since most cloud Kubernetes as a service distributions don’t expose the control plane pods to the end user, collecting metrics from these distributions is not supported.

Availability and configuration instructions πŸ”—

The following distributions are supported:

  • Kubernetes 1.22 (kops created)

  • OpenShift version 4.9

The following distributions are not supported:

  • AKS

  • EKS

  • EKS/Fargate

  • GKE

  • GKE/Autopilot

See the agent template for the default configurations for the control plane receivers.

Refer to the following documentation for information on the configuration options and supported metrics for each control plane receiver:

Known issue πŸ”—

There is a known limitation for the Kubernetes proxy control plane receiver. When using a Kubernetes cluster created via kops, a network connectivity issue prevents proxy metrics from being collected. The limitation can be addressed by updating the kubeProxy metric bind address in the kops cluster specification:

  1. Set kubeProxy.metricsBindAddress: 0.0.0.0 in the kops cluster specification.

  2. Run kops update cluster {cluster_name} and kops rolling-update cluster {cluster_name} to deploy the change.

Using custom configurations for non-standard control plane components πŸ”—

You can override the default configuration values used to connect to the control plane. If your control plane uses nonstandard ports or custom TLS settings, you need to override the default configurations. The following example shows how to connect to a nonstandard API server that uses port 3443 for metrics and custom TLS certs stored in the /etc/myapiserver/ directory.

agent:
  config:
    receivers:
      receiver_creator:
        receivers:
          # Template for overriding the discovery rule and configuration.
          # smartagent/{control_plane_receiver}:
          #   rule: {rule_value}
          #   config:
          #     {config_value}
          smartagent/kubernetes-apiserver:
            rule: type == "port" && port == 3443 && pod.labels["k8s-app"] == "kube-apiserver"
            config:
              clientCertPath: /etc/myapiserver/clients-ca.crt
              clientKeyPath: /etc/myapiserver/clients-ca.key
              skipVerify: true
              useHTTPS: true
              useServiceAccount: false

Run the container in non-root user mode πŸ”—

Collecting logs often requires reading log files that are owned by the root user. By default, the container runs with securityContext.runAsUser = 0 which gives the root user permission to read those files. To run the container in non-root user mode, set .agent.securityContext. The log data permissions will be adjusted to match the securityContext configurations. For instance:


agent:
securityContext:

runAsUser: 20000 runAsGroup: 20000

Note

Running the collector agent for log collection in non-root mode is not currently supported in CRI-O and OpenShift environments at this time, for more details see the related GitHub feature request issue .

Use the Network Explorer to collect telemetry πŸ”—

Network Explorer allows you to collect network telemetry and send it to the OpenTelemetry Collector gateway.

To enable the Network Explorer, set the enabled flag to true:

networkExplorer:
  enabled: true

Caution

Activating the network explorer automatically activates the OpenTelemetry Collector gateway.

Prerequisites πŸ”—

Network Explorer is only supported in the following Kubernetes-based environments on Linux hosts:

  • RedHat Linux 7.6+

  • Ubuntu 16.04+

  • Debian Stretch+

  • Amazon Linux 2

  • Google COS

Modify the reducer footprint πŸ”—

The reducer is a single pod per Kubernetes cluster. If your cluster contains a large number of pods, nodes, and services, you can increase the resources allocated to it.

The reducer processes telemetry in multiple stages, with each stage partitioned into one or more shards, where each shard is a separate thread. Increasing the number of shards in each stage expands the capacity of the reducer. There are three stages: ingest, matching, and aggregation. You can set between 1 to 32 shards for each stage. There is one shard per reducer stage by default.

The following example sets the reducer to use 4 shards per stage.

networkExplorer:
  reducer:
    ingestShards: 4
    matchingShards: 4
    aggregationShards: 4

Customize network telemetry generated by the Network Explorer πŸ”—

Metrics can be deactivated, either individually or by entire categories. See the values.yaml for a complete list of categories and metrics.

To disable an entire category, give the category name, followed by .all:

networkExplorer:
  reducer:
    disableMetrics:
      - tcp.all

Disable individual metrics by their names:

networkExplorer:
  reducer:
    disableMetrics:
      - tcp.bytes

You can mix categories and names. For example, yo disable all http metrics and the udp.bytes metric use:

networkExplorer:
  reducer:
    disableMetrics:
      - http.all
      - udp.bytes

Reactivate metrics πŸ”—

To activate metrics you have deactivated, use enableMetrics.

The disableMetrics flag is evaluated before enableMetrics, so you can deactivate an entire category, then re-activate individual metrics in that category that you are interested in.

For example, to deactivate all internal and http metrics but keep ebpf_net.collector_health, use:

networkExplorer:
  reducer:
    disableMetrics:
    - http.all
    - ebpf_net.all

    enableMetrics:
    - ebpf_net.collector_health

Configure features using gates πŸ”—

Use the agent.featureGates, clusterReceiver.featureGates, and gateway.featureGates configs to activate or deactivate features of the otel-collector agent, clusterReceiver, and gateway, respectively. These configs are used to populate the otelcol binary startup argument -feature-gates.

For example, to activate feature1 in the agent, activate feature2 in the clusterReceiver, and deactivate feature2 in the gateway, run:

helm install {name} --set agent.featureGates=+feature1 --set clusterReceiver.featureGates=feature2 --set gateway.featureGates=-feature2 {other_flags}

Set the pod security policy manually πŸ”—

Support of Pod Security Policies (PSP) was removed in Kubernetes 1.25. If you still rely on PSPs in an older cluster, you can add PSP manually:

  1. Run the following command to install the PSP. Don’t forget to add the --namespace kubectl argument if needed:

cat <<EOF | kubectl apply -f -
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: splunk-otel-collector-psp
  labels:
    app: splunk-otel-collector-psp
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'runtime/default'
    apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName:  'runtime/default'
    apparmor.security.beta.kubernetes.io/defaultProfileName:  'runtime/default'
spec:
  privileged: false
  allowPrivilegeEscalation: false
  hostNetwork: true
  hostIPC: false
  hostPID: false
  volumes:
  - 'configMap'
  - 'emptyDir'
  - 'hostPath'
  - 'secret'
  runAsUser:
    rule: 'RunAsAny'
  seLinux:
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
EOF
  1. Add the following custom ClusterRole rule in your values.yaml file along with all other required fields like clusterName, splunkObservability or splunkPlatform:

rbac:
  customRules:
    - apiGroups:     [extensions]
      resources:     [podsecuritypolicies]
      verbs:         [use]
      resourceNames: [splunk-otel-collector-psp]
  1. Install the Helm chart:

helm install my-splunk-otel-collector -f my_values.yaml splunk-otel-collector-chart/splunk-otel-collector