Docs » Install and configure Splunk Distribution of OpenTelemetry Collector » Install the Collector » Install on Linux

Install on Linux 🔗

Splunk OpenTelemetry Collector for Linux is a package that provides integrated collection and forwarding for all data types. Install the package using one of these methods:

Note

Splunk only supports the SignalFx Smart Agent on x86_64 and AMD64 platforms. The SignalFx Smart Agent Receiver is subject to the same limitation.

Installer script 🔗

The following Linux distributions and versions are supported:

  • Amazon Linux: 2

  • CentOS, Red Hat, or Oracle: 7, 8

  • Debian: 9, 10, 11

  • SUSE: 12, 15 for versions v0.34.0 or higher. Log collection with Fluentd is not currently supported.

  • Ubuntu: 16.04, 18.04, 20.04, and 22.04. Log collection with Fluentd is not currently supported on Ubuntu 22.04.

You must have systemd installed to use this script. The installer script deploys and configures these things:

Do the following to install the package using the installer script:

  1. Ensure that you have curl and sudo installed.

  2. Download and execute the installer script.

  3. Replace the following variables for your environment:

curl -sSL https://dl.signalfx.com/splunk-otel-collector.sh > /tmp/splunk-otel-collector.sh;
sudo sh /tmp/splunk-otel-collector.sh --realm SPLUNK_REALM --memory SPLUNK_MEMORY_TOTAL_MIB -- SPLUNK_ACCESS_TOKEN

Run additional script options 🔗

To display additional configuration options supported by the script, use the -h flag.

curl -sSL https://dl.signalfx.com/splunk-otel-collector.sh > /tmp/splunk-otel-collector.sh;
sh /tmp/splunk-otel-collector.sh -h

Configure memory allocation 🔗

To configure memory allocation, change the --memory parameter. By default, this parameter is set to 512 MiB, or 500 x 2^20 bytes, of memory. Increase this setting to allocate more memory, as shown in the following example.

curl -sSL https://dl.signalfx.com/splunk-otel-collector.sh > /tmp/splunk-otel-collector.sh;
sudo sh /tmp/splunk-otel-collector.sh --realm SPLUNK_REALM --memory SPLUNK_MEMORY_TOTAL_MIB \
    -- SPLUNK_ACCESS_TOKEN

Use pre-configured repos 🔗

By default, apt/yum/zypper repo definition files are created to download the package and Fluentd deb/rpm packages from https://splunk.jfrog.io/splunk and https://packages.treasuredata.com, respectively.

To skip these steps and use pre-configured repos on the target system that provide the splunk-otel-collector and td-agent deb/rpm packages, specify the --skip-collector-repo and/or --skip-fluentd-repo options. For example:

curl -sSL https://dl.signalfx.com/splunk-otel-collector.sh > /tmp/splunk-otel-collector.sh && \
sudo sh /tmp/splunk-otel-collector.sh --realm SPLUNK_REALM --skip-collector-repo --skip-fluentd-repo \
 -- SPLUNK_ACCESS_TOKEN

Configure Fluentd 🔗

Note

If you don’t need to collect logs, run the installer script with the --without-fluentd option to skip installation of Fluentd and the plugins and dependencies described in this section.

By default, the Fluentd service is installed and configured to forward log events with the @SPLUNK label to the package, which then sends these events to the HEC ingest endpoint determined by the --realm <SPLUNK_REALM> option. For example, https://ingest.<SPLUNK_REALM>.signalfx.com/v1/log.

The following Fluentd plugins are also installed:

  • capng_c for enabling Linux capabilities.

  • fluent-plugin-systemd for systemd journal log collection.

Additionally, the following dependencies are installed as prerequisites for the Fluentd plugins:

Debian-based systems:

  • build-essential

  • libcap-ng0

  • libcap-ng-dev

  • pkg-config

RPM-based systems:

  • Development Tools

  • libcap-ng

  • libcap-ng-devel

  • pkgconfig

You can specify the following parameters to configure the package to send log events to a custom Splunk HTTP Event Collector (HEC) endpoint URL:

  • hec-url = "<URL>"

  • hec-token = "<TOKEN>"

HEC lets you send data and application events to a Splunk deployment over the HTTP and Secure HTTP (HTTPS) protocols. See Set up and use HTTP Event Collector in Splunk Web

The main Fluentd configuration is installed to /etc/otel/collector/fluentd/fluent.conf. Custom Fluentd source configuration files can be added to the /etc/otel/collector/fluentd/conf.d directory after installation.

Note the following:

  • In this directory, all files with the .conf extension are automatically included by Fluentd.

  • The td-agent user must have permissions to access the configuration files and the paths defined within.

  • By default, Fluentd is configured to collect systemd journal log events from /var/log/journal.

After any configuration modification, run sudo systemctl restart td-agent to restart the td-agent service.

If the td-agent package is upgraded after initial installation, you might need to set the Linux capabilities for the new version by performing the following steps for td-agent versions 4.1 or later:

  1. Check for the enabled capabilities:

    sudo /opt/td-agent/bin/fluent-cap-ctl --get -f /opt/td-agent/bin/ruby
    Capabilities in '/opt/td-agent/bin/ruby',
    Effective:   dac_override, dac_read_search
    Inheritable: dac_override, dac_read_search
    Permitted:   dac_override, dac_read_search
    
  2. If the output from the previous command does not include dac_override and dac_read_search as shown above, run the following commands:

    sudo td-agent-gem install capng_c
    sudo /opt/td-agent/bin/fluent-cap-ctl --add "dac_override,dac_read_search" -f /opt/td-agent/bin/ruby
    sudo systemctl daemon-reload
    sudo systemctl restart td-agent
    

Deployments 🔗

Splunk offers the configuration management options described in this section.

Amazon ECS EC2 🔗

Note

Available for Prometheus only.

Splunk provides a task definition to deploy the Splunk OpenTelemetry Collector to ECS EC2. The task definition is a text file, in JSON format, that describes one or more containers that form your application.

To start an Amazon ECS EC2 integration:

  1. Log in to Splunk Observability Cloud

  2. In the left navigation menu, select Data Setup to open the Integrate Your Data page.

  3. Click the Amazon ECS EC2 tile to open the Amazon ECS EC2 guided setup.

See the AWS ECS EC2 README in GitHub if you need more information.

Amazon Fargate 🔗

Note

Available for Prometheus only. Not yet available for Amazon EKS.

Splunk provides a guided setup to deploy the Splunk OpenTelemetry Collector on Amazon Fargate as a sidecar (additional container) to Amazon ECS tasks.

To access the Amazon Fargate guided setup, follow these steps:

  1. Log in to Splunk Observability Cloud

  2. In the left navigation menu, select Data Setup to open the Integrate Your Data page.

  3. Click the Amazon Fargate tile to open the Amazon Fargate guided setup.

See the AWS Fargate Deployment README in GitHub if you need more information.

Ansible 🔗

Splunk provides an Ansible role that installs the package configured to collect data (metrics, traces, and logs) from Linux machines and send that data to Observability Cloud.

Before installing the Ansible collection, do the following:

Ansible Galaxy is Ansible’s official hub for sharing Ansible content. See Ansible Collection for Splunk OpenTelemetry Collector for more information about the playbook.

Run the following command to install the Ansible collection from Ansible Galaxy:

ansible-galaxy collection install signalfx.splunk_otel_collector

To use the role, include the signalfx.splunk_otel_collector.collector role invocation in your playbook. Note that this role requires root access. The following example shows how to use the role in a playbook with minimal required configuration:

- name: Install Splunk OpenTelemetry Collector
  hosts: all
  become: yes
  tasks:
    - name: "Include splunk_otel_collector"
      include_role:
        name: "signalfx.splunk_otel_collector.collector"
      vars:
        splunk_access_token: YOUR_ACCESS_TOKEN
        splunk_realm: SPLUNK_REALM

The following table describes the variables that can be configured for this role:

Variable

Description

Required

splunk_access_token

The Splunk access token to authenticate requests.

Yes

splunk_realm

The realm to send the data to. This variable is set with this value for the service. The default value is us0.

No

splunk_ingest_url

The Splunk ingest URL, for example, https://ingest.us0.signalfx.com. This variable is set with this value for the service. The default value is https://ingest.{{ splunk_realm }}.signalfx.com.

No

splunk_api_url

The Splunk API URL, for example, https://api.us0.signalfx.com. This variable is set with this value for the service. The default value is https://api.{{ splunk_realm }}.signalfx.com.

No

splunk_trace_url

The Splunk trace endpoint URL, for example, https://ingest.us0.signalfx.com/v2/trace. This variable is set with this value for the service. The default value is {{ splunk_ingest_url }}/v2/trace.

No

splunk_hec_url

The Splunk HEC endpoint URL, for example, https://ingest.us0.signalfx.com/v1/log. This variable is set with this value for the service. The default value is {{ splunk_ingest_url }}/v1/log.

No

splunk_otel_collector_version

The version of the package to install, for example, 0.25.0. The default value is latest.

No

splunk_otel_collector_config

The configuration file, created in YAML. This variable can be set to /etc/otel/collector/gateway_config.yaml to install the package in Gateway mode. The default location is /etc/otel/collector/agent_config.yaml.

No

splunk_config_override

The custom configuration that is merged into the default configuration.

No

splunk_config_override_list_merge

The variable used to configure the list_merge option for merging lists in splunk_config_override with lists in the default configuration. Allowed options are replace, keep, append, prepend, append_rp, or prepend_rp. The default value is replace. You can find information about this variable on the Ansible Documentation site.

No

splunk_otel_collector_config_source

This is the source path to a configuration file on your control host that is uploaded and set in place of the value set in splunk_otel_collector_config on remote hosts. This variable can be used to submit a custom configuration, for example,./custom_collector_config.yaml. The default value is "", which means that nothing is copied and the configuration file set with splunk_otel_collector_config is used.

No

splunk_bundle_dir

The path to the bundle directory. The default path is provided by the package. If the specified path is changed from the default value, the path should be an existing directory on the node. This variable is set with this value for the service. The default location is /usr/lib/splunk-otel-collector/agent-bundle.

No

splunk_collectd_dir

The path to the collectd configuration directory for the bundle. The default path is provided by the package. If the specified path is changed from the default value, the path should be an existing directory on the node. This variable is set with this value for the service. The default location is /usr/lib/splunk-otel-collector/agent-bundle.

No

splunk_service_user and splunk_service_group

The user or group ownership for the service. The user or group is created if they do not exist. The default value is splunk-otel-collector.

No

splunk_otel_collector_proxy_http and splunk_otel_collector_proxy_https

The proxy address, respectively for http_proxy and https_proxy environment variables, to be used by the service if at least one of them is not empty. This value must be a full URL, for example, http://user:pass@10.0.0.42. Notice this proxy is not used by Ansible itself during deployment. The default value is "".

No

splunk_memory_total_mib

The amount of allocated memory in MiB. The default value is 512, or 500 x 2^20 bytes, of memory .

No

splunk_ballast_size_mib

The set memory ballast size in MiB. The default value is 1/3 of the value set in splunk_memory_total_mib.

No

install_fluentd

The option to install or manage Fluentd and dependencies for log collection. The dependencies include capng_c for enabling Linux capabilities, fluent-plugin-systemd for systemd journal log collection, and the required libraries or development tools. The default value is true.

No

td_agent_version

The version of td-agent (Fluentd package) that is installed. The default value is 3.3.0 for Debian jessie, 3.7.1 for Debian stretch, and 4.3.0 for other distros.

No

splunk_fluentd_config

The path to the Fluentd configuration file on the remote host. The default location is /etc/otel/collector/fluentd/fluent.conf.

No

splunk_fluentd_config_source

The source path to a Fluentd configuration file on your control host that is uploaded and set in place of the value set in splunk_fluentd_config on remote hosts. Use this variable to submit a custom Fluentd configuration, for example, ./custom_fluentd_config.conf. The default value is "", which means that nothing is copied and the configuration file set with splunk_otel_collector_config is used.

No

Heroku 🔗

Splunk OpenTelemetry Collector for Heroku is a buildpack for the Collector. The buildpack installs and runs the Collector on a Dyno to receive, process, and export metric and trace data for Splunk Observability Cloud. See Heroku for the steps to install the buildpack.

Pivotal Cloud Foundry 🔗

Splunk provides a script to create a BOSH release of Collector. This is intended to be run by the Pivotal Cloud Foundry (PCF) tile. See Pivotal Cloud Foundry for the script.

Puppet 🔗

Splunk provides a Puppet module to install and configure the package. A module is a collection of resources, classes, files, definition, and templates. See Splunk OpenTelemetry Collector Puppet Module to download the module.

Manual 🔗

Splunk offers the manual configuration options described in this section.

Docker 🔗

Run the following command to install the package using Docker:

docker run --rm -e SPLUNK_ACCESS_TOKEN=12345 -e SPLUNK_REALM=us0 \
    -p 13133:13133 -p 14250:14250 -p 14268:14268 -p 4317:4317 -p 6060:6060 \
    -p 7276:7276 -p 8888:8888 -p 9080:9080 -p 9411:9411 -p 9943:9943 \
    --name otelcol quay.io/signalfx/splunk-otel-collector:latest
    # Use a semantic versioning (semver) tag instead of the ``latest`` tag.
    # Semantic versioning is a formal convention for determining the version
    # number of new software releases.

The following list provides more information on the docker run command options:

  • --rm automatically removes the container when it exits.

  • -e sets simple (non-array) environment variables in the container you’re running, or overwrite variables that are defined in the Dockerfile of the image you’re running.

  • -p publishes a container’s port(s) to the host.

Run the following command to execute an interactive bash shell on the container and see the status of the Collector:

docker exec -it containerID bash

See docker-compose.yml in GitHub to download a docker-compose example.

Create a custom Docker configuration 🔗

You can provide a custom configuration file instead of the default configuration file. Use the environment variable SPLUNK_CONFIG or the --config command line argument to provide the path to this file.

You can also use the environment variable SPLUNK_CONFIG_YAML to specify your custom configuration file at the command line. This is useful in environments where access to the underlying file system is not readily available. For example, in AWS Fargate, you can store your custom configuration YAML in a parameter in the AWS Systems Manager Parameter Store, then in your container definition specify SPLUNK_CONFIG_YAML to get the configuration from the parameter.

Command line arguments take precedence over environment variables. This applies to --config and --mem-ballast-size-mib. SPLUNK_CONFIG takes precedence over SPLUNK_CONFIG_YAML. For example:

docker run --rm -e SPLUNK_ACCESS_TOKEN=12345 -e SPLUNK_REALM=us0 \
    -e SPLUNK_CONFIG=/etc/collector.yaml -p 13133:13133 -p 14250:14250 \
    -p 14268:14268 -p 4317:4317 -p 6060:6060 -p 8888:8888 \
    -p 9080:9080 -p 9411:9411 -p 9943:9943 \
    -v "${PWD}/collector.yaml":/etc/collector.yaml:ro \
    # A volume mount may be required to load the custom configuration file.
    --name otelcol quay.io/signalfx/splunk-otel-collector:latest
    # Use a semantic versioning (semver) tag instead of the ``latest`` tag.
    # Semantic versioning is a formal convention for determining the version
    # number of new software releases.

If the custom configuration includes a memory_limiter processor, then the ballast_size_mib parameter should be the same as the SPLUNK_BALLAST_SIZE_MIB environment variable. For example:

extensions:
  memory_ballast:
  # In general, the ballast should be set to 1/3 of the Collector's memory.
  # The ballast is a large allocation of memory that provides stability to the heap.
  # The limit should be 90% of the Collector's memory.
  # Specify the ballast size by setting the value of the
  # SPLUNK_BALLAST_SIZE_MIB env variable.
  # The total memory size must be more than 99 MiB for the Collector to start.
     size_mib: ${SPLUNK_BALLAST_SIZE_MIB}

Use the following configuration to collect and log CPU metrics. The cat command assigns the CONFIG_YAML parameter to the YAML. The docker run command expands and assigns the parameter CONFIG_YAML to the environment variable SPLUNK_CONFIG_YAML. Note that YAML requires whitespace indentation to be maintained.

CONFIG_YAML=$(cat <<-END
receivers:
   hostmetrics:
      collection_interval: 1s
      scrapers:
         cpu:
exporters:
   logging:
      logLevel: debug
service:
   pipelines:
      metrics:
         receivers: [hostmetrics]
         exporters: [logging]
END
)

docker run --rm \
    -e SPLUNK_CONFIG_YAML=${CONFIG_YAML} \
    --name otelcol quay.io/signalfx/splunk-otel-collector:latest
    # Use a semantic versioning (semver) tag instead of the ``latest`` tag.
    # Semantic versioning is a formal convention for determining the version
    # number of new software releases.

Debian or RPM packages 🔗

All Intel, AMD, and ARM systemd-based operating systems are supported, including CentOS, Debian, Oracle, Red Hat, and Ubuntu. Manually installing an integration is useful for containerized environments, or if you want to use other common deployment options.

Observability Cloud provides a default configuration for each installation method. Each installation method has its own set of environment variables, and their values depend on the installation method, as well as your specific needs.

Note

systemctl is the main tool used to examine and control the state of the systemd system and service manager. systemctl is a requirement to run the Collector as a service. If you don’t have systemctl, you need to start the Collector manually.

Do the following to install the package using a Debian or RPM package:

  1. Set up the package repository and install the package, as shown in the following examples. The first example shows the Debian package and the subsequent examples show the RPM package. A default configuration is installed to /etc/otel/collector/agent_config.yaml, if it does not already exist:

    # Debian
    curl -sSL https://splunk.jfrog.io/splunk/otel-collector-deb/splunk-B3CD4420.gpg > /etc/apt/trusted.gpg.d/splunk.gpg
    echo 'deb https://splunk.jfrog.io/splunk/otel-collector-deb release main' > /etc/apt/sources.list.d/splunk-otel-collector.list
    apt-get update
    apt-get install -y splunk-otel-collector
    
    # RPM with yum
    yum install -y libcap
    # Required for enabling cap_dac_read_search and cap_sys_ptrace capabilities.
    
    cat <<EOH > /etc/yum.repos.d/splunk-otel-collector.repo
    [splunk-otel-collector]
    name=Splunk OpenTelemetry Collector Repository
    baseurl=https://splunk.jfrog.io/splunk/otel-collector-rpm/release/\$basearch
    gpgcheck=1
    gpgkey=https://splunk.jfrog.io/splunk/otel-collector-rpm/splunk-B3CD4420.pub
    enabled=1
    EOH
    
    yum install -y splunk-otel-collector
    
    # RPM with dnf
    dnf install -y libcap
    # Required for enabling cap_dac_read_search and cap_sys_ptrace capabilities.
    
    cat <<EOH > /etc/yum.repos.d/splunk-otel-collector.repo
    [splunk-otel-collector]
    name=Splunk OpenTelemetry Collector Repository
    baseurl=https://splunk.jfrog.io/splunk/otel-collector-rpm/release/\$basearch
    gpgcheck=1
    gpgkey=https://splunk.jfrog.io/splunk/otel-collector-rpm/splunk-B3CD4420.pub
    enabled=1
    EOH
    
    dnf install -y splunk-otel-collector
    
    # RPM with zypper
    zypper install -y libcap-progs
    # Required for enabling cap_dac_read_search and cap_sys_ptrace capabilities.
    
    cat <<EOH > /etc/zypp/repos.d/splunk-otel-collector.repo
    [splunk-otel-collector]
    name=Splunk OpenTelemetry Collector Repository
    baseurl=https://splunk.jfrog.io/splunk/otel-collector-rpm/release/\$basearch
    gpgcheck=1
    gpgkey=https://splunk.jfrog.io/splunk/otel-collector-rpm/splunk-B3CD4420.pub
    enabled=1
    EOH
    
    zypper install -y splunk-otel-collector
    
  2. Configure the splunk-otel-collector.conf environment file with the appropriate variables. You need this environment file to start the splunk-otel-collector systemd service. When you install the package in step 1, a sample environment file is installed to /etc/otel/collector/splunk-otel-collector.conf.example. This file includes the required environment variables for the default configuration.

  3. Run sudo systemctl restart splunk-otel-collector.service to start or restart the service.

Binary file 🔗

Download pre-built binaries (otelcol_linux_amd64 or otelcol_linux_arm64) from GitHub releases.

More options 🔗

Once you have installed the package, you can perform these actions: