Manually configure metrics collection on a *nix host for Splunk App for Infrastructure

To configure data collection, you must log in to an account with permissions to use sudo for root access. Do not log in as the root user.

Manually install the collectd agent to collect system metrics on a *nix host instead of using the script when:

You are installing collectd on a closed network.
You already have collectd on the host from which you want to collect data.
You do not have trusted URLs from which you can download the required packages and dependencies.

For more information, see About using collectd.

You can also configure collectd to forward metrics data to a local universal forwarder. For more information, see Send collectd data to a universal forwarder.

If you manually configure metrics collection, you also need to manually configure log collection. For more information, see Manually configure log collection on a *nix host for Splunk App for Infrastructure.

Before configuring metrics collection manually, confirm your system has the required dependencies. For more information, see *nix data collection requirements.

1. Install collectd version 5.7.x or 5.8.x

If you have not already installed collectd on your host, install version 5.7.x or 5.8.x now.

If you have an earlier version of collectd, you must update to a compatible version.

For a full list of collectd install commands, see collectd package sources, install commands, and locations.

To install collectd on a Debian or Ubuntu host, enter:

$ sudo apt-get install collectd

To install collectd on a CentOS, Redhat, or Fedora host, enter:

$ sudo yum install collectd

To install collectd on a SUSE or openSUSE host, enter:

$ sudo zypper install collectd

To install collectd on a Solaris host, enter:

$ pkgadd -d http://get.opencsw.org/now
$ /opt/csw/bin/pkgutil -U
$ /opt/csw/bin/pkgutil -i collectd 
$ /usr/sbin/pkgchk -L CSWcollectd # list files

2. Install the libcurl package

If you have not already installed the libcurl package on your system, install it now. For Linux systems, the version of libcurl you have to install depends on the Linux OS version you're running.

To install libcurl3 on a Debian 7, 8, or 9 system or on a Ubuntu 14 or 16 system, enter:

$ sudo apt-get install libcurl3

To install libcurl4 on a Debian 10 or on a Ubuntu 18, 04, 18.10 or 19 system, enter:

$ sudo apt-get install libcurl4

To install libcurl on a CentOS, Redhat, or Fedora system, enter:

$ sudo yum install libcurl

To install libcurl on a SUSE or openSUSE system, enter:

$ sudo zypper install libcurl4

To install libcurl on a Solaris system, enter:

$ pkgadd -d http://get.opencsw.org/now
$ /opt/csw/bin/pkgutil -U
$ /opt/csw/bin/pkgutil -i libcurl4_feature 
$ /usr/sbin/pkgchk -L CSWlibcurl4-feature # list files

3. (Optional) Install the libyajl package

If you monitor Docker containers on a Linux or Mac OS X host, you must have the libyajl version 2 package on your system. If you do not already have the package, install it now.

To install libyajl on a Debian or Ubuntu system, enter:

$ apt-get install libyajl2

To install libyajl on a CentOS, Redhat, or Fedora system, enter:

$ yum install yajl

To install libyajl on a SUSE or openSUSE system, enter:

$ zypper install libyajl2

To install libyajl on a Mac OS X system, enter:

$ brew install yajl

4. Copy the plug-ins to collectd's plug-in directory

For information about plug-in locations, see collectd package sources, install commands, and locations. You cannot monitor Docker containers on Solaris systems that you deployed without an orchestration tool.

The write_splunk collectd plug-in is a replacement for the write_http plug-in that directs metrics data to the Splunk HTTP Event Collector (HEC).

write_splunk creates these five dimensions when you integrate a system:

host
ip
os
os_version
kernel_version

You cannot delete the dimensions the plug-in creates.

If you want to monitor process metrics, copy the processmon.so plug-in. If you monitor Docker containers on Linux systems, copy the docker.so plug-in. Use the docker.so plug-in to monitor Docker containers you didn't deploy with an orchestration tool such as Docker Swarm, Kubernetes, or OpenShift. If you want to monitor Docker containers you deployed with Kubernetes or OpenShift, see these topics:

If you're monitoring a Linux system, the plug-in locations depend on which version of libcurl you're using.

Debian and Ubuntu systems with libcurl3

For Debian and Ubuntu systems with libcurl3 and all other Linux systems, copy the write_splunk plug-in, processmon.so plug-in, and docker.so plug-in.

write_splunk plug-in

$ wget https://<hostname>:8000/en-US/static/app/splunk_app_infrastructure/unix_agent/unix-agent.tgz
$ tar xvzf unix-agent.tgz
$ cp unix-agent/write_splunk.so <plug-in_directory>

processmon plug-in

$ cp unix-agent/processmon.so <plug-in_directory>

docker plug-in

$ cp unix-agent/docker.so <plug-in_directory>

Debian and Ubuntu systems with libcurl4

For Debian and Ubuntu systems with libcurl4, copy the write_splunk plug-in, processmon.so plug-in, and docker.so plug-in.

write_splunk plug-in

$ wget https://<hostname>:8000/en-US/static/app/splunk_app_infrastructure/unix_agent/unix-agent.tgz
$ tar xvzf unix-agent.tgz
$ cp unix-agent/deb_libcurl4/write_splunk.so <plug-in_directory>

processmon plug-in

$ cp unix-agent/deb_libcurl4/processmon.so <plug-in_directory>

docker plug-in

$ cp unix-agent/deb_libcurl4/docker.so <plug-in_directory>

For Solaris systems, copy the write_splunk-solaris plug-in:

$ wget https://<hostname>:8000/en-US/static/app/splunk_app_infrastructure/unix_agent/unix-agent.tgz
$ tar xvzf unix-agent.tgz
$ cp write_splunk-solaris.so "/opt/csw/lib/collectd/write_splunk.so"

For Mac OS X systems, copy the write_splunk plug-in:

$ curl -ksL -o osx-agent.tgz http://waitomo-next.sv.splunk.com:/static/app/splunk_app_infrastructure/osx_agent/osx-agent.tgz
$ tar -xzf osx-agent.tgz
$ cp osx-agent/write_splunk.so <plug-in_directory>

If you monitor Docker containers on Mac OS X systems, also copy the docker.so plug-in:

$ cp osx-agent/docker.so <plug-in_directory>

5. Configure collectd.conf to send data to the Splunk App for Infrastructure

To configure collectd.conf, you must add the <Plugin write_splunk> plug-in, add plug-ins for every other metric you want to monitor, and modify the Hostname field. For information about required plug-in locations, see collectd package sources, install commands, and locations.

Add a LoadPlugin for each plug-in you want to use.

<LoadPlugin "write_splunk">
FlushInterval 30
</LoadPlugin>
LoadPlugin cpu
LoadPlugin uptime
LoadPlugin memory
LoadPlugin df
LoadPlugin load
LoadPlugin disk
LoadPlugin interface
LoadPlugin docker
LoadPlugin processmon

Add metrics configuration stanzas for each metric you want to configure. The following stanzas are default stanzas that the easy install script configures. There is no stanza for the uptime metric. A stanza for the processmon plug-in is optional. Include a processmon stanza to specify whitelists and blacklists and report IO metrics for monitored processes. The following processmon stanza is just an example that includes the settings you can configure.

Plug-in	Supported OS	Stanza
write_splunk	Linux Solaris Mac OS X	<Plugin write_splunk> server "<receiving_server>" port "<HEC PORT>" token "<HEC TOKEN>" ssl true verifyssl false Dimension "entity_type:nix_host" Dimension "key2:value2" </Plugin> `server`: The IP or hostname of the Splunk deployment to which you are sending data. If you are sending data to a distributed deployment, the IPs or hostnames of the indexer. If you deploy a load balancer, the IP or hostname of the load balancer. `port`: The HEC port. `token`: The HEC token.
CPU	Linux Solaris Mac OS X	<Plugin cpu> ReportByCpu false ReportByState true ValuesPercentage true </Plugin>
Memory	Linux Solaris Mac OS X	<Plugin memory> ValuesAbsolute false ValuesPercentage true </Plugin>
DF	Linux Solaris Mac OS X	<Plugin df> FSType "ext2" FSType "ext3" FSType "ext4" FSType "XFS" FSType "rootfs" FSType "overlay" FSType "hfs" FSType "apfs" FSType "zfs" FSType "ufs" ReportByDevice true ValuesAbsolute false ValuesPercentage true IgnoreSelected false </Plugin>
Load	Linux Solaris Mac OS X	<Plugin load> ReportRelative true </Plugin>
Disk	Linux Solaris Mac OS X	<Plugin disk> Disk "" IgnoreSelected true UdevNameAttr "DEVNAME" </Plugin>
Interface	Linux Solaris Mac OS X	<Plugin interface> IgnoreSelected true </Plugin>
Docker	Linux Mac OS X	<Plugin docker> dockersock "/var/run/docker.sock" apiversion "v1.20" </Plugin> By default, collectd fails if you're running more than 100 Docker containers. To monitor more 100 or more Docker containers, add the `ReadBufferSize` parameter to the Docker plug-in. The max value is `32000`.
Process monitoring	Linux	<Plugin processmon> ReadIo true whitelist "process1." whitelist "process2." blacklist "process3.*" </Plugin> This plug-in is optional. If you don't configure a `processmon` stanza, the plug-in monitors every process and doesn't collect IO metrics. If you both blacklist and whitelist a process, the plug-in blacklists the process. The plug-in uses POSIX Extended Regular Expression syntax for the regular expression you enter to whitelist or blacklist processes. The plug-in uses the `comm` field in `/proc/[pid]/stat` for process names. For more information, see the Linux Programmer's Manual.

Update the Hostname field with the IP or hostname of the system that's running collectd. The Hostname must be unique to the system because it is used to identify the entity in SAI.

Here is a recommended collectd.conf file that includes every plug-in.

#
# Config file for collectd(1).
# Please read collectd.conf(5) for a list of options.
# http://collectd.org/
#

##############################################################################
# Global #
#----------------------------------------------------------------------------#
# Global settings for the daemon. #
##############################################################################

Hostname "collectd.server.sample"
FQDNLookup false
#BaseDir "/var/lib/collectd"
#PIDFile "/var/run/collectd.pid"
#PluginDir "/usr/lib64/collectd"
#TypesDB "/usr/share/collectd/types.db"

#----------------------------------------------------------------------------#
# When enabled, plugins are loaded automatically with the default options #
# when an appropriate <Plugin ...> block is encountered. #
# Disabled by default. #
#----------------------------------------------------------------------------#
#AutoLoadPlugin false

#----------------------------------------------------------------------------#
# When enabled, internal statistics are collected, using "collectd" as the #
# plugin name. #
# Disabled by default. #
#----------------------------------------------------------------------------#
#CollectInternalStats false

#----------------------------------------------------------------------------#
# Interval at which to query values. This may be overwritten on a per-plugin #
# base by using the 'Interval' option of the LoadPlugin block: #
# <LoadPlugin foo> #
# Interval 60 #
# </LoadPlugin> #
#----------------------------------------------------------------------------#
Interval 60

#MaxReadInterval 86400
#Timeout 2
#ReadThreads 5
#WriteThreads 5

# Limit the size of the write queue. Default is no limit. Setting up a limit is
# recommended for servers handling a high volume of traffic.
WriteQueueLimitHigh 1000000
WriteQueueLimitLow 800000

##############################################################################
# Logging #
#----------------------------------------------------------------------------#
# Plugins which provide logging functions should be loaded first, so log #
# messages generated when loading or configuring other plugins can be #
# accessed. #
##############################################################################

LoadPlugin syslog
LoadPlugin logfile
<LoadPlugin "write_splunk">
FlushInterval 30
</LoadPlugin>

##############################################################################
# LoadPlugin section #
#----------------------------------------------------------------------------#
# Lines beginning with a single `#' belong to plugins which have been built #
# but are disabled by default. #
# #
# Lines beginning with `##' belong to plugins which have not been built due #
# to missing dependencies or because they have been deactivated explicitly. #
##############################################################################

#LoadPlugin csv
LoadPlugin cpu
LoadPlugin uptime
LoadPlugin memory
LoadPlugin df
LoadPlugin load
LoadPlugin disk
LoadPlugin interface
LoadPlugin docker
LoadPlugin processmon

##############################################################################
# Plugin configuration #
#----------------------------------------------------------------------------#
# In this section configuration stubs for each plugin are provided. A desc- #
# ription of those options is available in the collectd.conf(5) manual page. #
##############################################################################

<Plugin logfile>
LogLevel info
File "/etc/collectd/collectd.log"
Timestamp true
PrintSeverity true
</Plugin>

<Plugin syslog>
LogLevel info
</Plugin>

<Plugin cpu>
ReportByCpu false
ReportByState true
ValuesPercentage true
</Plugin>

<Plugin memory>
ValuesAbsolute false
ValuesPercentage true
</Plugin>

<Plugin df>
FSType "ext2"
FSType "ext3"
FSType "ext4"
FSType "XFS"
FSType "rootfs"
FSType "overlay"
FSType "hfs"
FSType "apfs"
FSType "zfs"
FSType "ufs"
ReportByDevice true
ValuesAbsolute false
ValuesPercentage true
IgnoreSelected false
</Plugin>

<Plugin load>
ReportRelative true
</Plugin>

<Plugin disk>
Disk ""
IgnoreSelected true
UdevNameAttr "DEVNAME"
</Plugin>

<Plugin interface>
IgnoreSelected true
</Plugin>

<Plugin docker>
dockersock "/var/run/docker.sock"
apiversion "v1.20"
</Plugin>

<Plugin processmon>
ReadIo true
whitelist "collectd"
whitelist "bash"
blacklist "splunkd"
</Plugin>

<Plugin write_splunk>
server "<splunk infrastructure app server>"
port "<HEC PORT>"
token "<HEC TOKEN>"
ssl true
verifyssl false
Dimension "entity_type:nix_host"
</Plugin>

Optionally, you can also add dimensions as Dimension "key:value" to the write_splunk plug-in.

6. Start the collectd service

Start collectd on Linux systems:

$ sudo service collectd restart

Start collectd on Solaris systems:

$ sudo svcadm enable cswcollectd

Start collectd on Mac OS X systems:

$ sudo brew services restart collectd