Splunk® Enterprise

Workload Management

Download manual as PDF

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Configure Linux systemd for workload management

Before you can configure workload management on Linux distributions running systemd, you must configure systemd to manage splunkd as a service by creating a unit file that defines a cgroup hierarchy.

The following diagram illustrates the cgroup hierarchy under systemd:

The diagram shows the cgroup hierarchy on Linux machines running under systemd. 80 percent of the available system CPU and memory is reserved for splunkd.

For more information, see cgroups.

You must configure cpu and memory cgroups for workload management on all search heads and indexers.

Configure systemd to manage splunkd as a service

There are two ways to configure systemd to manage splunkd as a service:

Configuring systemd using enable boot-start requires Splunk Enterprise version 7.2.2 or later.

Permissions requirements for systemd

systemd has the following permissions requirements:

  • Non-root users must have super user permissions to manually configure systemd on Linux.
  • Non-root users must have super user permissions to run start, stop, restart commands under systemd.

For instructions on how to create a new user with super user permissions, see your Linux documentation. The specific steps can vary depending on the Linux distribution.

You must use sudo to run systemctl start|stop|restart. If you do not use sudo, you must authenticate. For example:

==== AUTHENTICATING FOR org.freedesktop.systemd1.manage-units ===
Authentication is required to manage system services or units.
Multiple identities can be used for authentication:
 1.  <username_1>
 2.  <username_2>
Choose identity to authenticate as (1-2): 2
Password: 
==== AUTHENTICATION COMPLETE ===

Configure systemd manually

Follow these steps to configure systemd to manage splunkd as a service:

  1. Confirm that your Linux machine is running systemd. See Is Linux running systemd?
  2. Before you create, delete, or modify the systemd unit file, you must stop splunkd:
    $SPLUNK_HOME/bin/splunk stop
    
  3. If you enabled Splunk software to start at boot using enable boot-start, run disable boot-start to remove both the splunk init script from /etc/init.d and its symbolic links.
    sudo $SPLUNK_HOME/bin/splunk disable boot-start
    
  4. Open the $SPLUNK_HOME/etc/splunk-launch.conf file and note the value of SPLUNK_SERVER_NAME. The default value is Splunkd.
  5. In the /etc/systemd/system directory, create a unit file named <SPLUNK_SERVER_NAME>.service, such as Splunkd.service.

    You can change the SPLUNK_SERVER_NAME to any name you choose by directly editing the splunk-launch.conf file.

  6. Add the following content to the <SPLUNK_SERVER_NAME>.service unit file:
    [Unit]
    After=network.target
    
    [Service]
    Type=simple
    Restart=always
    ExecStart=/home/<username>/splunk/bin/splunk _internal_launch_under_systemd
    LimitNOFILE=65536
    SuccessExitStatus=51 52
    RestartPreventExitStatus=51
    RestartForceExitStatus=52
    KillMode=mixed
    KillSignal=SIGINT
    TimeoutStopSec=10min
    User=<username>
    Delegate=true
    MemoryLimit=100G
    CPUShares=1024
    PermissionsStartOnly=true
    ExecStartPost=/bin/bash -c "chown -R <username>:<username> /sys/fs/cgroup/cpu/system.slice/%n"
    ExecStartPost=/bin/bash -c "chown -R <username>:<username> /sys/fs/cgroup/memory/system.slice/%n"
    
    [Install]
    WantedBy=multi-user.target
    

    Regarding these lines in the unit file:

    ExecStartPost=/bin/bash -c "chown -R <username>:<username> /sys/fs/cgroup/cpu/system.slice/%n"
    ExecStartPost=/bin/bash -c "chown -R <username>:<username> /sys/fs/cgroup/memory/system.slice/%n"
    

    if a group does not exist on they system with the name <username>, the splunkd service will not start. To workaround this issue, manually update the unit file with the correct group name.

    The following unit file properties are set specifically for Splunk workload management:
    Type=simple
    Restart=always
    Delegate=true
    Do not change these values unless you are familiar with systemd or receive guidance from Splunk support.

    Do not use the following unit file properties. These properties can cause splunkd to fail on restart.
    RemainAfterExit=yes
    ExecStop

    For more information, see Systemd unit file properties.

  7. Reload the unit file.
    sudo systemctl daemon-reload
    
  8. Start splunkd as a systemd service.
    sudo systemctl start Splunkd.service
    
  9. Verify that splunkd is running as a systemd service:
    sudo systemctl status <SPLUNK_SERVER_NAME>.service
    

    When you create the splunkd service, systemd creates corresponding CPU and Memory cgroups in these locations:

    CPU: /sys/fs/cgroup/cpu/system.slice/<SPLUNK_SERVER_NAME>.service
    Memory: /sys/fs/cgroup/memory/system.slice/<SPLUNK_SERVER_NAME>.service
    
  10. For distributed deployments, repeat steps 1-9 on all search heads and indexers.

systemd unit file properties

The following table lists the unit file properties you must specify to run splunkd as a service under systemd:

Property Expected Value
Restart always
Type simple
ExecStart $SPLUNK_HOME/bin/splunk _internal_launch_under_systemd
ExecStartPost chown -R <USER>:<GROUP of USER>/sys/fs/cgroup/<cpu or memory>/system.slice/%n"
Delegate True
SuccessExitStatus 51 52
RestartPreventExitStatus 51
RestartForceExitStatus 52
RemainAfterExit no (default)
MemoryLimit Example: 12G
CPUShares Example: 8192. (Allowed range is 2 to 262144. Default is 1024.)
User, Group <Splunk Owner> <Splunk Group>

For more information on systemd unit file properties, see Service unit configuration.

Manage clusters under systemd

When managing an indexer cluster under systemd:

You must use the sudo command to start, stop, and restart the cluster master or individual peer nodes using systemctl start|stop|restart commands. You do not need sudo to perform a rolling restart using the splunk rolling-restart cluster-peers command, or to take a peer offline using the splunk offline command.

When managing a search head cluster under systemd:

You must use the sudo command to start, stop, and restart cluster members using systemctl start|stop|restart commands. You do not need sudo to perform a rolling restart using the splunk rolling-restart shcluster-members command, or to remove a cluster member using the splunk remove shcluster-members command.

Next step

After you set up cgroups on your Linux operating system, you can configure workload management in Splunk Enterprise. See Configure workload management.

PREVIOUS
Set up Linux for workload management
  NEXT
Configure Linux systems not running systemd for workload management

This documentation applies to the following versions of Splunk® Enterprise: 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9


Comments

The following can be added under [Service] stanza to ensure that THP is always disabled before starting Splunk service.

ExecStartPre=/bin/sh -c "echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled && echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag"

Kchew
May 30, 2019

Update: MemoryMax isn't a valid option in RHEL 7.6 so I sadly it will have to wait for a few years for all these slow/extended-support releases to be phased out for Splunk to use it on all installs.

Intermediate
May 30, 2019

It turns out that "MemoryLimit" has been deprecated in newer releases of systemd and replaced by "MemoryMax". The "MemoryMax" variable accepts a percentage of available system memory, so Splunk could just configure it to 100% by default.
I'm not yet clear if this is applicable in the systemd version used in RHEL/CentOS 7 or older distributions like Debian stable.

Reference: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html

Intermediate
May 30, 2019

Why does Splunk set
"MemoryLimit=100G"
on all installs?! That is, it will set 100G on systems with much more and on systems with much less than that amount of RAM. It should be set dynamically, say to at least 80% of available system memory at the time.

We only found this out when having performance problems with indexers which had 500GiB of RAM and worload pools for ingest.

Intermediate
May 27, 2019

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters