Splunk® Enterprise

Workload Management

Download manual as PDF

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Configure systemd distributions for workload management

Before you can configure workload management on Linux distributions running systemd, you must configure systemd to manage splunkd as a service by creating a unit file that defines a cgroup hierarchy.

The following diagram illustrates the cgroup hierarchy under systemd:

The diagram shows the cgroup hierarchy on Linux machines running under systemd. 80 percent of the total system CPU and memory is reserved for splunkd.

Linux cgroups must be properly configured for workload management on all search heads and indexers.

You must have root access to set up systemd.

Configure cgroups on systemd

Follow these steps to configure cgroups for workload management on systemd:

  1. Confirm that your Linux machine is running systemd. See Is Linux running systemd?.
  2. Stop splunkd.
    $SPLUNK_HOME/bin/splunk stop
    
  3. If you have previously enabled Splunk software to start at boot using the enable boot-start command, run disable boot-start to remove both the splunk start script located in /etc/init.d and symbolic links.
    [sudo] $SPLUNK_HOME/bin/splunk disable boot-start
    
  4. Open $SPLUNK_HOME/etc/splunk-launch.conf and get the value of SPLUNK_SERVER_NAME. The default value is Splunkd.
  5. In the /etc/systemd/system directory , create a unit file named SPLUNK_SERVER_NAME.service. For example, Splunkd.service.

    You must use the same SPLUNK_SERVER_NAME on all search heads and indexers.

  6. Add the following content to the Splunkd.service unit file.
    [Unit]
    Description=Splunk service
    Wants=network.target
    After=network.target
     
    [Service]
    Restart=always
    Type=simple
    ExecStart=$SPLUNK_HOME/bin/splunk _internal_launch_under_systemd --accept-license --no-prompt --answer-yes
    Delegate=true
    #Splunk defines successful exit codes other than 0 to indicate special exit scenarios which are
    #used by splunk operations like rolling-restart, offline etc.
    SuccessExitStatus=51 52
    RestartPreventExitStatus=51
    RestartForceExitStatus=52
    RemainAfterExit=no
    #On some systemd installations, systemd does not create cgroups for memory and cpu controller under system.slice
    #rather it runs process under root cgroups, we can force systemd to create cgroups under system.slice by specifying
    #MemoryLimit and CPUShares, please look at description below.
    MemoryLimit=100G
    CPUShares=8192
    #If you want to run splunk as root user, comment out the following five lines:
    PermissionsStartOnly=true
    User=splunk
    Group=splunk
    ExecStartPost=/bin/bash -c "chown -R <USER Specified above>:<GROUP of User> /sys/fs/cgroup/cpu/system.slice/%n"
    ExecStartPost=/bin/bash -c "chown -R <USER Specified above>:<GROUP of User> /sys/fs/cgroup/memory/system.slice/%n"
     
    [Install]
    WantedBy=multi-user.target
    

    The following unit file property values are set specifically for Splunk workload management: Restart=always, Type=simple, Delegate=true. Do not change these values unless you know what you are doing, or under guidance from Splunk support.

    Do not set RemainAfterExit=yes in the unit file. Setting this property to yes causes failure of both single-instance restart using Splunk Web and rolling restart in search head cluster and indexer cluster deployments.

    Do not use the ExecStop and KillMode properties in the unit file. These properties can also cause Splunk restart to fail.

    Setting CPUShares to a higher value gives splunkd more CPU relative to other system processes.

    For more information on unit file configuration settings, see Systemd unit file properties.

  7. Reload the unit file.
    systemctl daemon-reload
    
  8. Start splunkd
    $SPLUNK_HOME/bin/splunk start
    

    This starts splunkd as a systemd service.

    splunkd detects that it is running under systemd and automatically translates splunk start|stop commands to systemctl start|stop commands.

  9. Verify that splunkd is running as a systemd service:
    systemctl status <SPLUNK_SERVER_NAME>.service
    

    Note that you do not need to create your own cgroups. When you create a Splunk service, systemd creates corresponding CPU and Memory cgroups in these locations:

    CPU: /sys/fs/cgroup/cpu/system.slice/<SPLUNK_SERVER_NAME>.service
    Memory: /sys/fs/cgroup/memory/system.slice/<SPLUNK_SERVER_NAME>.service
    
  10. For distributed deployments, repeat steps 1-9 above on all search heads and indexers.

    systemd unit file properties

    The following table describes unit file properties that you must specify to run splunkd as a service under systemd.

    Property Expected Value Description
    Restart always Restarts the splunk service regardless of the reason the service stopped.
    Type simple The process configured with ExecStart (shown below) is the main process of the service.
    ExecStart $SPLUNK_HOME/bin/splunk _internal_launch_under_systemd Commands with their arguments that are executed when the service is started.
    ExecStartPost chown -R <USER>:<GROUP of User>/sys/fs/cgroup/<cpu or memory>/system.slice/%n" Additional commands that are executed after the command in ExecStart.
    Delegate True Enables delegation of further resource control partitioning to processes of the unit. When enabled the unit can create and manage its own private sub-hierarchy of control groups below the control group of the unit itself. This gives the Splunk service permission to create and manage workload pools underneath the main service level cgroup.
    SuccessExitStatus 51 52 Takes a list of exit status definitions which, when returned by the main service process, will be considered successful termination, in addition to the normal successful exit code 0 and the signals SIGHUP, SIGINT, SIGTERM, and SIGPIPE.
    RestartPreventExitStatus 51 Takes a list of exit status definitions which, when returned by the main service process, will prevent automatic service restarts, regardless of the restart setting configured with Restart=. For the Splunk service, this value must be 51.
    RestartForceExitStatus 52 Takes a list of exit status definitions which, when returned by the main service process, will force automatic service restarts, regardless of the restart setting configured with Restart=. For the Splunk service, this value must be 52. This value is used for splunk commands such as ./splunk offline.
    RemainAfterExit no (default) Set value to no to ensure that systemd notices that processes belonging to the Splunk service have exited and the service is not active. systemd will then proceed to restart the Splunk service. Otherwise Splunk restart might not work as expected.
    MemoryLimit Example: "12G" Specifies the limit on maximum memory usage for the executed processes. The limit determines how much process and kernel memory tasks in this unit can use. By default, memory size is specified in bytes. If the value has has a suffix of K, M, G or T, the specified memory size is parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with the base 1024), respectively. Alternatively, you can specify a relative percentage of the installed physical memory on the system. If assigned the special value "infinity", no memory limit is applied. This controls the "memory.limit_in_bytes" control group attribute.
    CPUShares Example: "8192" Assigns the specified CPU time share weight to the processes executed. These options take an integer value and control the "cpu.shares" control group attribute. The allowed range is 2 to 262144. Default is 1024.
    User, Group <Splunk Owner> <Splunk Group> Sets the Linux user or group that the processes are executed as, respectively. Takes a single user or group name, or a numeric ID as argument. Remove the User and Group properties from the unit file to run as root.

    For more information on systemd unit file properties, see Service unit configuration.

    Next step

    After you set up cgroups on your Linux operating system, you can configure workload management in Splunk Enterprise. For more information, see Configure workload management.

PREVIOUS
Set up Linux for workload management
  NEXT
Configure Linux systems not running systemd for workload management

This documentation applies to the following versions of Splunk® Enterprise: 7.2.0, 7.2.1


Comments

Provided systemd unit does not perform graceful shutdown of splunkd.
Sometimes splunkd is being killed on unit stop and sometimes not.
Service displays correct status, only after splunkd was killed by systemd.
Service displays failed state after splunkd graceful shutdown.

[root@spl ~]# systemctl status Splunkd -l
...
Active: inactive (dead)
...
Main PID: 27500 (code=killed, signal=TERM)
...
Dec 20 00:48:57 spl splunk[27500]: Dying on signal #15 (si_code=0), sent by PID 1 (UID 0)
Dec 20 00:48:58 spl systemd[1]: Stopped Splunk service.

[root@spl ~]# systemctl status Splunkd -l
...
Active: failed (Result: exit-code) ...
...
Main PID: 30353 (code=exited, status=8)
...
Dec 20 00:52:03 spl systemd[1]: Splunkd.service: main process exited, code=exited, status=8/n/a
Dec 20 00:52:04 spl systemd[1]: Stopped Splunk service.
Dec 20 00:52:04 spl systemd[1]: Unit Splunkd.service entered failed state.
Dec 20 00:52:04 spl systemd[1]: Splunkd.service failed.

David

Davidkachan
December 19, 2018

For SLES12, possibly others: If you run as non-root, you'll lose the ability to issue any of the splunk controlling commands: start, stop, restart. Each call will be redirected to systemctl, which requires root password. Even running with `sudo splunk restart` still prompts for root password (after sudo auth).

Twinspop
December 17, 2018

Hi Nickhillscpl,

Thanks for bringing this to our attention! Yes, the line you pointed was not worded correctly. I've updated it to read: "#If you want to run splunk as root user, comment out the following five lines:"

Thanks again.

Sroback splunk, Splunker
October 16, 2018

In the example service, it says:
"#If you want to run splunk as non-root user, comment the following five lines."

I think this is wrong: Should it not say "un-comment" - in which case the 5 lines following should be prefixed with #.

-or-

The line should read: "#If you want to run splunk as the *root* user, comment *out* the following five lines."

As it stands the example is confusing, and badly worded.

Nickhillscpl
October 16, 2018

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters