
Transparent huge memory pages and Splunk performance
Some distributions of Linux (for example, Red Hat, CentOS, and Ubuntu) have an advanced memory management scheme called Transparent Huge Pages (THP). THP acts as an abstraction layer that lets the memory management units (MMUs) in a Linux host work with huge memory pages. With THP, this work occurs without specific action on the behalf of the administrator or the software that runs on the host.
Every CPU in a modern host has an MMU. The MMU manages memory in pages, and huge pages are structures that let MMUs manage multiple gigabytes and terabytes of memory more efficiently.
THP has been associated with degradation of Splunk Enterprise performance in at least some Linux kernel versions (for example, the 2.6.32 kernel used in Red Hat Enterprise Linux 6). When enabled, THP can significantly degrade overall system performance on systems that run Splunk Enterprise because of several issues:
- The implementation is too aggressive at coalescing memory pages for short-lived processes (such as many Splunk searches)
- It can prevent the
jemalloc
memory allocation implementation from releasing memory back to the operating system after use. Thejemalloc
implementation is more scalable version of themalloc
implementation and has been used in newer distributions of Linux. - For some workloads, it can cause I/O regressions surrounding swapping of huge pages.
On systems with THP enabled, Splunk has observed a minimum of a 30% degradation in indexing and search performance, with a similar percentage increase in latency. For this reason, Splunk recommends that you disable THP in your Linux system configuration unless that system runs an application that requires THP.
In future versions of Linux, kernel engineers might find ways to improve the behavior of THP for the Splunk workload.
See the following pages for additional information about THP and how to disable it. Specific steps on disabling the feature are not shown here because the procedure differs on each distribution.
- "How do I disable Transparent Huge Pages (THP) and confirm that it is disabled?" (http://answers.splunk.com/answers/188875/how-do-i-disable-transparent-huge-pages-thp-and-co.html) on Splunk Answers.
- "Performance Issues with Transparent Huge Pages" (https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge)
- ""Update on Hugepages" (https://dbakerber.wordpress.com/2015/03/11/update-on-hugepages-rewrite-to-fix-formatting-issues/) on Andrew Kerber's Oracle DBA Weblog.
- Your Linux distribution documentation on memory management.
PREVIOUS Workaround for Windows universal forwarder enabling inputs unexpectedly on installation or upgrade |
NEXT Linux kernel memory overcommitting and Splunk crashes |
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.5.10, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 6.6.11, 6.6.12, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 8.0.0
Comments
Can Splunk Engineering please confirm that this is still required in 2019 with RHEL/CentOS 7.6 with kernel 3.10+ and RHEL 8 which has a newer kernel?
Quoting from Ckurtz back in January 2017:
"
AFAIK, the severe performance bug was fixed in later Linux kernels (circa kernel 4.x+, but not back-ported to rhel/centos 7.2) so that madvise is the default, and is reasonably safe to use. However, if the box is performing double duty, cycles will be spent to manage and defrag any use of THP outside of the Splunk process. Since we require dedicated ownership/use of the system, enabling THP for "alien" processes is not recommended.
"
What about a bug (SPL-133708) is that not yet fixed?
Anil
June 30,2017
Here's another way to disable THP with tuned: https://kb.informatica.com/solution/23/Documents/Disable%20Transparent%20Huehpages%20on%20Linux%207.pdf
Splunk Engineering (via a Support Case) just confirmed that disabled THP is still required in RHEL7. Engineering's comment was:
"AFAIK, the severe performance bug was fixed in later Linux kernels (circa kernel 4.x+, but not back-ported to rhel/centos 7.2) so that madvise is the default, and is reasonably safe to use. However, if the box is performing double duty, cycles will be spent to manage and defrag any use of THP outside of the Splunk process. Since we require dedicated ownership/use of the system, enabling THP for "alien" processes is not recommended."
I've given the Docs Team more details so they can update the page.
Thanks for reporting, Lewis! I filed a bug (SPL-133708).
I believe there is a problem with the script that checks to see if THP is disabled:
From https://access.redhat.com/solutions/46111
-----
/sys/kernel/mm/redhat_transparent_hugepage/defrag
/sys/kernel/mm/redhat_transparent_hugepage/enabled
NOTE: Some third party application install scripts check value of above files and complain even if THP is disabled at boot time using transparent_hugepage=never, this is due to the fact when THP is disabled at boot time, the value of /sys/kernel/mm/redhat_transparent_hugepage/defrag will not be changed, however this is expected and system will never go in THP defragmentation code path when it is disabled at boot and THP defrag need not to be disabled separately.
-----
A warning should not be raised in the Health Check if THP is disabled at boot time even if the defrag file is set to "always". Currently the Health Check is raising false warnings...
Question: This article seems to apply to Linux Kernel versions 2.6.32 Which is RHEL 6.
Does this THP issue also applies with to RHEL 7.2 Kernel 3.10.0-327.36.1.el7.x86_64 ?
This is a recommendation for all servers that run Splunk Enterprise.<br /><br />We specifically did not include detailed instructions because we want customers to proactively ensure that they have the most up-to-date information on disabling the setting, depending on the distribution they have.
Is this recommendation directed at only at hosts performing indexing? Or is this best practice on search heads too?<br /><br />To disable for RHEL/CentOS: echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled<br />or try with grub but maybe broken: see https://access.redhat.com/site/solutions/422283
Hi Intermediate,
As we still support Linux kernel version 2.6, this notice still applies.