HTTP thread limit issues
When you run Splunk Enterprise in a way that uses lots of HTTP connections for Representational State Transfer (REST) operations (for example, a deployment server in a large distributed environment), you might encounter undesirable behavior, including but not limited to logging of errors in
splunkd.log like the following.
03-19-2015 14:36:10.971 -0500 WARN HttpListener - Can't handle request for /services/broker/connect/8D0E0E2C-8EB5-40D2-9E8A-083F8E9B2516/ISP1065C/241655/windows-x64/8089, max thread limit for REST HTTP server is 6008, threads already in use is 6008
This error occurs because, as of Splunk Enterprise 6.0 and later, the software limits the number of REST HTTP connections an instance uses to prevent service failure caused by resource exhaustion.
How Splunk Enterprise calculates threads and sockets for REST HTTP operations
Splunk Enterprise needs threads and file descriptors to perform REST HTTP operations. Threads let the processes perform tasks, and sockets let the processes communicate with the network. If Splunk Enterprise runs out of either HTTP sockets or threads, it can't complete REST calls to its backend and any such calls fail. Splunk Enterprise thus reserves threads and file descriptors to use for these services.
Splunk Enterprise uses the following formulas to compute the thread limit for HTTP REST:
MAX_THREADS = MAX_RAM / (256K * sizeof(void *)) MAX_HTTP_REST_THREADS = MAX_THREADS / 3
When it starts, Splunk Enterprise determines the amount of available memory in the host, in bytes. By default, it divides this number by the default stack size and pointer size to get the total number of available threads. It then divides the result by three. This final number is the number of threads available for REST HTTP operations.
- The way Splunk Enterprise calculates the initial MAX_RAM value varies from platform to platform. For example, on a Centos system, MAX_RAM can be retrieved using the following command:
cat /proc/meminfo | grep MemTotal
Multiply the result by 1024 to determine the MAX_RAM value.
- 256K (=262144 bytes) is the default stack size
- sizof(void *) is the size of the pointer, which is hardware and compiler specific and cannot be easily retrieved unless programmatically. Typically, this value is 8 on a 64-bit system as we need 8 bytes (64-bits) to store a memory address. On a 32-bit system, this value is typically 4.
- The calculated MAX_THREADS value must be less than the value of "ulimit -u" and it must be greater than 20 but less than 150000, which is the hard limit.
For example, on a 64-bit Centos system with 8GB RAM (results rounded down to the nearest integer):
MAX_THREADS = (8251723776 / ( 262144 * 8) = 3935 MAX_HTTP_REST_THREADS = 3935 / 3 = 1312
It then checks the number of available file descriptors for the system, as configured by the
ulimit command. It divides that number by three. The result is the number of file descriptors available for sockets for REST HTTP operations. For example, if the number of open file descriptors is 36000, then Splunk Enterprise reserves 12000 for sockets for REST HTTP operations.
The number of available file descriptors is different than the number of threads. Both must be present before Splunk Enterprise can make REST calls.
Override automatic socket and thread configuration
You can override this automatic configuration by making changes to
server.conf. Increasing the number of threads can increase the amount of memory that the Splunk Enterprise instance uses.
- In the
$SPLUNK_HOME\etc\system\local, create or edit
- In the
[httpServer]stanza, set the
maxThreadsattribute to specify the number of threads for REST HTTP operations that Splunk Enterprise should use.
- Set the
maxSocketsattribute to specify the number of sockets that should be available for REST HTTP operations.
- Save the file.
- Restart Splunk Enterprise. The changes should take effect immediately.
The following example sets the number of HTTP threads to 100000 and the number of sockets to 50000:
[httpServer] maxThreads=100000 maxSockets=50000
Splunk Enterprise does not start due to unusable filesystem
This documentation applies to the following versions of Splunk® Enterprise: 6.5.7, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5
Feedback submitted, thanks!