Collect pstacks
Support might ask you to gather thread call stacks with pstack or eu-stack, for example if your deployment experiences:
- unexplained high CPU, along with identified threads using high CPU,
- frozen Splunk that's not doing anything, when it obviously should, or
- unexplainably slow behavior in splunkd (that is, not limited by disk or CPU).
On *nix
Find or install pstack
Pstack is available on Red Hat and Centos Linux and Solaris by default. Pstack is installable on several other flavors of Linux.
Test whether pstack is installed:
which pstack /usr/bin/pstack
If you get an error message instead of a location, you might still be able to install pstack. On RHEL and its derivatives (CentOS, Oracle Linux, etc), pstack is part of the gdb package.
Error on Linux from pstack: no symbols
On Linux flavors that aren't based on RHEL, pstack might be useless for troubleshooting, in that it does not support threads.
If you get output from pstack such as:
29175: splunkd -p 8089 start (No symbols found) 0x7fd3740e96d9: ???? (100, 0, 7fffa6befd00, 100000010, 25bb080, ffffffff00000010) + ffff8001594106da
Then you probably have the x86-64-specific pstack binary, which is less capable than the redhat gdb-based one, as it does not understand posix threaded applications. Ensure that the gdb package is installed, and try the gstack command as a substitution for pstack. gstack is available on Ubuntu, for example. If gstack is not available, a very barebones gstack is provided here:
pid=$1 echo 'thread apply all bt' | gdb --quiet -nx /proc/$pid/exe $pid
gdb
Installable on nearly any Unix.
# ps aux |grep splunkd root 31038 0.5 0.6 245292 104884 ? Sl Sep07 66:45 splunkd -p 17011 restart root 31039 0.0 0.0 47012 7076 ? Ss Sep07 4:47 splunkd -p 17011 restart # gdb -p 31038 #this will freeze splunk temporarily ... lots of output you don't care about ... (gdb) <-this is the prompt (gdb) thread apply all bt <... interesting output here...> (gdb) quit # important! otherwise splunk is frozen forever #
Run pstack
To run pstack from the *nix command line,
# ps aux |grep splunkd root 31038 0.5 0.6 245292 104884 ? Sl Sep07 66:45 splunkd -p 17011 restart root 31039 0.0 0.0 47012 7076 ? Ss Sep07 4:47 splunkd -p 17011 restart # pstack 31038 <... output here ...>
It is usually beneficial to get multiple pstacks separated by 1 second. Here is an example of getting 100 pstacks separated by 1 second and storing them in /tmp:
i=0; while [ $i -lt 100 ] ; do date > /tmp/pstack$i.out; pstack $splunkd_pid >> /tmp/pstack$i.out; let "i+=1"; sleep 1; done
Note that this script requires bash (let
is not a portable expression).
This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14
Feedback submitted, thanks!