Splunk® Enterprise

Troubleshooting Manual

Download manual as PDF

Download topic as PDF

Collect pstacks

Support might ask you to gather thread call stacks with pstack, for example if your deployment experiences:

  • unexplained high CPU, along with identified threads using high CPU,
  • frozen Splunk that's not doing anything, when it obviously should, or
  • unexplainably slow behavior in splunkd (that is, not limited by disk or CPU).

On *nix

Find or install pstack

Pstack is available on Red Hat and Centos Linux and Solaris by default. Pstack is installable on several other flavors of Linux.

Test whether pstack is installed:

 which pstack
/usr/bin/pstack

If you get an error message instead of a location, you might still be able to install pstack. On RHEL and its derivatives (CentOS, Oracle Linux, etc), pstack is part of the gdb package.

Error on Linux from pstack: no symbols

On Linux flavors that aren't based on RHEL, pstack might be useless for troubleshooting, in that it does not support threads.

If you get output from pstack such as:

29175: splunkd -p 8089 start
(No symbols found)
0x7fd3740e96d9: ???? (100, 0, 7fffa6befd00, 100000010, 25bb080, ffffffff00000010) + ffff8001594106da 

Then you probably have the x86-64-specific pstack binary, which is less capable than the redhat gdb-based one, as it does not understand posix threaded applications. Ensure that the gdb package is installed, and try the gstack command as a substitution for pstack. gstack is available on Ubuntu, for example. If gstack is not available, a very barebones gstack is provided here:

pid=$1
echo 'thread apply all bt' | gdb --quiet -nx /proc/$pid/exe $pid

gdb

Installable on nearly any Unix.

# ps aux |grep splunkd
root     31038  0.5  0.6 245292 104884 ?       Sl   Sep07  66:45 splunkd -p 17011 restart
root     31039  0.0  0.0  47012  7076 ?        Ss   Sep07   4:47 splunkd -p 17011 restart
# gdb -p 31038  #this will freeze splunk temporarily
... lots of output you don't care about ...
(gdb) <-this is the prompt
(gdb) thread apply all bt
<... interesting output here...>
(gdb) quit # important! otherwise splunk is frozen forever
#

Run pstack

To run pstack from the *nix command line,

# ps aux |grep splunkd
root     31038  0.5  0.6 245292 104884 ?       Sl   Sep07  66:45 splunkd -p 17011 restart
root     31039  0.0  0.0  47012  7076 ?        Ss   Sep07   4:47 splunkd -p 17011 restart
# pstack 31038
<... output here ...>

It is usually beneficial to get multiple pstacks separated by 1 second. Here is an example of getting 100 pstacks separated by 1 second and storing them in /tmp:

% i=0; while [ $i -lt 100 ] ; do date > /tmp/pstack$i.out; pstack $splunkd_pid >> /tmp/pstack$i.out; let "i+=1"; sleep 1; done

Note that this script requires bash (let is not a portable expression).

On Windows

You can gather many pstacks at once, like with *nix:

http://wiki.splunk.com/Community:GatherWindowsStacks

PREVIOUS
Anonymize data samples to send to Support
  NEXT
Command line tools for use with Support

This documentation applies to the following versions of Splunk® Enterprise: 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.1.3


Comments

Took me awhile to figure out I didn't need the % at the start so below is my changes:

Removed the % at the start
Replace $splunk_pid with just the PID (not $21861)
added a echo out since it was silent ... echo $i

Here's mine:

i=0; while [ $i -lt 100 ]; do date > /opt/gstack_output/gstack$i.out; echo $i; gstack 21868 >> /opt/gstack_output/gstack$i.out; let "i+=1"; sleep 1; done

Rewritex
April 5, 2017

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters