Intro to troubleshooting Splunk Enterprise
This topic is intended as a first step in either diagnosing your Splunk Enterprise problem yourself or asking for help.
Narrow down the problem
For example, if the error occurs in a dashboard or alert, check the underlying search first to see whether the error appears there. When troubleshooting searches, it's almost always best to remove the dashboard layer as soon as possible.
For another example, does the problem exist in one app but not the other? With one user but not admins?
Basically, is there any case for which this does work?
Did the error start occurring after the product was functioning normally?
Yes! So what has changed? Remember to think of both Splunk and non-Splunk factors. Was there a server outage? Network problems? Has any configuration or topology changed?
No, it never functioned normally. Check the operating environment and installation. Start with the system requirements in the Installation Manual.
Resources to help you
Splunk has configuration files in several locations, with rules about which files take precedence over each other. Use btool to check which settings your Splunk instance is using. Read about btool in this manual.
The *.conf files are case-sensitive. Check settings and values against the spec and example configuration files in the Admin manual.
There are also a lot of settings in the .conf files that aren't exposed in Splunk Web. It's best to leave these alone unless you know what changing these settings might do.
Splunk log files
Splunk has various internal log files that can help you diagnose problems. Read about the log files in this manual.
Understand how your data gets into Splunk
The Distributed Deployment Manual has a high-level overview of the Splunk data pipeline, breaking it into input, parsing, indexing, and search segments.
For more detail on each segment, see this Community Wiki article about how indexing works.
I've figured out exactly where the problem is
Hey, well done!
Check the (continuously growing) chapter in this manual on some of the most common symptoms and solutions.
Test potential fixes or workarounds
Once you've found a way to fix the problem, test it! Test any noninvasive changes first. Then, test any changes that would create minor interruptions. Make sure no new issues arise from your tested solution.
Always test invasive or major changes in a sandbox environment before moving them to your production system! Your sandbox should be an independent system that mirrors the affected environment.
What's in the Troubleshooting Manual?
Determine which version of Splunk you're running
This documentation applies to the following versions of Splunk® Enterprise: 4.3, 4.3.1, 4.3.2, 4.3.3, 4.3.4, 4.3.5, 4.3.6, 4.3.7, 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.3.0, 6.3.1, 6.3.1511, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.4.0, 6.4.1