Splunk® Enterprise

Troubleshooting Manual

Generate a diagnostic file

When you contact Splunk support for assistance with Splunk software, they often request a diagnostic (diag) file to assist them in troubleshooting the issue.

About diag

A diag file provides a snapshot of the configurations and logs from the Splunk software along with select information about the platform instance. The diag collection process gathers information such as server specifications, operating system (OS) version, file system information, and current network connections. A diag collection also includes the contents of the $SPLUNK_HOME installation path, such as app configurations, internal log files, and index metadata. The diag collection process does not collect or store any indexed data. You can use the Splunk Enterprise command line or Splunk Web to initiate the diag collection process.

In some environments, custom app objects such as lookup tables might contain sensitive data. For additional anonymization options to use on a diag file, see Anonymize data samples to send to Support.

You can review your diag file before sending it to support and ensure that no proprietary data is included. The diag collection process attempts to exclude sensitive information from any output when using the commands below, and in any anonymized data samples sent to Splunk Support. We cannot guarantee compliance with your company security policy.

Generate diags using Splunk Web

As a Splunk Enterprise admin, you can generate diags across your deployment using Splunk Web.

You can select multiple instances in your deployment to generate diags for, and which configurations to use. You can recreate a diag using settings you chose in the past. You can manage previously created diag bundles, including deleting files, viewing the status of diag creation, and downloading diags to your local machine. After you have diags on your local machine, you can upload them to an existing Support case.

To generate and view diags in Splunk Web, you need the get_diag capability.

Follow these steps to access the Splunk Web diag generation page.

  1. Log into Splunk Web on a search head or monitoring console in your deployment.
  2. Click Settings > Instrumentation.

Decide which instance to use to generate diags

Generating diags in Splunk Web is supported for a remote instance that has at least one of the following server roles:

  • A search head that is the only search head in a deployment.
  • A clustered search head.
  • A clustered indexer.
  • An indexer cluster manager.

Splunk Cloud customers can generate diag files for self-hosted instances only. Typically, the self-hosted instances are forwarders.

If you are on a search head and cannot generate a diag for one of your remote instances, try again from your monitoring console. Since the monitoring console in distributed mode adds all instances as search peers to the instance hosting the monitoring console, this is a useful instance to generate diags from.

Choose which files to include in your diags

Choose which directories are included with components. By default, all components are included but REST. You can adjust the thoroughness with which some components are collected by using additional options. See Include or exclude content using components.

Components and options you select in Splunk Web override any local settings.

How diags are generated and stored in your deployment

Diags are stored in the $SPLUNK_HOME path. If you run the diag command to generate a diag on a remote instance, the diag artifacts are transferred to the instance where the command was invoked.

Run diag at the command line

This produces diag-<server name>-<date>.tar.gz in your Splunk home directory, which you can upload to your Splunk Support case via the website or built-in upload functionality. If your Support case is about forwarding, Support will probably need a diag for both your forwarder and your receiver. Label each diag so it is clear which is from the forwarder and which is from the receiver.

Be sure to run the diag command using a user account with sufficient access to read files in $SPLUNK_HOME.

The basic syntax to run diag at the command line is:

  1. Using a shell prompt, go to the folder $SPLUNK_HOME/bin in *nix or %SPLUNK_HOME%\bin in Windows.
  2. Run the following command:
    splunk diag
*nix example Windows example
./splunk diag
splunk diag

Exclude files from diag

The Splunk platform can be told to leave some files out of the diag. One way to do this is with path exclusions. In Splunk Web, use the Exclude patterns option. At the command line you can use the --exclude flag. For example:

splunk diag --exclude "*/passwd"

This is repeatable:

splunk diag --exclude "*/passwd" --exclude "*/dispatch/*"

Files excluded by the --exclude feature are listed in excluded_filelist.txt in the diag bundle to ensure Splunk Support can interpret the diag.

Include or exclude content using components

A more robust way to exclude content is with components. The following options select which categories of information should be collected.

  --collect=<list>      Declare a set of components to gather, as a
                      comma-separated list, overriding any prior choices
  --enable=<component_name>
                      Add a component to the work list
  --disable=<component_name>
                      Remove a component from the work list

The following components are available at both the command line and in Splunk Web.

Component Description Options
conf_replication_summary A directory listing of replication summaries produced by search head clustering. This component is not available in Splunk Web.
consensus Copies of the consensus protocol files used for search head cluster member coordination from var/run/splunk/_raft
dispatch The search dispatch directories. See Dispatch directory and search artifacts in the Search Manual.
etc The entire contents of the $SPLUNK_HOME/etc directory, which contains configuration information, including .conf files.
  • By default, diag excludes lookup files in etc/apps and etc/users starting in Splunk Enterprise version 6.5.0. To include lookups, use the option --include-lookups.
  • By default, diag excludes files in $SPLUNK_HOME/etc larger than 10 MB. To modify this limit, use --etc-filesize-limit=<level>, where level is the file size in kilobytes and 0 disables this filter.
file_validate The results of the latest file integrity check. See Check the integrity of your Splunk software files in the Admin Manual.
index_files Files from the index that describe their contents. (Hosts.data, Sources.data, Sourcetypes.data, and bucketManifests). User data is not collected. If diag collects index files on larger deployments, it might take a while to run. Read about index files in the Splexicon. --index-files=level

Index data file gathering level: manifests, or full, meaning manifests + metadata files. Default: manifests.

index_listing Directory listings of the index contents are gathered, in order to see file names, directory names, sizes, timestamps, and the like. This information is recorded in systeminfo.txt. --index-listing=level

Index directory listing level: light (hot buckets only), or full, meaning all index buckets. Default: light.

kvstore Directory listing of the Splunk key value store files.
log The contents of $SPLUNK_HOME/var/log/... See What Splunk Enterprise logs about itself.
  • Set the log age to gather using --log-age=<days>. Log files over this many days old are not included, 0 disables this filter. Default: 60.
  • By default diag gathers at most three Windows crash .dmp files. To gather every .dmp file, use --all-dumps=<bool>.
  • Fully gather files in $SPLUNK_HOME/var/log smaller than the size specified by --log-filesize-limit=size. For log files larger than this size, gather only this many bytes from the end of the file (capture truncated trailing bytes). [default: 1GB]
  • To redact search terms from audit.log and remote_searches.log, use --filter-searchstrings. To not modify these log files, use --no-filter-searchstrings.
pool If search head pooling is enabled, the contents of the pool dir. By default diag excludes lookup files in pool starting in Splunk Enterprise version 6.5.0. To include lookups, use the option --include-lookups.
rest splunkd httpd REST endpoint gathering. Collects output of various splunkd urls into xml files to capture system state. Off by default.
searchpeers Directory listing of the "searchpeers" location, actually the data provided by search*heads* on indexers/search nodes.
app:<app_name> If you have an app installed that extends diag, adding apps-specific troubleshooting data, it will offer a component like this. For information on what type of data the app provides, see the app documentation, review the content stored in the produced tar file, or contact the app developers. This component is not available in Splunk Web. An app might offer additional app-specific flags, in the form --app_name:setting

For example, the most commonly requested files collected are log files and configuration files only for initial analysis. To collect only those two components, use:

$SPLUNK_HOME/bin/splunk diag --collect=log,etc

Defaults can also be controlled in server.conf. Refer to server.conf.spec in the Admin Manual for more information. Apps do not currently offer defaulting of their settings in server.conf

Redact search strings

Diag by default removes some types of sensitive information from search strings in diag files. Read about configuring search string redaction in server.conf.spec.

These options cause diag to redact or hide data from the output diag.

   --filter-searchstrings
                       Attempt to redact search terms from audit.log &
                       remote_searches.log that may be private or personally
                       identifying
   --no-filter-searchstrings
                       Do not modify audit.log & remote_searches.log

Run the diag command on a remote instance

To gather diags from remote Splunk Enterprise installations, you need:

  • A local instance with Splunk Enterprise installed.
  • A local login credential that has the get_diag capability. The admin role has this capability by default.
  • A login credential for the remote Splunk Enterprise instance.
  • Sufficient space to store the remote diag file locally in the $SPLUNK_HOME path.

Remote diag collection does not work with universal forwarders. The options available when including or excluding components using remote diag collection are: --basename, --all-dumps, and --exclude.

  1. Using a shell prompt, go to the folder $SPLUNK_HOME/bin.
  2. Run the command ./splunk diag -uri "https://<host>:<mgmtPort>"
  3. When prompted, type the login credential and password
  4. The diag will run and the file transferred to the local Splunk Enterprise instance. Depending upon the size of the diag file and the speed of the connection, this will take time to complete.
  5. Using a shell prompt, go to the folder $SPLUNK_HOME and look for the file diag-<host>_<mgmtPort>-<date>.tar.gz.

Upload a file to Splunk Support

If you have already opened a case with Splunk Support, you can use the diag CLI command to send a diag file to the open case after the file is generated. You can also use the command to upload a supporting file to a case, such as a previously generated diag file or other debugging data. Files that you upload using the CLI must be 5 GB or less in size. File upload does not work with universal forwarders.

To generate and upload a diag, the CLI syntax is:

splunk diag --upload

To upload a file you already have, the CLI syntax is:

splunk diag --upload-file=<filename>.tar.gz

The diag command interactively prompts for key values, such as your splunk.com user name and password, choice of open cases for that user, and the upload description.

If you have the open case number or other key values, you can set those flags in the diag command directly:

  Upload:
    Flags to control uploading files  Ex: splunk diag --upload
[...]
    --case-number=case-number
                        Case number to attach to, e.g. 200500
    --upload-user=UPLOAD_USER
                        splunk.com username to use for uploading
    --upload-description=UPLOAD_DESCRIPTION
                        description of file upload for Splunk support
    --firstchunk=chunk-number
                        For resuming upload of a multi-part upload; select the
                        first chunk to send
    --chunksize=chunk-size
                        Optional set the chunksize in bytes to be uploaded
  • You are always prompted for the splunk.com password on the command line when using the diag --upload switch.
  • The user names for splunk.com are not email addresses, and do not include @domain.com.
  • To upload a diag file to a case, the host must be allowed to connect to https://api.splunk.com.
  • The --firstchunk flag is only used if there's a failure during the upload of the diag file. When an upload failure happens, a full summary status including a sample diag --upload command with the last chunk number is provided. The summary is also logged in the $SPLUNK_HOME/var/log/splunk/splunk_rapid_diag.log file.
  • The --chunksize flag defaults to 100000000 (100MB.) The chunk size is used to divide a large diag file into smaller portions for uploading. If you have repeated diag upload failures, use this switch to lower the chunk size.

Example: splunk diag --upload --case-number=$number --upload-user=$user_name --upload-description="$brief_description"

Diag CLI examples

Exclude a lookup table

These two examples exclude content on the file level. A lookup table can be one of several formats, like .csv, .dat, or text.

Exclude all .csv files, or all .dat files, in $SPLUNK_HOME:

splunk diag --exclude "*.csv" or

splunk diag --exclude "*.dat"

Note: These examples exclude all files of that type, not only lookup tables. If you have .csv or .dat files that will be helpful for Support in troubleshooting your issue, exclude only your lookup tables. That is, write out the files instead of using an asterisk.

Exclude the dispatch directory

This example excludes content on the component level. Exclude the dispatch directory to avoid gathering search artifacts (which can be very costly on a pooled search head):

$SPLUNK_HOME/bin/splunk diag --disable=dispatch

Exclude multiple components

To exclude multiple components, use the --disable flag once for each component.

Exclude the dispatch directory and all files in the shared search head pool:

$SPLUNK_HOME/bin/splunk diag --disable=dispatch --disable=pool

Note: This does not gather a full set of the configuration files in use by that instance. Such a diag is useful only for the logs gathered from $SPLUNK_HOME/var/log/splunk. See What Splunk Enterprise logs about itself in this manual.

Gather only logs

To include only the Splunk Enterprise internal log files:

$SPLUNK_HOME/bin/splunk diag --collect=log

Generate a diag, then upload it

$SPLUNK_HOME/bin/splunk diag --upload

Fetch a diag from a remote instance, then upload it

$SPLUNK_HOME/bin/splunk diag --uri https://splunkserver.example.com:8089
$SPLUNK_HOME/bin/splunk diag --upload-file=<diag_from_prior_command>

Save the settings for diag in server.conf

You can update the default settings for diag in the [diag] stanza of server.conf.

[diag]

EXCLUDE-<class> = <glob expression>
* Specifies a glob / shell pattern to be excluded from diags generated on this instance. 
* Example: */etc/secret_app/local/*.conf

Flags that you append to splunk diag override server.conf settings.

Diag contents

Primarily, a diag contains server logs, from $SPLUNK_HOME/var/log/splunk and $SPLUNK_HOME/var/log/introspection, and the configuration files, from $SPLUNK_HOME/etc.

Specifically, by path name, there are:

_raft/...
Files containing the state of the consensus protocol produced by search head clustering from var/run/splunk/_raft
composite.xml
The generated file that splunkd uses at runtime to control its component system (pipelines & processors), from var/run/splunk/composite.xml
diag.log
A copy of all the messages diag produces to the screen when running, including progress indicators, timing, messages about files excluded by heuristic rules (eg if size heuristic, the setting and the size of the file), errors, exceptions, etc.
dispatch/...
A copy of some of the data from the search dispatch directory. Results files (the output of searches) are not included, nor other similar files (events/*)
etc/...
A copy of the contents of the configuration files. All files and directories under $SPLUNK_HOME/etc/auth are excluded by default.
excluded_filelist.txt
A list of files which diag would have included, but did not because of some restriction (exclude rule, size restriction). This is primarily to confirm the behavior of exclusion rules for customers, and to enable Splunk technical support to understand why they can't see data they are looking for.
introspection/...
The log files from $SPLUNK_HOME/var/log/introspection
log/...
The log files from $SPLUNK_HOME/var/log/splunk
rest-collection/...
Output of several splunkd http endpoints that contain information not available in logs. File input/monitor/tailing status information, server-level admin banners, clustering status info if on a cluster.
scripts/...
A single utility script may exist here for support reasons. It is identical for every diag.
systeminfo.txt
Generated output of various system commands to determine things like available memory, open splunk sockets, size of disk/filesystems, operating system version, ulimits.
Also contained in systeminfo.txt are listings of filenames/sizes etc from a few locations.
  • Some of the splunk index directories (or all of the index directories, if full listing is requested.)
  • The searchpeers directory (replicated files from search heads)
  • Search Head Clustering -- The summary files used in synchronization from var/run/splunk/snasphot
Typically var/...
The paths to the indexes are a little 'clever', attempting to resemble the paths actually in use (For example, on windows if an index is in e:\someother\largedrive, that index's files will be in e/someother/largdrive inside the diag). By default only the .bucketManifest for each index is collected.
app_ext/<app_name>/...
If you have an app installed which extends diag, the content it adds to the produced tar.gz file will be stored here.

Behavior on failure

If a diag collection fails, it will clean up the temporary files it created, and place the errors into a text file.

For example:

Starting splunk diag...
[etc .... etc]
Getting index listings...
Copying Splunk configuration files...
Exception occurred while generating diag, we are deeply sorry.
Traceback (most recent call last):
  File "/opt/splunk/lib/python2.7/site-packages/splunk/clilib/info_gather.py", line 1959, in main
    create_diag(options, log_buffer)
  File "/opt/splunk/lib/python2.7/site-packages/splunk/clilib/info_gather.py", line 1862, in create_diag
    copy_etc(options)
  File "/opt/splunk/lib/python2.7/site-packages/splunk/clilib/info_gather.py", line 1626, in copy_etc
    raise Exception("OMG!")
Exception: OMG!

Diag failure, writing out logged messages to '/tmp/diag-fail-F2B94h.txt', please send output + this file to either an existing or new case ; http://www.splunk.com/support
We will now try to clean out the temp directory...

For most errors, the diag command tries to guess at the original problem, but it also writes out a text file for use in triaging the diag collection process. You should create a support case and attach any files to the case.

Additional resources

Have questions? Visit the Splunk Community to search for questions and answers about diags.

Last modified on 21 December, 2023
How to file a great Support case   Using RapidDiag

This documentation applies to the following versions of Splunk® Enterprise: 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.1.7, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.2.4, 9.3.0, 9.3.1, 9.3.2, 9.4.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters