Splunk® Enterprise

Troubleshooting Manual

Download manual as PDF

Splunk Enterprise version 6.x is no longer supported as of October 23, 2019. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Anonymize data samples to send to Support

Splunk Enterprise contains an anonymize function. The anonymizer combs through sample log files or event files to replace identifying data - like usernames, IP addresses, domain names - with fictional values that maintain the same word length and event type. For example, it might turn the string user=carol@adalberto.com into user=plums@wonderful.com. This lets Splunk Enterprise users share log data without revealing confidential or personal information from their networks.

The anonymized file is written to the same directory as the source file, with ANON- prepended to its filename. For example, /tmp/messages is anonymized as /tmp/ANON-messages. In Windows, a file \temp\messages becomes \temp\ANON-messages.

You can anonymize files from the Splunk Enterprise CLI. See About the CLI for instructions on accessing the Splunk Enterprise CLI.

Simple method

The easiest way to anonymize a file is with the anonymizer tool's defaults, as shown in the session below. Note that you currently need to have $SPLUNK_HOME/bin as your current working directory.

From the CLI while you are in $SPLUNK_HOME, type the following:

> ./splunk anonymize file -source </path/to/filename>

Of course it is always good practice to move the file somewhere safe (like /tmp) before doing this sort of thing. So, for example:

> cp -p /var/log/messages /tmp
> cd $SPLUNK_HOME/bin
> ./splunk anonymize file -source /tmp/messages
Processing files: ['/tmp/messages']
Getting named entities
        Processing /tmp/messages
Adding named entities to list of public terms: Set(['secErrStr', 'MD_SB_DISKS', 'TTY', 'target', 'precision ', 'lpj', 'ip', 'pci', 'hard', 'last bus', 'override with idebus', 'SecKeychainFindGenericPassword err', 'vector', 'USER', 'irq ', 'com  user', 'uid'])
        Processing /tmp/messages for terms.
        Calculating replacements for 4672 terms.
Wrote dictionary scrubbed terms with replacements to "/tmp/INFO-mapping.txt"
Wrote suggestions for dictionary to "/tmp/INFO-suggestions.txt"
Writing out /tmp/ANON-messages

Advanced method

You can customize the anonymizer by telling it what terms to anonymize, what terms to leave alone, and what terms to use as replacements. The advanced form of the command is:

./splunk anonymize file -source <filename> [-public_terms <file>] [-private_terms <file>] [-name_terms <file>] [-dictionary <file>] [-timestamp_config <file>]

  • filename
    • Default: None
    • Path and name of the file to anonymize.
  • public_terms
    • Default: $SPLUNK_HOME/etc/anonymizer/public-terms.txt
    • A list of locally-used words that will not be anonymized if they are in the file. It serves as an appendix to the dictionary file.
    • Here is a sample entry:
2003 2004 2005 2006 abort aborted am apr april aug august auth
authorize authorized authorizing bea certificate class com complete
  • private_terms
    • Default: $SPLUNK_HOME/etc/anonymizer/private-terms.txt
    • A list of words that will be anonymized if found in the file, because they may denote confidential information.
    • Here is a sample entry:
  • name_terms
    • Default: $SPLUNK_HOME/etc/anonymizer/names.txt
    • A global list of common English personal names that Splunk uses to replace anonymized words.
    • Splunk always replaces a word with a name of the exact same length, to keep each event's data pattern the same.
    • Splunk uses each name in name_terms once to replace a character string of equal length throughout the file. After it runs out of names, it begins using randomized character strings, but still mapping each replaced pattern to one anonymized string.
    • Here is a sample entry:
  • dictionary
    • Default: $SPLUNK_HOME/etc/anonymizer/dictionary.txt
    • A global list of common words that will not be anonymized, unless overridden by entries in the private_terms file.
    • Here is a sample entry:
  • timestamp_config
    • Default: $SPLUNK_HOME/etc/anonymizer/anonymizer-time.ini
    • Splunk's built-in file that determines how timestamps are parsed.

Output Files

Splunk's anonymizer function will create three new files in the same directory as the source file.

  • ANON-filename
    • The anonymized version of the source file.
  • INFO-mapping.txt
    • This file contains a list of which terms were anonymized into which strings.
    • Here is a sample entry:
Replacement Mappings
kb900485 --> LO200231
1718 --> 1608
transitions --> tstymnbkxno
reboot --> SPLUNK
cdrom --> pqyvi
  • INFO-suggestions.txt
    • A report of terms found in the file that, based on their appearance and frequency, you may want to add to public_terms.txt or to private-terms.txt or to public-terms.txt for more accurate anonymization of your local data.
    • Here is a sample entry:
Terms to consider making private (currently not scrubbed):
['uid', 'pci', 'lpj', 'hard']
Terms to consider making public (currently scrubbed):
['jun', 'security', 'user', 'ariel', 'name', 'logon', 'for', 'process', 'domain', 'audit']

Linux tip: Anonymize all log files from a diag at once

Here are the steps to generate a diagnostic (diag file) and then anonymize the logs of that diag.

1. Generate the diag: For example:

./splunk diag --exclude "*/passwd"

2. Uncompress the diag. For example:

cd pathtomyuncompresseddiag/
tar xfz  my-diag-hostname.tar.gz

3. Run anonymize on each file of the diag. If you run this command for all *.log, then make note of the log files that now have a prefix of ANON*.log. For example:

find pathtomyuncompresseddiag/ -name \*.log* | xargs -I{} ./splunk anonymize file -source '{}'

4. Keep all the files that now have a prefix of ANON*.log while deleting the non-anonymized versions in the diag directory.

5. Compress the diag.

tar cfz my-diag-hostname.tar.gz pathtomyuncompresseddiag

6. Upload the diag, adding it to the Support case, with the ADD FILE button in the case.

Last modified on 22 June, 2016
Generate a diag
Collect pstacks

This documentation applies to the following versions of Splunk® Enterprise: 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters