Splunk® Enterprise

Getting Data In

Download manual as PDF

Splunk Enterprise version 6.x is no longer supported as of October 23, 2019. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Anonymize data

This topic discusses how to anonymize data that comes into Splunk Enterprise, such as credit card and Social Security numbers.

You might want to mask sensitive personal data when indexing log events. Credit card numbers and social security numbers are two examples of data that you might not want to appear in an index. This topic describes how to mask part of confidential fields to protect privacy while providing enough remaining data for use in tracking events.

Splunk Enterprise lets you anonymize data in two ways:

  • Through a regular expression (regex) transform
  • Through a sed script

Anonymize data with a regular expression transform

You can configure transforms.conf to mask data by means of regex expressions.

This example masks all but the last four characters of fields SessionId and Ticket number in an application server log.

An example of the desired output:


A sample input:

"2006-09-21, 02:57:11.58",  122, 11, "Path=/LoginUser Query=CrmId=ClientABC&
SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET,IP=,
Content=", ""
"2006-09-21, 02:57:11.60",  122, 15, "UserData:<User CrmId="clientabc" 
UserId="p12345678"><EntitlementList></EntitlementList></User>", ""
"2006-09-21, 02:57:11.60",  122, 15, "New Cookie: SessionId=3A1785URH117BEA&
AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=
&PinPGRate=&PinMenu=&", ""

To mask the data, modify the props.conf and transforms.conf files in your $SPLUNK_HOME/etc/system/local/ directory.

Configure props.conf

Edit $SPLUNK_HOME/etc/system/local/props.conf and add the following stanza:

TRANSFORMS-anonymize = session-anonymizer, ticket-anonymizer

Note the following:

  • <spec> must be one of the following:
    • <sourcetype>, the source type of an event.
    • host::<host>, where <host> is the host of an event.
    • source::<source>, where <source> is the source of an event.
  • In this example, session-anonymizer and ticket-anonymizer are arbitrary TRANSFORMS class names whose actions are defined in stanzas in a corresponding transforms.conf file. Use the class names you create in transforms.conf.

Configure transforms.conf

In $SPLUNK_HOME/etc/system/local/transforms.conf, add your TRANSFORMS:

REGEX = (?m)^(.*)SessionId=\w+(\w{4}[&"].*)$
FORMAT = $1SessionId=########$2
DEST_KEY = _raw
REGEX = (?m)^(.*)Ticket=\w+(\w{4}&.*)$
FORMAT = $1Ticket=########$2
DEST_KEY = _raw

Note the following:

  • REGEX should specify the regular expression that will point to the string in the event you want to anonymize.

Note: The regex processor does not handle multi-line events. To get around this you must specify that the event is multi-line. Place (?m) before the regular expression in transforms.conf.

  • FORMAT specifies the masked values. $1 is all the text leading up to the regex and $2 is all the text of the event after the regex.
  • DEST_KEY = _raw specifies to write the value from FORMAT to the raw value in the log - thus modifying the event.

Anonymize data through a sed script

You can also anonymize your data by using a sed script to replace or substitute strings in events.

Most UNIX users are familiar with sed, a Unix utility which reads a file and modifies the input as specified by a list of commands. Splunk Enterprise lets you use sed-like syntax in props.conf to anonymize your data.

Define the sed script in props.conf

Edit or create a copy of props.conf in $SPLUNK_HOME/etc/system/local.

Create a props.conf stanza that uses SEDCMD to indicate a sed script:

SEDCMD-<class> = <sed script>

Note the following:

  • <spec> must be one of the following:
    • <sourcetype>, the source type of an event.
    • host::<host>, where <host> is the host of an event.
    • source::<source>, where <source> is the source of an event.
  • The sed script applies only to the _raw field at index time. Splunk Enterprise currently supports the following subset of sed commands:
    • replace (s)
    • character substitution (y).

Note: After making changes to props.conf, restart Splunk Enterprise to enable the configuration changes.

Replace strings with regex match

The syntax for a sed replace is:

SEDCMD-<class> = s/<regex>/<replacement>/flags

Note the following:

  • regex is a PERL regular expression.
  • replacement is a string to replace the regex match. It uses "\n" for back-references, where n is a single digit.
  • flags can be either "g" to replace all matches or a number to replace a specified match.


In the following example, you want to index data containing Social Security and credit card numbers. At index time, you want to mask these values so that only the last four digits are evident in your events. Your props.conf stanza might look like this:

SEDCMD-accounts = s/ssn=\d{5}(\d{4})/ssn=xxxxx\1/g s/cc=(\d{4}-){3}(\d{4})/cc=xxxx-xxxx-xxxx-\2/g

Now, in your accounts events, social security numbers appear as ssn=xxxxx6789 and credit card numbers will appear as cc=xxxx-xxxx-xxxx-1234.

Substitute characters

The syntax for a sed character substitution is:

SEDCMD-<class> = y/<string1>/<string2>/

This substitutes each occurrence of the characters in string1 with the characters in string2.


Let's say you have a file you want to index, abc.log, and you want to substitute the capital letters "A", "B", and "C" for every lowercase "a", "b", or "c" in your events. Add the following to your props.conf:

SEDCMD-abc = y/abc/ABC/

Now, if you search for source="*/abc.log", you should not find the lowercase letters "a", "b", and "c" in your data at all. Splunk Enterprise substituted "A" for each "a", "B" for each "b", and "C" for each "c'.

Last modified on 28 June, 2016
Configure indexed field extraction
How timestamp assignment works

This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters