Splunk® Enterprise

Getting Data In

Download manual as PDF

Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Whitelist or blacklist specific incoming data

Use whitelist and blacklist rules to explicitly tell Splunk which files to consume when monitoring directories. You can also apply these settings to batch inputs. When you define a whitelist, Splunk indexes only the files in that list. When you define a blacklist, Splunk ignores the files in that list and consumes everything else. You define whitelists and blacklists in the particular input's stanza in inputs.conf.

You don't have to define both a whitelist and a blacklist; they are independent settings. If you do define both and a file matches both of them, that file will not be indexed; blacklist will override whitelist.

Whitelist and blacklist rules use regular expression syntax to define the match on the file name/path. Also, your rules must be contained within a configuration stanza, for example [monitor://<path>]; those outside a stanza (global entries) are ignored.

Instead of whitelisting or blacklisting your data inputs, you can filter specific events and send them to different queues or indexes. Read more about routing and filtering data. You can also use the crawl feature to predefine files you want Splunk to index or not index automatically when they are added to your file system.

Important: Define whitelist and blacklist entries with exact regex syntax; the "..." wildcard used for input paths (described here) is not supported.

Whitelist (allow) files

To define the files you want Splunk to exclusively index, add the following line to your monitor stanza in the /local/inputs.conf file for the app this input was defined in:

whitelist = <your_custom regex>

For example, if you want Splunk to monitor only files with the .log extension:

[monitor:///mnt/logs]
    whitelist = \.log$

You can whitelist multiple files in one line, using the "|" (OR) operator. For example, to whitelist filenames that contain query.log OR my.log:

whitelist = query\.log$|my\.log$

Or, to whitelist exact matches:

whitelist = /query\.log$|/my\.log$

Note: The "$" anchors the regex to the end of the line. There is no space before or after the "|" operator.

For information on how whitelists interact with wildcards in input paths, see "Wildcards and whitelisting".

Blacklist (ignore) files

To define the files you want Splunk to exclude from indexing, add the following line to your monitor stanza in the /local/inputs.conf file for the app this input was defined in:

blacklist = <your_custom regex>

Important: If you create a blacklist line for each file you want to ignore, Splunk activates only the last filter.

If you want Splunk to ignore and not monitor only files with the .txt extension:

[monitor:///mnt/logs]
    blacklist = \.(txt)$

If you want Splunk to ignore and not monitor all files with either the .txt extension OR the .gz extension (note that you use the "|" for this):

[monitor:///mnt/logs]
    blacklist = \.(txt|gz)$

If you want Splunk to ignore entire directories beneath a monitor input refer to this example:

[monitor:///mnt/logs]
    blacklist = (archive|historical|\.bak$)

The above example tells Splunk to ignore all files under /mnt/logs/ within the archive or historical directories and all files ending in *.bak.

If you want Splunk to ignore files that contain a specific string you could do something like this:

[monitor:///mnt/logs]
   blacklist = 2009022[8|9]file\.txt$

The above example will ignore the webserver20090228file.txt and webserver20090229file.txt files under /mnt/logs/.

PREVIOUS
Specify input paths with wildcards
  NEXT
How Splunk handles log file rotation

This documentation applies to the following versions of Splunk® Enterprise: 4.3, 4.3.1, 4.3.2, 4.3.3, 4.3.4, 4.3.5, 4.3.6, 4.3.7, 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18


Comments

Moreymic - <br /><br />While this topic describes how to specify whitelists/blacklists by directly editing the inputs.conf file, you can also specify a whitelist or blacklist in Splunk Manager (the web interface) when adding a file or directory input. Click on the "more settings" option on the "files & directories" data input page; you'll see the whitelist/blacklist fields at the end of the page. You need to specify a regex expression.

Sgoodman, Splunker
March 12, 2012

For a Windows installation, this is terribly confusing. Am I doing this from inside the web-based interface? The "monitor stanza in the /local/inputs.conf file" means absolutely nothing to me. Do Windows expressions work? For example, if I wanted to omit .7z files, could I just put *.7z in the Blacklist options when setting up a new Data Input?

Moreymic
March 6, 2012

If you are going to whitelist multiple files then you need to group them in parentheses, ie: whitelist = (query\.log$|my\.log$). I only tested whitellists and on Linux 2.6 64-bit.

Tgow
October 19, 2011

Can you apply filters to the contents of a compressed archive? e.g. file.tar.gz:/folder_1/folder_a etc...

Quixand
October 7, 2011

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters