Splunk® Enterprise

Getting Data In

Download manual as PDF

Download topic as PDF

Specify input paths with wildcards

You can configure inputs manually by editing the inputs.conf file. Input path specifications in inputs.conf do not use regular expressions (regexes) but rather Splunk-defined wildcards. This topic discusses how to specify these wildcards in a path in inputs.conf. To specify wildcards, you must use inputs.conf to specify file and directory monitor inputs.

Wildcard overview

A wildcard is a character that you can substitute for one or more unspecified characters when searching text or selecting multiple files or directories. You can use wildcards to specify the input path for a file or directory monitor input.

Wildcard Description Reg. Exp. equivalent Example(s)
... The ellipsis wildcard recurses through directories and any number of levels of subdirectories to find matches.

If you specify a folder separator (for example, //var/log/.../file), it does not match the first folder level, only subfolders.

.* /foo/.../bar.log matches the files /foo/1/bar.log, /foo/2/bar.log, /foo/1/2/bar.log, etc., but does not match /foo/bar.log, or /foo/3/notbar.log

Because a single ellipse recurses through all folders and subfolders, /foo/.../bar.log matches the same as /foo/.../.../bar.log.

* The asterisk wildcard matches anything in that specific folder path segment.

Unlike "...", "*" does not recurse through subfolders.

[^/]* /foo/*/bar matches the files /foo/1/bar, /foo/2/bar, etc., but does not match /foo/bar or /foo/1/2/bar.

/foo/m*r/bar matches /foo/mr/bar, /foo/mir/bar, /foo/moor/bar, etc.

/foo/*.log matches all files with the .log extension, such as /foo/bar.log. It does not match /foo/bar.txt or /foo/bar/test.log.

A single period (.) is not a wildcard, and is the regular expression equivalent of \..

For more specific matches, combine the ... and * wildcards. For example, /foo/.../bar/* matches any file in the /bar directory within the specified path.

Wildcards and regular expression metacharacters

When determining the set of files or directories to monitor, Splunk Enterprise splits elements of a monitoring stanza into segments. Segments are blocks of text between directory separator characters ("/" or "\") in the stanza definition. If you specify a monitor stanza that contains segments with both wildcards and regular expression metacharacters (such as (, ), [, ], and |), those characters behave differently depending on where the wildcard is in the stanza.

If a monitoring stanza contains a segment with regular expression metacharacters before a segment with wildcards, the metacharacters are treated literally, as if you wanted to monitor files or directories with those characters in the file or directory names. For example:

[monitor:///var/log/log(a|b).log]

monitors the /var/log/log(a|b).log file. The (a|b) is not treated as a regular expression because no wildcards are present.

[monitor:///var/log()/log*.log]

monitors all files in the /var/log()/ directory that begin with log and have the extension .log. The () is not treated as a regular expression because it is in the segment before the wildcard.

If the regular expression metacharacters occur within or after a segment that contains a wildcard, Splunk Enterprise treats the metacharacters as a regular expression and matches files to monitor accordingly. For example:

[monitor:///var/log()/log(a|b)*.log]

monitors all files in the /var/log()/ directory that begin with either loga or logb and have the extension .log. The first set of () is not treated as a regular expression because the wildcard is in the following segment. The second set of () does get treated as a regular expression because it is in the same segment as the wildcard '*'.

[monitor:///var/.../log(a|b).log]

monitors all files in any subdirectory of the /var/ directory named loga.log and logb.log. Splunk Enterprise treats (a|b) as a regular expression because of the wildcard '...' in the previous stanza segment.

[monitor:///var/.../log[A-Z0-9]*.log]

monitors all files in any subdirectory of the /var/ directory that:

  • begin with log, then
  • contain a single capital letter (from A-Z) or number (from 0-9), then
  • contain any other characters, then
  • end in .log.

The expression [A-Z0-9]* is treated as a regex because of the wildcard '...' in the previous stanza segment.

Input examples

To monitor /apache/foo/logs, /apache/bar/logs, /apache/bar/1/logs:

[monitor:///apache/.../logs/*]

To monitor /apache/foo/logs, /apache/bar/logs, but not /apache/bar/1/logs or /apache/bar/2/logs:

[monitor:///apache/*/logs]

To monitor any file directly under /apache/ that ends in .log:

[monitor:///apache/*.log]

To monitor any file under /apache/ under any level of subdirectory that ends in .log:

[monitor:///apache/.../*.log]

The "..." followed by a folder separator will imply that the wildcard level folder will be excluded.

[monitor:///var/log/.../*.log]

the tailing logic will become '^\/var\/log/.*/[^/]*\.log$'

Therefore, /var/log/subfolder/test.log will match, but /var/log/test.log will not match and be excluded. To monitor all files in all folders use:

[monitor:///var/log/]

whitelist=\.log$

recurse=true

#true by default

Wildcards and whitelisting

Splunk Enterprise defines whitelists and blacklists with standard Perl-compatible Regular Expression (PCRE) syntax.

When you specify wildcards in a file input path, Splunk Enterprise creates an implicit whitelist for that stanza. The longest wildcard-free path becomes the monitor stanza, and Splunk Enterprise translates the wildcards into regular expressions.

Splunk Enterprise anchors the converted expression to the right end of the file path, so that the entire path must be matched.

For example, if you specify

[monitor:///foo/bar*.log]

Splunk Enterprise translates this into

[monitor:///foo/]
whitelist = bar[^/]*\.log$

On Windows, if you specify

[monitor://C:\Windows\foo\bar*.log]

Splunk Enterprise translates it into

[monitor://C:\Windows\foo\]
whitelist = bar[^/]*\.log$

Note: In Windows, whitelist and blacklist rules do not support regular expressions that include backslashes. Use two backslashes (\\) to escape wildcards.

PREVIOUS
Monitor files and directories with inputs.conf
  NEXT
Whitelist or blacklist specific incoming data

This documentation applies to the following versions of Splunk® Enterprise: 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2


Comments

> Note: Because a single ellipse recurses through all directories and subdirectories, /foo/.../bar.log matches the same as /foo/.../.../bar.log.

Actually, doesn't the second pattern require at least two subdirectories after /foo, whereas the first only requires one? In other words, wouldn't the second pattern exclude /foo/a/bar.log?

Perhaps this should be phrased as
> /foo/*/.../bar.log matches the same file names as /foo/.../.../bar.log, but the former is the preferred way to express this.

Matt harden
July 10, 2015

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters