Admin Manual

 


Monitor files and directories

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Monitor files and directories

Splunk has two file input processors: monitor and upload. For the most part, you can use monitor to add all your data sources from files and directories. However, you may want to use upload when you want to add one-time inputs, such as an archive of historical data.

This topic discusses how to add monitor and upload inputs using Splunk Web and the configuration files. You can also add, edit, and list monitor inputs using the CLI; for more information, read this topic.

How monitor works in Splunk

Specify a path to a file or directory and Splunk's monitor processor consumes any new input. This is how you'd monitor live application logs such as those coming from J2EE or .Net applications, Web access logs, and so on. Splunk will continue to index the data in this file or directory as it comes in. You can also specify a mounted or shared directory, including network filesystems, as long as the Splunk server can read from the directory. If the specified directory contains subdirectories, Splunk recursively examines them for new files.

Splunk checks for the file or directory specified in a monitor configuration on Splunk server start and restart. If the file or directory specified is not present on start, Splunk checks for it again in 24 intervals from the time of the last restart. Subdirectories of monitored directories are scanned continuously. To add new inputs without restarting Splunk, use Splunk Web or the command line interface. If you want Splunk to find potential new inputs automatically, use crawl.

When using monitor:

Note: You cannot currently use both monitor and file system change monitor to follow the same directory or file. If you want to see changes in a directory, use file system change monitor. If you want to index new events in a directory, use monitor.

Note: Monitor input stanzas may not overlap. That is, monitoring /a/path while also monitoring /a/path/subdir will produce unreliable results. Similarly, monitor input stanzas which watch the same directory with different whitelists, blacklists, and wildcard components are not supported.

Why use upload or batch

Use the Upload a local file or Index a file on the Splunk server options to index a static file one time. The file will not be monitored on an ongoing basis.

Use the batch input type in inputs.conf to load files once and destructively. By default, Splunk's batch processor is located in $SPLUNK_HOME/var/spool/splunk. If you move a file into this directory, Splunk indexes it and then deletes it.

Note: For best practices on loading file archives, see "How to index different sized archives" on the Community Wiki.

Monitor files and directories in Splunk Web

Add inputs from files and directories via Splunk Web.

1. Click Manager in the upper right-hand corner of Splunk Web.

2. Under System configurations, click Data Inputs.

3. Click Files and directories.

4. Click New to add an input.

5. Choose the radio button you want. You can:

6. Specify the path to the file or directory. If you select Upload a local file, use the Browse... button.

To monitor a shared network drive, enter the following: <myhost><mypath> (or \\<myhost>\<mypath> on Windows). Make sure Splunk has read access to the mounted drive as well as the files you wish to monitor.

7. Under the Host heading, select the host name. You have several choices if you are using Monitor or Batch methods. Learn more about setting host value.

Note: Host only sets the host field in Splunk. It does not direct Splunk to look on a specific host on your network.

8. Now set the Source Type. Source type is a default field added to events. Source type is used to determine processing characteristics such as timestamps and event boundaries.

9. After specifying the source, host, and source type, click Submit.

Define input stanzas in inputs.conf

To add an input, add a stanza for it to inputs.conf in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in $SPLUNK_HOME/etc/apps/. If you have not worked with Splunk's configuration files before, read about configuration files in this manual before you begin.

You can set any number of attributes and values following an input type. If you do not specify a value for one or more attributes, Splunk uses the defaults that are preset in $SPLUNK_HOME/etc/system/default/.

Note: To ensure new events are indexed when you copy over an existing file with new contents, set CHECK_METHOD = modtime in props.conf for the source. This checks the modtime of the file and re-indexes when it changes. Note that the entire file is indexed, which can result in duplicate events.

The following are options that you can use in both monitor and batch input stanzas. See the sections following for more attributes that are specific to each type of input.

host = <string>

index = <string>

sourcetype = <string>

source = <string>

queue = <string> (parsingQueue, indexQueue, etc)

host_regex = <regular expression>

host_segment = <integer>

Monitor syntax and examples

Monitor input stanzas direct Splunk to watch all files in the <path> (or just <path> itself if it represents a single file). You must specify the input type and then the path, so put three slashes in your path if you're starting at root. You can use wildcards for the path. For more information, read how to "Specify input paths with wildcards".

[monitor://<path>]
<attrbute1> = <val1>
<attrbute2> = <val2>
...

The following are additional attributes you can use when defining monitor input stanzas.

crcSalt = <string>

followTail = 0|1

_whitelist = <regular expression>

_blacklist = <regular expression>

Example 1. To load anything in /apache/foo/logs or /apache/bar/logs, etc.

[monitor:///apache/.../logs]

Example 2. To load anything in /apache/ that ends in .log.

[monitor:///apache/*.log]

Batch syntax and examples

Use batch to set up a one time, destructive input of data from a source. For continuous, non-destructive inputs, use monitor. Remember, after the batch input is indexed, Splunk deletes the file.

[batch://<path>]
move_policy = sinkhole
<attrbute1> = <val1>
<attrbute2> = <val2>
...

Important: When defining batch inputs, you must include the setting, move_policy = sinkhole. This loads the file destructively. Do not use this input type for files you do not want to consume destructively.

Note: source = <string> and <KEY> = <string> are not used by batch.

Example: This example batch loads all files from the directory /system/flight815/.

[batch://system/flight815/*]
move_policy = sinkhole

This documentation applies to the following versions of Splunk: 4.0 , 4.0.1 , 4.0.2 , 4.0.3 , 4.0.4 , 4.0.5 , 4.0.6 , 4.0.7 , 4.0.8 , 4.0.9 , 4.0.10 , 4.0.11 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.