Admin Manual

 


Monitor files and directories

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Monitor files and directories

Splunk has two file input processors: monitor and upload. For the most part, you can use monitor to add all your data sources from files and directories. However, you may want to use upload when you want to add one-time inputs, such as an archive of historical data.

This topic discusses how to add monitor and upload inputs using Splunk Web and the configuration files. You can also add, edit, and list monitor inputs using the CLI; for more information, read this topic.

How monitor works in Splunk

Specify a path to a file or directory and Splunk's monitor processor consumes any new input. This is how you'd monitor live application logs such as those coming from J2EE or .Net applications, Web access logs, and so on. Splunk will continue to index the data in this file or directory as it comes in. You can also specify a mounted or shared directory, including network filesystems, as long as the Splunk server can read from the directory. If the specified directory contains subdirectories, Splunk recursively examines them for new files.

Splunk checks for the file or directory specified in a monitor configuration on Splunk server start and restart. If the file or directory specified is not present on start, Splunk checks for it again in 24 hour intervals from the time of the last restart. Subdirectories of monitored directories are scanned continuously. To add new inputs without restarting Splunk, use Splunk Web or the command line interface. If you want Splunk to find potential new inputs automatically, use crawl.

When using monitor, note the following:

Note: You cannot currently use both monitor and file system change monitor to follow the same directory or file. If you want to see changes in a directory, use file system change monitor. If you want to index new events in a directory, use monitor.

Note: Monitor input stanzas may not overlap. That is, monitoring /a/path while also monitoring /a/path/subdir will produce unreliable results. Similarly, monitor input stanzas that watch the same directory with different whitelists, blacklists, and wildcard components are not supported.

Why use upload or batch

Use the Upload a local file or Index a file on the Splunk server options to index a static file one time. The file will not be monitored on an ongoing basis.

Use the batch input type in inputs.conf to load files once and destructively. By default, Splunk's batch processor is located in $SPLUNK_HOME/var/spool/splunk. If you move a file into this directory, Splunk indexes it and then deletes it.

Note: For best practices on loading file archives, see "How to index different sized archives" on the Community Wiki.

Configure with Splunk Web

Add inputs from files and directories via Splunk Web.

1. Click Manager in the upper right-hand corner of Splunk Web.

2. Under System configurations, click Data Inputs.

3. Click Files and directories.

4. Click Add new to add an input.

5. Select a Source radio button:

6. Specify the Full path to the file or directory.

To monitor a shared network drive, enter the following: <myhost><mypath> (or \\<myhost>\<mypath> on Windows). Make sure Splunk has read access to the mounted drive, as well as to the files you wish to monitor.

7. Under the Host section, set the host name value. You have several choices for this setting. Learn more about setting the host value in "About default fields".

Note: Host only sets the host field. It does not direct Splunk to look on a specific host on your network.

8. Set the Source type. Source type is a default field added to events. Source type is used to determine processing characteristics such as timestamps and event boundaries.

9. Set the Index. Leave the value as "default", unless you have defined multiple indexes to handle different types of events. In addition to indexes for user data, Splunk has a number of utility indexes, which show up in the dropdown box.

10. Click Save.

Advanced options for file/directory monitoring

If your choice for source is Monitor a file or directory, the page includes an Advanced Options section, which allows you to configure some additional settings:

For detailed information on whitelists and blacklists, see Whitelist or blacklist specific incoming data in this manual.

Configure with inputs.conf

To add an input, add a stanza to inputs.conf in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in $SPLUNK_HOME/etc/apps/. If you have not worked with Splunk's configuration files before, read "About configuration files" before you begin.

You can set multiple attributes in an input stanza. If you do not specify a value for an attribute, Splunk uses the default that's preset in $SPLUNK_HOME/etc/system/default/.

Note: To ensure that new events are indexed when you copy over an existing file with new contents, set CHECK_METHOD = modtime in props.conf for the source. This checks the modtime of the file and re-indexes it when it changes. Be aware that the entire file will be re-indexed, which can result in duplicate events.

Configuration settings

The following are options that you can use in both monitor and batch input stanzas. See the sections that follow for attributes that are specific to each type of input.

host = <string>

index = <string>

sourcetype = <string>

source = <string>

queue = parsingQueue | indexQueue

_TCP_ROUTING = <tcpout_group_name>,<tcpout_group_name>,...

host_regex = <regular expression>

host_segment = <integer>

Monitor syntax and examples

Monitor input stanzas direct Splunk to watch all files in the <path> (or just <path> itself if it represents a single file). You must specify the input type and then the path, so put three slashes in your path if you're starting at root. You can use wildcards for the path. For more information, read how to "Specify input paths with wildcards".

[monitor://<path>]
<attrbute1> = <val1>
<attrbute2> = <val2>
...

The following are additional attributes you can use when defining monitor input stanzas:

crcSalt = <string>

followTail = 0|1

whitelist = <regular expression>

blacklist = <regular expression>

alwaysOpenFile = 0 | 1

time_before_close = <integer>

recursive = true|false

followSymlink

Example 1. To load anything in /apache/foo/logs or /apache/bar/logs, etc.

[monitor:///apache/.../logs]

Example 2. To load anything in /apache/ that ends in .log.

[monitor:///apache/*.log]

Batch syntax and examples

Use batch to set up a one time, destructive input of data from a source. This input is effective when, for example, you have a directory containing files whose data you wanted, but whose disk space utilization you did not.

Caution: For continuous, non-destructive inputs, use monitor. Remember, after the batch input is indexed, Splunk deletes the file.

[batch://<path>]
move_policy = sinkhole
<attrbute1> = <val1>
<attrbute2> = <val2>
...

Important: When defining batch inputs, you must include the setting, move_policy = sinkhole. This loads the file destructively. Do not use this input type for files you do not want to consume destructively.

Note: source = <string> and <KEY> = <string> are not used by batch.

Example: This example batch loads all files from the directory /system/flight815/, but does not recurse through any subdirectories under it -- remove the asterisk and recursion will occur:

[batch://system/flight815/*]
move_policy = sinkhole

For details on using the asterisk in input paths, see "Specify input paths with wildcards".

This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 View the Article History for its revisions.


Comments

Upload a local file. Uploads a file from your local machine into Splunk

Ordinate
January 6, 2011

You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!