Monitor files and directories
Splunk Enterprise has three file input processors: monitor, MonitorNoHandle, and upload.
You can use monitor to add nearly all your data sources from files and directories. However, you might want to use upload to add one-time inputs, such as an archive of historical data.
On hosts that run Windows Vista or Windows Server 2008 and later, you can use MonitorNoHandle to monitor files which the system rotates automatically. MonitorNoHandle works only on Windows hosts.
You can add inputs to monitor or upload using any of these methods:
You can add inputs to MonitorNoHandle using either the CLI or inputs.conf.
You can use the "Set Sourcetype" page to see how Splunk Enterprise will index data from a file. See "The "Set Sourcetype" page" for details.
How monitor works in Splunk Enterprise
Specify a path to a file or directory and the Splunk Enterprise monitor processor consumes any new data written to that file or directory. This is how you can monitor live application logs such as those coming from Java 2 Platform Enterprise Edition (J2EE) or .NET applications, Web access logs, and so on. Splunk Enterprise continues to monitor and index the file or directory as new data appears. You can also specify a mounted or shared directory, including network file systems, so long as Splunk Enterprise can read from the directory. If the specified directory contains subdirectories, Splunk Enterprise recursively examines them for new files.
Splunk Enterprise checks for the file or directory specified in a monitor configuration on start and restart. If the file or directory is not present on start, Splunk Enterprise continues to check for it every 24 hours from the time of the last restart. Splunk Enterprise also scans subdirectories of monitored directories continuously. To add new inputs without restarting Splunk Enterprise, use Splunk Web or the CLI. If you want Splunk Enterprise to find potential new inputs automatically, use the crawl CLI command.
When using monitor, note the following:
- On most file systems, files can be read even as they are being written to. However, Windows file systems can prevent files from being read while they are being written to, and some Windows programs might use these modes. If you need to read files while they are being written to, you can use the
- Files or directories can be included or excluded via whitelists and blacklists.
- Upon restart, Splunk Enterprise continues processing files where it left off.
- Splunk Enterprise decompresses archive files before it indexes them. It can handle these common archive file types:
tar, gz, bz2, tar.gz, tgz, tbz, tbz2, zip, and
- If you add new data to an existing archive file, Splunk Enterprise will re-index the entire file, not just the new data in the file. This can result in duplication of events.
- Splunk Enterprise detects log file rotation and does not process renamed files it has already indexed (with the exception of .tar and .gz archives; for more information see "Log file rotation" in this manual).
- The entire
dir/filenamepath must not exceed 1024 characters.
- Disabling or deleting a file-based input using the command line or System does not stop the input's files from being indexed. Rather, it stops files from being checked again, but all the initial content will be indexed. To stop all in-process data, you must restart the Splunk Enterprise server.
- Splunk Enterprise does not index files with a
.splunkfilename extension. This is because Splunk Enterprise expects files with that extension to be metadata information files. If you need to index files with a
.splunkextension, use the
add oneshotCLI command.
Monitor inputs may overlap. So long as the stanza names are different, Splunk Enterprise treats them as independent stanzas and files matching the most specific stanza will be treated in accordance with its settings.
Why use upload or batch?
To index a static file once, select Upload in Splunk Web. Splunk Enterprise will only monitor the file once.
You can also use the CLI
add oneshot or
spool commands for the same purpose. See "Use the CLI" for details.
batch input type in
inputs.conf to load files once and destructively. By default, the Splunk batch processor is located in
$SPLUNK_HOME/var/spool/splunk. If you move a file into this directory, Splunk indexes it and then deletes it.
Note: For best practices on loading file archives, see "How to index different sized archives" on the Community Wiki.
Why use MonitorNoHandle?
This Windows-only input lets you read files on Windows systems as Windows writes to them. It does this by using a kernel-mode filter driver to capture raw data as it gets written to the file. Use this input stanza on files which get locked open for writing. You can use this input stanza on a file which the system locks open for writing, such as the Windows DNS server log file.
MonitorNoHandle only works on Windows Vista or Windows Server 2008 and later operating systems. You can only monitor single files with
MonitorNoHandle. You can not monitor directories. If a file you choose to monitor already exists, Splunk does not index its current contents, only new information that comes into the file as it gets written to.
Modify input settings
Use Splunk Web
This documentation applies to the following versions of Splunk® Enterprise: 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15