Splunk® Enterprise

Getting Data In

Download manual as PDF

Download topic as PDF

Monitor files and directories

Splunk Enterprise has three file input processors: monitor, MonitorNoHandle, and upload.

You can use monitor to add nearly all your data sources from files and directories. However, you might want to use upload to add one-time inputs, such as an archive of historical data.

On hosts that run Windows Vista or Windows Server 2008 and later, you can use MonitorNoHandle to monitor files which the system rotates automatically. The MonitorNoHandle input works only on Windows hosts.

Add inputs to monitor or upload using any of these methods:

You can add inputs to MonitorNoHandle using either the CLI or inputs.conf.

Use the "Set Sourcetype" page to see how the data from a file will be indexed. See The "Set Sourcetype" page for details.

How the monitor processor works

Specify a path to a file or directory and the monitor processor consumes any new data written to that file or directory. This is how you can monitor live application logs such as those coming from Web access logs, Java 2 Platform Enterprise Edition (J2EE) or .NET applications, and so on.

Splunk Enterprise monitors and indexes the file or directory as new data appears. You can also specify a mounted or shared directory, including network file systems, as long as Splunk Enterprise can read from the directory. If the specified directory contains subdirectories, the monitor process recursively examines them for new files, as long as the directories can be read.

You can include or exclude files or directories from being read by using whitelists and blacklists.

If you disable or delete a monitor input, Splunk Enterprise does not stop indexing the files that the input references. It only stops checking those files again. To stop all in-process data indexing, the Splunk server must be stopped and restarted.

How Splunk Enterprise handles monitoring of files during restarts

When the Splunk server is restarted, it continues processing files where it left off. It first checks for the file or directory specified in a monitor configuration. If the file or directory is not present on start, Splunk Enterprise checks for it every 24 hours from the time of the last restart. The monitor process scans subdirectories of monitored directories continuously.

Monitor inputs may overlap. So long as the stanza names are different, Splunk Enterprise treats them as independent stanzas and files matching the most specific stanza will be treated in accordance with its settings.

How Splunk Enterprise monitors archive files

Archive files (such as a .tar or .zip file, are decompressed before being indexed. The following types of archive files are supported:

  • .tar
  • .gz
  • .bz2
  • .tar.gz and .tgz
  • .tbz and .tbz2
  • .zip
  • .z

If you add new data to an existing archive file, the entire file is reindexed, not just the new data. This can result in event duplication.

How Splunk Enterprise monitors files that the operating system rotates on a schedule

The monitoring process detects log file rotation and does not process renamed files that it has already indexed (with the exception of .tar and .gz archives). See How Splunk Enterprise handles log file rotation.

How Splunk Enterprise monitors nonwritable Windows files

Windows can prevent Splunk Enterprise from reading open files. If you need to read files while they are being written to, you can use the monitorNoHandle input.

Restrictions on file monitoring

Splunk Enterprise cannot monitor a file whose path exceeds 1024 characters.

Files with a .splunk filename extension are also not monitored, because files with that extension contain Splunk metadata. If you need to index files with a .splunk extension, use the add oneshot CLI command.

Why use upload or batch?

To index a static file once, select Upload in Splunk Web.

You can also use the CLI add oneshot or spool commands for the same purpose. See Use the CLI for details.

If you have Splunk Enterprise, you can use the batch input type in inputs.conf to load files once and destructively. By default, the Splunk batch processor is located in $SPLUNK_HOME/var/spool/splunk. If you move a file into this directory, the file is indexed and then deleted.

Note: For best practices on loading file archives, see How to index different sized archives on the Community Wiki.

Why use MonitorNoHandle?

This Windows-only input lets you read files on Windows systems as Windows writes to them. It does this by using a kernel-mode filter driver to capture raw data as it gets written to the file. Use this input stanza on files which get locked open for writing. You can use this input stanza on a file which the system locks open for writing, such as the Windows DNS server log file.

Caveats for using MonitorNoHandle

The MonitorNoHandle input has the following caveats:

  • MonitorNoHandle only works on Windows Vista or Windows Server 2008 and later operating systems. It does not work with earlier version of Windows, nor does it work on operating systems that are not Windows.
  • You can only monitor single files with MonitorNoHandle. To monitor more than one file, you must create a MonitorNoHandle input stanza for each file.
  • You cannot monitor directories with MonitorNoHandle.
  • If a file you choose to monitor with MonitorNoHandle already exists, Splunk Enterprise does not index its current contents, only new information that comes into the file as processes write to it.
  • When you monitor a file with MonitorNoHandle, the source field for the file is MonitorNoHandle, not the name of the file. If you want to have the source field be the name of the file, you must set it explicitly in inputs.conf. See Monitor files and directories with inputs.conf.
PREVIOUS
Distribute source type configurations in Splunk Enterprise
  NEXT
Monitor files and directories with Splunk Web

This documentation applies to the following versions of Splunk® Enterprise: 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.5.10, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 6.6.11, 6.6.12, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.3.0, 7.3.1, 7.3.3, 7.3.2, 8.0.0


Comments

Hi Ranasenojak,

It depends on the type of data you want to get, but, in most cases, likely not.

Malmoore, Splunker
June 19, 2019

can I use this to get data from www

Ranasenojak
June 16, 2019

Thanks for the comments, DUThibault. I have corrected the typo, and I will raise an issue with the search team about how they are indexing the pages.

Cgales splunk, Splunker
May 14, 2019

I see that the three Web pages are already one. This means the Search Docs functionality should do some search tree pruning to avoid the triplication of results.

DUThibault
May 14, 2019

The three Web pages https://docs.splunk.com/Documentation/Splunk/latest/Data/MonitorFilesandDirectories, MonitorfilesandDirectories, and MonitorFilesAndDirectories should be merged (http is case-sensitive...) by making two redirect to the third. Right now this causes the Search Docs functionality to return each hit from those pages three times.

DUThibault
May 14, 2019

"indexed (with the exception of .tar and .gz archives. See"
should be
"indexed (with the exception of .tar and .gz archives). See"

DUThibault
May 14, 2019

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters