Monitor files and directories
To monitor files and directories in Splunk Cloud Platform, you must use a universal or a heavy forwarder in nearly all cases. You perform the data collection on the forwarder and then send the data to the Splunk Cloud Platform instance.
Forwarders have three file input processors:
While you must use a forwarder for monitor and MonitorNoHandle input processors, you do not need to use a forwarder to upload a single file. You can upload a single file at a time to Splunk Cloud Platform using Splunk Web.
If you have Splunk Enterprise, you can monitor files using the CLI, Splunk Web, or the inputs.conf configuration file directly on your Splunk Enterprise instance. You can also use a universal or heavy forwarder, as you would with Splunk Cloud Platform.
You can use the monitor input to add nearly all your data sources from files and directories. However, you might want to use the upload input to monitor a file such as an archive of historical data, only one time.
On machines that run Windows Vista or Windows Server 2008 and higher, you can use the MonitorNoHandle input to monitor files that Windows rotates automatically. The MonitorNoHandle input works only on Windows machines.
You can add monitor or upload inputs using these methods:
- On a heavy forwarder: See Monitor files and directories with Splunk Web.
- On a universal forwarder configured for Splunk Cloud Platform: See Forward data from files and directories to Splunk Cloud Platform.
- On a universal or heavy forwarder, see the following:
You can add MonitorNoHandle inputs using either the CLI or the inputs.conf file.
If you use Splunk Web on a heavy forwarder to configure file monitor inputs, you can use the Set Sourcetype page to see how the Splunk platform indexes file. See The Set Sourcetype page for details.
How the monitor processor works
When you specify a path to a file or directory, the monitor processor consumes any new data written to that file or directory. Using the method of specifying the path, you can monitor live application logs such as those coming from Web access logs, Java 2 Platform Enterprise Edition (J2EE), or .NET applications. Splunk uses memory for each file monitored, even if the file is ignored.
The forwarder monitors and indexes the file or directory as new data appears. You can also specify a mounted or shared directory, including network file systems, as long as the forwarder can read from the directory. If the specified directory contains subdirectories, the monitor process recursively examines them for new files, as long as those directories can be read.
Monitor inputs may overlap. So long as the stanza names are different, the forwarder treats them as independent stanzas and files matching the most specific stanza will be treated in accordance with its settings. You can include or exclude files or directories from being read by using allow lists or exclude lists.
If you disable or delete a monitor input, the forwarder does not stop indexing the files that the input references. It only stops checking those files again. To stop all in-process data indexing, you must restart the forwarder.
How the forwarder handles the monitoring of files during restarts
When you restart a forwarder, it continues processing files where it left off before the restart. It first checks for the file or directory specified in a monitor configuration. If the file or directory is not present on start, the forwarder checks for it every 24 hours from the time of the last restart. The monitor process scans subdirectories of monitored directories continuously.
How the forwarder monitors archive files
In order to monitor archived files, forwarders decompress archive files, such as a TAR or ZIP file, prior to processing. Splunk then processes these files in a single threaded format. The following types of archive files are supported:
- TAR.GZ and TGZ
- TBZ and TBZ2
If you add new data to an existing archive file, the forwarder reprocesses the entire file rather than just the new data. This can result in event duplication.
How the forwarder monitors files that the operating system rotates on a schedule
The monitoring processor detects file rotation and does not process renamed files that it has already processed (with the exception of .tar and .gz archives). See How the Splunk platform handles log file rotation.
How the forwarder monitors nonwritable Windows files
Windows can prevent a forwarder from reading open files. If you need to read files while they are being written to, use the monitorNoHandle input.
Restrictions on file monitoring
The forwarder cannot monitor a file whose path exceeds 1024 characters (256 characters on Windows).
Forwarders also do not monitor files with a .splunk filename extension because files with that extension contain Splunk metadata. If you need to index files with a .splunk extension, use the
add oneshot CLI command.
When to use upload or batch?
To index a static file once, select Upload in Splunk Web on Splunk Cloud Platform or Splunk Enterprise.
Otherwise, use the CLI commands
add oneshot or
spool on a forwarder to index a static file. See Use the CLI for details.
You can use the batch input type in the inputs.conf file to load files once and destructively. By default, the Splunk batch processor is located in the
$SPLUNKFORWARDER_HOME/var/spool/splunk directory on the forwarder. If you move a file into this directory, the forwarder processes and deletes the file.
When to use MonitorNoHandle
This Windows-only input lets you read files on Windows systems as Windows writes to them. You must use either a universal or heavy forwarder to use the input for Splunk Cloud Platform. The input uses a kernel-mode filter driver to capture raw data as the data gets written to the file. You can use this input on files that the system locks open for writing, such as the Windows DNS server log file.
Restrictions for using MonitorNoHandle
The MonitorNoHandle input has the following restrictions:
- MonitorNoHandle only works on Windows Vista or Windows Server 2008 and higher operating systems. It does not work with earlier versions of Windows, nor does it work on operating systems that are not Windows.
- You can only monitor single files with MonitorNoHandle. To monitor more than one file, you must create a MonitorNoHandle input stanza for each file.
- You cannot monitor directories with MonitorNoHandle.
- If a file you choose to monitor with MonitorNoHandle already exists, the forwarder does not index its current contents, only new information that comes into the file as processes write to it.
- When you monitor a file with MonitorNoHandle, the source field for the file is MonitorNoHandle, not the name of the file. If you want to have the source field be the name of the file, you must set the field explicitly in inputs.conf. See Monitor files and directories with inputs.conf.
Distribute source type configurations in Splunk Enterprise
Monitor Splunk Enterprise files and directories with the CLI
This documentation applies to the following versions of Splunk Cloud Platform™: 8.1.2103, 8.2.2106, 8.2.2107, 8.2.2105, 8.2.2109, 8.2.2111, 8.2.2112, 8.2.2201 (latest FedRAMP release), 8.2.2202, 8.2.2203