Splunk® Enterprise

Getting Data In

Download manual as PDF

Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Edit inputs.conf

To add an input, add a stanza to inputs.conf in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in $SPLUNK_HOME/etc/apps/. If you have not worked with Splunk's configuration files before, read "About configuration files" before you begin.

You can set multiple attributes in an input stanza. If you do not specify a value for an attribute, Splunk uses the default that's preset in $SPLUNK_HOME/etc/system/default/.

Note: To ensure that new events are indexed when you copy over an existing file with new contents, set CHECK_METHOD = modtime in props.conf for the source. This checks the modtime of the file and re-indexes it when it changes. Be aware that the entire file will be re-indexed, which can result in duplicate events.

Configuration settings

There are separate stanza types for monitor and batch. See "Monitor files and directories" for detailed information about monitor and batch.

The following are attributes that you can use in both monitor and batch input stanzas. See the sections that follow for attributes that are specific to each type of input.

host = <string>

  • Sets the host key/field to a static value for this stanza.
  • Sets the host key's initial value. The key is used during parsing/indexing, in particular to set the host field. It is also the host field used at search time.
  • The <string> is prepended with 'host::'.
  • If not set explicitly, this defaults to the IP address or fully qualified domain name of the host where the data originated.

index = <string>

  • Set the index where events from this input will be stored.
  • The <string> is prepended with 'index::'.
  • Defaults to main, or whatever you have set as your default index.
  • For more information about the index field, see "How indexing works" in the Admin manual.

sourcetype = <string>

  • Sets the sourcetype key/field for events from this input.
  • Explicitly declares the source type for this data, as opposed to allowing it to be determined automatically. This is important both for searchability and for applying the relevant formatting for this type of data during parsing and indexing.
  • Sets the sourcetype key's initial value. The key is used during parsing/indexing, in particular to set the source type field during indexing. It is also the source type field used at search time.
  • The <string> is prepended with 'sourcetype::'.
  • If not set explicitly, Splunk picks a source type based on various aspects of the data. There is no hard-coded default.
  • For more information about source types, see "Why source types matter", in this manual.

queue = parsingQueue | indexQueue

  • Specifies where the input processor should deposit the events that it reads.
  • Set to "parsingQueue" to apply props.conf and other parsing rules to your data.
  • Set to "indexQueue" to send your data directly into the index.
  • Defaults to parsingQueue.

_TCP_ROUTING = <tcpout_group_name>,<tcpout_group_name>,...

  • Specifies a comma-separated list of tcpout group names.
  • Using this attribute, you can selectively forward your data to specific indexer(s) by specifying the tcpout group(s) that the forwarder should use when forwarding your data.
  • The tcpout group names are defined in outputs.conf in [tcpout:<tcpout_group_name>] stanzas.
  • This setting defaults to the groups present in 'defaultGroup' in [tcpout] stanza in outputs.conf.

host_regex = <regular expression>

  • If specified, the regex extracts host from the filename of each input.
  • Specifically, the first group of the regex is used as the host.
  • Defaults to the default "host =" attribute, if the regex fails to match.

host_segment = <integer>

  • If specified, a segment of the path is set as host, using <integer> to determine which segment. For example, if host_segment = 2, host is set to the second segment of the path. Path segments are separated by the '/' character.
  • Defaults to the default "host =" attribute, if the value is not an integer, or is less than 1.

Monitor syntax and examples

Monitor input stanzas direct Splunk to watch all files in the <path> (or just <path> itself if it represents a single file). You must specify the input type and then the path, so put three slashes in your path if you're starting at root. You can use wildcards for the path. For more information, read how to "Specify input paths with wildcards".

<attrbute1> = <val1>
<attrbute2> = <val2>

The following are additional attributes you can use when defining monitor input stanzas:

source = <string>

  • Sets the source key/field for events from this input.
  • Note: Overriding the source key is generally not recommended. Typically, the input layer will provide a more accurate string to aid in problem analysis and investigation, accurately recording the file from which the data was retreived. Consider use of source types, tagging, and search wildcards before overriding this value.
  • The <string> is prepended with 'source::'.
  • Defaults to the input file path.

crcSalt = <string>

  • Use this setting to force Splunk to consume files that have matching CRCs (cyclic redundancy checks). (Splunk only performs CRC checks against the first few lines of a file. This behavior prevents Splunk from indexing the same file twice, even though you may have renamed it -- as, for example, with rolling log files. However, because the CRC is based on only the first few lines of the file, it is possible for legitimately different files to have matching CRCs, particularly if they have identical headers.)
  • If set, string is added to the CRC.
  • If set to <SOURCE>, the full source path is added to the CRC. This ensures that each file being monitored has a unique CRC.
  • Be cautious about using this attribute with rolling log files; it could lead to the log file being re-indexed after it has rolled.
  • Note: This setting is case sensitive.

ignoreOlderThan =

  • Causes the monitored input to stop checking files for updates if their modtime has passed the <code><time window></code> threshold. This improves the speed of file tracking operations when monitoring directory hierarchies with large numbers of historical files (for example, when active log files are co-located with old files that are no longer being written to).
  • Note: A file whose modtime falls outside <code><time window></code> when monitored for the first time will not get indexed.
  • Value must be: <code><number><unit></code>. For example, "7d" indicates one week. Valid units are "d" (days), "m" (minutes), and "s" (seconds).
  • Defaults to 0 (disabled).

<code>followTail = 0|1</code>

  • If set to 1, monitoring begins at the end of the file (like <code>tail -f</code>).
  • This only applies to files the first time they are picked up.
  • After that, Splunk's internal file position records keep track of the file.
  • Defaults to 0.

<code>whitelist = <regular expression></code>

  • If set, files from this path are monitored only if they match the specified regex.

<code>blacklist = <regular expression></code>

  • If set, files from this path are NOT monitored if they match the specified regex.

<code>alwaysOpenFile = 0 | 1</code>

  • If set to 1, Splunk opens a file to check if it has already been indexed.
  • Only useful for files that don't update modtime.
  • Should only be used for monitoring files on Windows, and mostly for IIS logs.
  • Note: This flag should only be used as a last resort, as it increases load and slows down indexing.

<code>time_before_close = <integer></code>

  • Modtime delta required before Splunk can close a file on EOF.
  • Tells the system not to close files that have been updated in past <code><integer></code> seconds.
  • Defaults to 3.

<code>recursive = true|false</code>

  • If set to <code>false</code>, Splunk will not go into subdirectories found within a monitored directory.
  • Defaults to <code>true</code>.


  • If <code>false</code>, Splunk will ignore symbolic links found within a monitored directory.
  • Defaults to <code>true</code>.

Example 1. To load anything in <code>/apache/foo/logs</code> or <code>/apache/bar/logs</code>, etc.


Example 2. To load anything in <code>/apache/</code> that ends in <code>.log</code>.


Batch syntax and examples

Use batch to set up a one time, destructive input of data from a source. For continuous, non-destructive inputs, use monitor. Remember, after the batch input is indexed, Splunk deletes the file.

move_policy = sinkhole
<attrbute1> = <val1>
<attrbute2> = <val2>

Important: When defining batch inputs, you must include the setting, <code>move_policy = sinkhole</code>. This loads the file destructively. Do not use the batch input type for files you do not want to consume destructively.

Example: This example batch loads all files from the directory <code>/system/flight815/</code>, but does not recurse through any subdirectories under it:

move_policy = sinkhole

For details on using the asterisk in input paths, see "Specify input paths with wildcards".

Use the CLI
Specify input paths with wildcards

This documentation applies to the following versions of Splunk® Enterprise: 4.3, 4.3.1, 4.3.2, 4.3.3, 4.3.4, 4.3.5, 4.3.6, 4.3.7


0; i've now added that piece of info to the topic; thanks for catching that.

May 26, 2011

Does followTail default to 0 or 1?

May 24, 2011

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters