Admin Manual

 


How Splunk Works

props.conf

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

props.conf

props.conf controls what parameters apply to events during indexing based on settings tied to each event's source, host, or sourcetype.

IMPORTANT: You can use wildcards in your props.conf <spec> much the same as in inputs.conf. You can only use wildcards for host or source.


props.conf.spec

# Copyright (C) 2005-2007 Splunk Inc.  All Rights Reserved.  Version 3.0 
#
# This file contains possible attribute/value pairs for a "props.conf" file.
#
# The processing properties of Splunk are configured through the files
# $SPLUNK_HOME/etc/bundles/<bundle name>/props.conf
# There is a props.conf in $SPLUNK_HOME/etc/bundles/default/.  To set custom configurations, 
# place an props.conf in $SPLUNK_HOME/etc/bundles/local/ or your own custom bundle directory.
# Here is an example props.conf stanza:
[<spec>]
attribute1 = val1
attribute2 = val2
...
# A props.conf file can contain multiple stanzas with different <specs>.
<spec> can be:
1. <sourcetype>, the sourcetype of an event.
2. host::<host>, where <host> is the host for an event.
3. reportinghost::<host>, where <host> is the host reporting an event.
4. source::<source>, where <source> is the source for an event.
5. rule::<rulename>, where <rulename> is a unique name of a sourcetype classification rule.
6. delayedrule::<rulename>, where <rulename> is a unique name of a
   delayed sourcetype classification rule.  These are only considered
   as a last resort before generating a new sourcetype based on the
   source seen.
If the same <spec> is found in two bundle directories, the following precedence rules apply:
Attributes in $SPLUNK_HOME/etc/bundles/local are read first. 
Attributes in $SPLUNK_HOME/etc/bundles/default are read last. 
Attributes in other directories are loaded in alphabetical order by name.
Overriding is performed attribute by attribute, so if a specific attribute is not specified in 
"local", but in another bundle, it will be taken from that other bundle.
**************
The possible attributes/value pairs for props.conf, and their default values, are:
# International characters
CHARSET = <string>
        * When set, Splunk will assume the input from the given <spec> is in the specified encoding.  
        * A list of valid encodings can be retrieved using the command "iconv -l" on most *nix systems.  
        * If an invalid encoding is specified, a warning will be logged during initial configuration 
        and further input from that <spec> will be discarded.  
        * If the source encoding is valid, but some characters from the <spec> are not valid in the
        specified encoding, then the characters will be escaped as hex (e.g. "\xF3").
        * Defaults to ASCII.
# Line breaking
TRUNCATE = <non-negative integer>
        * Change the default maximum line length.  
        * Set to 0 if you do not want truncation ever (very long lines are, however, often a sign of 
        garbage data).
    * Defaults to 10000.
    
    
# Multiline events
SHOULD_LINEMERGE = <true/false>
        * When set to true, Splunk combines several input lines into a single event, based on the 
        following configuration attributes.
        * Defaults to true.
        
        
# The following are used only when SHOULD_LINEMERGE = True
AUTO_LINEMERGE = <true/false>
        * Directs Splunk to use automatic learning methods to determine where to break lines into events.
        * Defaults to true.
BREAK_ONLY_BEFORE_DATE = <true/false>
        * When set to true, Splunk will create a new event if and only if it encounters
    a new line with a date.
    * Defaults to false.
BREAK_ONLY_BEFORE = <regular expression>
        * When set, Splunk will create a new event if and only if it encounters
    a new line that matches the regular expression.
    * Defaults to empty.
MUST_BREAK_AFTER = <regular expression>
        * When set, and the regular expression matches the current line,
    Splunk is guaranteed to create a new event for the next input line.
    * Splunk may still break before the current line if another rule matches.
    * Defaults to empty.
MUST_NOT_BREAK_AFTER = <regular expression>
        * When set and the current line matches the regular expression, Splunk will
    not break on any subsequent lines until the MUST_BREAK_AFTER expression
        matches.
        * Defaults to empty.
MUST_NOT_BREAK_BEFORE = <regular expression>
        * When set and the current line matches the regular expression, Splunk will not break the last 
        event before the current line.
        * Defaults to empty.
MAX_EVENTS = <integer>
        * Specifies the maximum number of input lines that will be added to any event. 
        * Splunk will break after the specified number of lines are read.
        * Defaults to 256.
        
     
# Timestamp extraction configuration
DATETIME_CONFIG = <filename relative to $SPLUNK_HOME>
        * Specifies the file to configure the timestamp extractor.
    * This configuration may also be set to "NONE" to prevent the timestamp extractor from running 
    or "CURRENT" to assign the current system time to each event.
    * Defaults to /etc/datetime.xml (eg $SPLUNK_HOME/etc/datetime.xml).
MAX_TIMESTAMP_LOOKAHEAD = <integer>
        * Specifies how far (in characters) into an event Splunk should look for a timestamp.
        * Defaults to 150.
TIME_PREFIX = <regular expression>
        * Specifies the necessary condition for a timestamp to be extracted. 
        * The timestamping algorithm will only look for a timestamp after the prefix in the event.
        * Defaults to empty.
TIME_FORMAT = <strptime-style format>
        * Specifies a strptime format string to extract the date. 
        * This method of date extraction does not support in-event timezones. 
        * TIME_FORMAT starts reading after the TIME_PREFIX. 
        * The <strptime-style format> must contain the hour, minute, month, and day.
        * Defaults to empty.
TZ = <posix timezone string>
        * The algorithm for determining the time zone for a particular event is as follows:
      - If the event has a timezone in its raw text (e.g., UTC, -08:00), use that as the timezone.
      - If TZ is set to a valid timezone string, use that as the timezone for the event.
      - Otherwise, use the timezone of the system that is running splunkd.
    * Defaults to empty.
MAX_DAYS_AGO = <integer>
        * Specifies the maximum number of days past, from the current date, for an extracted date to be valid.  
        * If set to 10, for example, date that are older than 10 days ago are ignored.
        * Defaults to 1000.
        * PLEASE NOTE:  if your data is older than 1000 days, you must change this setting.
MAX_DAYS_HENCE = <integer>
        * Specifies the maximum number days in the future, from the current date, 
        for an extracted date to be valid.  
        * If set to 3, for example, dates that are more than 3 days in the future will be ignored.  
        * False positives are less likely with a tighter the window.
        * The default value allows dates that are tomorrow.  
        * If your machines have the wrong date set or are in a timezone that is one day ahead, 
        increase this value to at least 3.
    * Defaults to 2.
# Transform configuration
# You can use TRANSFORMS or REPORT to create extracted fields.
# Please note that TRANSFORMS should only be used when speed is of the essence
# and the extraction rule will not change.  REPORTS allows for more flexibility.
# For more information, see documentation at: http://www.splunk.com/doc/current/admin/ExtractFields
TRANSFORMS<class> = <"transform name","transform name",...> {see transforms.conf.spec}
        * Splunk configures classes of regular expressions for each event.  
        * For each class, Splunk takes the configuration from the highest precedence configuration block
        (see precedence rules at the beginning of this file).
        * If a particular class is specified for a source, it will override the same class if it is 
        specified for a sourcetype. 
        * Similarly, if a particular class is specified in the local bundle for a sourcetype, it will 
        override that class for the default bundle for that sourcetype.
 
    * The following is an example TRANSFORMS class in the default bundle for
    all sourcetypes:
                TRANSFORMS-annotation = filetype,loglevel,os,browser,language,ip,email,url
# Report configuration
REPORT<class> = <"transform name","transform name",...> {see transforms.conf.spec}
        * Like TRANSFORMS, this configures extractions, but only those which should be run at report time. 
        * TRANSFORMS are not run at report time, only at index time.
KV_MODE = <none/auto/multi>
        * Specifies the key/value extraction mode for the data. 
        * Set KV_MODE to :
    -- "none" if you want no key/value extraction to take place.
    -- "auto" extracts key/value pairs separated by equal signs.
    -- "multi" invokes multikv to expand a tabular event into multiple events.
        * Defaults to auto.
# Sourcetype configuration
sourcetype = <string>
        * Can only be set for a [<source>::...] stanza.
        * Anything from that <source> is assigned the specified sourcetype.
    * Defaults to empty.
    
# The following attribute/value pairs can only be set for a stanza that begins with [<sourcetype>]:
invalid_cause = <string>
        * Can only be set for a [sourcetype] stanza.
        * Splunk will not index any data with invalid_cause set.
        * Set <string> to "archive" to send the file to the archive processor (specified in unarchive_cmd).
        * Set to any other string to throw an error in the splunkd.log if running Splunklogger in debug mode.
        * Defaults to empty.
        
is_valid = <true/false>
        * Automatically set by invalid_cause.
        * DO NOT SET THIS.
    * Defaults to true.
unarchive_cmd = <string>
        * Only called if invalid_cause is set to "archive".
        * <string> specifies the shell command to run to extract an archived source.
        * Must be a shell command that takes input on stdin and produces output on stdout.
    * DOES NOT WORK ON BATCH PROCESSED FILES. Use preprocessing_script.
    * Defaults to empty.
preprocessing_script = <string>
        * Can only be set for a [sourcetype] stanza.
        * For batch processing, run a preprocessing script on the data stream using the binary found in
    $SPLUNK_HOME/bin.
        * DOES NOT WORK ON TAILING.  Use unarchive_cmd.
        * Defaults to empty.
LEARN_MODEL = <true/false>
        * For known sourcetypes, the fileclassifier will add a model file to the learned bundle.
        * To disable this behavior for diverse sourcetypes (such as sourcecode, where there is no good
        exemplar to make a sourcetype) set LEARN_MODEL = false.
        * Defaults to empty.
maxDist = <int>
        * Determines how different a sourcetype model may be from the current file.  
        * The larger the value, the more forgiving.
    * For example, if the value is very small (e.g., 10), then files of the specified 
    sourcetype should not vary much.
    * A larger value indicates that files of the given sourcetype vary quite a bit.
    * Defaults to 300.
    
# rule:: and delayedrule:: configuration
MORE_THAN<optional_unique_value>_<number> = <regular expression> (empty)
LESS_THAN<optional_unique_value>_<number> = <regular expression> (empty)
An example attribute value would be:
           [rule::bar_some]
           sourcetype = source_with_lots_of_bars
           # if more than 80% of lines have "----", but less than 70% have "####"
           # declare this a "source_with_lots_of_bars"
           MORE_THAN_80 = ----
           LESS_THAN_70 = ####
     A rule can have many MORE_THAN and LESS_THAN patterns, and all
     are required for the rule to match.
# Segmentation configuration
SEGMENTATION = <string>
        * Specifies the segmenter from segmenters.conf to use at index time.
        * You can set segmentation for any of the <spec> outlined at the top of this file.
SEGMENTATION-<segment selection> = <string>
        * Specifies that SplunkWeb should use the a specific segmenter for the given <segment selection>
        choice. 
        * Example segment selection choices are: all, inner, outer, raw.
        
        
# Binary file configuration
NO_BINARY_CHECK = <bool>
        * When set to true, Splunk will process binary files.
    * By default, binary files are ignored.
    * Defaults to false.
    
    
# File checksum configuration
CHECK_METHOD = <entire_md5, modtime>
        * By default, if the checksums of the first and last 256 bytes of a file match existing stored 
        checksums, Splunk lists the file as already indexed and thus ignores it.
    * Set this to "entire_md5" to use the checksum of the entire file.
    * Alternatively, set this to "modtime" to check only the modification time of the file.
    * Defaults to endpoint_md5.
    
    
    
     
 # Internal settings
# NOT YOURS.  DO NOT SET.
_actions = <string> ("new,edit,delete")
   * Internal field used for user-interface control of objects.
   * Defaults to "new,edit,delete".
pulldown_type = <bool>
   * Internal field used for user-interface control of sourcetypes.
   * Defaults to empty.

props.conf.example

# Copyright (C) 2005-2007 Splunk Inc.  All Rights Reserved.  Version 3.0 
#
# The following are example props.conf configurations.
# To use one or more of these configurations, copy the configuration block into
# segmenters.conf in $SPLUNK_HOME/etc/bundles/local/ (or your own custom bundle).
########
# Line merging settings
########
# The following example will linemerge source data into multi-line events for apache_error sourcetype.
[apache_error]
SHOULD_LINEMERGE = True
########
# Settings for tuning
########
# The following example limits the amount of characters indexed per event from host::small_events.
[host::small_events]
TRUNCATE = 256
# The following example turns off DATETIME_CONFIG (which can speed up indexing) from any path
# that has ends in /mylogs/*.log.
[source::.../mylogs/*.log]
DATETIME_CONFIG = NONE
  
########
# Timestamp extraction configuration
########
# The following example sets Eastern Time Zone if host matches nyc*.
[host::nyc*]
# from 2007 onward
TZ = EST-5EDT,M3.2.0,M11.1.0
# 2006 and before:
# TZ EST-5EDT,M4.1.0/02:00:00,M10.5.0/02:00:00
# The following example uses a custom datetime.xml that has been created and placed in a custom bundle.
# This will set all events coming in from hosts starting with LA to use this custom file.
[host::LA*]
DATETIME_CONFIG = <etc/bundles/custom_time/datetime.xml>
########
# Transform configuration
########
# The following example will create a search field for host::foo if tied to a stanza in transforms.conf.
[host::foo]
TRANSFORMS-foo=foobar
# The following example will create an extracted field for sourcetype access_combined
# if tied to a stanza in transforms.conf.
[access_combined]
REPORT-baz = foobaz
########
# Sourcetype configuration
########
# The following example sets a sourcetype for the file web_acces.log.
[source::.../web_access.log]
sourcetype = splunk_web_access 
# The following example will untar syslog events.
[syslog]
invalid_cause = archive
unarchive_cmd = gzip -cd -
        
# The following example learns a custom sourcetype and limits the range between different examples
# with a smaller than default maxDist.
[custom_sourcetype]
LEARN_MODEL = true
maxDist = 30
# rule:: and delayedrule:: configuration
# The following examples create sourectype rules for custom sourcetypes with custom regex.
[rule::bar_some]
sourcetype = source_with_lots_of_bars
MORE_THAN_80 = ----
[delayed::baz_some]
sourcetype = my_sourcetype
LESS_THAN_70 = ####
########        
# File configuration
########
# Binary file configuration
# The following example will eat binary files from the host::sourcecode.
[host::sourcecode]
NO_BINARY_CHECK = true 
    
# File checksum configuration
# The following example will check the entirety of every file in the web_access dir rather than 
# skipping files that appear to be the same.
[source::.../web_access/*]
CHECK_METHOD = entire_md5

This documentation applies to the following versions of Splunk: 3.1.4 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!