Configure rule-based source type recognition
You can use rule-based source type recognition to expand the range of source types that Splunk software recognizes.
You must configure rule-based source type recognition using configuration files. If you use Splunk Cloud Platform, you must configure source type recognition using a heavy forwarder before you send the data to Splunk Cloud Platform. You might also need to file a support ticket to confirm that your Splunk Cloud Platformdeployment recognizes the source type rules.
If you use Splunk Enterprise, create a rule::
stanza in the props.conf configuration file that associates a specific source type with a set of qualifying criteria. When it consumes data, the Splunk platform assigns the specified source type to file inputs that meet the rule qualifications.
How rule-based source type recognition works
You can create two types of rules in the props.conf file: rules and delayed rules. The difference between the two is the point at which the Splunk platform checks them during the source typing process. As it processes each set of incoming data, the Splunk platform uses several methods to determine source types:
- After the Splunk platform checks for explicit source type definitions, it looks at any
rule::
stanzas that you defined in the props.conf file and tries to match source types to the data based on the classification rules specified in those stanzas. - If the Splunk platform is unable to find a matching source type using the available
rule::
stanzas, it tries to use automatic source type matching, where it identifies patterns similar to source types it has learned in the past. - If that method fails, the Splunk platform then checks any
delayedrule::
stanzas in the props.conf file and tries to match the data to source types using the rules in those stanzas.
For details on the precedence rules that the Splunk platform uses to assign source types to data, see How the Splunk platform assigns source types.
You can configure the Splunk platform so that rule::
stanzas contain classification rules for specialized source types, while delayedrule::
stanzas contain classification rules for generic source types. That way, the Splunk platform applies the generic source types to broad ranges of events that aren't qualified for more specialized source types.
For example, you could use rule::
stanzas to catch data with specific syslog source types, such as sendmail_syslog
or cisco_syslog
, and then configure a delayedrule::
stanza to apply the generic syslog
source type to any remaining syslog data.
Create source typing rules in props.conf
To set source typing rules, edit the props.conf configuration file in $SPLUNK_HOME/etc/system/local/ or in your own custom application directory in $SPLUNK_HOME/etc/apps/. For information on configuration files in general, see About configuration files in the Admin Manual.
Create a rule by following these steps:
- In props.conf, add a
rule::
ordelayedrule::
stanza. Provide a name for the rule in the stanza header.[rule::<rule_name>] OR [delayedrule::<rule_name>]
- Declare the source type in the body of the stanza.
[rule::<rule_name>] OR [delayedrule::<rule_name>] sourcetype=<source_type>
-
After the source type declaration, set a numerical value in the
MORE_THAN
andLESS_THAN
attributes, corresponding to the percentage of input lines that must contain the string specified by the regular expression. For example,MORE_THAN_80
means at least 80% of the lines must contain the associated expression.LESS_THAN_20
means that less than 20% of the lines can contain the associated expression.
You can set any number ofMORE_THAN
andLESS_THAN
conditions in a rule. The rule's source type is assigned to a data file only if the data qualifies all the statements in the rule. For example, you can define a rule that assigns a specific source type to a file input only if more than 60% of the lines match one regular expression and less than 20% match another regular expression.
Follow this syntax to define the source type assignment rules:[rule::<rule_name>] OR [delayedrule::<rule_name>] sourcetype=<source_type> MORE_THAN_[0-99] = <regex> LESS_THAN_[1-100] = <regex>
Despite its nomenclature, the MORE_THAN_
attribute actually means more than or equal to. Similarly the LESS_THAN_
attribute means less than or equal to.
You can test regular expressions by using them in searches with the rex search command.
Examples
The following examples show ways to configure source type rules.
Postfix syslog files
# postfix_syslog sourcetype rule [rule::postfix_syslog] sourcetype = postfix_syslog # If 80% of lines match this regex, then it must be this type MORE_THAN_80=^\w{3} +\d+ \d\d:\d\d:\d\d .* postfix(/\w+)?\[\d+\]:
Delayed rule for breakable text
# breaks text on ascii art and blank lines if more than 10% of lines have # ascii art or blank lines, and less than 10% have timestamps [delayedrule::breakable_text] sourcetype = breakable_text MORE_THAN_10 = (^(?:---|===|\*\*\*|___|=+=))|^\s*$ LESS_THAN_10 = [: ][012]?[0-9]:[0-5][0-9]
Override automatic source type assignment | List of pretrained source types |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408
Feedback submitted, thanks!