Splunk Cloud Platform

Getting Data In

Configure rule-based source type recognition

You can use rule-based source type recognition to expand the range of source types that Splunk software recognizes.

You must configure rule-based source type recognition using configuration files. If you use Splunk Cloud Platform, you must configure source type recognition using a heavy forwarder before you send the data to Splunk Cloud Platform. You might also need to file a support ticket to confirm that your Splunk Cloud Platformdeployment recognizes the source type rules.

If you use Splunk Enterprise, create a rule:: stanza in the props.conf configuration file that associates a specific source type with a set of qualifying criteria. When it consumes data, the Splunk platform assigns the specified source type to file inputs that meet the rule qualifications.

How rule-based source type recognition works

You can create two types of rules in the props.conf file: rules and delayed rules. The difference between the two is the point at which the Splunk platform checks them during the source typing process. As it processes each set of incoming data, the Splunk platform uses several methods to determine source types:

  • After the Splunk platform checks for explicit source type definitions, it looks at any rule:: stanzas that you defined in the props.conf file and tries to match source types to the data based on the classification rules specified in those stanzas.
  • If the Splunk platform is unable to find a matching source type using the available rule:: stanzas, it tries to use automatic source type matching, where it identifies patterns similar to source types it has learned in the past.
  • If that method fails, the Splunk platform then checks any delayedrule:: stanzas in the props.conf file and tries to match the data to source types using the rules in those stanzas.

For details on the precedence rules that the Splunk platform uses to assign source types to data, see How the Splunk platform assigns source types.

You can configure the Splunk platform so that rule:: stanzas contain classification rules for specialized source types, while delayedrule:: stanzas contain classification rules for generic source types. That way, the Splunk platform applies the generic source types to broad ranges of events that aren't qualified for more specialized source types.

For example, you could use rule:: stanzas to catch data with specific syslog source types, such as sendmail_syslog or cisco_syslog, and then configure a delayedrule:: stanza to apply the generic syslog source type to any remaining syslog data.

Create source typing rules in props.conf

To set source typing rules, edit the props.conf configuration file in $SPLUNK_HOME/etc/system/local/ or in your own custom application directory in $SPLUNK_HOME/etc/apps/. For information on configuration files in general, see About configuration files in the Admin Manual.

Create a rule by following these steps:

  1. In props.conf, add a rule:: or delayedrule:: stanza. Provide a name for the rule in the stanza header.
    [rule::<rule_name>] OR [delayedrule::<rule_name>]
    
  2. Declare the source type in the body of the stanza.
    [rule::<rule_name>] OR [delayedrule::<rule_name>]
    sourcetype=<source_type>
    
  3. After the source type declaration, set a numerical value in the MORE_THAN and LESS_THAN attributes, corresponding to the percentage of input lines that must contain the string specified by the regular expression. For example, MORE_THAN_80 means at least 80% of the lines must contain the associated expression. LESS_THAN_20 means that less than 20% of the lines can contain the associated expression.

    You can set any number of MORE_THAN and LESS_THAN conditions in a rule. The rule's source type is assigned to a data file only if the data qualifies all the statements in the rule. For example, you can define a rule that assigns a specific source type to a file input only if more than 60% of the lines match one regular expression and less than 20% match another regular expression.

    Follow this syntax to define the source type assignment rules:
    [rule::<rule_name>] OR [delayedrule::<rule_name>]
    sourcetype=<source_type>
    MORE_THAN_[0-99] = <regex>
    LESS_THAN_[1-100] = <regex>
    

Despite its nomenclature, the MORE_THAN_ attribute actually means more than or equal to. Similarly the LESS_THAN_ attribute means less than or equal to.

You can test regular expressions by using them in searches with the rex search command.

Examples

The following examples show ways to configure source type rules.

Postfix syslog files

# postfix_syslog sourcetype rule
[rule::postfix_syslog]
sourcetype = postfix_syslog
# If 80% of lines match this regex, then it must be this type
MORE_THAN_80=^\w{3} +\d+ \d\d:\d\d:\d\d .* postfix(/\w+)?\[\d+\]:

Delayed rule for breakable text

# breaks text on ascii art and blank lines if more than 10% of lines have
# ascii art or blank lines, and less than 10% have timestamps
[delayedrule::breakable_text]
sourcetype = breakable_text
MORE_THAN_10 = (^(?:---|===|\*\*\*|___|=+=))|^\s*$
LESS_THAN_10 = [: ][012]?[0-9]:[0-5][0-9]
Last modified on 27 October, 2021
Override automatic source type assignment   List of pretrained source types

This documentation applies to the following versions of Splunk Cloud Platform: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403 (latest FedRAMP release), 9.2.2406


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters