Admin Manual

 


How Splunk Works

segmenters.conf

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

segmenters.conf

Segmenters.conf defines schemes for how events will be tokenized in Splunk's index. These schemes are applied to events from particular sources, hosts or sourcetypes via props.conf.


To edit this configuration for your local Splunk server, make your edits in $SPLUNK_HOME/etc/bundles/local/segmenters.conf.


You can create this file by copying examples from $SPLUNK_HOME/etc/bundles/README/segmenters.conf.example.


Never edit files in our default bundle in $SPLUNK_HOME/etc/bundles/default or your changes may be overwritten in an upgrade.


segmenters.conf.spec

# Copyright (C) 2005-2007 Splunk Inc.  All Rights Reserved.  Version 3.0 
#
# This file contains all possible options for a "tokenizers.conf" file.
#
# Splunk's division (token-izing) of data to be indexed and data to be displayed
# in the SplunkWeb are controlled by
# $SPLUNK_HOME/etc/bundles/<bundle name>/segmenters.conf
#
# A configuration looks like:
[tokenizer-name]
attribute1 = val1
attribute2 = val2
...
# The precedence rules are the same as in props.conf.spec.
# A configuration without a set of attribute/value pairs will use the
# [default] attribute/value pairs.
The possible attributes/value pairs are:
MAJOR = <space separated list of strings>
  * These will be the major breakers, that is, sequences of characters
    that will delimit indexed tokens. Typically these will be single
    characters. Also, \s represents the space; \n, the newline; \r,
    the carriage return; and \t, the tab.
MINOR = <space separated list of strings>
  * These will be the minor breakers. In addition to the tokens specified
    by the major breakers, for each minor breaker found, Splunk will index
    the token from the last major breaker to the current minor breaker and
    from the last minor breaker to the current minor breaker.
FILTER = <regular expression> (empty)
  * If set, specifies that division should take place only if the regular
    expression matches. Moreover, will only take place on the
    first group of the expression.
LOOKAHEAD = <integer> (-1)
  * If set and non-negative, division will only occur up to the specified
    character. If The filter is set as well, this is applied after filtration.
MINOR_LEN = <integer> (-1)
  * If set and non-negative, specifies how long a minor token can be. Longer
    minor tokens are discarded without prejudice.
MAJOR_LEN = <integer> (-1)
  * If set and non-negative, specifies how long a major token can be. Longer
    minor tokens are discarded without prejudice.
MINOR_COUNT = <integer> (-1)
  * If set and non-negative, specifies how many minor tokens to emit. After the
    specified number of minor tokens have been emitted, later ones will be
    discarded without prejudice.
MAJOR_COUNT = <integer> (-1)
  * If set and non-negative, specifies how many major tokens to emit. After the
    specified number of major tokens have been emitted, later ones will be
    discarded without prejudice.

This documentation applies to the following versions of Splunk: 3.0 , 3.0.1 , 3.0.2 , 3.1 , 3.1.1 , 3.1.2 , 3.1.3 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!