Admin Manual

 


Authentication

Change the characters that separate segments within events

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Change the characters that separate segments within events

It is easy to change the breakers (also called cleaners) used to tokenize data prior to indexing.


Changing the breakers can help clean up the way the server segment terms or to improve performance / diskspace usage by reducing the number of terms indexed per event.


There are two types of breakers; major and minor. Major breakers are most closely related to word boundaries where minor breakers are sections of a word that are indexed independently.


For example, take the following:


bob.smith@splunk.com splunk.com/people/~bobsmith/calc.pl

In the above it would produce indexed terms for:


bob.smith@splunk.com
bob.smith@splunk
bob.smith
bob
splunk.com/people/~bobsmith/calc.pl
splunk.com/people/~bobsmith/calc
splunk.com/people/~bobsmith
splunk.com/people
splunk.com
splunk

The major breaker in the example above is the space character.


The minor breakers are . / and @


Breakers are specified in a an XML config file.


The servers ships with two default files:


SPLUNK_HOME/etc/myinstall/pluginconfs/
    cleaners.xml
    majorOnly_cleaners.xml

The format of these files is a list of breaking separators with a name, the characters, and attribute if they are minor or major.


The following is an example of breaker '['


<breakingSeparator name="lsqrBracketBreak" isMajor="1">
    <value>[</value>
</breakingSeparator>

If you want to change or add a breaker you can go and change the cleaners.xml file and then restart the server.


The best way to create you own cleaners files is to copy either SPLUNK_HOME/myinstall/pluginconfis/cleaners.xml or majorOnly_cleaners.xml and then edit/remove entries as necessary.


If you create your own cleaner then you must edit the multiIndex.xml configuration to use your new cleaner.


Every index has one cleaner. A default cleaner is located at top of the multiIndex.xml file (<defaultCleaningConfig>) which is used as a cleaner is not specified in the index tag.


To add your cleaner either change the default cleaner <defaultCleaningConfig> uri or add a <cleaningConfig> tag and uri to a specific <database> tag.


<databases>
    <database>
        <name>main</name>
        <dbHomePath>$$SPLUNK_DB]]/defaultdb/db</dbHomePath>
        <coldDBPath>$$SPLUNK_DB]]/defaultdb/colddb</coldDBPath>
        <cleaningConfig>$$SPLUNK_HOME]]/etc/myinstall/pluginconfs/mycleaners.xml
             ...
    </database>
</databases>

NOTE - its not advised to change cleaners on an index that contains data. You should clean the index after changing the cleaners config.

This documentation applies to the following versions of Splunk: 2.1 , 2.2 , 2.2.1 , 2.2.3 , 2.2.6 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.