Admin Manual

 


How Splunk Works

Indexing performance

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.

Indexing performance

Splunk's indexing performance can be maximized by tweaking settings in Splunk's configuration files. Here are some basic tweaks you can implement to improve indexing performance:


Negative impact on indexing performance

Processors

Splunk has several internal processors. If you notice that Splunk isn't indexing your data as you like, you can track down exactly which processor is responsible for the delay by running the following search:


index::_internal NOT sendout group=pipeline | timechart sum(cpu_seconds) by processor

This search shows you a chart of Splunk's internal processors. If one processor in particular is taking up more cpu time than another, you can tweak settings to reduce this.


Below are some tuning parameters in Splunk's configuration files that affect indexing performance.


indexes.conf

indexes.conf controls how Splunk's indexes are configured. You can change the following entries to improve indexing performance.


indexThreads = <non-negative number> (0) The number of extra threads to use for a specific index. Turning up the number of index threads will improve indexing, but is dependent on the capability of your hardware. It is not recommended to turn up index threads to be greater than the number of processors in the server that this instance is running on. For example, a single core system should never be set to higher than 1
maxMemMB = <non-negative number> (50)Amount of memory to allocate for indexing. This amount will be allocated per index thread. For example, if you have indexThreads set to 2 and maxMemMB set to 300, you will be using 600 MB of memory
maxDataSize = <non-negative number> (750)Max amount of data in MBs db hot can grow to. Values larger than the default are not recommended unless you have a 64-bit system.

props.conf

props.conf controls what parameters apply to events during indexing based on settings tied to each event's source, host, or sourcetype.


DATETIME_CONFIG = <filename relative to Splunk_HOME> (/etc/datetime.xml) Specifies the file to configure the timestamp extractor. This configuration may also be set to "NONE" to prevent the timestamp extractor from running or "CURRENT" to assign the current system time to each event.
TIME_FORMAT = <strptime-style format> (empty) Specifies a strptime format to extract the date. Specifying a strptime format for date extraction accelerates event indexing.
MAX_TIMESTAMP_LOOKAHEAD = <integer> (150) Specifies how far into an event Splunk should look for a timestamp. If you know your timestamp is in the first n characters of the event, set this to n. This will increase the speed of indexing.

segmenters.conf

segmenters.conf defines schemes for how events will be tokenized in Splunk's index.


MAJOR = <space separated list of strings> Move MINOR breakers into the MAJOR breaker list, or remove breakers in the MAJOR breaker list to change the size and amount of raw data events.
MINOR = <space separated list of strings> Remove the MINOR= string of characters that represent tokens to index by in addition to the MAJOR breaker list. Reduce or remove this list to increase indexing performance.

Read more about how to configure custom segmentation.

This documentation applies to the following versions of Splunk: 3.2 , 3.2.1 , 3.2.2 , 3.2.3 , 3.2.4 , 3.2.5 , 3.2.6 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!