Index time versus search time

Splunk Enterprise documentation contains references to the terms "index time" and "search time". These terms distinguish between the types of processing that occur during indexing, and the types that occur when a search is run.

It is important to consider this distinction when administering Splunk Enterprise. For example, say that you want to use custom source types and hosts. You should define those custom source types and hosts before you start indexing, so that the indexing process can tag events with them. After indexing, you cannot change the host or source type assignments.

If you neglect to create the custom source types and hosts until after you have begun to index data, your choice is either to re-index the data, in order to apply the custom source types and hosts to the existing data, as well as to new data, or, alternatively, to manage the issue at search time by tagging the events with alternate values.

Conversely, as a general rule, it is better to perform most knowledge-building activities, such as field extraction, at search time. Index-time custom field extraction can degrade performance at both index time and search time. When you add to the number of fields extracted during indexing, the indexing process slows. Later, searches on the index are also slower, because the index has been enlarged by the additional fields, and a search on a larger index takes longer.

You can avoid such performance issues by instead relying on search-time field extraction. For details on search-time field extraction, see About fields and When Splunk Enterprise extracts fields in the Knowledge Manager Manual.

At index time

Index-time processes take place between the point when the data is consumed and the point when it is written to disk.

The following processes occur during index time:

Default field extraction (such as host, source, sourcetype, and timestamp)
Static or dynamic host assignment for specific inputs
Default host assignment overrides
Source type customization
Custom index-time field extraction
Structured data field extraction
Event timestamping
Event linebreaking
Event segmentation (also happens at search time)

At search time

Search-time processes take place while a search is run, as events are collected by the search. The following processes occur at search time:

Event segmentation (also happens at index time)
Event type matching
Search-time field extraction (automatic and custom field extractions, including multivalue fields and calculated fields)
Field aliasing
Addition of fields from lookups
Source type renaming
Tagging

The data pipeline

The data pipeline provides a more detailed way to think about the progression of data through the system. The data pipeline is particularly useful for understanding how to assign configurations and work across a distributed deployment. See How data moves through Splunk: the data pipeline in Distributed Deployment.

Related answers from Splunk Community

Index time versus search time

At index time

At search time

The data pipeline

Comments

Index time versus search time

Was this topic useful?