What's a Splunk index?
Indexes reside in flat files in a datastore on your file system. Splunk manages its index files to facilitate flexible searching and fast data retrieval, eventually archiving them according to a user-configurable schedule. Splunk handles everything with flat files; it doesn't require any third-party database software running in the background.
During indexing, Splunk processes incoming raw data to enable fast search and analysis, storing the result in an index. As part of the indexing process, Splunk adds knowledge to the data in various ways, including by:
- Separating the datastream into individual, searchable events.
- Creating or identifying timestamps.
- Extracting fields such as host, source, and sourcetype.
- Performing user-defined actions on the incoming data, such as identifying custom fields, masking sensitive data, writing new or modified keys, applying breaking rules for multi-line events, filtering unwanted events, and routing events to specified indexes or servers.
To start the indexing process, simply specify the data inputs, using Splunk Web, the CLI, or the
inputs.conf file. You can add additional inputs at any time, and Splunk will begin indexing them as well. See "What Splunk can index" in the Getting Data In manual.
Splunk, by default, puts all user data into a single, preconfigured index. It also employs several other indexes for internal purposes. You can add new indexes and manage existing ones to meet your data requirements. See "Manage indexes" in this manual.
Configuration parameters and the data pipeline
How indexing works