Splunk® Enterprise

Getting Data In

Download manual as PDF

Download topic as PDF

How Splunk software handles your data

Splunk Enterprise consumes data and indexes it, transforming it into searchable knowledge in the form of events. The data pipeline shows the main processes that act on the data during indexing. These processes constitute event processing. After the data is processed into events, you can associate the events with knowledge objects to enhance their usefulness.

The data pipeline

Incoming data moves through the data pipeline, which is described in How data moves through Splunk deployments: The data pipeline in the Distributed Deployment Manual.

This diagram shows the main steps in the data pipeline.

Datapipeline1 60.png

Event processing

Event processing occurs in two stages, parsing and indexing. All data enters through the parsing pipeline as large chunks. During parsing, Splunk software breaks these chunks into events. It then hands off the events to the indexing pipeline, where final processing occurs.

During both parsing and indexing, Splunk software transforms the data. You can configure most of these processes to adapt them to your needs.

In the parsing pipeline, Splunk software performs a number of actions, including:

  • Extracting a set of default fields for each event, including host, source, and sourcetype.
  • Configuring character set encoding.
  • Identifying line termination using line breaking rules. You can also modify line termination settings interactively, using the "Set Sourcetype" page in Splunk Web.
  • Identifying or creating timestamps. At the same time that it processes timestamps, Splunk software identifies event boundaries. You can modify timestamp setings interactively, using the "Set sourcetype" page.
  • Anonymizing data, based on your configuration. You can mask sensitive event data (such as credit card or social security numbers) at this stage.
  • Applying custom metadata to incoming events, based on your configuration.

In the indexing pipeline, Splunk software performs additional processing, including:

  • Breaking all events into segments that can then be searched. You can determine the level of segmentation, which affects indexing and searching speed, search capability, and efficiency of disk compression.
  • Building the index data structures.
  • Writing the raw data and index files to disk, where post-indexing compression occurs.

The distinction between parsing and indexing pipelines matters mainly for forwarders. Heavy forwarders can parse data locally and then forward the parsed data on to receiving indexers, where the final indexing occurs. Universal forwarders offer minimal parsing in specific cases such as handling structured data files. Additional parsing occurs on the receiving indexer.

For information about events and what happens to them during the indexing process, see Overview of event processing in this manual.

Enhance and refine events

After the data has been transformed into events, you can make the events more useful by associating them with knowledge objects, such as event types, field extractions, and reports. For information about managing Splunk knowledge, see the Knowledge Manager manual, starting with "What is Splunk knowledge?".

PREVIOUS
Configure your inputs
  NEXT
How do you want to add data?

This documentation applies to the following versions of Splunk® Enterprise: 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 6.6.11, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.2.0, 7.2.1


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters