Components and the data pipeline
Each segment of the data pipeline corresponds to one or more Splunk Enterprise processing components. For example, data input is a pipeline segment. You can use either an indexer or a forwarder to input data.
Most segments of the data pipeline can be handled by multiple component types. The component that you employ for a segment depends on how you structure your deployment.
For example, although you can input data directly into an indexer, you would typically do so only in small deployments consisting of a single instance. In larger deployments, and frequently also in single-instance deployments, you would instead input data through a forwarder. Delegating input tasks to a forwarder can offer greater flexibility for your deployment.
For information on the data pipeline, see How data moves through Splunk Enterprise: The data pipeline.
For information on the processing components, see Scale your deployment with Splunk Enterprise components.
How components support the data pipeline
This table correlates the data pipeline segments with the components that can handle the segment:
|Data pipeline segment||Components|
Parsing can also occur on other components under limited circumstances:
- Various components, such as search heads and indexer cluster manager nodes, process their own internal data. When doing so, they perform parsing locally.
- When a universal forwarder ingests structured data, it performs the parsing locally. The indexer does not further parse the structured data. See Extract fields from files with structured data in Getting Data In.
Some typical interactions between components
These are examples of some of the ways that you can distribute and manage Splunk Enterprise functionality.
Forward data to an indexer
In most deployments, forwarders handle data input only, collecting data and sending it on to a Splunk Enterprise indexer. The indexer then performs both parsing and indexing. In some deployments, however, the forwarders also parse the data before sending it to the indexer, which then only indexes.
See How data moves through Splunk Enterprise: The data pipeline for the distinction between parsing and indexing.
Forwarders come in two flavors:
- Universal forwarders. These maintain a small footprint on their host machine. They perform minimal processing on the incoming data streams before forwarding them on to an indexer. The indexer then parses and indexes the data.
- Heavy forwarders. These retain much of the functionality of a full Splunk Enterprise instance. They can parse data before forwarding it to the receiving indexer. When a heavy forwarder parses the data, the indexer handles only the indexing segment.
Both types of forwarders tag data with metadata, such as host, source, and source type, before forwarding it on to the indexer.
Forwarders allow you to use resources efficiently while processing large quantities or disparate types of data. They also enable a number of interesting deployment topologies, by offering capabilities for load balancing, data filtering, and routing.
For an extended discussion of forwarders, including configuration and detailed use cases, see About forwarding and receiving in Forwarding Data.
Search across multiple indexers
In distributed search, you separate the indexing/parsing and search segments. Search heads send search requests to indexers and merge the results back to the user. This topology is particularly useful for horizontal scaling. To expand your deployment beyond the departmental level, you will likely employ distributed search.
For an extended discussion of distributed search, including configuration and detailed use cases, see About distributed search in Distributed Search.
Configurations and the data pipeline
For guidance on where to configure Splunk Enterprise settings, see Configuration parameters and the data pipeline in the Admin Manual. The topic lists configuration settings and the data pipeline segments that they act upon.
If you know which components in your Splunk Enterprise topology handle which segments of the data pipeline, you can use that topic to determine where to configure any particular setting. For example, if you use a search head to handle the search segment, you need to configure all search-related settings on the search head.
For more information
In summary, these are the fundamental components of a Splunk Enterprise distributed environment:
How data moves through Splunk deployments: The data pipeline
Components that help to manage your deployment
This documentation applies to the following versions of Splunk® Enterprise: 8.1.0, 8.1.1