Indexers in a distributed deployment
Important: To better understand this topic, you should be familiar with Splunk Enterprise distributed environments, covered in Distributed Deployment.
The indexer is the Splunk Enterprise component that creates and manages indexes. The primary functions of an indexer are:
- Indexing incoming data.
- Searching the indexed data.
For larger-scale needs, indexing is split out from the data input function and sometimes from the search management function as well. In these larger, distributed deployments, the indexer might reside on its own machine and handle only indexing, along with searching of its indexed data. In those cases, other Splunk Enterprise components take over the non-indexing roles.
For instance, you might have a set of Windows and Linux machines generating events, which need to go to a central indexer for consolidation. Usually the best way to do this is to install a lightweight instance of Splunk Enterprise, known as a forwarder, on each of the event-generating machines. These forwarders handle data input and send the data across the network to the indexer residing on its own machine.
Similarly, in cases where you have a large amount of indexed data and numerous concurrent users searching on it, it can make sense to split off the search management function from indexing. In this type of scenario, known as distributed search, one or more search heads distribute search requests across multiple indexers. The indexers still perform the actual searching of their own indexes, but the search heads manage the overall search process across all the indexers and present the consolidated search results to the user.
Here is an example of a scaled-out deployment:
While the fundamental issues of indexing and event processing remain the same for distributed deployments, it is important to take into account deployment needs when planning your indexing strategy.
Forward data to an indexer
To forward remote data to an indexer, you use forwarders, which are Splunk Enterprise instances that receive data inputs and then consolidate and send the data to a Splunk Enterprise indexer. Forwarders come in two flavors:
- Universal forwarders. These maintain a small footprint on their host machine. They perform minimal processing on the incoming data streams before forwarding them on to an indexer, also known as the receiver.
- Heavy forwarders. These retain most of the functionality of a full Splunk Enterprise instance. They can parse data before forwarding it to the receiving indexer. (See How indexing works for the distinction between parsing and indexing.) They can store indexed data locally and also forward the parsed data to a receiver for final indexing on that machine as well.
Both types of forwarders tag data with metadata such as host, source, and source type, before forwarding it on to the indexer.
Forwarders allow you to use resources efficiently when processing large quantities or disparate types of data coming from remote sources. They also enable a number of interesting deployment topologies, by offering capabilities for load balancing, data filtering, and routing.
For an extended discussion of forwarders, including configuration and detailed use cases, read Forwarding Data.
Search across multiple indexers
In distributed search, search heads send search requests to indexers and then merge the results back to the user. This is useful for a number of purposes, including horizontal scaling, access control, and managing geo-dispersed data.
For an extended discussion of distributed search and search heads, including configuration and detailed use cases, see Distributed Search.
Deploy indexers in a distributed environment
To implement a distributed environment similar to the diagram earlier in this topic, you need to install and configure three types of components:
- Forwarders (typically, universal forwarders)
- Search head(s)
Install and configure the indexers
By default, all full Splunk Enterprise instances serve as indexers. For horizontal scaling, you can install multiple indexers on separate machines.
To learn how to install a Splunk Enterprise instance, read the Installation Manual.
Then return to this manual for information on configuring each individual indexer to meet the needs of your specific deployment. Start with the chapter Manage indexers and continue with the chapters that follow.
Install and configure the forwarders
A typical distributed deployment has a large number of forwarders feeding data to a few indexers. For most forwarding purposes, the universal forwarder is the best choice. The universal forwarder is a separate downloadable from the full Splunk Enterprise instance.
To learn how to install and configure forwarders, read Forwarding Data.
Install and configure the search head(s)
You can install one or more search heads to handle your distributed search needs. Search heads are just full Splunk Enterprise instances that have been specially configured.
To learn how to configure a search head, read Distributed Search.
Other deployment tasks
You can use the Splunk Enterprise deployment server to simplify the job of updating the deployment components. For details on how to configure a deployment server, see Updating Splunk Enterprise Instances.
Install a cluster of indexers
If data availability, data fidelity, and data recovery are key issues for your deployment, then you should consider deploying an indexer cluster, rather than a series of individual indexers. For further information, see About indexer clusters and index replication.
Install an indexer
About managing indexes
This documentation applies to the following versions of Splunk® Enterprise: 9.0.0, 9.0.1