Splunk® Enterprise

Distributed Deployment Manual

Download manual as PDF

Splunk Enterprise version 6.x is no longer supported as of October 23, 2019. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Distributed Splunk Enterprise overview

This manual describes how to distribute various components of Splunk Enterprise functionality across multiple machines. By distributing Splunk Enterprise, you can scale its functionality to handle the data needs for enterprises of any size and complexity.

In single-machine deployments, one instance of Splunk Enterprise handles the entire end-to-end process, from data input through indexing to search. A single-machine deployment can be useful for testing and evaluation purposes and might serve the needs of department-sized environments. For larger environments, however, where data originates on many machines and where many users need to search the data, you'll want to distribute functionality across multiple Splunk Enterprise instances. This manual describes how to deploy and use Splunk Enterprise in such a distributed environment.

How Splunk Enterprise scales

Splunk Enterprise performs three key functions as it moves data through the data pipeline. First, it consumes data from files, the network, or elsewhere. Then it indexes the data. (Actually, it first parses and then indexes the data, but for purposes of this discussion, we consider parsing to be part of the indexing process.) Finally, it runs interactive or scheduled searches on the indexed data.

You can split this functionality across multiple specialized instances of Splunk Enterprise, ranging in number from just a few to thousands, depending on the quantity of data you're dealing with and other variables in your environment. You might, for example, create a deployment with many instances that only consume data, several other instances that index the data, and one or more instances that handle search requests. These specialized instances are known collectively as components. There are several types of components.

For a typical mid-size deployment, for example, you can deploy lightweight versions of Splunk Enterprise, called forwarders, on the machines where the data originates. The forwarders consume data locally and then forward the data across the network to another Splunk Enterprise component, called the indexer. The indexer does the heavy lifting; it indexes the data and runs searches. It should reside on a machine by itself. The forwarders, on the other hand, can easily co-exist on the machines generating the data, because the data-consuming function has minimal impact on machine performance. This diagram shows several forwarders sending data to a single indexer:

30 admin13 forwardreceive-dataforward 60.png

As you scale up, you can add more forwarders and indexers. For a larger deployment, you might have hundreds of forwarders sending data to a number of indexers. You can use load balancing on the forwarders, so that they distribute their data across some or all of the indexers. Not only does load balancing help with scaling, but it also provides a fail-over capability if one of the indexers goes down. The forwarders automatically switch to sending their data to any indexers that remain alive. In this diagram, each forwarder load-balances its data across two indexers:

30 admin13 forwardreceive-balance 60.png

To coordinate and consolidate search activities across multiple indexers, you can also separate out the functions of indexing and searching. In this type of deployment, called distributed search, each indexer just indexes data and performs searches across its own indexes. A Splunk Enterprise instance dedicated to search management, called the search head, coordinates searches across the set of indexers, consolidating the results and presenting them to the user:

Horizontal scaling new2 60.png

For the largest environments, you can deploy a pool of several search heads sharing a single configuration set. With search head pooling, you can coordinate simultaneous searches across a large number of indexers:

Search head pooling 1 60.png

These diagrams illustrate a few basic deployment topologies. You can actually combine the functions of data input, indexing, and search in a great variety of ways. For example, you can set up the forwarders so that they route data to multiple indexers, based on specified criteria. You can also configure forwarders to process data locally before sending the data on to an indexer for storage. In another scenario, you can deploy a single instance that serves as both search head and indexer, searching across not only its own indexes but the indexes on other indexers as well. You can mix-and-match Splunk Enterprise components as needed. The possible scenarios are nearly limitless.

This manual describes how to scale a deployment to fit your exact needs, whether you're managing data for a single department or for a global enterprise... or for anything in between.

Use indexer clusters

Clusters are groups of Splunk Enterprise indexers configured to replicate each others' data, so that the system keeps multiple copies of all data. This process is known as index replication. By maintaining multiple, identical copies of Splunk Enterprise data, clusters prevent data loss while promoting data availability for searching.

Splunk Enterprise clusters feature automatic failover from one indexer to the next. This means that, if one or more indexers fail, incoming data continues to get indexed and indexed data continues to be searchable.

Besides enhancing data availability, clusters have other key features that you should consider when you're scaling a deployment. For example, they include a capability to coordinate configuration updates easily across all indexers in the cluster. They also include a built-in distributed search capability. For more information on clusters, see "About clusters and index replication" in the Managing Indexers and Clusters manual.

Manage your Splunk Enterprise deployment

Splunk Enterprise provides a few key tools to help manage a distributed deployment:

  • Deployment server. This component provides a way to centrally manage configurations and content updates across your entire deployment. See "About deployment server" in the Updating Splunk Enterprise Instances manual for details.
  • Deployment monitor. This app can help you manage and troubleshoot your deployment. It tracks the status of your forwarders and indexers and provides early warning if problems develop. Read the Deploy and Use Splunk Enterprise Deployment Monitor App manual for details.

What comes next

The rest of this Overview section covers:

It starts by describing the data pipeline, from the point that the data enters Splunk Enterprise to when it becomes available for users to search on. Next, the overview describes how Splunk Enterprise functionality can be split into modular components. It then correlates the available Splunk Enterprise components with their roles in facilitating the data pipeline.

The remaining sections of this manual describe the Splunk Enterprise components in detail, explaining how to use them to create a distributed Splunk Enterprise deployment.

For information on capacity planning based on the scale of your deployment, read "Hardware capacity planning for your Splunk Enterprise deployment" in the Installation Manual.

Last modified on 27 April, 2015
How data moves through Splunk Enterprise: the data pipeline

This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters