Splunk® Enterprise

Installation Manual

Download manual as PDF

Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

High availability reference architecture

Splunk provides the flexibility and capability to handle machine data for any type of computing environment, including the most stringent needs of medium and large enterprises. In some environments, maintaining data integrity and high availability can be of critical importance.

How you define high availability, and the approach you take to implement it, will vary greatly according to the needs of your particular business and the state of your existing system. This topic will help you make the right decisions about how best to deploy Splunk to promote a highly available, highly reliable system. It does not attempt to dictate any single approach to high availability. Rather, it offers a starting point for planning an approach that suits your enterprise.

As part of planning a highly available Splunk deployment, you must also take into account all aspects of your existing system - not only its components and topology, but also its overall reliability and availability. The specifics of your current system will determine how you integrate Splunk into it.

Before reading this topic, you should already be familiar with Splunk deployments and components, as described in "Distributed Splunk overview".

Note: This topic is intended for planning purposes only. It is not meant to serve as a detailed implementation guide. If you want to implement a high availability Splunk deployment, contact Splunk Professional Services for guidance.

The elements of a high availability architecture

Splunk collects data and it queries data. To implement end-to-end Splunk availability, you need to consider both functions.

If you are using Splunk forwarders in a load-balanced configuration, in which you send data alternately to multiple Splunk indexers in a group, then you already have high availability on the data collection side of Splunk. If one indexer goes down, the forwarder will automatically start sending the data to the other indexers in the load-balanced group.

To provide high availability for Splunk's data querying capability, you must maintain availability for:

  • The indexer(s)
  • The indexed data

The rest of this topic describes ways to maintain high availability for querying Splunk data.

Depending on your requirements, you might also need to consider availability of other components of your Splunk deployment, such as search heads and forwarders. You must also provide high availability for non-Splunk (but Splunk-dependent) aspects of your system, such as your data sources, hardware, and network.

There are two basic choices for implementing high availability for Splunk indexers and data:

  • Use a highly reliable storage system
  • Use a mirrored cluster of Splunk indexers

High reliability storage

There are a number of ways that you can use an underlying storage system to promote high availability for Splunk. The exact architecture you implement will depend on your existing environment and specific needs.

For example, in a typical SAN-based architecture, you could install your Splunk indexers directly on the SAN and then mount the Splunk volumes on server nodes. If a node goes down, you can remount its volume on another node. The new node takes on the identity of the failed node, with the same configurations and access to the same set of indexed data. You just need to point your search head at that node in place of the old one. You can further configure your SAN to attain the level of redundancy you require.

Data replication across indexer clusters

Another way to achieve high availability of both the indexed data and the indexing/searching capabilities is to create primary and secondary clusters of mirrored indexers. If an indexer in the primary cluster fails, you can reconfigure forwarders and search heads to point to its mirror on the secondary cluster.

Here's an example of this strategy. It starts by showing two forwarders using load balancing to distribute data to the indexers in the primary cluster:


HA 1 temp.jpg


The primary indexers index the data locally and also forward the raw (unindexed) data onwards to secondary indexers, which then index the data a second time:


HA 2 temp.jpg


You now have copies of the indexed data in two places. Each indexer in the secondary cluster contains an exact copy of the data on its corresponding indexer in the primary cluster. You can search against either the primary or the secondary cluster:


HA 3 temp.jpg


If one of the indexers in the primary cluster goes down, the forwarders' load-balancing capability means that they will automatically start sending all their data to the remaining indexer(s) in that cluster. Those indexers will continue to send copies of their data on to their mirrored instances in the secondary cluster:


HA 4 temp.jpg


You can continue to search across the full set of data. You just redirect the search head(s) to point to the secondary instance of the downed indexer. At the same time, the search head can continue to point to the remaining indexer in the primary cluster. Alternatively, you can redirect the search head to point exclusively to the secondary indexers. In either case, searching continues across the entire set of data:


HA 5 temp.jpg


There are many ways you can implement specific aspects of this architecture. For guidance, talk with Splunk Professional Services.

This second solution has the advantage that it depends less on the capabilities of your underlying storage system. On the downside, it requires double the hardware (since you're doubling the indexers), as well as a license for twice the indexing volume (since you're indexing everything twice).

PREVIOUS
Hardware capacity planning for your Splunk deployment
  NEXT
Estimate your storage requirements

This documentation applies to the following versions of Splunk® Enterprise: 4.3, 4.3.1, 4.3.2, 4.3.3, 4.3.4, 4.3.5, 4.3.6, 4.3.7


Comments

In the second scenario how we can ensure that replica indexer has the exact copy of the primary indexer? Could someone please help us on this?

Psivakum
July 4, 2013

Djbyler - Please contact your Splunk sales rep for information about HA licenses. They're the best source for definitive information on this issue.

Sgoodman, Splunker
October 15, 2012

I have a question about the following statement:<br /><br />"This second solution has the advantage that it depends less on the capabilities of your underlying storage system. On the downside, it requires double the hardware (since you're doubling the indexers), as well as a license for twice the indexing volume (since you're indexing everything twice)."<br /><br />Would this be scenario be a valid use case for an HA License, where the second indexing cluster uses the HA license pool?

Djbyler
October 15, 2012

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters