Use forwarders to get data into the indexer cluster

The main reasons to use forwarders with indexer clusters are:

To ensure that all incoming data gets indexed. By activating the forwarder's optional indexer acknowledgment feature, you can ensure that all incoming data gets indexed and stored on the cluster. See "How indexer acknowledgment works."

To handle potential node failure. With load-balanced forwarders, if one peer in the group goes down, the forwarder continues to send its data to the remaining peers in the group. See "How load balancing works."

To simplify the process of connecting data sources and peer nodes. By enabling indexer discovery on your forwarders, the forwarders automatically load balance across all available peer nodes, including any that are later added to the cluster. See "Advantages of the indexer discovery method."

To use forwarders to get data into clusters, you must perform two types of configuration:

Before continuing, you must be familiar with forwarders and how to use them to get data into Splunk Enterprise. For an introduction to forwarders, read "About forwarding and receiving" in the Forwarding Data manual. Subsequent topics in that manual describe all aspects of deploying and configuring forwarders.

Connect forwarders to peer nodes

There are two ways to connect forwarders to peer nodes:

Use the indexer discovery feature. With indexer discovery, each forwarder queries the manager node for a list of all peer nodes in the cluster. It then uses load balancing to forward data to the set of peer nodes. In the case of a multisite cluster, a forwarder can optionally query the manager for a list of all peers on a single site. See "Use indexer discovery to connect forwarders to peer nodes."

Connect forwarders directly to peer nodes. This is the traditional method for establishing forwarder/indexer connectivity. You specify the peer nodes directly on the forwarders as receivers. See "Connect forwarders directly to peer nodes."

Advantages of the indexer discovery method

Indexer discovery has advantages over the traditional method:

When new peer nodes join the cluster, you do not need to reconfigure and restart your forwarders to connect to the new peers. The forwarder automatically gets the updated list of peers from the manager node. It uses load balancing to forward to all peers in the list.

You can add new forwarders without needing to determine the current set of cluster peers. You just configure indexer discovery on the new forwarders.

You can use weighted load balancing when forwarding data across the set of peers. With indexer discovery, the manager node can track the amount of total disk space on each peer and communicate that information to the forwarders. The forwarders then adjust the amount of data they send to each peer, based on the disk capacity.

Configure the data inputs to each forwarder

After you specify the connection between the forwarders and the receiving peers using the method you prefer, you must specify the data inputs to each forwarder, so that the forwarder has data to send to the cluster. You usually do this by editing the forwarder's inputs.conf file.

Read the Getting Data In manual, starting with "What Splunk can index" for detailed information on configuring data inputs. The topic in that manual entitled "Use forwarders" provides an introduction to specifying data inputs on forwarders.

How indexer acknowledgment works

To ensure end-to-end data fidelity, you must explicitly enable indexer acknowledgment on each forwarder sending data to the cluster.

In brief, indexer acknowledgment works like this: The forwarder sends data continuously to a peer node, in blocks of approximately 64kB. The receiving peer, also known as the "source peer", then streams each data block to its target peers. The forwarder maintains a copy of the block in memory until it gets an acknowledgment from the source peer. While waiting, it continues to send more data blocks.

When indexer acknowledgment is enabled, its default behavior ensures that the replication factor number of peers receive each block of data. To handle ingestion delays due to slow indexers, you can change the behavior to limit the number of peer nodes that must receive the data before acknowledgment occurs. In doing so, you are trading guarantee of data high availability in exchange for better cluster performance.

The main scenarios are:

Full acknowledgment (default)
Limited acknowledgment

Full acknowledgment behavior

By default, the cluster waits until the number of peers equal to the replication factor receive the data before returning acknowledgment to the forwarder. For example, in a replication factor 3 cluster, the source peer and its two target peers must all receive the data before the source peer returns acknowledgment to the forwarder.

If all goes well, the source peer:

1. receives the block of data, parses and indexes it, and writes the data (raw data and index data) to its file system.

2. streams copies of the raw data to each of its target peers to fulfill the replication factor.

3. receives notification from each target peer of either a successful or unsuccessful write.

4. sends an acknowledgment back to the forwarder.

At the conclusion of this process, the source peer plus its target peers have all received copies of the data block.

The acknowledgment assures the forwarder that the data was successfully written to the cluster. Upon receiving the acknowledgment, the forwarder releases the block from memory.

If the forwarder does not receive the acknowledgment, that means there was a failure along the way. Either the source peer went down or that peer was unable to contact its set of target peers. The forwarder then automatically resends the block of data. If the forwarder is using load-balancing, it sends the block to another peer in the load-balanced group. If the forwarder is not set up for load-balancing, it attempts to resend data to the same peer as before.

Limited acknowledgment behavior

To ensure that data continues to flow into the cluster at the expected rate, even in clusters with a slow or malfunctioning peer node, you can configure the acknowledgment behavior so that the process does not require that the full replication factor number of peers receive the data before acknowledgment. Instead the source peer can return acknowledgment to the forwarder once some smaller number of peer nodes have received the data, without waiting for the entire replication factor number of peers to receive it.

For example, you can configure the behavior so that only one peer (that is, the source peer) must receive the data before returning the acknowledgment. Or, in a replication factor 3 cluster, you can specify that only two peers (the source peer and one of the target peers) must receive the data before the source peer returns an acknowledgment to the forwarder.

By doing so, you can avoid situations where the forwarder might hang unnecessarily while awaiting acknowledgment, due to a peer node performing poorly.

Limited acknowledgment only determines the timing for when the source peer returns an acknowledgment to the forwarder. The source peer continues to stream the data to its target peers, even after returning the acknowledgment, thus ensuring that the replication factor number of copies of the data resides on peers in the cluster.

Configure indexer acknowledgment behavior

The ack_factor setting in each peer node's server.conf determines the number of peers that must receive data before the source peer returns acknowledgment to the forwarder.

By default, ack_factor is set to 0, which signifies the replication number of peers. So, if ack_factor=0 and the cluster has a replication factor of 3, then a total of 3 peer nodes must receive each data block before acknowledgment is returned to the forwarder.

To limit the number of peer nodes that must receive the data, you can set ack_factor to an integer between 1 and the replication factor. For example, to limit the number of peer nodes that must receive the data to one, set ack_factor=1. To limit the number of peer nodes to two, set ack_factor=2, and so on.

Note:

The setting is valid only if useACK=true and mode=peer.
All peer nodes in the cluster must set ack_factor to the same value.
This setting requires a restart to take effect.

For more information on how indexer acknowledgment works, read "Protect against loss of in-flight data" in the Forwarding Data manual.

How load balancing works

In load balancing, the forwarder distributes incoming data across several receiving peer nodes. Each node gets a portion of the total data, and together the receiving nodes get all the data.

Splunk forwarders perform "automatic load balancing". The forwarder routes data to different nodes based on a specified time interval. For example, assume you have a load-balanced group consisting of three peer nodes: A, B, and C. At the interval specified by the autoLBFrequency attribute in outputs.conf (30 seconds by default), the forwarder switches the data stream to another node in the group, selected at random. So, every 30 seconds, the forwarder might switch from node B to node A to node C, and so on. If one node is down, the forwarder immediately switches to another.

Note: To expand on this a bit, each of the forwarder's inputs has its own data stream. At the specified interval, the forwarder switches the data stream to the newly selected node, if it is safe to do so. If it cannot safely switch the data stream to the new node, it keeps the connection to the previous node open and continues to send the data stream to that node until it has been safely sent.

Load balancing, in conjunction with indexer acknowledgment, is of key importance in a clustered deployment because it helps ensure that you don't lose any data in case of node failure. If a forwarder does not receive indexer acknowledgment from the node it is sending data to, it resends the data to the next available node in the load-balanced group.

Forwarders using the indexer discovery feature always use load balancing to send data to the set of peer nodes. You can enable weighted load balancing, which means that the forwarder distributes data based on each peer's disk capacity. For example, a peer with a 400GB disk receives twice the data of a peer with a 200GB disk. See "Use weighted load balancing."

For further information on:

load balancing with indexer discovery, see "Use indexer discovery to connect forwarders to peer nodes."
load balancing without indexer discovery, see "Set up load balancing" in the Forwarding Data manual.
how load balancing works with indexer acknowledgment, read "Protect against loss of in-flight data" in the Forwarding Data manual.

Managing Indexers and Clusters of Indexers

Related Answers

Use forwarders to get data into the indexer cluster

Connect forwarders to peer nodes

Advantages of the indexer discovery method

Configure the data inputs to each forwarder

How indexer acknowledgment works

Full acknowledgment behavior

Limited acknowledgment behavior

Configure indexer acknowledgment behavior

How load balancing works

Comments

Use forwarders to get data into the indexer cluster