How data distribution works
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
How data distribution works
Splunk servers running on any supported OS platform can forward data to one another (as well as to other systems) in real time. This setup allows data inputs gathered on one Splunk server in a specific environment to be sent to another Splunk server for indexing and search. Also, Splunk servers can forward data to groups of other Splunk servers, to enable horizontal scaling via clustered indexing. Splunk servers can also clone data to multiple groups of other Splunk servers to provide for data redundancy in high availability environments.
Data distribution covers all configurations in which one Splunk server (the forwarder) is sending data to one or more Splunk servers (the receivers) prior to being indexed. The forwarder can also index data locally.
Note: All Splunk instances in a distributed cluster must be running the same version of Splunk, although they can be running on any variety of support OSes. Each receiving Splunk server must have a unique, valid Splunk Enterprise license.
Forwarding
Forwarding is the simplest setup for forwarding and receiving. Forwarding refers to any server that sends data to another server for indexing.
Learn how to enable forwarding and receiving.
Routing
With routing enabled, the forwarder matches conditions based on patterns in the events themselves to selectively send some events to one other server and other events to another server.
Learn how to enable data routing.
Cloning
Cloning refers specifically to a forwarder sending every event to two or more other Splunk servers to provide for data redundancy.
Learn how to enable cloning.
Data balancing
Data balancing refers to data that is sent in a balanced fashion to groups of servers. This set up supports large volumes of data. All of the forwarders send data to some number of receivers, and the receivers index data in a round-robin fashion.
Data balanced target groups are made up of multiple servers. Learn how to set up data balancing.
Buffering during data balancing
If a server becomes inaccessible during data balancing, Splunk continues to send events to all accessible servers.
Eventually, Splunk stops trying to send to an unresponsive server, and notes that the server has gone off line. If all servers are inaccessible, Splunk writes to a buffer on the forwarder's side.
Data buffering values are set in outputs.conf on the forwarding side.
Target groups
Rather than output data to one receiver, forwarders can send to target groups. Target groups are made of one or more receiving servers:
[target group 1] server 1, server 2 [target group 2] server 3 [target group 3] server 4, server 5, server 6
Cloning sends every event to all target groups.
Routing sends specific events to one target group and different events to other target groups.
You can also set up default groups, which receive all the data not sent to target groups. If more than one group is specified, Splunk clones events to all listed default groups.
defaultGroup=<groupname1>,<groupname2>...
Learn more about target group configuration.
Security
Any Splunk server can route some or all of its incoming data in real time to other Splunk servers and to other systems via TCP, either in the clear text or via SSL. Learn how to set up SSL.
Send to 3rd party systems
By default, data is routed between Splunk servers as cooked data -- meaning events have been parsed and tagged. However, Splunk can be configured to either receive or send raw data in order to interact with third party systems.
Learn how to configure Splunk to send to or receive from third party software.
Distributed search
Splunk servers can distribute search requests to other Splunk servers and merge the results back to the user. Distributed search combines with balanced indexing to provide horizontal scaling, allowing you to search and index hundreds of gigabytes or terabytes per day. Additionally, distributed search allows select users to correlate data across different data silos.
Learn more about distributed search.
Configuration files for data distribution
- The forwarder uses the TCP output processor, configured by outputs.conf.
- Configure the receiver via inputs.conf.
- Conditions for routing are established in transforms.conf and linked to specific sources, source types or hosts in props.conf.
This documentation applies to the following versions of Splunk: 3.2 , 3.2.1 , 3.2.2 , 3.2.3 , 3.2.4 , 3.2.5 , 3.2.6 View the Article History for its revisions.





