Use forwarders to get data into Splunk Cloud Platform
You can get data into Splunk Cloud Platform in a number of ways. The best way depends on the source of the data and what you want to do with that data. You use one or more instances of the following tools to get data into Splunk Cloud Platform:
A forwarder is a Splunk Enterprise instance that has been optimized to send data. You might use multiple forwarders to send data depending on the volume and location of your source data.
- Inputs Data Manager (IDM).
The IDM is a hosted solution specific to Splunk Cloud Platform for scripted and modular inputs. In a majority of cases, an IDM eliminates the need for customer-managed infrastructure. For more information, see Work with Inputs Data Manager (IDM) in the Splunk Cloud Platform Admin Manual.
Customers on the Splunk Cloud Platform Victoria Experience don't need to use an IDM. For more information, see Determine your Splunk Cloud Platform Experience.
Forwarder types for getting data into Splunk Cloud Platform
Usually, to get data from your customer site to Splunk Cloud Platform, you use a forwarder. Splunk forwarders send data from a datasource to your Splunk Cloud Platform deployment for indexing, which makes the data searchable. Forwarders are lightweight processes, so they can usually run on the machines where the data originates.
There are two types of forwarders that you can use to get data in:
- To forward data to Splunk Cloud Platform, you typically use the Splunk universal forwarder. A universal forwarder is a dedicated, streamlined version of Splunk Enterprise that contains only the essential components needed to send data. The universal forwarder is oftentimes the best tool for forwarding data to indexers. Its main limitation is that it forwards only unparsed data, with a few exceptions. To send event-based data to indexers, you must use either a heavy forwarder or IDM.
- A heavy forwarder is a full Splunk Enterprise instance with some features disabled to achieve a smaller footprint. For data sources that must be collected using programmatic means, for example APIs and database access, or if you need to route and filter data, use a heavy forwarder to avoid running these kinds of inputs in the search head tier. Since the heavy forwarder adds metadata to the messages, you might see as much as two to three times the network traffic when you use a heavy forwarder.
Splunk Cloud Platform and the forwarder credentials app
When you work with forwarders to send data to Splunk Cloud Platform, you must download an app that has the credentials specific to your Splunk Cloud Platform instance. You install the forwarder credentials app on your universal forwarder, heavy forwarder, or deployment server, and it lets you connect to Splunk Cloud Platform.
Use a deployment server to deliver configurations to multiple forwarders
If you have multiple forwarders, you might need to use a deployment server to manage them. A deployment server is a Splunk Enterprise instance that acts as a centralized configuration manager. It groups together and collectively manages any number of forwarders. Forwarder instances that are remotely configured by deployment servers are called deployment clients. The deployment server downloads content such as configuration files and apps, to deployment clients to simplify the process of downloading these files to multiple locations for each configuration change.
Work with an intermediate forwarding tier
An intermediate forwarder tier is a collection of heavy forwarders that act as an aggregation point for data sent from other forwarders in your network. Once the data arrives at the intermediate tier, it is forwarded to Splunk Cloud Platform. By sending data from other forwarders through the intermediate forwarder tier, you can limit the hosts that will communicate with Splunk Cloud Platform over the internet.
As your intermediate tier uses heavy forwarders, you can also route or drop data before it leaves your network for Splunk Cloud Platform. For more information about data parsing and routing, see Use a heavy forwarder to get data into Splunk Cloud Platform.
When implementing an intermediate forwarder tier, try to maintain a 2:1 or greater ratio of intermediate forwarders to your Splunk Cloud Platform indexers. You can accomplish this by adding additional intermediate forwarder nodes, or by configuring your intermediate forwarder nodes to use multiple pipelines. For more information, see Configure a forwarder to handle multiple pipeline sets in the Forwarder Manual.
Use a universal forwarder to get data into Splunk Cloud Platform
The universal forwarder is the best choice for a large set of data collection requirements from systems in your environment. It is a purpose-built data collection mechanism with minimal resource requirements. The universal forwarder is the default choice for collecting and forwarding log data.
By default, the universal forwarder can forward a maximum of 256 KB of data per second. As a best practice, do not exceed this limit. For more information, read Possible thruput limits in the Splunk Enterprise Troubleshooting Manual.
The universal forwarder provides the following functionality:
- Checkpoint and restart function for lossless data collection.
- Efficient protocol that minimizes network bandwidth utilization.
- Throttling capabilities.
- Built-in, load-balancing across available indexers.
- Optional network encryption using SSL or TLS.
- Data compression (use only without SSL or TLS).
- Multiple input methods (files, Windows Event logs, network inputs, scripted inputs).
- Limited event filtering capabilities (Windows event logs only).
- Parallel ingestion pipeline support to increase throughput and reduce latency.
The universal forwarder does not parse log sources into events, so it cannot perform any action that requires the ability to format the logs. It also ships with a stripped-down version of Python, which makes it incompatible with any modular input apps that require a full Splunk platform to function. It is normal for a large number of universal forwarders (from hundreds to tens of thousands) to be deployed on endpoints and servers in a Splunk platform environment and to be centrally managed, either with a Splunk deployment server, or a third-party configuration management tool, such as Puppet or Chef.
There are endpoints that do not allow installation of the universal forwarder, such as network devices, appliances, and logs using the syslog protocol. These are special considerations and are not covered in this document.
The following diagram shows how universal forwarders send data to Splunk Cloud Platform:
Use a heavy forwarder to get data into Splunk Cloud Platform
The heavy forwarder is a full Splunk Enterprise instance configured to act as a forwarder with indexing disabled. A heavy forwarder generally performs no other Splunk Enterprise functions (for example, do not use a heavy forwarder to search data). The main difference between a universal forwarder and a heavy forwarder is that the heavy forwarder contains the full parsing pipeline, performing the identical functions an indexer performs, without writing and indexing events on disk. This configuration enables the heavy forwarder to parse and act on individual events. For example, a heavy forwarder can mask data or perform filtering and routing based on event data. Because it is a full Splunk Enterprise installation, it can host modular inputs that require a full Python stack to function properly for data collection or serve as an endpoint for the Splunk HTTP event collector (HEC).
The heavy forwarder has the following characteristics:
- Parses data into events
- Filters and routes based on individual event data
- Has a larger resource footprint than the UF
- Has a larger network bandwidth footprint than the Universal Forwarder
- Has a GUI for management
In general, heavy forwarders are not installed on endpoints for the purpose of data collection. Instead, they are used on standalone systems to implement data collection nodes (DCN) or intermediary forwarding tiers. Use a heavy forwarder only when requirements to collect data from other systems can't be met with a Universal Forwarder. Instances in which a heavy forwarder is required include the following examples:
- Reading data from a RDBMS for the purposes of ingesting it into Splunk Cloud Platform (database inputs).
- Collecting data from systems that are reachable using an API, such as cloud services, VMWare monitoring, and proprietary systems.
- Providing a dedicated tier to host the HTTP event collector service.
- Implementing an intermediary forwarding tier that requires a parsing forwarder for routing, filtering, and masking.
If you want to set up a heavy forwarder to send data in Splunk Cloud Platform, request a deployment server license from Splunk support to allow them to carry out functions above and beyond what is covered by the forwarder license. See Data collection in the Splunk Cloud Platform Service Description.
For more information about heavy forwarders, see the Splunk Forwarding Data manual.
The following example shows a heavy forwarder used to send data to Splunk Cloud Platform:
Is my data local or remote?
Use forwarders to get data into Splunk Enterprise
This documentation applies to the following versions of Splunk Cloud Platform™: 8.0.2006, 8.0.2007, 8.1.2009, 8.1.2011, 8.1.2012, 8.1.2101, 8.1.2103, 8.2.2104, 8.2.2105 (latest FedRAMP release), 8.2.2106, 8.2.2107, 8.2.2109