Use forwarders to get data in to Splunk Cloud
This section is designed to help you make decisions about the best way to get data into your Splunk Cloud instance. There are a few different ways to get data into Splunk Cloud. The best way to get data in depends on the source of the data and what you intend to do with it. You use one or more of the following tools to get data into Splunk Cloud:
- Forwarders. A forwarder is a version of Splunk Enterprise optimized to send data. A universal forwarder is a purpose-built data collection mechanism with very minimal resource requirements, whereas a heavy forwarder is full Splunk Enterprise deployment configured to act as a forwarder with indexing disabled. See Work with forwarders
- Inputs Data Manager (IDM). The IDM is a hosted solution for Splunk Cloud for scripted and modular inputs. In a majority of cases, an IDM obviates the need for customer-managed infrastructure. See Work with Inputs Data Manager
Work with forwarders
Usually, to get data from your customer site to Splunk Cloud, you use a forwarder. There are two types of forwarders commonly used to get data in:
- A universal forwarder is a dedicated, streamlined version of Splunk Enterprise that contains only the essential components needed to send data. The universal forwarder is usually the best tool for forwarding data to indexers. Its main limitation is that it forwards only unparsed data. To send event-based data to indexers, you must use a heavy forwarder or IDM.
- A heavy forwarder is a full Splunk Enterprise instance, with some features disabled to achieve a smaller footprint. For data sources that have to be collected using programmatic means (APIs, database access), or if you need to do data routing and filtering, deploying a data collection node (DCN) a heavy forwarder is recommended. It is not recommended that you run these kinds of inputs on the search head tier in anything other than a development environment. Since the heavy forwarder adds metadata to the messages, you may see as much as two to three times the network traffic when you use a heavy forwarder.
Work with the forwarder app
When you work with forwarders, you download a forwarder app ( from
Splunk Cloud Home > Universal Forwarder > Download Universal Forwarder Credentials) that has the credentials specific to your Splunk Cloud instance. You install this app on your forwarder, heavy forwarder, or on your deployment server, and it allows you to easily connect to Splunk Cloud.
Work with a deployment server
In addition, if you have multiple forwarders, you may need to use a deployment server to manage them. A deployment server is a Splunk Enterprise instance that acts as a centralized configuration manager, grouping together and collectively managing any number of forwarders. Instances that are remotely configured by deployment servers are called deployment clients. The deployment server downloads updated content, such as configuration files and apps, to deployment clients. Units of such content are known as deployment apps.
Work with an intermediate forwarding tier
In some cases, you may optionally use an intermediate forwarding tier. An intermediate forwarder tier is a collection of universal forwarders that simply relays data to Splunk Cloud. This can be helpful if you want to limit the number of servers that need direct access to the Internet. It also reduces the overhead of updating firewall rules each time a new server is added or removed from the environment.
There are endpoints that do not allow installation of the universal forwarder, such as network devices, appliances, and logs using the syslog protocol. These are special considerations and are not covered in this document.
Use a Universal Forwarder to get data into Splunk Cloud
The universal forwarder is the best choice for a large set of data collection requirements from systems in your environment. It is a purpose-built data collection mechanism with very minimal resource requirements. The universal forwarder should be the default choice for collecting and forwarding log data. The universal forwarder provides:
- Checkpoint/restart function for lossless data collection.
- Efficient protocol that minimizes network bandwidth utilization.
- Throttling capabilities.
- Built-in, load-balancing across available indexers.
- Optional network encryption using SSL/TLS.
- Data compression (use only without SSL/TLS).
- Multiple input methods (files, Windows Event logs, network inputs, scripted inputs).
- Limited event filtering capabilities (Windows event logs only).
- Parallel ingestion pipeline support to increase throughput/reduce latency.
With few exceptions for well-structured data (json, csv, tsv), the Universal Forwarder does not parse log sources into events, so it cannot perform any action that requires understanding of the format of the logs. It also ships with a stripped down version of Python, which makes it incompatible with any modular input apps that require a full Splunk stack to function. It is normal for a large number of UFs (100s to 10,000s) to be deployed on endpoints and servers in a Splunk environment and centrally managed, either with a Splunk deployment server, or a third-party configuration management tool (like e.g. Puppet or Chef).
The following example shows universal forwarders to send data to Splunk Cloud:
Use a Heavy Forwarder to get data into Splunk Cloud
The heavy forwarder is a full Splunk Enterprise deployment configured to act as a forwarder with indexing disabled. A heavy forwarder generally performs no other Splunk roles. The key difference between a universal forwarder and a heavy forwarder is that the heavy forwarder contains the full parsing pipeline, performing the identical functions an indexer performs, without actually writing and indexing events on disk. This enables the heavy forwarder to understand and act on individual events, for example, to mask data or to perform filtering and routing based on event data. Since it is a full Splunk Enterprise installation, it can host modular inputs that require a full Python stack to function properly for data collection or serve as an endpoint for the Splunk HTTP event collector (HEC).
The heavy forwarder performs the following functions:
- Parses data into events.
- Filters and routes based on individual event data.
- Has a larger resource footprint than the UF.
- Has a larger network bandwidth footprint than the Universal Forwarder.
- GUI for management.
In general, heavy forwarders are not installed on endpoints for the purpose of data collection. Instead, they are used on standalone systems to implement data collection nodes (DCN) or intermediary forwarding tiers. Use a heavy forwarder only when requirements to collect data from other systems cannot be met with a Universal Forwarder. Examples of such requirements include:
- Reading data from RDBMS for the purposes of ingesting it into Splunk (database inputs).
- Collecting data from systems that are reachable via an API (cloud services, VMWare monitoring, proprietary systems, etc.).
- Providing a dedicated tier to host the HTTP event collector service.
- Implementing an intermediary forwarding tier that requires a parsing forwarder for routing/filtering/masking.
If you want to set up a heavy forwarder to send data in Splunk Cloud, request a deployment server license from Splunk support to allow them to carry out functions above and beyond what is covered by the forwarder license. See the section on data collection in the Splunk Cloud Service Description.
The following example shows a heavy forwarder used to send data to Splunk Cloud:
Work with Inputs Data Manager (IDM)
The IDM is a hosted solution for Splunk Cloud for scripted and modular inputs. In a majority of cases, an IDM obviates the need for customer-managed infrastructure. IDM is an ideal solution for Cloud inputs that allows you to avoid using a forwarder to get data to your Splunk Cloud instance. However, note that an IDM is not a one-to-one replacement for a heavy forwarder. You still need to use a heavy forwarder if you need to perform parsing or activities other than standard scripted and modular data inputs. As a best practice, cloud-based add-ons should be installed on an IDM, and on-premises-based add-ons should be installed on a forwarder or heavy forwarder. Note: If the add-on is tightly integrated with an Enterprise Security search head, you should not use IDM.
The following graphic shows typical topology for an IDM in which data is sent directly from a Cloud data source to the IDM:
Is my data local or remote?
Use forwarders to get data in
This documentation applies to the following versions of Splunk Cloud™: 8.0.2003, 8.0.2004, 8.0.2006