How the Edge Processor solution works
The Edge Processor solution combines Splunk-managed cloud services, on-premises data processing software, and the Search Processing Language, version 2 (SPL2) to support data processing at the edge of your network. The Edge Processor solution consists of the following main components:
|Edge Processor||A data processing engine that allocates resources for processing and routing data||You install Edge Processors on machines in your local network. Edge Processors provide an on-premises data plane that lets you reduce and sanitize your data before sending it outside of your network.|
|Edge Processor service||A cloud service that provides a centralized console for managing Edge Processors||Splunk hosts the Edge Processor service as part of Splunk Cloud Platform. The Edge Processor service provides a cloud control plane that lets you deploy configurations, monitor the status of your Edge Processors, and gain visibility into the amount of data that is moving through your network.|
|Pipeline||A set of data processing instructions written in SPL2, which is the data search and preparation language used by Splunk software||In the Edge Processor service, you create pipelines to specify what data to process, how to process it, and what destination to send the processed data to. Then, you apply pipelines to your Edge Processors to configure them to start processing data according to those instructions.|
By using the Edge Processor solution, you can process data in your own local network while also managing and monitoring your data ingest ecosystem from a Splunk-managed cloud service.
The following diagram provides an overview of these elements:
- The components that comprise the Edge Processor solution, and whether each component is hosted in the Splunk cloud environment or your local environment. See the System architecture section on this page for more information.
- The path your data takes as it moves from source to destination through an Edge Processor. See the Data pathway section on this page for more information.
The primary components of the Edge Processor solution include the Edge Processor service, Edge Processors, and SPL2 pipelines that support data processing.
Edge Processor service
The Edge Processor service is a cloud service hosted by Splunk. It is part of the data management experience, which is a set of services that fulfill a variety of data ingest and processing use cases.
You can use the Edge Processor service to do the following:
- Configure and install Edge Processors on your local environment for on-location data processing.
- Create and apply SPL2 pipelines that determine how each Edge Processor processes and routes the data that it receives.
- Define source types to identify the kind of data that you want to process and determine how Edge Processors break and merge that data into distinct events.
- Create connections to the destinations that you want your Edge Processors to send processed data to.
You access the Edge Processor service by logging in to your tenant in the Splunk cloud environment. Your tenant is connected with your Splunk Cloud Platform deployment, and uses it as an identity provider for managing user accounts and logins. To log in and access the Edge Processor service, use the same username and password as you would when logging in to your Splunk Cloud Platform deployment.
The connection between the tenant and the Splunk Cloud Platform deployment also allows the Edge Processor solution to use the deployment as a storage location for the logs and metrics that are generated by Edge Processors. The Edge Processor service retrieves these logs and metrics from the deployment and displays them in the user interface of the service.
These Edge Processor logs and metrics only contain information pertaining to the operational status of a given Edge Processor. They do not contain any of the actual data that you are ingesting and processing through Edge Processors. See the Edge Processors section that follows for more details.
An Edge Processor is a data processing engine that allocates resources for processing and routing data. You can install an Edge Processor on a single server node in your network or on a cluster of multiple server nodes. Multi-instance Edge Processors provide more powerful data processing capabilities than single-instance Edge Processors. Be aware that multiple Edge Processor instances cannot run on the same machine, so you must install each instance on a different machine.
Each Edge Processor instance is associated with a supervisor, which contacts the cloud service at regular intervals to check for system updates, provide telemetry data, and confirm that the instance is still connected to the service. When you use the Edge Processor service to change your Edge Processor configurations or pipeline definitions, or when Splunk releases new features or bug fixes for Edge Processors, the supervisor detects these changes and automatically updates the instance as needed.
The supervisor sends the following information from the Edge Processor instance to the Edge Processor service in the cloud:
- Configuration information. This includes details such as the following:
- The list of applied pipelines
- The datasets that represent the selected data sources and destinations
- The names of the Splunk indexes that the Edge Processor sends internal logs and metrics to
- The version of the Edge Processor software that the instance is running
- Heartbeats that indicate the status of the Edge Processor instance and confirm if the instance is still connected to the service. These heartbeats include information such as the following:
- Whether the instance is running or stopped
- How much CPU and memory the instance is consuming
- The version of the Edge Processor software that the instance is running
As an Edge Processor works to process data, it generates logs and metrics containing operational information such as the amount of data that was processed and any events, warnings, or errors that have occurred. The Edge Processor sends these logs and metrics to the Splunk Cloud Platform deployment that is connected to the tenant.
The information that an Edge Processor instance and its supervisor sends to the cloud does not contain any of the actual data that is being ingested and processed. The data that you send through an Edge Processor only gets transmitted to the destinations that you choose in the Edge Processor configuration settings and the applied pipelines.
A pipeline is a set of data processing instructions written in SPL2. When you create a pipeline, you write a specialized SPL2 statement that specifies which data to process, how to process it, and where to send the results. For example, you can create a pipeline that filters for syslog events and sends them to a dedicated index in Splunk Cloud Platform. When you apply a pipeline to an Edge Processor, the Edge Processor uses those instructions to process all the data that it receives from data sources such as Splunk forwarders, HTTP clients, and logging agents.
The Edge Processor solution supports a subset of SPL2 commands and functions. Pipelines can include only the commands and functions that are part of the
EdgeProcessor profile. For information about the specific SPL2 commands and functions that you can use to write pipelines for Edge Processors, see Edge Processor pipeline syntax. For a summary of how the
EdgeProcessor profile supports different commands and functions compared to other SPL2 profiles, see the following pages in the SPL2 Search Reference:
- Compatibility Quick Reference for SPL2 commands
- Compatibility Quick Reference for SPL2 evaluation functions
Data moves through the Edge Processor solution as follows:
- A tool, machine, or piece of software in your network generates data such as event logs or traces.
- An agent, such as a Splunk forwarder, receives the data and then sends it to an Edge Processor. Alternatively, the device or software that generated the data can send it to an Edge Processor without using an agent.
- The Edge Processor filters and transforms the data, and then sends the resulting processed data to a destination such as a Splunk index. The SPL2 pipelines applied to the Edge Processor determine how it processes and routes the data.
By default, Edge Processors route processed data to destinations based on any pipelines you applied. If there are no applicable pipelines, then unprocessed data is routed to the default destination for the Edge Processor.
If you don't specify a default destination, then Edge Processors drop unprocessed data by default.
As the Edge Processor receives and processes data, it measures metrics indicating the volume of data that was received, processed, and sent to a destination. These metrics are stored in the _metrics index of the Splunk Cloud Platform deployment that is connected to your tenant. The Edge Processor service surfaces the metrics in the dashboard, providing detailed overviews of the amount of data that is moving through the system.
For information about how to set up and use specific components of the Edge Processor solution, see the following resources.
About the Edge Processor solution
First-time setup instructions for the Edge Processor solution
This documentation applies to the following versions of Splunk Cloud Platform™: 9.0.2209, 9.0.2303, 9.0.2305 (latest FedRAMP release)