Splunk Cloud Platform

Use Edge Processors

Set up alerts for Edge Processor metrics

As an Edge Processor administrator, you can set up alerts that trigger when Edge Processor metrics meet a certain criteria so that you can monitor the health and status of your Edge Processors. You can then take action to troubleshoot any potential issues with your Edge Processors. You can do this from your Splunk Cloud Platform deployment in your cloud tenant for use in Edge Processors.

This table highlights the search queries that you can use to set up alerts for Edge Processor metrics as well as some potential action items you can take once that situation occurs. You can create these queries and alerts by utilizing Splunk Cloud Platform functionality. For more information on how to configure alerts in Splunk Cloud Platform, see Getting started with alerts in the Splunk Cloud Platform Alerting Manual.

Metrics Alert trigger conditions Example search Action item
Edge Processor queue size If queue size is above a certain threshold. For example, 70%. This indicates that you need to increase your queue size. SPL query to see latest queue size for each instance:
| mstats latest(exporter_queue_size) as current_queue_size where index=_metrics by exporter
Increase your queue size to process more data. See these topics for more information:
Destination connection If the Edge Processor fails to connect to a destination. This indicates that your destination configuration might be incorrect or the destination might be offline. SPL query to see connectivity failures per dataset:
| mstats sum(egress_heartbeat_error_total) as heartbeat_failures_total where index=_metrics by dataset_name
Verify that the destination information is correct for Edge Processors by checking the edge.log file. See View logs for the Edge Processor solution for more information.
Destination data send failure If the Edge Processor fails to send data to a destination, creates errors, and those errors are above a certain threshold. This indicates that your destination configuration might be incorrect or the destination might be offline. SPL query to see total send errors per dataset:
| mstats sum(write_to_sink_errors_total) as export_failures_total where index=_metrics by dataset_name
Verify that the destination information is correct for Edge Processors by checking the edge.log file. See View logs for the Edge Processor solution for more information.
CPU usage If your host resource has an idle CPU usage above a certain threshold. This indicates that the host CPU can't handle the required workload. SPL query to see the CPU usage by state for each host:
| mstats sum(system.cpu.time) where index=_metrics by host,state
Verify what is causing a high CPU usage and take action accordingly. Increase CPU specifications or create an additional host to manage traffic. See An Edge Processor instance is in the "Warning" status for more information.
Memory usage If your host resource has a memory usage above a certain threshold. This indicates that the host memory can't handle the required workload. SPL query to see memory usage in bytes per host:
| mstats latest(system.memory.usage) where index=_metrics by host
Verify what is causing a high memory usage and take action accordingly, such as by increasing memory specifications. See An Edge Processor instance is in the "Warning" status for more information.
Last modified on 26 April, 2024
View logs for the Edge Processor solution   Troubleshoot the Edge Processor solution

This documentation applies to the following versions of Splunk Cloud Platform: 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release)


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters