
Set up alerts for Edge Processor metrics
As an Edge Processor administrator, you can set up alerts that trigger when Edge Processor metrics meet a certain criteria so that you can monitor the health and status of your Edge Processors. You can then take action to troubleshoot any potential issues with your Edge Processors. You can do this from your Splunk Cloud Platform deployment in your cloud tenant for use in Edge Processors.
This table highlights the search queries that you can use to set up alerts for Edge Processor metrics as well as some potential action items you can take once that situation occurs. You can create these queries and alerts by utilizing Splunk Cloud Platform functionality. For more information on how to configure alerts in Splunk Cloud Platform, see Getting started with alerts in the Splunk Cloud Platform Alerting Manual.
Metrics | Alert trigger conditions | Example search | Action item |
---|---|---|---|
Edge Processor metrics | If no metrics are received for a certain threshold. For example, 15 minutes. This indicates that your pipelines are not operating as expected. | stats latest(_time) | Verify your pipeline configuration and determine why your metrics might be missing. See these topics for more information:
Verify your Edge Processor and pipeline configurations |
Edge Processor queue size | If queue size is above a certain threshold. e.g: 70%. This indicates that you need to increase your queue size. | stats latest(exporter_queue_size) by service_instance_id | Increase your queue size to process more data. See these topics for more information:
Verify your Edge Processor and pipeline configurations Troubleshoot the Edge Processor solution |
Destination connection | If Edge Processor fails to connect to a destination. This indicates that your destination configuration might be incorrect or offline. | stats sum(egress_heartbeat_error_total) by dataset_name | Verify destination information is correct for Edge Processors by accessing Check edge.log. See Verify your Edge Processor and pipeline configurations for more information.
|
Destination data send failure | If Edge Processor destination fails to send data to a destination, creates errors, and those errors are above a certain threshold. This indicates that your destination configuration might be incorrect or offline. | stats sum(write_to_sink_errors_total) by dataset_name | Verify destination information is correct for Edge Processors by accessing Check edge.log. See Verify your Edge Processor and pipeline configurations for more information.
|
CPU usage | If your host resource has an idle CPU usage above a certain threshold. This indicates that the host CPU can't handle the required workload. | stats sum(system.cpu.time) by host,state | Verify what is causing a high CPU usage and take action accordingly. Increase CPU specifications or create an additional host to manage traffic. See Troubleshoot the Edge Processor solution for more information. |
Memory usage | If your host resource has a memory usage above a certain threshold. This indicates that the host memory can't handle the required workload. | stats latest(system.memory.usage) by host | Verify what is causing a high memory usage and take action accordingly, such as by increasing memory specifications. See Troubleshoot the Edge Processor solution for more information. |
PREVIOUS View logs for the Edge Processor solution |
NEXT Troubleshoot the Edge Processor solution |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.0.2209, 9.0.2303, 9.0.2305 (latest FedRAMP release)
Feedback submitted, thanks!