Set up alerts for the splunkd health report

The splunkd health report generates alerts for all features in the health status tree. When a feature indicator meets the threshold condition, the feature's health status changes, for example from green to red, and an alert fires. Use health status alerts to maintain visibility into the health of your deployment, whether or not you are logged into Splunk Web.

You can configure health report alerts as follows:

Enable/disable alerts on the global, feature, or indicator level.
Send alert notifications via email or PagerDuty.
Set the health status color (yellow or red) that triggers an alert.
Set a minimum duration that must elapse between alerts.

You can configure health report alerts by directly editing health.conf or querying the server/health-config endpoint.

Disable health report alerts

Alerts are enabled by default for all features in the splunkd health report. You can disable alerts at the global, feature, or indicator level. Disabling alerts at the global level overrides enabled alerts at the feature level. Likewise, disabling alerts at the feature level overrides enabled alerts at the indicator level.

Disabling alerts is useful for reducing noise from non-critical features and minimizing false positives when performing maintenance tasks.

Disable alerts using health.conf

To disable alerts for all features in the splunkd health report:

Edit $SPLUNK_HOME/etc/system/local/health.conf.
In the health_reporter stanza, set alert.disabled = 1. For example:
```
[health_reporter]
full_health_log_interveral = 30
suppress_status_update = 600
alert.disabled = 1
```
To enable alerts for all features in the splunkd health report, set alert.disabled = 0.

To disable alerts for a single feature:

Edit $SPLUNK_HOME/etc/system/local/health.conf.
In the stanza for the particular feature, set alert.disabled = 1. For example:
```
[feature:indexers]
...
indicator:missing_peers:yellow = 1
indicator:missing_peers:red = 1
alert.disabled = 1
```
To enable alerts for a feature, set alert.disabled = 0.

To disable alerts for a single feature indicator:

Edit $SPLUNK_HOME/etc/system/local/health.conf.
In the stanza for a particular feature, set alert:<indicator_name>.disabled = 1. For example, in the following stanza, alerting for the indicator s2s_connections is disabled:
```
[feature:s2s_autolb]
...
indicator:s2s_connections:yellow = 20
indicator:s2s_connections:red = 70
alert:s2s_connections.disabled = 1
```
To enable alerts for an indicator, set alert:.disabled = 0.

Disable alerts using REST endpoint

To disable alerts for features and indicators, send a POST request to server/health-config/{feature_name}. For example, to disable alerts for the batchreader feature on the instance you are monitoring run the following command:

curl -k -u admin:pass https://<host>:<mPort>/services/server/health-config/feature:batchreader -d alert.disabled=1

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

To access server/health-config endpoints, a role must have the edit_health capability.

Set up health report alert actions

You can set up alert actions that run when an alert fires, such as sending alert notifications via email, mobile device, or PagerDuty.

Alert actions apply on the global level only. Multiple alert actions for the same action type are not supported. For example, you cannot have multiple email actions and multiple PagerDuty actions.

Before you can send health email alert notifications, you must configure email notification settings in Splunk Web. For instructions, see Email notification action in the ''Alerting Manual''.

Set up email notifications in health.conf

To set up email alert notifications:

Edit SPLUNK_HOME/etc/system/local/health.conf

Add the [alert_action:email] stanza and specify the recipients. For example:

[alert_action:email] 
disabled = 0
action.to =  <recipient@example.com>
action.cc = <recipient_2@example.com>
action.bcc = <other_recipients@example.com>

Set up PagerDuty notifications in health.conf

Before you can send alert notifications to PagerDuty, you must install the PagerDuty App from Splunkbase. You must also add a new service to your PagerDuty integration, and copy the integration key. For more information, see PagerDuty documentation.

To set up PagerDuty alert notifications:

Edit $SPLUNK_HOME/etc/system/local/health.conf.

Add the [alert_action:pagerduty] stanza and specify the integration key. For example:

[alert_action:pagerduty]
disabled = 0
action.integration_url_override = <integration key>

For more information, see health.conf.example.

Set up alert notifications using REST

To set up alert notifications, send a POST request to server/health-config/{alert_action}. For example, to set up an email alert notification:

curl -k -u admin:pass https://localhost:8089/services/server/health-config/alert_action:email -d action.to=admin@example.com -d action.cc=admin2@example.com

For endpoint details, see server/health-config/{alert_action} in the REST API Reference Manual.

Set the alert threshold color

You can set the threshold color that triggers an alert. Possible alert threshold values are yellow or red. If the threshold value is yellow, an alert fires for both yellow and red. If the value is red, an alert fires for red only. The default alert threshold value is red.

Set the alert threshold color in health.conf

To set the alert threshold color on the global or feature level:

Edit $SPLUNK_HOME/etc/system/local/health.conf.

Add the alert.threshold_color setting to the [health_reporter] or [feature:<feature_name>] stanza. For example:

[feature:replication_failures]
...
alert.threshold_color = yellow
indicator:replication_failures:red = 10
indicator:replication_failures:yellow = 5

To set the alert threshold color on the indicator level:

Edit $SPLUNK_HOME/etc/system/local/health.conf.

 Add the alert:<indicator name>.threshold_color setting to the feature stanza. For example:

[feature:replication_failures]
...
indicator:replication_failures:red = 10
indicator:replication_failures:yellow = 5
alert:replication_failures.threshold_color = yellow

Alert threshold color settings at the indicator level override alert threshold color settings at the feature level.Set the alert threshold color using REST

To set the alert threshold color for a feature or indicator, send a POST request to server/health-config{feature_name}. For example, to set the alert threshold color for the Replication Failures feature:


curl -k -u admin:pass https://localhost:8089/services/server/health-config/feature:replication_failures -d alert.threshold_color=yellow

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

Set minimum duration between alerts

You can set the amount of time an unhealthy health status persists before an alert fires using the alert.min_duration_sec setting. You can use this setting to help reduce noise from feature health status changes that might be rapidly flipping between states, for example, between green and yellow or yellow and red.

Set minimum duration between alerts in health.conf

To set the minimum duration between alerts on the global or feature level:


 Edit $SPLUNK_HOME/etc/system/local/health.conf.
 Add the alert.min_duration_sec setting to the [health_reporter] or [feature:<feature_name>] stanza. For example:

[feature:replication_failures]
...
alert.min_duration_sec = 600
indicator:replication_failures:red = 10
indicator:replication_failures:yellow = 5

To set the minimum duration between alerts on the indicator level:


 Edit $SPLUNK_HOME/etc/system/local/health.conf.
 Add the alert:<indicator name>.min_duration_sec setting to the [feature:<feature_name>] stanza. For example:

[feature:replication_failures]
...
indicator:replication_failures:red = 10
indicator:replication_failures:yellow = 5
alert:replication_factor.min_duration_sec = 600

Minimum duration between alerts settings on the feature level override settings on the indicator level.Set minimum duration between alerts using REST

To set the minimum duration between alerts for a feature or indicator, send a POST request to server/health-config/{feature_name}. For example, to set the minimum duration between alerts for the Replication Failures feature:


curl -k -u admin:pass https://localhost:8089/services/server/health-config/feature:replication_failures -d alert.min_duration_sec=600

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

Related answers from Splunk Community

Set up alerts for the splunkd health report

Disable health report alerts

Disable alerts using health.conf

Disable alerts using REST endpoint

Set up health report alert actions

Set up email notifications in health.conf

Set up PagerDuty notifications in health.conf

Set up alert notifications using REST

Set the alert threshold color

Set the alert threshold color in health.conf

Set the alert threshold color using REST

Set minimum duration between alerts

Set minimum duration between alerts in health.conf

Set minimum duration between alerts using REST

Comments

Set up alerts for the splunkd health report

Was this topic useful?