Splunk® Enterprise

Monitoring Splunk Enterprise

Download manual as PDF

Download topic as PDF

Set up alerts for the splunkd health report

The splunkd health report generates alerts for all features in the health status tree. When a feature indicator meets the threshold condition, the feature's health status changes, for example from green to red, and an alert fires. Use health status alerts to maintain visibility into the health of your deployment, whether or not you are logged into Splunk Web.

You can configure health report alerts as follows:

  • Enable/disable alerts on the global, feature, or indicator level.
  • Send alert notifications via email or PagerDuty.
  • Set the health status color (yellow or red) that triggers an alert.
  • Set a minimum duration that must elapse between alerts.

You can configure health report alerts by directly editing health.conf or querying the server/health-config endpoint.

Disable health report alerts

Alerts are enabled by default for all features in the splunkd health report. You can disable alerts at the global, feature, or indicator level. Disabling alerts at the global level overrides enabled alerts at the feature level. Likewise, disabling alerts at the feature level overrides enabled alerts at the indicator level.

Disabling alerts is useful for reducing noise from non-critical features and minimizing false positives when performing maintenance tasks.

Disable alerts using health.conf

To disable alerts for all features in the splunkd health report:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. In the health_reporter stanza, set alert.disabled = 1. For example:
    [health_reporter]
    full_health_log_interveral = 30
    suppress_status_update = 600
    alert.disabled = 1
    

    To enable alerts for all features in the splunkd health report, set alert.disabled = 0.

To disable alerts for a single feature:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. In the stanza for the particular feature, set alert.disabled = 1. For example:
    [feature:indexers]
    ...
    indicator:missing_peers:yellow = 1
    indicator:missing_peers:red = 1
    alert.disabled = 1
    

    To enable alerts for a feature, set alert.disabled = 0.

To disable alerts for a single feature indicator:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. In the stanza for a particular feature, set alert:<indicator_name>.disabled = 1. For example, in the following stanza, alerting for the indicator s2s_connections is disabled:
    [feature:s2s_autolb]
    ...
    indicator:s2s_connections:yellow = 20
    indicator:s2s_connections:red = 70
    alert:s2s_connections.disabled = 1
    

    To enable alerts for an indicator, set alert:.disabled = 0.

Disable alerts using REST endpoint

To disable alerts for features and indicators, send a POST request to server/health-config/{feature_name}. For example, to disable alerts for the batchreader feature on the instance you are monitoring run the following command:

curl -k -u admin:pass https://<host>:<mPort>/services/server/health-config/feature:batchreader -d alert.disabled=1

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

To access server/health-config endpoints, a role must have the edit_health capability.

Set up health report alert actions

You can set up alert actions that run when an alert fires, such as sending alert notifications via email, mobile device, or PagerDuty.

Alert actions apply on the global level only. Multiple alert actions for the same action type are not supported. For example, you cannot have multiple email actions and multiple PagerDuty actions.

Before you can send health email alert notifications, you must configure email notification settings in Splunk Web. For instructions, see Email notification action in the ''Alerting Manual''.

Set up email notifications in health.conf

To set up email alert notifications:

  1. Edit SPLUNK_HOME/etc/system/local/health.conf
  2. Add the [alert_action:email] stanza and specify the recipients. For example:
    [alert_action:email] 
    disabled = 0
    action.to =  <recipient@example.com>
    action.cc = <recipient_2@example.com>
    action.bcc = <other_recipients@example.com>
    

Set up PagerDuty notifications in health.conf

Before you can send alert notifications to PagerDuty, you must install the PagerDuty App from Splunkbase. You must also add a new service to your PagerDuty integration, and copy the integration key. For more information, see PagerDuty documentation.

To set up PagerDuty alert notifications:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the [alert_action:pagerduty] stanza and specify the integration key. For example:
    [alert_action:pagerduty]
    disabled = 0
    action.integration_url_override = <integration key>
    

For more information, see health.conf.example.

Set up alert notifications using REST

To set up alert notifications, send a POST request to server/health-config/{alert_action}. For example, to set up an email alert notification:

curl -k -u admin:pass https://localhost:8089/services/server/health-config/alert_action:email -d action.to=admin@example.com -d action.cc=admin2@example.com

For endpoint details, see server/health-config/{alert_action} in the REST API Reference Manual.

Set the alert threshold color

You can set the threshold color that triggers an alert. Possible alert threshold values are yellow or red. If the threshold value is yellow, an alert fires for both yellow and red. If the value is red, an alert fires for red only. The default alert threshold value is red.

Set the alert threshold color in health.conf

To set the alert threshold color on the global or feature level:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the alert.threshold_color setting to the [health_reporter] or [feature:<feature_name>] stanza. For example:
    [feature:replication_failures]
    ...
    alert.threshold_color = yellow
    indicator:replication_failures:red = 10
    indicator:replication_failures:yellow = 5
    

To set the alert threshold color on the indicator level:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the <code>alert:<indicator name>.threshold_color setting to the feature stanza. For example:
    [feature:replication_failures]
    ...
    indicator:replication_failures:red = 10
    indicator:replication_failures:yellow = 5
    alert:replication_failures.threshold_color = yellow
    

Alert threshold color settings at the indicator level override alert threshold color settings at the feature level.

Set the alert threshold color using REST

To set the alert threshold color for a feature or indicator, send a POST request to server/health-config{feature_name}. For example, to set the alert threshold color for the Replication Failures feature:

curl -k -u admin:pass https://localhost:8089/services/server/health-config/feature:replication_failures -d alert.threshold_color=yellow

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

Set minimum duration between alerts

You can set the amount of time an unhealthy health status persists before an alert fires using the alert.min_duration_sec setting. You can use this setting to help reduce noise from feature health status changes that might be rapidly flipping between states, for example, between green and yellow or yellow and red.

Set minimum duration between alerts in health.conf

To set the minimum duration between alerts on the global or feature level:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the alert.min_duration_sec setting to the [health_reporter] or [feature:<feature_name>] stanza. For example:
    [feature:replication_failures]
    ...
    alert.min_duration_sec = 600
    indicator:replication_failures:red = 10
    indicator:replication_failures:yellow = 5
    

To set the minimum duration between alerts on the indicator level:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the <code>alert:<indicator name>.min_duration_sec setting to the [feature:<feature_name>] stanza. For example:
    [feature:replication_failures]
    ...
    indicator:replication_failures:red = 10
    indicator:replication_failures:yellow = 5
    alert:replication_factor.min_duration_sec = 600
    

Minimum duration between alerts settings on the feature level override settings on the indicator level.

Set minimum duration between alerts using REST

To set the minimum duration between alerts for a feature or indicator, send a POST request to server/health-config/{feature_name}. For example, to set the minimum duration between alerts for the Replication Failures feature:

curl -k -u admin:pass https://localhost:8089/services/server/health-config/feature:replication_failures -d alert.min_duration_sec=600

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

PREVIOUS
Configure the splunkd health report
  NEXT
Set access controls for the splunkd health report

This documentation applies to the following versions of Splunk® Enterprise: 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.3.0, 7.3.1, 7.3.2


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters