Splunk® Enterprise

Monitoring Splunk Enterprise

Set up alerts for the splunkd health report

The splunkd health report generates alerts for all features in the health status tree. When a feature indicator meets the threshold condition, the feature's health status changes, for example from green to red, and an alert fires. Use health status alerts to maintain visibility into the health of your deployment, whether or not you are logged into Splunk Web.

You can configure health report alerts as follows:

  • Enable/disable alerts on the global, feature, or indicator level.
  • Send alert notifications via email, mobile, VictorOps, or PagerDuty.
  • Set the health status color (yellow or red) that triggers an alert.
  • Set a minimum duration that must elapse between alerts.

You can configure health report alerts by directly editing health.conf or querying the server/health-config endpoint.

Disable health report alerts

Alerts are enabled by default for all features in the splunkd health report. You can disable alerts at the global, feature, or indicator level. Disabling alerts at the global level overrides enabled alerts at the feature level. Likewise, disabling alerts at the feature level overrides enabled alerts at the indicator level.

Disabling alerts is useful for reducing noise from non-critical features and minimizing false positives when performing maintenance tasks.

Disable alerts using health.conf

To disable alerts for all features in the splunkd health report:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. In the health_reporter stanza, set alert.disabled = 1. For example:
    [health_reporter]
    full_health_log_interveral = 30
    suppress_status_update = 600
    alert.disabled = 1
    

    To enable alerts for all features in the splunkd health report, set alert.disabled = 0.

To disable alerts for a single feature:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. In the stanza for the particular feature, set alert.disabled = 1. For example:
    [feature:indexers]
    ...
    indicator:missing_peers:yellow = 1
    indicator:missing_peers:red = 1
    alert.disabled = 1
    

    To enable alerts for a feature, set alert.disabled = 0.

To disable alerts for a single feature indicator:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. In the stanza for a particular feature, set alert:<indicator_name>.disabled = 1. For example, in the following stanza, alerting for the indicator s2s_connections is disabled:
    [feature:s2s_autolb]
    ...
    indicator:s2s_connections:yellow = 20
    indicator:s2s_connections:red = 70
    alert:s2s_connections.disabled = 1
    

    To enable alerts for an indicator, set alert:.disabled = 0.

Disable alerts using REST endpoint

To disable alerts for features and indicators, send a POST request to server/health-config/{feature_name}. For example, to disable alerts for the batchreader feature on the instance you are monitoring run the following command:

curl -k -u admin:pass https://<host>:<mPort>/services/server/health-config/feature:batchreader -d alert.disabled=1

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

To access server/health-config endpoints, a role must have the edit_health capability.

Set up health report alert actions

You can set up alert actions that run when feature health status changes trigger an alert. Alert actions include sending notifications via email, mobile, VictorOps, or PagerDuty. Alert actions apply on the global level only. Multiple alert actions for the same action type are not supported. For example, you cannot have multiple email actions and multiple PagerDuty actions.

When you set up health report alert actions, you must specify the alert actions in the [health_reporter] stanza in SPLUNK_HOME/etc/system/local/health.conf. You can specify alert actions in a comma separated list. For example:

[health_reporter] 
alert_actions=email, mobile, VictorOps, PagerDuty

You must also set up the specific alert action using the appropriate alert action stanza, as shown in the following sections.

Set up email alert notifications in health.conf

Before you can send health email alert notifications, you must configure email notification settings in Splunk Web. For instructions on how to configure email notifications, see Email notification action in the Alerting Manual.

To set up email alert notifications using Splunk Web:

  1. Log in to Splunk Web as a user in the admin role.
  2. Go to Settings > Health report manager.
  3. Click Manage Alerts.
  4. Configure the email recipient for alerts.
  5. (Optional) Populate the CC and BCC fields with additional email addresses.
  6. Click Save.

To manually configure the email settings:

  1. Edit SPLUNK_HOME/etc/system/local/health.conf
  2. Add the [alert_action:email] stanza and specify the recipients. For example:
    [alert_action:email] 
    disabled = 0
    action.to =  <recipient@example.com>
    action.cc = <recipient_2@example.com>
    action.bcc = <other_recipients@example.com>
    

Set up mobile alert notifications in health.conf

Before you configure mobile alert notifications for health report, you must download the Splunk Mobile iOS app from the App Store to your phone, and download the Splunk Cloud Gateway app from Splunkbase to your Splunk instance. For details on how to install and set up the Splunk Cloud Gateway app, see Install Splunk Cloud Gateway.

To set up mobile alert notifications:

  1. In Splunk Web, click Settings > Alert actions. In the Send to mobile row, confirm that splunk_app_cloudgateway is enabled.
  2. In Splunk Web, click Apps > Splunk Cloud Gateway > Register. Enter the activation code displayed in the Splunk Mobile app on your mobile device.
  3. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  4. Add the [alert_action:mobile] stanza and specify the recipients. For example:
    [alert_action:mobile] 
    disabled = 0
    action.alert_recipients = admin
    

Set up VictorOps alert notifications in health.conf

Before you can send alert notifications to VictorOps, you must install the VictorOps App from Splunkbase. You must also create an API key and optionally a routing key in VictorOps. For instructions on how to create an API key for VictorOPs, see the Splunk Integration Guide for VictorOps.

To set up VictorOps alert notifications:

  1. In Splunk Web, click Settings > Alert actions. In the VictorOps row, confirm the victorops_app is enabled and click Setup VictorOps Notifications.
  2. Enter the API key and optional Routing Key.
  3. Click Save.
  4. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  5. Add the [alert_action:victorops] stanza and specify the alert message type. For example:
    [alert_action:victorops] 
    disabled = 0
    action.message_type = CRITICAL
    

For more information on VictorOps alert settings, including valid alert message types and optional settings, see health.conf.example in the Admin Manual.

Set up PagerDuty alert notifications in health.conf

Before you can send alert notifications to PagerDuty, you must install the PagerDuty App from Splunkbase. You must also add a new service to your PagerDuty integration, and copy the integration key. For more information, see the PagerDuty documentation.

To set up PagerDuty alert notifications:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the [alert_action:pagerduty] stanza and specify the integration key. For example:
    [alert_action:pagerduty]
    disabled = 0
    action.integration_url_override = <integration key>
    

For more information, see health.conf.example.

Set up alert actions using REST

To set up alert actions, send a POST request to server/health-config/{alert_action}. For example, to set up an email alert notification:

curl -k -u admin:pass https://localhost:8089/services/server/health-config/alert_action:email -d action.to=admin@example.com -d action.cc=admin2@example.com

For endpoint details, see server/health-config/{alert_action} in the REST API Reference Manual.

Set up distributed health report alert actions

You can set up alert actions, such as sending email or mobile alert notifications, directly on the distributed health report's central instance, which lets you configure alert actions in one location, simplifying the alert configuration process, when compared to the single-instance (local) view health report that requires you to set up alert actions on individual instances.

The distributed health report must be enabled to receive alerts from your deployment.

To set up distributed health report email alert actions:

  1. Log in to the distributed health report's central instance. In a typical environment, this is the cluster manager or search head captain.
  2. Edit $SPLUNK_HOME/etc/system/local/health.conf
  3. Make sure the distributed health report is enabled, as shown:
    [distributed_health_reporter]
    disabled = 0
    
  4. Log in to Splunk Web as a user in the admin role.
  5. Go to Settings > Health report manager.
  6. Click Manage Alerts
  7. Configure the email recipient for alerts.
  8. (Optional) Populate the CC and BCC fields with additional email addresses.
  9. Click Save

For details on how to configure alternate alert actions, see Set up health report alert actions.

If you enable alerts on both the central instance and on an individual instance, the distributed health report can send duplicate alerts. To avoid duplicate alerts, disable alerts on individual instances.

Set the alert threshold color

You can set the threshold color that triggers an alert. Possible alert threshold values are yellow or red. If the threshold value is yellow, an alert fires for both yellow and red. If the value is red, an alert fires for red only. The default alert threshold value is red.

Set the alert threshold color in health.conf

To set the alert threshold color on the global or feature level:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the alert.threshold_color setting to the [health_reporter] or [feature:<feature_name>] stanza. For example:
    [feature:replication_failures]
    ...
    alert.threshold_color = yellow
    indicator:replication_failures:red = 10
    indicator:replication_failures:yellow = 5
    

To set the alert threshold color on the indicator level:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the alert:<indicator name>.threshold_color setting to the feature stanza. For example:
    [feature:replication_failures]
    ...
    indicator:replication_failures:red = 10
    indicator:replication_failures:yellow = 5
    alert:replication_failures.threshold_color = yellow
    

Alert threshold color settings at the indicator level override alert threshold color settings at the feature level.

Set the alert threshold color using REST

To set the alert threshold color for a feature or indicator, send a POST request to server/health-config{feature_name}. For example, to set the alert threshold color for the Replication Failures feature:

curl -k -u admin:pass https://localhost:8089/services/server/health-config/feature:replication_failures -d alert.threshold_color=yellow

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

Set minimum duration between alerts

You can set the amount of time an unhealthy health status persists before an alert fires using the alert.min_duration_sec setting. You can use this setting to help reduce noise from feature health status changes that might be rapidly flipping between states, for example, between green and yellow or yellow and red.

Set minimum duration between alerts in health.conf

To set the minimum duration between alerts on the global or feature level:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the alert.min_duration_sec setting to the [health_reporter] or [feature:<feature_name>] stanza. For example:
    [feature:replication_failures]
    ...
    alert.min_duration_sec = 600
    indicator:replication_failures:red = 10
    indicator:replication_failures:yellow = 5
    

To set the minimum duration between alerts on the indicator level:

  1. Edit $SPLUNK_HOME/etc/system/local/health.conf.
  2. Add the alert:<indicator name>.min_duration_sec setting to the [feature:<feature_name>] stanza. For example:
    [feature:replication_failures]
    ...
    indicator:replication_failures:red = 10
    indicator:replication_failures:yellow = 5
    alert:replication_factor.min_duration_sec = 600
    

Minimum duration between alerts settings on the feature level override settings on the indicator level.

Set minimum duration between alerts using REST

To set the minimum duration between alerts for a feature or indicator, send a POST request to server/health-config/{feature_name}. For example, to set the minimum duration between alerts for the Replication Failures feature:

curl -k -u admin:pass https://localhost:8089/services/server/health-config/feature:replication_failures -d alert.min_duration_sec=600

For endpoint details, see server/health-config/{feature_name} in the REST API Reference Manual.

Last modified on 24 February, 2023
Configure the splunkd health report   Set access controls for the splunkd health report

This documentation applies to the following versions of Splunk® Enterprise: 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.1.7, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.2.4, 9.3.0, 9.3.1, 9.3.2, 9.4.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters