Splunk® Enterprise

Monitoring Splunk Enterprise

Download manual as PDF

Download topic as PDF

Search: Search Head Clustering

This topic is a reference for all of the Monitoring Console dashboards related to search head clustering. See About the Monitoring Console.

Status and Configuration

The Status and Configuration dashboard is an overview of your search head cluster. It is high-level information.

Configuration Replication

The Configuration Replication dashboard provides insight into configurations that a user changes on any search head cluster member (for example a new event type), and how these changes propagate through the cluster. Use this dashboard if you notice a significant lag in this propagation.

Action reference: The following are low-level actions exposed in the Count of Actions Over Time and Time Spent on Actions Over Time panels. These panels can be helpful for troubleshooting.

Action Description
accept_push On the captain, accept replicated changes from a member.
acquire_mutex Acquire a mutex (mutual exclusion) that "protects" the configuration system.
add_commit On a member, record a change.
base_initialize Initialize a configuration "root" (e.g. $SPLUNK_HOME/etc).
check_range Compare two ranges of configuration changes.
compute_common Find the latest common change between a member and the captain.
pull_from On a member, pull changes from the captain.
purge_eligible On a member, purge sufficiently old changes from the repo.
push_to On a member, push changes to the captain.
release_and_reacquire_mutex Release, then re-acquire a mutex that "protects" the configuration system. This is similar to acquire_mutex.
reply_pull On the captain, reply to a member's pull_from request.
repo_initialize Initialize a configuration repo (from disk).

We expect this information to be leveraged by Splunk Support. If you have issues with configuration replication, you can look at this dashboard for clues. But we expect you to use this dashboard more for gathering information after you file your Support case, rather than gaining insight on your own.

Artifact Replication

The Artifact Replication dashboard contains several panels describing the cluster's backlog of search artifacts to replicate. See Search head clustering architecture in the 'Distributed Search' Manual.

The Warnings and Errors Patterns panel groups warning and error events based on text within the messages. The grouping functionality uses the cluster command.

If your search head cluster is replicating artifacts on time, its Count of Artifacts Awaiting Replication will be at or near zero. A few artifacts awaiting replication is likely not a warning sign. A consistently high and especially a growing number of artifacts could indicate a problem with replication. If you have many artifacts waiting, someone using another search head might not get a local cache and will experience slowness in search availability.

Median Count of Artifacts to Replicate is (as advertised) a median. This means that if you have narrow spikes, you won't see them at larger time ranges.

The Artifact Replication Job Activity panel shows the rate of change of replicating jobs (specifically, the backlog change is the rate). The backlog change can be negative, if your cluster is catching up with its backlog. In this panel, a red flag to look for is a backlog that grows consistently (that is, if the backlog change is always positive). If this happens, the Median Count of Artifacts to Replicate panel above shows a continually growing backlog.

Scheduler Delegation

See Search head clustering architecture in the Distributed Search Manual.

In the Scheduler Status panel, note that max_pending and max_running are "highwater marks" over a 30 second period. That is, they are the highest number of jobs that were pending or running in a 30 second span. You can select one of several functions in this panel. The "maximum" function works in a straightforward manner with these statistics. But take a moment to think through what "average," "median," or "90th percentile" mean. For example: Say max_pending is 4 over 30 seconds, then you average the values of max_pending. You end up with the average high values, not the average of all. So if the number of pending jobs fluctuates a lot, the average max_pending might not be close to a straight average of the number of pending jobs.

App Deployment

The App Deployment dashboard monitors apps as they are deployed from a deployer to search head cluster members.

See About deployment server and forwarder management in the Updating Splunk Enterprise Instances Manual.

In the Apps status panel, a persistent discrepancy indicates that the deployer has not finished deploying apps to its members.

Troubleshoot these dashboards

The search head clustering dashboards require the monitored instances to be running Splunk Enterprise 6.2.0 or greater.

Make sure you have completed all of the Monitoring Console setup steps.

In particular:

For the App Deployment dashboard:

  • The deployer needs to be a search peer of the Monitoring Console, or the Monitoring Console can be hosted on the deployer. See Add instances as search peers.
  • The deployer needs to have the deployer role (it might auto-detect). Check this in Monitoring Console > Settings > General Setup.
  • The deployer needs to be manually labeled as a member of the SHC. (It will not auto-detect.) Set this in Monitoring Console > Settings > General Setup.
  • The deployer must forward logs, as above. See Monitoring Console prerequisites.
PREVIOUS
Search: Distributed Search
  NEXT
Resource Usage

This documentation applies to the following versions of Splunk® Enterprise: 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.5.10, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 6.6.11, 6.6.12, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.3.0, 7.3.1


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters