Search: Distributed Search

This topic is a reference for the Monitoring Console dashboards related to distributed search. See About the Monitoring Console in this manual.

What do these views show?

The distributed search views expose the health, activity, and performance of the distributed search framework.

These views focus on communication between a search head and its peers during searches. In contrast, the search head clustering dashboards describe communication between search heads.

There are two basic ways to use these views:

1. Navigate to the health check at top of view, specific to this product area. Verify that these basic checks pass.

2. If your users report distributed search problems, use these views to understand how the components are performing. For example, users might see messages like "search peers could not participate in the search" or about search peers being unavailable or taking too long. For these types of messages, use these dashboards. If you know the instance reporting the problems, go directly to the Distributed search: Instance view. If not, start at the Distributed search: Deployment view. Look at the history of how these instances were behaving. These views can help you understand the distributed search framework. This hopefully gives you a better idea of the nature of the problem.

Interpret results in these views

For either view (Instance or Deployment), you can choose to examine search heads or search peers by selecting Search heads or Indexers at the top of the page. The metrics displayed on the dashboard change depending on which role you select.

On the Instance view, select a search head to see how the search head is communicating with its peers, from the operating context of this search head.

What to look for in these views

Scan for red flags in the Health Checks at the top of each view. The health checks are not comprehensive across the entire distributed search infrastructure. Rather, they are a high-level check of basic things.

The Snapshot panel exposes response times to a request and times for bundle replication. These times are vitals because they should take a very short time (under a second). Generally, if any of these times is a few seconds or longer, then something is not right.

In the Deployment view, select the search heads radial and use column sorting to inspect timing metrics:

Dispatch directories are reaped per operation, so times over 15 seconds indicate problems.
Bundle directory reaping should also be much less than 15 seconds.

The three Heartbeat metrics represent a vital on the search head. When they're high, the search peers might be oversubscribed and having trouble responding to communication requests in a timely manner. Response times over 1 second are not ideal and could indicate a developing problem. Response times over 5 or 10 seconds will start hitting up against timeouts. When this happens, searches actually fail. To continue troubleshooting, match this with the Resource Usage: Machine view corresponding to this peer. See Intermittent authentication timeouts on search peers in the Troubleshooting Manual for more information.

For additional help with distributed search problems, see General troubleshooting issues in the Distributed Search Manual.

Troubleshoot these views

All of the metrics that these views leverage were introduced in Splunk Enterprise 6.3.0. If a component of your deployment is on a Splunk Enterprise version older than 6.3.0, these views will not include data from that component.

The snapshot panels use data from a variety of endpoints.

All historical panels in these views get their data from metrics.log.

Related answers from Splunk Community

Search: Distributed Search

What do these views show?

Interpret results in these views

What to look for in these views

Troubleshoot these views

Comments

Search: Distributed Search

Was this topic useful?