Troubleshoot distributed search
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
- General configuration issues
- Clock skew between search heads and search peers can affect search behavior
- Search head pooling configuration issues
- Clock skew between search heads and shared storage can affect search behavior
- Permission problems on the shared storage server can cause pooling failure
- NFS client concurrency limits can cause search timeouts or slow search behavior
- Warning about unique serverName attribute
- Artifacts and incorrectly-displayed items in Manager UI after upgrade
- Distributed search error messages
Troubleshoot distributed search
This topic describes issues to be aware of when configuring or using distributed search.
General configuration issues
Clock skew between search heads and search peers can affect search behavior
It's important to keep the clocks on your search heads and search peers in sync, via NTP (network time protocol) or some similar means. If the clocks are out-of-sync by more than a few seconds, you can end up with search failures or premature expiration of search artifacts.
Search head pooling configuration issues
When implementing search head pooling, there are a few potential issues you should be aware of, mainly having to do with coordination among search heads.
It's important to keep the clocks on your search heads and shared storage server in sync, via NTP (network time protocol) or some similar means. If the clocks are out-of-sync by more than a few seconds, you can end up with search failures or premature expiration of search artifacts.
On each search head, the user account Splunk runs as must have read/write permissions to the files on the shared storage server.
NFS client concurrency limits can cause search timeouts or slow search behavior
The search performance in a search head pool is a function of the throughput of the shared storage and the search workload. The combined effect of concurrent search users and concurrent scheduled searches running will yield a total IOPs that the shared volume needs to support. IOP requirements will also vary by the kind of searches run. To adequately provision a device to be shared between search heads, you need to know the number of concurrent users submitting searches and the number of jobs/apps that will be executed simultaneously.
If searches are timing out or running slowly, you might be exhausting the maximum number of concurrent requests supported by the NFS client. To solve this problem, increase your client concurrency limit. For example, on a Linux NFS client, adjust the tcp_slot_table_entries setting.
Warning about unique serverName attribute
Each search head in the pool must have a unique serverName attribute. Splunk validates this condition when each search head starts. If it finds a problem, it generates this error message:
serverName "<xxx>" has already been claimed by a member of this search head pool in <full path to pooling.ini on shared storage> There was an error validating your search head pooling configuration. For more information, run 'splunk pooling validate'
The most common cause of this error is that another search head in the pool is already using the current search head's serverName. To fix the problem, change the current search head's serverName attribute in .system/local/server.conf.
There are a few other conditions that also can generate this error:
- The current search head's
serverNamehas been changed. - The current search head's GUID has been changed. This is usually due to
system/local/server.confbeing deleted.
To fix these problems, run
splunk pooling replace-member
This updates the pooling.ini file with the current search head's serverName->GUID mapping, overwriting any previous mapping.
Artifacts and incorrectly-displayed items in Manager UI after upgrade
When upgrading pooled search heads, you must copy all updated apps - even those that ship with Splunk (such as the Search app and the data preview feature, which is implemented as an app) - to the search head pool's shared storage after the upgrade is complete. If you do not, you might see artifacts or other incorrectly-displayed items in Manager.
To fix the problem, copy all updated apps from an upgraded search head to the shared storage for the search head pool, taking care to exclude the local sub-directory of each app.
Important: Excluding the local sub-directory of each app from the copy process prevents the overwriting of configuration files on the shared storage with local copies of configuration files.
Once the apps have been copied, restart Splunk on all search heads in the pool.
Distributed search error messages
This table lists some of the more common search-time error messages associated with distributed search:
| Error message | Meaning |
|---|---|
status=down
| The specified remote peer is not available. |
status=not a splunk server
| The specified remote peer is not a Splunk server. |
duplicate license
| The specified remote peer is using a duplicate license. |
certificate mismatch
| Authentication with the specified remote peer failed. |
This documentation applies to the following versions of Splunk: 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 , 4.3.3 , 4.3.4 , 4.3.5 , 4.3.6 View the Article History for its revisions.