Splunk® Enterprise

Managing Indexers and Clusters of Indexers

Splunk Enterprise version 7.2 is no longer supported as of April 30, 2021. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk® Enterprise. For documentation on the most recent version, go to the latest release.

Troubleshoot SmartStore

SmartStore, in common with other features of Splunk Enterprise, provides a number of tools that you can use to troubleshoot your deployment:

  • Log files
  • CLI commands
  • REST endpoints

This topic discuss each of these tools in the SmartStore troubleshooting context. In addition, it covers some common SmartStore issues and their possible causes.

Troubleshoot with log files

Several log files can provide insight into the state of SmartStore operations.

splunkd.log. Examine these log channels:

  • S3Client. Communication with S3.
  • StorageInterface. External storage activity (at a higher level than S3Client).
  • CacheManager. Activity of the cache manger component.
  • CacheManagerHandler. Cache manager REST endpoint activity (both server and client side).

search.log . Examine these log channels:

  • CacheManagerHandler. Bucket operations with cache manger REST endpoint activity.
  • S2BucketCache. Search-time bucket management (open, close, and so on).
  • BatchSearch, CursoredSearch, IndexScopedSearch, ISearchOperator. Search activity related to buckets.

audit.log

  • Contains information on bucket operations, such as upload, download, evict, and so on.

metrics.log

  • Contains metrics concerning operations on external storage.

splunkd_access.log

  • Contains a trail of the search process activity against the cache manger REST endpoint.

Test connectivity with remote storage

One common problem is connectivity with the remote storage. Connectivity problems can result from network or permissions issues. Use the splunkd cmd rfs command to test connectivity with remote storage. This section demonstrates some uses for the command.

List the contents of the "foobar" index on remote storage:

splunk cmd splunkd rfs ls index:foobar

List the contents of a given bucket on remote storage:

splunk cmd splunkd rfs ls bucket:foo~737~B1CE2AB0-CE4A-4697-83F2-1C5DBFB6485A

Test getting a file from remote storage:

splunk cmd splunkd rfs getF bucket:foo~737~B1CE2AB0-CE4A-4697-83F2-1C5DBFB6485A/guidSplunk-B1CE2AB0-CE4A-4697-83F2-1C5DBFB6485A/Hosts.data  /tmp/foo/

Troubleshoot with REST searches

Use the following search to get a list of buckets that are actively being searched:

| rest /services/admin/cacheman search=cm:bucket.ref_count>0

The ref_count value increments by 1 when the search opens the bucket and decrements by 1 when the search closes the bucket.

Use the following search to get a list of buckets that have not been uploaded to the remote store (that is, are not "stable" on the remote store):

| rest /services/admin/cacheman search=cm:bucket.stable=0

Common issues

Searches are running slowly or appear stuck

Slow or stuck searches are often due to these issues:

  • Performance issues with remote storage.
  • The cache manger is evicting buckets too aggressively.
  • Cold cache issues. A cold cache occurs when a peer participating in a search does not have a local copy of some needed buckets and therefore must download the buckets from remote storage. A cold cache can result from the master reassigning primary bucket copies to different peers.

Searches erroring out

The search-related error message "Failed to localize fileSet='....' for bid='...'. Results will be incomplete." indicates an error condition while downloading the specified bucket,

For more details, examine splunkd.log on the indexer issuing the error.

Disk full issues

A disk full related message indicates that the cache manager is unable to evict sufficient buckets. These are some possible causes:

  • Search load overwhelming local storage. For example, the entire cache might be consumed by buckets opened by at least one search process. When the search ends, this problem should go away.
  • Cache manager issues. If the problem persists beyond a search, the cause could be related to the cache manager. Examine splunkd.log on the indexer issuing the error.
Last modified on 23 January, 2019
Add a SmartStore index   The SmartStore cache manager

This documentation applies to the following versions of Splunk® Enterprise: 7.2.0, 7.2.1, 7.2.2, 7.2.3


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters