The following are the spec and example files for checklist.conf.


# This file contains the set of attributes and values you can use to 
# configure checklist.conf in Monitoring Console.


* A unique string for the name of this health check.

title = <ASCII string>
* (required) Displayed title for this health check.

category = <ASCII string>
* (required) Category for overarching goups of health check items.

tags = <ASCII string>
* (optional) Comma separated list of tags that apply to this health check.
* If omitted user will not be able to run this health check as part of a subset of health checks.

description = <ASCII string>
* (optional) A description of what this health check is checking.
* If omitted no description will be displayed.

failure_text = <ASCII string>
* (optional) If this health check did not pass, this tells what could have gone wrong.
* If omitted nothing will be displayed to help the user identify why this check is failing.

suggested_action = <ASCII string>
* (optional) Suggested actions for diagnosing and fixing your Splunk installation
  so this health check is no longer failing.
* If omitted no suggested actions for fixing this health check will be displayed.

doc_link = <ASCII string>
* (optional) Location string for help documentation for this health check.
* If omitted no help link will be displayed to help the user fix this health check.
* Can be a comma separated list if more than one documentation link is needed.

doc_title = <ASCII string>
* (optional) Title string for help documentation link for this health check.
* Must be included if doc_link exists.
* Will be inserted in the text for the help documentation link like so: "Learn more about $doc_title$"
* If doc_link is a comma separated list,
*   then doc_title must also be a comma separated list with one title per item corresponding to doc_link.

applicable_to_groups = <ASCII string>
* (optional) Comma separated list of applicable groups that this check should be run against.
* If omitted this check item can be applied to all groups.

environments_to_exclude = <ASCII string>
* (optional) Comma separated list of environments that the health check should not run in.
*   Possible environments are 'standalone' and 'distributed'
* If omitted this check can be applied to all groups.

disabled = [0|1]
* Disable this check item by setting to 1.
* Defaults to 0.

search = <ASCII string>
* (required) Search string to be run to perform the health check.
* Please seperate lines by "\" if the search string has multiple lines.
* In single-instance mode, this search will be used to generate the final result.
* In multi-instance mode, this search will generate one row per instance in the result table.
* |---------------------------------------------------------------
* |     instance    |            metric         | severity_level |
* |---------------------------------------------------------------
* | <instance name> | <metric number or string> | <level number> |
* |---------------------------------------------------------------
* |       ...       |              ...          |      ...       |
* |---------------------------------------------------------------
* <instance name> (required, unique) is either the "host" field of events or the
  "splunk_server" field of "| rest" search.
*   In order to generate this field, please do things like:
*     ... | rename host as instance
*   or
*     ... | rename splunk_server as instance
* <metric number or string> (optional) one ore more columns to "show your work"
*   This should be the data that severity_level is determined from.
*   The user should be able to look at this field to get some idea of what made the instance fail this check.
* <level number> (required) could be one of the following:
*   - -1 (N/A)		means: "Not Applicable"
*   - 0	(ok) 		means: "all good"
* 	- 1 (info)		means: "just ignore it if you don't understand"
* 	- 2 (warning)	means: "well, you'd better take a look"
* 	- 3 (error)		means: "FIRE!"
* Please also note that the search string must contain either of the following
  token to properly scope to either a single instance or a group of instances,
  depending on the settings of checklistsettings.conf.
* 	$rest_scope$ 		- used for "|rest" search
* 	$hist_scope$ 		- used for historical search

drilldown = <ASCII string>
* (optional) Link to a search or Monitoring Console dashboard for additional information.
* Please note that the drilldown string must contain a $ delimited string.
  * This string must match one of the fields output by the search.
  * Most dashboards will need the name of the instance, eg $instance$


No example

This documentation applies to the following versions of Splunk® Enterprise: 7.0.2


Thank you, Sideview. I'll pass this input along for inclusion in a future release.

Andrewb splunk, Splunker
February 7, 2018

Forgot to mention another typo in here - "checklistsettings.conf" This seems to be an older never-released name of checklist.conf. As written it's going to make everyone do what I just did, mildly panic looking to see if there's really a checklistsettings.conf that might further modify how checklist.conf behaves. =)

February 6, 2018

"applicable_to_groups" and "environments_to_exclude" does a confusing dance switching between the word "groups" and the word "environments", which seem to be synonyms here.

Although it may be too late to make the actual keys be consistent, at least the docs should use the same word.

February 6, 2018

Elsewhere the spec files refers to "single-instance mode" vs "multi instance mode". It's not clear what this is referring to if anything. It also seems to make a problematic assumption that if a healthcheck returns multiple failures, that the failures will always map one-to-one with splunk search peers (wildly untrue in our cases - some of ours are checking field-configuration, so the "rows" are fields, some are checking sourcetypes.)

February 6, 2018

The note about $rest_scope$ vs $hist_scope$ is very strange and apparently wrong.

We've been adding checklist.conf searches and there seems to be no need to specify *either* of these. The docs here don't say whether a) the checklist UI tries to secretly put values into these $foo$ tokens, (and if so, what those values might be) or b) this was just a hack because a developer for some reason wasn't allowed to add an actual additional key to the checklist.conf file.

If there are bad consequences to ignoring the directive, the spec file should spell out what they are.

February 6, 2018

