Splunk® IT Service Intelligence

User Manual

Acrobat logo Download manual as PDF

Splunk IT Service Intelligence version 4.1.x reached its End of Life on January 19, 2021. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see Before you upgrade IT Service Intelligence.
This documentation does not apply to the most recent version of ITSI. Click here for the latest version.
Acrobat logo Download topic as PDF

ITSI Health Check dashboard

The ITSI Health Check dashboard provides basic statistics about your ITSI environment.

Dashboard panels

Panel Description
Splunk Server Information Basic server information for each host.
ITSI Migration Status The current version of ITSI and the ITSI KV store. These versions should be the same.
ITSI Basic Information For each host, lists the number of services, searches, and entities, as well as KV store and HEC information.
Shared Base Search Usage Summary The number of KPIs using each base search.
KV Store Collections All ITSI KV store collections, the number of objects in each collection, acceleration information, and the collection size. For more information, see About the app key value store in the Splunk Admin Manual.
Concurrent Searches All ITSI searches currently running.
Interesting Indexes All ITSI indexes and their statistics.
Interesting Searches Real-time searches that ITSI runs. itsi_event_grouping handles event grouping for notable event aggregation policies. itsi_mad_context and itsi_mad_cohesive_context handle metric anomaly detection. The searches exist in savesearches.conf. If any search jobs are failed or not running, this could indicate a problem.
KPI Performance Basic performance information for each KPI in your ITSI instance. Any failed or skipped searches indicate a problem. The runtime headroom percentage indicates how much time has been used up out of the search's frequency. A headroom percentage clone to 100 is best, and a value closer to 0 indicates a problem.
Refresh Queue Runtimes Statistics for the refresh queue. The refresh queue ensures data integrity and eventual consistency of your ITSI configuration. It runs as a single instance.
Refresh Queue Failed Jobs The number of failed jobs in the refresh queue. Click a failed job to drill down to the logs.
Rules Engine Information The Java version being used by the rules engine. ITSI version 4.1.x requires Java 7 or Java 8 to run notable event management features.
Skipped Events Count A raw count of skipped events (events that are not included in any episodes) over the past 7 days. Under normal conditions this number should be zero
Skipped Events Percentage The percentage of ungrouped events versus grouped events over the past 7 days. Under normal conditions this percentage should be zero.
Episode Processing Times The amount of time it takes to convert tracked alerts (active raw notable events) to grouped alerts (active grouped notable events). Under normal conditions the processing time should be about 60 seconds.
ITSI Log Messages (deduplicated) Warning and error messages in the ITSI logs. The messages are deduplicated so you won't see the same error multiple times.
Check for Duplicate Entity Aliases Lists the alias field values identifying more than one entity. For more information, see Duplicate entity aliases in the ITSI Installation and Configuration Manual.
Event Analytics Real-Time Search Status The current state of real-time searches, including how much disk space they've used so far and how long they've been running. The searches exist in savesearches.conf.
Event Analytics HEC Tokens Shows which HEC tokens are available by host. If you create notable events using HEC tokens, this table shows which of your instances to send events to using the 'Auto Generated ITSI Event Management Token'. The absence of any of these tokens will lead to event analytics not working properly.
Event Analytics KV Store Lookups Compares the created KV store lookups with the ones that are required for event analytics but not created. If a lookup is not created, you must add it to transforms.conf.
Event Analytics Action Queue Errors A count of action queue errors over time. To search for the action queue errors, run the following search:

index=_internal sourcetype="itsi_internal_log" source="*itsi_notable_event_actions_queue_consumer*" ERROR

Notable Event Size Check Notable event sizes over time. The maximum allowable event size is 10000 bytes. If your events exceed this limit, increase the TRUNCATE setting in props.conf.
Last modified on 20 June, 2019
Predictive Analytics dashboard
ITSI modules

This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.1.1, 4.1.2, 4.1.5

Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters