Overview of Episode Review in ITSI
Use Episode Review to see a unified view of all your service-impacting alerts. Episode Review displays episodes (groups of notable events) and their current status.
A notable event represents an anomalous incident detected by an ITSI multi-KPI alert, a correlation search, or anomaly detection algorithms. For example, a notable event can represent:
- An alert that ITSI ingests from a third-party product into the
- A single KPI (such as cpu_load_percent) that exceeds a pre-defined threshold.
- The result of a multi-KPI alert that correlates the status of multiple KPIs based on multiple trigger conditions.
- The result of a correlation search that looks for relationships between data points.
- An anomaly that has been detected when anomaly detection is enabled.
An episode represents a group of events occurring as part of a larger sequence (an incident or period considered in isolation).
As an analyst, you can use Episode Review to gain insight into the severity of episodes occurring in your system or network. You can use the console to triage new episodes, assign episodes to analysts for review, and examine episode details for investigative leads.
You can perform actions on episodes, including running a script, sending an email, creating a ticket in ServiceNow or Remedy (if configured), adding a link to a ticket in an external system, and any other custom actions that are configured. You can also automatically perform actions on episodes through the use of notable event aggregation policies. See Overview of aggregation policies in ITSI for more information.
Note: Monitor episodes and actions in Episode Review with the Event Analytics Audit dashboard. For more information, see Event Analytics Audit dashboard in this manual.
Episode management workflow
You can use this example workflow to triage and work on episodes in Episode Review:
- An IT operations analyst monitors the Episode Review, sorting and performing high-level triage on newly-created episodes.
- When an episode warrants investigation, the analyst acknowledges the episode, which moves the status from New to In Progress.
- The analyst researches and collects information on the episode using the drilldowns and fields in the episode details. The analyst records the details of their research in the Comments section of the episode.
- If the analyst cannot immediately find the root cause of the episode, the analyst might open a ticket in Remedy or ServiceNow.
- After the analyst has addressed the cause of the episode and any remediation tasks have been escalated or solved, the analyst sets the episode status to Resolved.
- The analyst assigns the episode to a final analyst for verification.
- The final analyst reviews and validates the changes made to resolve the episode, and sets the status to Closed.
When you close an episode created by an aggregation policy, this breaks the episode (no more events can be added to it) even if the breaking criteria specified in the aggregation policy were not met.
Investigate a service with poor health in ITSI
Triage episodes in ITSI
This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.5.0 Cloud only, 4.5.1 Cloud only, 4.6.0 Cloud only, 4.6.1 Cloud only, 4.6.2 Cloud only