Splunk® IT Service Intelligence

Service Insights Manual

Monitor KPI data drift in ITSI

Use drift detection to identify changes in KPI behavior that occur slowly over longer periods of time, and prevent issues before they arise. Normal KPIs can display an incorrect severity (high or critical) due to user configuration changes in code, data, workload, or infrastructure. For example, a KPI measuring disk usage slowly increases over a period of weeks, and adaptive thresholds continue to adjust daily to the new values until an issue arises, which may lead to data loss. Drift detection helps notify you of these incremental changes to help you proactively remediate the issue.

Drift detection provides additional context about your KPI behavior, helping you to troubleshoot the root cause, and prevent inaccurate alerts or missed opportunities for proactive engagement.

Prerequisites

  • You must have the itoa_admin or itoa_team_admin role in order to configure drift detection, with the read_itsi_services and write_itsi_services capabilities.
  • Install Python for Scientific Computing version 4.2.0 or later in order to use this feature.
  • Your entities need at least 3 months of backfilled data, or display a historical pattern or trend in order to produce meaningful results. Drift detection analyzes KPI changes occurring over longer time periods, in contrast with adaptive thresholding that covers more rapid changes. For more information, see When to use adaptive thresholds.

Configure drift detection

  1. From the ITSI navigation menu, select Configuration then Service and KPI Management. Alternatively, select a service from the Service Analyzer and go to the KPIs tab.
  2. From the KPIs tab, find a KPI or select multiple KPIs to apply drift detection settings.
  3. For multiple KPIs, select Configure drift detection from the Bulk options dropdown. For an individual KPI, select the three dot menu next to the KPI's name in the list.
  4. From the Drift detection configuration page, configure the following settings:
    Option Description
    Data resolution The time frame over which data is collected and summarized.
    Function The statistical method used to analyze aggregated data:
    • Max
    • Average
    • Min
    • Sum
    Look back period Time frame over which data is analyzed to evaluate trends and patterns.
    Drift tolerance % Percentage amount that a KPI can deviate from the baseline and be considered normal.
  5. Select Preview on chart to see when drift is detected based on your selected settings. There are two types of drift that can be detected:
    • Gradual drift: indicates gradual changes in KPI behavior patterns, marked by a beginning and end point. Gradual drift suggests potential issues caused by minute increases or decreases in the KPI value over several weeks.
    • Rapid drift: indicates when KPI behavior rapidly changes in a short period of time. Rapid drift suggests a sustained change in the KPI occurring over a short time period, and that the configuration needs review.
  6. Select Save configuration.

View drifting KPIs in a service

Follow these steps to view drifting KPIs from the Service Analyzer:

  1. Select a service from the Service Analyzer page.
  2. On the KPIs & Episodes tab, a special symbol next to the KPI name indicates that drift has been detected. For example, drift was detected in the 4xx Errors Count KPI:
    Screenshot of a list of KPI names and a label over the name stating that drift was detected.
  3. Hover on the KPI name to view the percentage amount that the KPI has drifted from the original baseline value configured in your drift settings.
  4. Select the Drift Review tab to see a list of episodes associated with a specific drift alert.

View drifting KPIs in the Configuration assistant

Follow these steps to view all KPIs with drift detected from the Configuration assistant:

  1. From the navigation menu, select Configuration, then ITSI Configuration Assistant.
  2. From the Configuration assistant page, view the KPIs with drift detected from the Configuration issues detected section.
  3. In the sidebar panel, select Get recommendations to generate thresholding recommendations for KPIs with drift detected.
  4. Select Reconfigure drift to set up new drift configuration settings if the drift detected was not accurate.

KPI drift notable event aggregation policy

The KPI drift policy is a notable event aggregation policy that automatically groups alerts for drifting KPIs into episodes. You can find drift episodes by filtering on this policy, or viewing the details of individual episodes.

To view drift episodes, follow these steps:

  1. Go to the Alerts and Episodes page.
  2. Select an episode. On the Impact tab, you're alerted if drift was detected in the episode details section. You can view the impacted services and KPIs from the episode details section.
  3. (Optional) Provide feedback about whether the detected drift is accurate based on your KPI's expected behavior in order to continue refining the drift detection algorithm.
Last modified on 20 February, 2025
Configure KPI thresholds with machine learning in ITSI   Set KPI importance values in ITSI

This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.20.0


Please expect delayed responses to documentation feedback while the team migrates content to a new system. We value your input and thank you for your patience as we work to provide you with an improved content experience!

Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters