Splunk® IT Service Intelligence

Service Insights Manual

Splunk IT Service Intelligence (ITSI) version 4.11.x reached its End of Life on December 6, 2023. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see Before you upgrade IT Service Intelligence.
This documentation does not apply to the most recent version of Splunk® IT Service Intelligence. For documentation on the most recent version, go to the latest release.

Create time-based static KPI thresholds in ITSI

Time-based static thresholds let you define specific threshold values to be used at different times to account for changing workloads over time. Use time-based static thresholds if you know the workload schedule for a specific KPI. Time policies accommodate normal variations in usage across your services and improve the accuracy of KPI and service health scores.

For example, if your organization's peak activity is during the standard work week, you might create a KPI threshold time policy that accounts for higher levels of usage during work hours, and lower levels of usage during off-hours and weekends.

IT Service Intelligence (ITSI) stores thresholding information at the KPI level in the KV store. Any updates you make to a KPI threshold template are applied to all KPIs using that template, overriding any changes made to those KPIs. Updates are also applied to any services or service templates using those KPIs.

You can only have one active time policy at any given time. When you create a new time policy, the previous time policy is overwritten and cannot be recovered.

Available KPI threshold templates

ITSI provides default thresholding templates that you can use to build your time policies. You can select templates with different time block combinations, such as work hours, off hours, weekends, AM/PM, 3 hour block, 2 hour block, and so on.

Thresholding templates are either static or adaptive. Use static templates to create time policies that do not change after you configure them. Use adaptive templates to create time polices that generate thresholds dynamically and update daily based on changes in your data. You can use adaptive thresholds with aggregate thresholds but not per-entity thresholds.

The following template types are available:

Algorithm Description
Quantile Lets you put threshold bounds at various percentiles based on historic data. For example, you might choose to set critical severity for data points falling below the 1st percentile (0.01) and above the 99th percentile (0.99).


Because you must choose percentile threshold values between 0 and 1, this is the only algorithm that will never produce thresholds above or below your historical data minimum and maximum values. Therefore, you are likely to see your threshold bounds crossed repeatedly. This could influence your alerting strategy.

Standard deviation Lets you specify thresholds based on multiples of the standard deviation from the mean. Negative values produce thresholds below the mean and positive values produce thresholds above the mean.

Standard deviation is most useful for cyclical data, which might include customer traffic volumes. For example, customer traffic increases at 8:00, peaks at 1:00, starts to drop at 5:00, and goes very low in the middle of the night.

This algorithm is sensitive to outliers in historic data which may cause much larger threshold values than you want or expect. Additionally, if your data is skewed, with larger values and outliers present, but smaller values and outliers not present (such as what you might see with a response time KPI), you might struggle to generate meaningful lower bound thresholds.

If your data is well distributed around a mean, this algorithm could be a good choice for you. Even if your data is skewed, standard deviation is a good choice if you only care about meaningful thresholds in the direction of the skew.

Range Focuses on the minimum and maximum data points from your historic data and the span between those values (max – min). Your thresholds then become a multiplier of the span added to the minimum. A value of 0 will set a threshold to the historic data minimum, a value of 1 will set a threshold to the historic data maximum. A value of -1 will set the threshold to the minimum minus the span, and a value of 2 will set the threshold to the maximum plus the span.


Outliers in historical data will directly affect the span value which will then affect the thresholds. If you need or want the ability to specify thresholds beyond historic data mins and maxes and standard deviation doesn't work for you, this could be a good choice.

Time zones with threshold templates

Time blocks in threshold templates, including custom templates you create, are stored in the backend in UTC time but presented in the UI in your own time zone. For example, if you're on PST, a time block of 11:00 AM - 12:00 PM on your system is stored in the backend as 6:00 PM - 7:00 PM. This doesn't affect the preview thresholds.

If another user logs in from a different time zone and views the exact same time policy, they'll see the time blocks in their own time zone. For example, a person on EST would see the exact same time block as above as 2:00 PM - 3:00 PM, but the name of the time policy would remain the same. If your organization has people using the same system in two different time zones, this behavior could be confusing for one set of users.

Time policies support time zone offsets of 15, 30, and 45 minutes for compatibility with non-hourly time zones (such as (GMT+05:30) Chennai, Kolkata, Mumbai, New Delhi). This allows users in non-hourly time zones to accurately apply time policies created by users in hourly time zones.

Apply a threshold time policy to a KPI

Perform the following steps to apply a threshold time policy to a KPI.

  1. In the KPIs list, select the specific KPI for which you want to set a threshold time policy.
  2. Expand the Thresholding panel.
  3. Select Use Thresholding Template.
  4. Select a thresholding template such as 3-hour blocks every day (adaptive/stdev). Selecting an adaptive template automatically enables Adaptive Thresholding and Time Policies. ITSI backfills the threshold preview with aggregate data from the past 7 days. If there's data in the summary index it loads the summary data, otherwise it loads raw data. Loading raw data can take some time.
  5. Expand the Configure Thresholds for Time Policies panel.
  6. (Optional) Select a time policy block, then click Apply Adaptive Thresholding. For more information, see Apply adaptive thresholds to a KPI in ITSI.
    AdaptiveThresholds.png
    ITSI generates your new threshold time policy, which you can view in the Preview Aggregate Thresholds window. Adaptive thresholds update once daily at midnight.
  7. Click Save.

TIme polices cannot overlap. If you attempt to create a policy that overlaps with another policy, a validation error appears.

Visualize the distribution of your data

It takes a relatively simple Splunk search to visualize the distribution of your data and see outliers. To do so, you need to ensure the KPI of interest has already been built and backfilled. Once backfilled or running for several weeks, you'll have enough data in the itsi_summary index to run the search.

Run the following Splunk search from within ITSI. Insert your KPI name as needed at the beginning.

index=itsi_summary is_service_aggregate=1 kpi="<YOUR KPI NAME HERE>" | bin alert_value as bin_field | stats max(alert_value) as sort_field count by bin_field | sort sort_field | fields - sort_field | makecontinuous bin_field

The above SPL produces the following visualization of your data:

Visualizedist.png

The search consolidates the many alert_value results down to a smaller set of buckets using the bin command. It also leverages the makecontinuous command to ensure that any outliers are visually apparent. The results above show a fairly uniform data distribution with no outliers. If necessary, you can add the span option to your bin command to provide more granularity to your graph.

Create custom threshold time policies

You can create custom threshold time policies tailored to your specific monitoring needs and variations in the usage of your services. You can also enable adaptive thresholding for custom time policies, and set a training window over which historical KPI data is analyzed for adaptive threshold adjustments.

Prerequisite

You must have the write_itsi_kpi_threshold_template capability to create custom threshold time policies. The itoa_admin role is assigned this capability by default.

Steps

  1. In the KPIs list, select the KPI for which you want to set a threshold time policy.
  2. Expand the Thresholding panel.
  3. Select Set Custom Thresholds.
  4. For Enable Time Policies, select Yes.
  5. (Optional) Click Yes to enable adaptive thresholding. Set the Training Window, which is the time window over which historical KPI data is analyzed for adaptive threshold updates.
  6. Expand the Configure Thresholds for Time Policies panel.
  7. Click Add Time Policy.
  8. Configure your custom time policy:
    Field Description
    Title The name of the threshold time policy.
    Start Time (HH:MM) The specific hour and minute at which the threshold time policy begins. Note that the threshold start time supports 15, 30, and 45 minute offsets for compatibility with off-hour time zones.
    Duration The number of hours to which the threshold applies.
    Repeat The specific days of the week to which the time policy applies.
    Apply the threshold values from an existing time policy? (Optional). Choose whether to copy the threshold values from an existing policy over to this policy. This option saves you time and effort if your policies have identical or similar thresholds.
  9. Click Add.
  10. (Optional) To apply adaptive thresholding to this specific time policy, select the policy and click Apply Adaptive Thresholding. For more information, see Apply adaptive thresholds to a KPI in ITSI.

    Policy types that support adaptive thresholds include Standard deviation, Quantile, and Range.

Copy threshold values between policies

If you're creating multiple time policies that require the same threshold values, you can save time by copying the threshold levels and their corresponding values from one policy to another. For example, if you configure Policy 1 with the threshold values Critical=94, High=88, and Medium=75, and you want to use those same levels and values in Policy 2 and Policy 3, you can copy them over rather than inputting the values manually.

  1. Click the More Options icon on a policy and choose Copy threshold values.
    CopyThresholds.png
  2. Select the policies to apply the current threshold values to. You can apply aggregate or per-entity thresholds to aggregate and/or per-entity thresholds of the same policy and/or other policies.
  3. Click Save. ITSI copies the threshold levels, values, and base severity from the original policy over to the policies you selected.

Create a new KPI threshold template

Create a custom KPI threshold template with specific time block combinations that are tailored to your business case.

Prerequisite

You must have the write_itsi_kpi_threshold_template capability to create KPI threshold templates. The itoa_admin role has this capability by default.

Steps

  1. In the ITSI main menu, click Configuration > KPI Threshold Templates.
  2. Click Create Threshold Template. Your role must have write access to the Global team to see this option.
  3. (Optional) Enter a title and description. KPI threshold templates can only exist in the Global team.
  4. Click Create.
  5. Select the template you just created in the KPI Threshold Templates lister page.
  6. Select a Preview Service and Preview KPI. For example, Database Service and CPU Utilization %, respectively.
  7. Select Aggregate Thresholds for the preview to display the single aggregate search value. Or, select Per Entity Thresholds for the preview to display KPI values for each individual entity against which the KPI runs.
  8. Expand the Configure Thresholds Time Policies panel.
  9. Define a time policy for this template. Click Save.

You can now apply your custom thresholding template to any number of KPIs.

Edit thresholding templates

ITSI provides an editing page where you can modify any existing thresholding templates. For example, you might want to change the days and hours for a specific time policy, change the policy type, or apply adaptive thresholding.

  1. Select a KPI in the service definition.
  2. Expand the Thresholding panel.
  3. Select Use Thresholding Template and select the template you want to edit.
  4. Click Edit Template.
  5. Make your modifications to the thresholding templates.
  6. Click Save.
Last modified on 15 January, 2022
Overview of advanced thresholding in ITSI   Create adaptive KPI thresholds in ITSI

This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.11.0, 4.11.1, 4.11.2, 4.11.3, 4.11.4, 4.11.5, 4.11.6


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters