Splunk® IT Service Intelligence

Administration Manual

Acrobat logo Download manual as PDF


Splunk IT Service Intelligence version 4.0.x reached its End of Life on January 19, 2021. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see Plan an upgrade of IT Service Intelligence.
This documentation does not apply to the most recent version of ITSI. Click here for the latest version.
Acrobat logo Download topic as PDF

Add a KPI to a service in ITSI

ITSI uses KPI searches to monitor the performance of your IT services. For more information on KPIs, see ITSI concepts and features in this manual.

You must add at least one KPI search to a service to use ITSI. For information on how the number of KPIs can impact performance, see Performance considerations in this manual.

About KPI search properties

When you create a KPI search, you configure a set of search properties. KPI search properties include:

Source search
A search string that you define as the basis for your KPI using a data model search, ad hoc search, metrics search, or base search.
Threshold field
The specific field in your data that the KPI search monitors and aggregates. For example CPU_load_percent or mem_free.
Entity Split Field (optional)
Field in your data that can be used to break down the KPI. This allows you to apply a KPI search to multiple entities, enabling comparative analysis of search results on a per entity basis. This field can be different from the Entity Filter Field.
Entity Filter Field and Entity Alias Filtering (optional)
You can filter entities in or out of a KPI search using Entity Filter Field. You can map entity aliases to fields in your search data to determine the specific entities to which a KPI search applies.
Monitoring calculations
The recurring KPI search schedule and statistical operations on search results, including service health score calculations.
Severity-level thresholds
Thresholds that you apply to KPI search results. Severity-level thresholds let you monitor KPI status (normal, low, medium, high, and critical) and set trigger conditions for alerts.

For example, to monitor the CPU load percentage of an entity (machine) in a service, you can create a KPI using an ad hoc base search that returns the value of the field cpu_load_percent at 5 minute intervals over a 5 minute time range, then set a range of severity-level thresholds between 0% and 100%.

Create a KPI search

This section shows you how to create a basic KPI search. These instructions assume that you have already created a service. If not, see Overview of creating services in ITSI in this manual. You must create a service with at least one KPI to run ITSI.

Step 1: Add a new KPI

  1. Click Configure > Services.
  2. Select an existing service.
  3. In the New dropdown menu in the KPI tab, choose one of the following two options:
    • Select Generic KPI.
    • Select a KPI template. For example, Application Server: CPU and Memory > Memory Used. KPI templates provide pre-configured KPI source searches, including ad hoc searches and base searches, based on ITSI modules. KPI templates are tailored for specific service monitoring use cases, such as operating systems, databases, web servers, load balancers. virtual machines, and so on.
  4. In Step 1 of the The KPI creation dialog, enter the KPI Title and Description (optional). Click Next.

Step 2: Define a source search

When you create a KPI, you must define a source search on which to build the KPI. You can chose from four source search types: data model, metrics search, ad hoc search, and base search.

Note: Before you define your source search, consider the performance implications for your particular deployment. While data models are suitable for smaller test environments, base searches generally provide best performance in larger production settings. See Create KPI base searches in ITSI.

Define a source search from a data model

  1. Configure your data model search.
    Field Description
    KPI source Data Model
    Data Model The data model object, child, and attribute fields. For example, Host Operating System > Memory > mem_used_percent.

    When you create a KPI search from a data model, the data model object field becomes the threshold field. When you create a KPI search from an ad hoc search, you must manually enter the threshold field.

    Filters (optional) Click Add Filter to add data model filter conditions. Data model filters let you include/exclude search result data based on the filter conditions. For example, the filter condition host Equals ipaddress filters out all values for the data model search field host, except for values that equal ipaddress. Data model filtering can help improve the speed and accuracy of your searches by excluding extraneous data from search results.
  2. Click Generated Search to preview your KPI search string.
    Use the Generated Search box to view changes that ITSI makes to your search string as you build your KPI. Click anywhere on the Generated Search itself to run the search.
    GeneratedSearch.png
  3. Click Next.

Define a source search from a metrics search

  1. Configure your metrics search.
    Field Description
    KPI source Metrics Search

    If there are no metrics indexes configured in your Splunk deployment, you will receive the message: "No metrics found." For more information about metrics, see Get started with Metrics in the Splunk Enterprise Metrics Manual.

    Metrics Search Select the metrics index from which to choose a metric.
    Metric Select the metric to use for the KPI. For example, memory.used.
  2. Click Generated Search to preview your KPI search string. Metrics searches begin with the mstats command.
  3. Click Next.

Define a source search from an ad hoc search

  1. Configure your ad hoc search.
    Field Description
    KPI source Ad hoc Search
    Search The ad hoc search string that you create. This is the event gathering search for the KPI.

    Note: The use of transforming commands, the mstats command, the `gettime` macro, or time modifiers in your KPI search is not recommended as this may cause issues with KPI backfill, the display of raw data on ITSI views such as glass tables and deep dives that allow you to run KPI searches against raw data, and the KPI threshold preview.
    Threshold Field The field in your data that the KPI aggregates and monitors. For pure counts use _time.
  2. Click Generated Search to preview your KPI search string.
  3. Click Next.

Define a source search from a base search

  1. Configure your base search.
    Field Description
    KPI source Base Search
    Base Search The base search that you want to associate with the KPI. For example, DA-ITSI-OS: Performance.Memory. Base searches provide pre-configured KPI templates built on ITSI modules.
    Metric The metric that you want to associate with the KPI. For example, mem_free_percent.
  2. Click Generated Search to preview your KPI search string.
  3. Click Next.

Note: Most fields in the next window (steps 3 through 6) are pre-populated for the base search by the KPI template. For more information on how to create and configure KPI base searches, see Create KPI base searches.

Step 3: Filter entities

Filter entities to have more granular control of your KPI at the entity level.

Split by Entity

The Split by Entity option lets you maintain a breakdown of KPI values at the entity level. Use Split by Entity to enable monitoring of KPI values for each individual entity against which a KPI is running.

You must split KPIs by entity to use the following ITSI features:

Configure the following fields:

Field Description
Split by Entity Enable/disable a breakdown of KPI values at the entity level. The KPI must be running against two or more entities.
Entity Split Field Specify the field in your data to use to look up the corresponding split by entities. The default lookup field for data model searches and ad hoc searches is host. For metrics searches, select a dimension associated with the metric. This field is case sensitive.

When filtering a KPI down to entities, you can split by a field other than the field you are using for filtering the entities (specified in the Entity Filter Field). This allows you to filter to the hosts that affect your service, but split out your data by a different field. For example, you might want to filter down to all of your database hosts but split the metric by the processes running on the hosts.

Entity filtering

Entity filtering lets you specify the entities against which a KPI search will run. Provide the entity filter field and apply entity alias filters to reduce collection of extraneous data.

Configure the following fields:

Field Description
Filter to Entities in Service Enable/disable entity filtering.
Entity Filter Field Specify the field in your data to use to look up the corresponding entities by which to filter the KPI. For metrics searches, select a dimension for the metric. The default field for data model searches, ad hoc searches, and metrics searches is host. This field can be different than the field used for the Entity Split Field.
Entity Alias Filtering The specific entity alias that you want to use as a filter. For example, host. This filters out all entity alias values from the KPI search, except those that match the name host. For example, if your service has these two entity aliases:
host=sr-centos2.sv.splunk.com
IP= 10.141.20.37

your KPI search will include an OR clause, such as:

| search Performance.dest=sr-centos2.sv.splunk.com OR dest=10.141.20.37

When you specify host as an entity alias filter, this limits the KPI search clause to:

| search Performance.dest=sr-centos.sv.splunk.com

For more information on entity aliases, see Define Entities in ITSI in this manual.

Step 4: Add monitoring calculations

Configure the following KPI monitoring calculations:

Field Description
KPI Search Schedule Determines the frequency of the KPI search.

Avoid scheduling searches at one minute intervals. Running multiple concurrent KPI searches at short intervals can produce lengthy search queues and is not necessary to monitor most KPIs.

Service/Aggregate Calculation The statistical operation that ITSI performs on KPI search results. The correct aggregate calculation to use depends on the type of KPI search. For example, if your search returns results for CPU Load percentage, you could use Average. If your search returns a count, such as number of errors, then you would want to use Count.
Calculation Window The time period over which the calculation applies. For example, Last 5 Minutes.
Fill Data Gaps with How to treat gaps in your data. This affects how KPI data gaps are displayed in service analyzers, deep dive KPI lanes, glass table KPI widgets, and other dashboards in ITSI populated by the summary index.
  • Select Null values to fill gaps in data with N/A values. Also select the severity level to use for Null values.
  • Select Last available value to use the last reported value in the ITSI summary index. For aggregate level KPIs, service aggregate data gaps are filled with the last reported aggregate KPI value. For entity level KPIs, entity data gaps are filled with the corresponding entity's last available value. After the entity gaps have been filled, the service aggregate result is calculated for the KPI.
  • Select Custom value to specify a specific value to use when there is a gap in data. Enter a positive integer.

Filled gap values are not used in the calculations performed for Anomaly Detection and Adaptive Thresholding.

Click Next.

How filling data gaps with last reported value works

Each time the saved search runs for a KPI with the Fill Data Gaps with option set to Last available value, the alert value for the KPI is cached in a KV store collection called itsi_kpi_summary_cache. ITSI uses a lookup named itsi_kpi_alert_value_cache in the KPI saved search to fill entity-level and service-aggregate gaps for the KPI using the cached alert value.

To prevent bloating of the collection with entity/service-aggregate KPI results, a retention policy runs on the itsi_kpi_summary_cache collection using a Splunk modular input. The modular input runs every 15 minutes and removes the entries from cache that have not been updated for more than 30 minutes. 15 minutes is the default frequency and 30 minutes is the default retention time for entries in cache. You can change the frequency and retention time in the [itsi_age_kpi_alert_value_cache://age_kpi_alert_value_cache] stanza of the SA-ITOA/local/inputs.conf file.

The filling of data gaps with the last reported value occurs for at most 45 minutes, in accordance with the modular input interval and retention time (15 minutes + 30 minutes by default). If data gaps for a KPI continue to occur for more than 30 to 45 minutes, the KPI will stop getting filled with the last reported value and data gaps will start displaying as N/A values.

Step 5: Optional setup

Configure the following optional settings:

Field Description
Unit The unit of measurement that you want to appear in KPI visualizations. For example, GB, Mbps, secs, and so on.
Monitoring lag The monitoring lag time (in seconds) to offset the indexing lag. When indexing large quantities of data, an indexing lag can occur, which can cause performance issues.
Enable backfill Enables backfill of the KPI summary index (itsi_summary). Requires you to have indexed adequate raw data for the backfill period.

You must have 7 days of summary data in the summary index for Adaptive Thresholding to work properly.

If backfill is performed for a KPI using Last available value to fill data gaps, then data gaps are backfilled with filled-in alert values (using the last reported value for the KPI) instead of N/A alert values. If backfill is performed for a KPI using a Custom value to fill data gaps, then data gaps are backfilled with filled-in alert values (using the custom value provided), instead of N/A alert values.
Backfill period The time range of data that is available after backfill is complete.

Selecting a backfill period initiates the backfill. A message appears in Splunk Web that informs you when the backfill is complete.

Click Next.

Step 6: Set thresholds

Severity-level thresholds determine the current status of your KPI. When KPI values meet threshold conditions, the KPI status changes, for example, from high (yellow) to critical (red). The current status of the KPI is reflected in all views across the product, including Service Analyzers, Glass Tables, and Deep Dives.

You can manually add threshold values for your KPIs one at a time using the threshold preview window. Or apply threshold time policies, which automatically adapt threshold values based on day and time. See Create KPI threshold time policies on this page.

ITSI supports two types of KPI severity-level thresholds: Aggregate thresholds and per entity thresholds. Adaptive thresholds can be used with aggregate thresholds but not entity thresholds.

Set aggregate threshold values

Aggregate thresholds are useful for monitoring the status of aggregated KPI values. For example, you might apply aggregate thresholds to monitor the status of KPIs that return the total number of service requests or service errors, based on a calculation that uses the stats count function.

  1. Click Aggregate Thresholds.
  2. Click Add threshold to add a range of severity-level thresholds to the threshold preview graph.
    AggThresholds.png
  3. Click Finish.

Set per-entity threshold values

Per-entity thresholds are useful for monitoring multiple separate entities against which a single KPI is running. For example, you might have a KPI, such as Free Memory %, that is running against three separate servers. Using per-entity thresholds, you can monitor the status of Free Memory % on each individual server.

Adaptive thresholding cannot be used on a per-entity basis.

Prerequisites

To use per-entity thresholds, a KPI must be split by entity. See "Step 3: Filter entities" above.

Steps

  1. Click Per Entity Thresholds.
  2. Click Add threshold to add a range of severity-level thresholds to the threshold preview graph.
    The threshold preview shows a separate search results graph for each entity that the KPI is running against.
    Per entity threshold.png
  3. Click Finish.
Last modified on 19 December, 2018
PREVIOUS
How service health scores work in ITSI
  NEXT
Set KPI importance values in ITSI

This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.0.0, 4.0.1, 4.0.2, 4.0.3, 4.0.4


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters