
Add a KPI to a service in ITSI
ITSI uses KPI searches to monitor the performance of your IT services. For more information on KPIs, see ITSI concepts and features in this manual.
You must add at least one KPI search to a service to use ITSI. For information on how the number of KPIs can impact performance, see Performance considerations in this manual.
About KPI search properties
When you create a KPI search, you configure a set of search properties. KPI search properties include:
- Source search
- A search string that you define as the basis for your KPI using a data model search, ad hoc search, metrics search, or base search.
- Threshold field
- The specific field in your data that the KPI search monitors and aggregates. For example
CPU_load_percent
ormem_free
. - Entity Split Field (optional)
- Field in your data that can be used to break down the KPI. This allows you to apply a KPI search to multiple entities, enabling comparative analysis of search results on a per entity basis. This field can be different from the Entity Filter Field.
- Entity Filter Field and Entity Alias Filtering (optional)
- You can filter entities in or out of a KPI search using Entity Filter Field. You can map entity aliases to fields in your search data to determine the specific entities to which a KPI search applies.
- Monitoring calculations
- The recurring KPI search schedule and statistical operations on search results, including service health score calculations.
- Severity-level thresholds
- Thresholds that you apply to KPI search results. Severity-level thresholds let you monitor KPI status (normal, low, medium, high, and critical) and set trigger conditions for alerts.
For example, to monitor the CPU load percentage of an entity (machine) in a service, you can create a KPI using an ad hoc base search that returns the value of the field cpu_load_percent
at 5 minute intervals over a 5 minute time range, then set a range of severity-level thresholds between 0% and 100%.
Create a KPI search
This section shows you how to create a basic KPI search. These instructions assume that you have already created a service. If not, see Overview of creating services in ITSI in this manual. You must create a service with at least one KPI to run ITSI.
Step 1: Add a new KPI
- Click Configure > Services.
- Select an existing service.
- In the New dropdown menu in the KPI tab, choose one of the following two options:
- Select Generic KPI.
- Select a KPI template. For example, Application Server: CPU and Memory > Memory Used. KPI templates provide pre-configured KPI source searches, including ad hoc searches and base searches, based on ITSI modules. KPI templates are tailored for specific service monitoring use cases, such as operating systems, databases, web servers, load balancers. virtual machines, and so on.
- Select Generic KPI.
- In Step 1 of the The KPI creation dialog, enter the KPI Title and Description (optional). Click Next.
Step 2: Define a source search
When you create a KPI, you must define a source search on which to build the KPI. You can chose from four source search types: data model, metrics search, ad hoc search, and base search.
Note: Before you define your source search, consider the performance implications for your particular deployment. While data models are suitable for smaller test environments, base searches generally provide best performance in larger production settings. See Create KPI base searches in ITSI.
Define a source search from a data model
- Configure your data model search.
Field Description KPI source Data Model Data Model The data model object, child, and attribute fields. For example, Host Operating System
>Memory
>mem_used_percent
.When you create a KPI search from a data model, the data model object field becomes the threshold field. When you create a KPI search from an ad hoc search, you must manually enter the threshold field.
Filters (optional) Click Add Filter to add data model filter conditions. Data model filters let you include/exclude search result data based on the filter conditions. For example, the filter condition host Equals ipaddress
filters out all values for the data model search fieldhost
, except for values that equalipaddress
. Data model filtering can help improve the speed and accuracy of your searches by excluding extraneous data from search results. - Click Generated Search to preview your KPI search string.
Use the Generated Search box to view changes that ITSI makes to your search string as you build your KPI. Click anywhere on the Generated Search itself to run the search.
- Click Next.
Define a source search from a metrics search
- Configure your metrics search.
Field Description KPI source Metrics Search If there are no metrics indexes configured in your Splunk deployment, you will receive the message: "No metrics found." For more information about metrics, see Get started with Metrics in the Splunk Enterprise Metrics Manual.
Metrics Search Select the metrics index from which to choose a metric. Metric Select the metric to use for the KPI. For example, memory.used
. - Click Generated Search to preview your KPI search string. Metrics searches begin with the
mstats
command. - Click Next.
Define a source search from an ad hoc search
- Configure your ad hoc search.
Field Description KPI source Ad hoc Search Search The ad hoc search string that you create. This is the event gathering search for the KPI.
Note: The use of transforming commands, themstats
command, the`gettime`
macro, or time modifiers in your KPI search is not recommended as this may cause issues with KPI backfill, the display of raw data on ITSI views such as glass tables and deep dives that allow you to run KPI searches against raw data, and the KPI threshold preview.Threshold Field The field in your data that the KPI aggregates and monitors. For pure counts use _time
. - Click Generated Search to preview your KPI search string.
- Click Next.
Define a source search from a base search
- Configure your base search.
Field Description KPI source Base Search Base Search The base search that you want to associate with the KPI. For example, DA-ITSI-OS: Performance.Memory. Base searches provide pre-configured KPI templates built on ITSI modules. Metric The metric that you want to associate with the KPI. For example, mem_free_percent
. - Click Generated Search to preview your KPI search string.
- Click Next.
Note: Most fields in the next window (steps 3 through 6) are pre-populated for the base search by the KPI template. For more information on how to create and configure KPI base searches, see Create KPI base searches.
Step 3: Filter entities
Filter entities to have more granular control of your KPI at the entity level.
Split by Entity
The Split by Entity option lets you maintain a breakdown of KPI values at the entity level. Use Split by Entity to enable monitoring of KPI values for each individual entity against which a KPI is running.
You must split KPIs by entity to use the following ITSI features:
- Per entity thresholds. See Set thresholds below.
- Entity overlays. See Add overlays to a deep dive in ITSI in the ITSI User Manual.
- Maximum severity view in the Service Analyzer. See Aggregate vs. maximum severity KPI values in ITSI in the ITSI User Manual.
- Cohesive anomaly detection. See Apply anomaly detection in ITSI in this manual.
Configure the following fields:
Field | Description |
---|---|
Split by Entity | Enable/disable a breakdown of KPI values at the entity level. The KPI must be running against two or more entities. |
Entity Split Field | Specify the field in your data to use to look up the corresponding split by entities. The default lookup field for data model searches and ad hoc searches is host . For metrics searches, select a dimension associated with the metric. This field is case sensitive. When filtering a KPI down to entities, you can split by a field other than the field you are using for filtering the entities (specified in the Entity Filter Field). This allows you to filter to the hosts that affect your service, but split out your data by a different field. For example, you might want to filter down to all of your database hosts but split the metric by the processes running on the hosts. |
Entity filtering
Entity filtering lets you specify the entities against which a KPI search will run. Provide the entity filter field and apply entity alias filters to reduce collection of extraneous data.
Configure the following fields:
Field | Description |
---|---|
Filter to Entities in Service | Enable/disable entity filtering. |
Entity Filter Field | Specify the field in your data to use to look up the corresponding entities by which to filter the KPI. For metrics searches, select a dimension for the metric. The default field for data model searches, ad hoc searches, and metrics searches is host . This field can be different than the field used for the Entity Split Field.
|
Entity Alias Filtering | The specific entity alias that you want to use as a filter. For example, host . This filters out all entity alias values from the KPI search, except those that match the name host . For example, if your service has these two entity aliases:
host=sr-centos2.sv.splunk.com IP= 10.141.20.37 your KPI search will include an OR clause, such as:
When you specify
|
For more information on entity aliases, see Define Entities in ITSI in this manual.
Step 4: Add monitoring calculations
Configure the following KPI monitoring calculations:
Field | Description |
---|---|
KPI Search Schedule | Determines the frequency of the KPI search. Avoid scheduling searches at one minute intervals. Running multiple concurrent KPI searches at short intervals can produce lengthy search queues and is not necessary to monitor most KPIs. |
Service/Aggregate Calculation | The statistical operation that ITSI performs on KPI search results. The correct aggregate calculation to use depends on the type of KPI search. For example, if your search returns results for CPU Load percentage, you could use Average . If your search returns a count, such as number of errors, then you would want to use Count .
|
Calculation Window | The time period over which the calculation applies. For example, Last 5 Minutes .
|
Fill Data Gaps with | How to treat gaps in your data. This affects how KPI data gaps are displayed in service analyzers, deep dive KPI lanes, glass table KPI widgets, and other dashboards in ITSI populated by the summary index.
Filled gap values are not used in the calculations performed for Anomaly Detection and Adaptive Thresholding. |
Click Next.
How filling data gaps with last reported value works
Each time the saved search runs for a KPI with the Fill Data Gaps with option set to Last available value, the alert value for the KPI is cached in a KV store collection called itsi_kpi_summary_cache. ITSI uses a lookup named itsi_kpi_alert_value_cache in the KPI saved search to fill entity-level and service-aggregate gaps for the KPI using the cached alert value.
To prevent bloating of the collection with entity/service-aggregate KPI results, a retention policy runs on the itsi_kpi_summary_cache collection using a Splunk modular input. The modular input runs every 15 minutes and removes the entries from cache that have not been updated for more than 30 minutes. 15 minutes is the default frequency and 30 minutes is the default retention time for entries in cache. You can change the frequency and retention time in the [itsi_age_kpi_alert_value_cache://age_kpi_alert_value_cache]
stanza of the SA-ITOA/local/inputs.conf
file.
The filling of data gaps with the last reported value occurs for at most 45 minutes, in accordance with the modular input interval and retention time (15 minutes + 30 minutes by default). If data gaps for a KPI continue to occur for more than 30 to 45 minutes, the KPI will stop getting filled with the last reported value and data gaps will start displaying as N/A values.
Step 5: Optional setup
Configure the following optional settings:
Field | Description |
---|---|
Unit | The unit of measurement that you want to appear in KPI visualizations. For example, GB, Mbps, secs, and so on. |
Monitoring lag | The monitoring lag time (in seconds) to offset the indexing lag. When indexing large quantities of data, an indexing lag can occur, which can cause performance issues. |
Enable backfill | Enables backfill of the KPI summary index (itsi_summary). Requires you to have indexed adequate raw data for the backfill period. You must have 7 days of summary data in the summary index for Adaptive Thresholding to work properly. If backfill is performed for a KPI using Last available value to fill data gaps, then data gaps are backfilled with filled-in alert values (using the last reported value for the KPI) instead of N/A alert values. If backfill is performed for a KPI using a Custom value to fill data gaps, then data gaps are backfilled with filled-in alert values (using the custom value provided), instead of N/A alert values. |
Backfill period | The time range of data that is available after backfill is complete.
Selecting a backfill period initiates the backfill. A message appears in Splunk Web that informs you when the backfill is complete. |
Click Next.
Step 6: Set thresholds
Severity-level thresholds determine the current status of your KPI. When KPI values meet threshold conditions, the KPI status changes, for example, from high (yellow) to critical (red). The current status of the KPI is reflected in all views across the product, including Service Analyzers, Glass Tables, and Deep Dives.
You can manually add threshold values for your KPIs one at a time using the threshold preview window. Or apply threshold time policies, which automatically adapt threshold values based on day and time. See Create KPI threshold time policies on this page.
ITSI supports two types of KPI severity-level thresholds: Aggregate thresholds and per entity thresholds. Adaptive thresholds can be used with aggregate thresholds but not entity thresholds.
Set aggregate threshold values
Aggregate thresholds are useful for monitoring the status of aggregated KPI values. For example, you might apply aggregate thresholds to monitor the status of KPIs that return the total number of service requests or service errors, based on a calculation that uses the stats count function.
- Click Aggregate Thresholds.
- Click Add threshold to add a range of severity-level thresholds to the threshold preview graph.
- Click Finish.
Set per-entity threshold values
Per-entity thresholds are useful for monitoring multiple separate entities against which a single KPI is running. For example, you might have a KPI, such as Free Memory %, that is running against three separate servers. Using per-entity thresholds, you can monitor the status of Free Memory % on each individual server.
Adaptive thresholding cannot be used on a per-entity basis.
Prerequisites
To use per-entity thresholds, a KPI must be split by entity. See "Step 3: Filter entities" above.
Steps
PREVIOUS How service health scores work in ITSI |
NEXT Set KPI importance values in ITSI |
This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.0.0, 4.0.1, 4.0.2, 4.0.3, 4.0.4
Feedback submitted, thanks!