IT Service Intelligence concepts and features
Familiarize yourself with the following concepts and features before deploying Splunk IT Service Intelligence (ITSI).
ITSI workflow overview
The following diagram gives an overview of the basic workflows involved in configuring and using ITSI:
Key ITSI concepts
A service is a logical mapping of IT objects that applies to your business goals. Splunk ITSI provides for a broad definition of a service. For example, a service can be:
- An application or group of applications
- An infrastructure tier (such as web, database, or network tier)
- A business service, such as an online store, with multiple tiers
- A single process, such as one instance of an application running on a host
ITSI lets you create services that model your IT infrastructure. ITSI services contain KPIs (Key Performance Indicators), which make it possible to monitor service health, perform root cause analysis, receive alerts, and ensure that your IT operations are in compliance with business SLAs (service-level agreements).
An entity is an IT infrastructure component, such as
- A physical or virtual server
- A network device (switch, router)
- A user (AD/LDAP)
- A storage system or volume
- An operating system process
- A software application (database, web server, business app)
- An application process instance (for example, 2 instances of the same web server application is 2 separate entities)
Each entity has specific attributes and relationships to other IT processes that uniquely identify it. For example, a server that you define as an entity can have multiple IP addresses, MAC addresses, DNS names, and so on.
Data comes from applications on the entities, log files, and through API data collection processes. You can map data to entities in ITSI through aliases (field/value pairs) that you extract from Splunk searches. For example:
host=10.141.24.63. In this way, entities link services to Splunk search results.
Entities are optional.
A KPI (Key Performance Indicator) is a recurring saved search that returns the value of an IT performance metric, such as CPU load percentage, memory used percentage, response time, and so on.
ITSI lets you create KPIs and add them to your services. You can then use KPI search result values inside ITSI to monitor service health, check the status of IT components, and troubleshoot trends that might indicate an issue with your IT systems.
cpu_load_percent is a KPI that measures the CPU load percentage on a server. If your organization has a site uptime guarantee of 99.9% per month, you will need to monitor the status of this KPI (and others) to ensure that CPU performance remains within acceptable parameters.
Key ITSI features
The Service Analyzer provides an overview of ITSI service health scores and KPI search results that are currently trending at the highest severity levels. Use the Service Analyzer to quickly view the status of IT operations and to identify services and KPIs running outside expected norms. Click on any tile in the Service Analyzer to drill down to the deep dives for further analysis and comparison of search results over time.
For more information, see Monitor the health of your services with the ITSI Service Analyzer in the IT Service Intelligence User Manual.
Glass tables are custom visualizations that let you monitor KPI and service health scores. You can use glass tables to create dynamic contextual views of your IT topology or business processes and monitor them in real time. Glass tables features a drawing canvas where you can draw custom images, upload pre-existing images, and/or add icons from the Splunk icon library.
For more information, see Create a glass table in ITSI in the Splunk IT Intelligence User Manual.
Deep dives are an investigative tool that let you quickly identify and troubleshoot issues in your IT environment. Deep dives provide swimlane views that let you stack KPI search results over time and create contextual views showing all KPIs in a service. You can use deep dives to quickly zoom in on metric and log events, and visually correlate root cause.
For more information, see Overview of deep dives in ITSI in the IT Service Intelligence User Manual.
A multi-KPI alert is an alert that is based on multiple KPI trigger conditions. When trigger conditions occur simultaneously, a correlation search generates a notable event. Multi-KPI alerts let you correlate the status of multiple KPIs, which can give you insight into system behavior, and help you to identify causal relationships that might negatively impact system performance.
For more information, see Create multi-KPI alerts in ITSI in the IT Service Intelligence User Manual.
ITSI provides a notable events management framework that lets you triage and analyze groups of notable events (episodes). ITSI generates notable events when a correlation search or multi-KPI alert meets specific conditions that you define. An episode is a group of events occurring as part of a larger sequence (an incident or period considered in isolation). Use Episode Review to view episode details and identify issues that might impact the performance and availability of your IT services.
Other ITSI notable events management features include a python-based, notable event action SDK, which lets you define secondary, post-notable event actions, such as add tags, add comments, view notable event activities, change owner, change status, change severity, and so on.
For more information, see Overview of Episode Review in ITSI in the IT Service Intelligence User Manual.
ITSI modules provide pre-built KPIs, entity definitions, and dashboard visualizations. ITSI modules are tailored to specific IT use cases, such as monitoring and troubleshooting operating system hosts, load balancers, databases, app servers, and so on. Modules are optimized to process data that you collect using Splunk add-ons.
For more information, see ITSI Modules overview in the IT Service Intelligence Modules Manual.
About Splunk IT Service Intelligence
Overview of entities in ITSI
This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.0.0, 4.0.1, 4.0.2, 4.0.3, 4.0.4, 4.1.0, 4.1.1, 4.1.2, 4.1.5, 4.2.0, 4.2.1, 4.2.2, 4.3.0