Schedule maintenance downtime in ITSI
Splunk IT Service Intelligence (ITSI) lets you put services and entities into maintenance mode for a specific time period. Use maintenance mode to prevent ITSI from triggering alerts from machines and other devices that are undergoing maintenance operations or don't require active monitoring.
For example, if your ticketing system experiences a failure and you start to receive a cascade of identical notable events, you can put the service that is monitoring the ticketing system into maintenance mode to stop ITSI from generating alerts until the issue is resolved.
Create a new maintenance window
Maintenance windows apply to services and entities.
By default, only users assigned the
write-maintenance_calendar capability can create a maintenance window. By default, the
itoa_team_admin roles have this capability.
- From the ITSI top menu bar, click Configure > Maintenance Windows.
- Click Create Maintenance Window.
- Provide a title for the maintenance window. For example, "DB entity maintenance window."
- Set the start time, duration, and end time for the maintenance window.
- Select the Objects for which you want to create a maintenance window: Entities or Services.
If you don't have write access to the Global team, you can't put entities into maintenance mode.
- Click Next.
- Select the specific services or entities that you want to place in maintenance mode for the duration of the maintenance window. You can only select services or entities for which you have write access.
- Click Create. The selected entities or services enter maintenance mode according to the defined schedule.
When viewing a scheduled maintenance window, you can only see the services included in the maintenance window for which you have read access. If a maintenance window only contains services for which you don't have read access, you can't view the maintenance window. All users can view a maintenance window that contains only entities.
If you're bulk deleting maintenance windows, you can only delete the maintenance windows that contain services or entities for which you have write access.
Impact of maintenance windows
Maintenance windows can have an impact on associated KPIs, service health score calculations, and other ITSI features.
Consider the following when you put a service into maintenance mode:
- All KPIs associated with that service are automatically put into maintenance mode.
- ITSI ignores search results from KPIs in maintenance mode for the purpose of service health score calculation for the duration of the maintenance window.
- Maintenance windows don't affect adaptive threshold calculations. Search results from KPIs in maintenance mode don't count when looking back at past data to calculate threshold values.
Consider the following when you put an entity into maintenance mode:
- If the entity has no KPIs running searches against it, there is no impact on service health scores.
- If the entity has one or more KPIs running searches against it, all search results from all KPIs running against that entity are ignored for the purpose of service health score calculation.
- If a KPI is split by entity, for example if the same KPI is running against two different entities, and one entity is in maintenance mode and one is not, search results generated by the KPI running against the entity in maintenance mode are ignored for the purpose of health score calculation. Search results generated by the same KPI running against the entity that's not in maintenance mode are included as usual in the service health score calculation.
- You can put an entity in full or partial maintenance mode without it being explicitly put into maintenance mode, if a service that contains the entity is put in maintenance mode.
Impact on ITSI features
Services, entities, and KPIs that are fully or partially impacted by a maintenance window appear in a dark gray color on pages that display health scores, including service analyzers, service and entity details pages, glass tables, multi-KPI alerts, and deep dives.
View impacted KPIs
You can view the impact of a maintenance window on associated KPIs.
- Select a maintenance window from the Maintenance Windows lister page.
The maintenance window details page opens, showing the specific services or entities impacted by the maintenance window.
- Click Impacted KPIs to see a list of KPIs impacted by the maintenance window. KPIs that are split by entity, and thus are currently running searches against other entities that are not in maintenance mode, are listed as "Partially" impacted. KPIs that are not split by entity are listed as "Fully" impacted.
When to schedule maintenance windows
It is a best practice to schedule maintenance windows with a 15- to 30-minute time buffer before and after you start and stop your maintenance work. This gives the system an opportunity to catch up with the maintenance state and reduces the chances of ITSI generating false positives during maintenance operations.
For example, if a server will be shut down for maintenance at 1:00PM and restarted at 5:00PM, the ideal maintenance window is 12:30PM to 5:30PM.
The 15- to 30-minute time buffer is a rough estimate based on 15 minutes being the time period over which most KPIs are configured to search data and identify alert triggers.
Create multi-KPI alerts in ITSI
Back up and restore ITSI KV store data
This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.0.0, 4.0.1, 4.0.2, 4.0.3, 4.0.4, 4.1.0, 4.1.1, 4.1.2, 4.1.5, 4.2.0, 4.2.1, 4.2.2, 4.2.3, 4.3.0, 4.3.1, 4.4.0, 4.4.1