Troubleshoot an upgrade of IT Service Intelligence
Use this information to troubleshoot post-upgrade issues.
The ITSI upgrade page is stuck
The migration process is interrupted and ITSI upgrade page is stuck even after a restart.
Cause
Interruptions to the migration process, such as a Splunk restart, might cause the migration page to become stuck.
Resolution
First, check the upgrade status by running the following command:
curl -k -u admin:changeme -X GET https://localhost:8089/servicesNS/nobody/SA-ITOA/migration/info
Sample response:
{ "is_running": true, "start_time": { "since_unix_epoch": 1593203210.6703181, "utc": "2020-06-26T20:26:50Z" }, "skip_local_failure": true }
If is_running
is true
and the migration has been stuck for a long time, you can clear the itsi_migration_status KV store collection and then go to the ITSI app upgrade page to trigger another migration. The following command clears the upgrade KV store collection:
curl -k -u admin:changeme -X DELETE https://localhost:8089/servicesNS/nobody/SA-ITOA/storage/collections/data/itsi_migration_status
Teams validation checks, UI loading, and team creation script fail
The ITSI teams validation checks, UI loading, and the team creation script fail when your Splunk Enterprise instance has a role issue. Roles issues often happen on deployments where a role is missing. For example role_A
inherits from role_B
, but at some point the app where role_B
is defined was removed.
First, run the following search to determine whether you're experience this issue:
index=_internal source=*splunkd.log* ( ERROR "Error retrieving info for role" ) OR ( WARN "Unknown role" )
If there's a role issue, the following errors appear every minute for each broken role:
11-22-2019 09:22:13.260 -0800 ERROR AdminHandler:AuthenticationHandler - Error retrieving info for role: role_B
If this is the case, identify all the roles that are trying to link to the missing roles with the following btool command:
./splunk btool authorize list | grep role_B
For more information, see Use btool to troubleshoot configurations in the Splunk Enterprise Troubleshooting Manual.
To fix the issue, perform one of the following steps:
- Create a local version of authorize.conf at
$SPLUNK_HOME/etc/apps/SA-ITOA/local/
and modify the import list. - Use the UI to edit the role.
- Recreate the missing role.
Knowledge objects are missing after upgrade
If some objects, such as service analyzers, glass tables, or deep dives, are missing from the UI or unaccessible after you upgrade, the ACL objects corresponding to the objects might be missing or corrupted.
- See if the object exists in the KV store. Even if it does exist, there could be duplicates, which you'll address in the next step. Check the list of knowledge objects by name at the following endpoints:
- curl -k -u admin:password https://<host>:<admin_port>/servicesNS/nobody/SA-ITOA/itoa_interface/deep_dive
- curl -k -u admin:password https://<host>:<admin_port>/servicesNS/nobody/SA-ITOA/itoa_interface/glass_table
- curl -k -u admin:password https://<host>:<admin_port>/servicesNS/nobody/SA-ITOA/itoa_interface/home_view
- curl -k -u admin:password https://<host>:<admin_port>/servicesNS/nobody/SA-ITOA/itoa_interface/event_management_state
- curl -k -u admin:password https://<host>:<admin_port>/servicesNS/nobody/SA-ITOA/event_management_interface/notable_event_aggregation_policy
- curl -k -u admin:password https://<host>:<admin_port>/servicesNS/nobody/SA-ITOA/event_management_interface/correlation_search
The value of the
_key
attribute is calledobj_id
or object ID in the next steps. - Check if a corresponding ACL object exists with the ID of the object you're looking for at the following endpoint:
curl -k -u admin:password https://<host>:<admin_port>/servicesNS/nobody/SA-UserAccess/storage/collections/data/app_acl
- If one ACL object exists with the corresponding object ID, and the object is still missing from the UI, contact Splunk Support.
- If two ACL objects exists with the corresponding object ID, delete one of them by running the following command:
curl -k -u admin:password -X DELETE https://<host>:<admin_port>/servicesNS/nobody/SA-UserAccess/storage/collections/data/app_acl/<ACL_ID>
- If no ACL object exists with the corresponding object ID, manually create an ACL object with the following command:
curl -k -u admin:password https://<host>:<admin_port>/servicesNS/nobody/SA-UserAccess/storage/collections/data/app_acl -H "Content-Type: application/json" -X POST -d '{"obj_type":"<OBJ_TYPE>","acl_owner":"nobody","acl_id":"<ACL_ID>","obj_id":"<OBJ_ID>","_user":"nobody","obj_shared_by_inclusion":true,"obj_acl":{"delete":["*"],"write":["*"],"obj_owner":"nobody","read":["*"]},"_key":"<ACL_ID>","obj_storename":"<OBJ_STORENAME>","obj_app":"itsi"}'
Replace the tokens with the following values:
Object name OBJ_TYPE OBJ_STORENAME OBJ_ID ACL_ID Service analyzer home_view itsi_service_analyzer ID of the missing object unique ID Deep dive deep_dive itsi_pages ID of the missing object unique ID Glass table glass_table itsi_pages ID of the missing object unique ID Episode review event_management_state itsi_event_management ID of the missing object unique ID Notable event aggregation policy notable_aggregation_policy itsi_notable_event_aggregation_policy ID of the missing object unique ID Correlation search correlation_search itsi_correlation_search ID of the missing object unique ID ACL_ID
must be a unique value.
The Global team is missing after upgrade
All services in ITSI must be assigned to a team. If migration fails with the error Failed to import Team settings
, you can manually run the Python script called itsi_reset_default_team.py
. The script manually creates the Global team in the KV store which completes the migration.
To run the script, perform the following steps:
- Run the following commands on any search head in your ITSI deployment:
cd $SPLUNK_HOME/etc/apps/SA-ITOA/bin $SPLUNK_HOME/bin/splunk cmd python itsi_reset_default_team.py
- Provide the splunkd port number and your Splunk username and password when prompted.
After the script finishes successfully, the Global team is created in the KV store. - Restart your Splunk software.
Duplicate Windows or VMware entities after entity import
Cause
The ITSI Import Objects - VMware VM saved searches fails to merge entities with the host
field and may create duplicate entities.
Resolution
Update the saved search.
- Disable the ITSI Import Objects - VMware VM saved search.
- Copy the ITSI Import Objects - VMware VM saved search and change the
entity_merge_field
attribute tohost
. - Enable the updated ITSI Import Objects - VMware VM search.
Duplicate ITSI license error
Cause
Two ITSI licenses are being flagged as duplicates on the system.
Resolution
Enable AllowDuplicateKeys
in the license XML.
- Go to the node where search peers are configured.
- Identify the Splunk licenses (Enterprise, ITSI, non ITSI) currently installed. Ignore licenses under IT Service Intelligence Internals DO NOT COPY.
- Navigate to http://LM_IP/en-US/manager/system/licensing/licenses and check if the
AllowDuplicateKeys
capability is enabled for each of the license identified in step 1. - If not enabled, procure a new license from Splunk support and replace it.
- Make sure all licenses in the stack have the capability enabled.
- Restart Splunk.
Here is a sample license with AllowDuplicateKeys
enabled:
<?xml version="1.0" encoding="UTF-8"?> <license> <signature>UktliszY9Qpn3FiNwRqNHpTyYLfPW4ehn0LZOyamhD8Iuj6jhULWKRkuRq5dSE9Q67pc8NoLpyHRTU5s1cDXL+1vSWzfwooWszTvnh3pFxxQExnniRveifUqq7Xc15lVoab6WMxq4DmggAoco39e6UeNPGS2l+b6ASZ8jVm8xj7kzsmBTPQF0+nH1eAX0EE6Y9rC8/B4k9cTzZKeWPlfDU7OvoZT2rmirLdURUXaaRE9khwH68iMsID8ODqSzH2+bboAaaFXAbh/PU2HqYUzumzxzqf4s7fTlGmwCY+lMAUQHXaZV7eaCY35A762XWbYZ90k9BS+lboiI2MLOYVPOQ==</signature> <payload> <type>enterprise</type> <group_id>Enterprise</group_id> <quota>1</quota> <max_violations>5</max_violations> <window_period>30</window_period> <creation_time>1618383600</creation_time> <label>Splunk IT Service Intelligence Internal License DO NOT DISTRIBUTE</label> <expiration_time>1659205961</expiration_time> <features> <feature>Auth</feature> <feature>FwdData</feature> <feature>LocalSearch</feature> <feature>ScheduledSearch</feature> <feature>AllowDuplicateKeys</feature> <feature>Alerting</feature> <feature>SplunkWeb</feature> </features> <add_ons> <add_on name="itsi" type="app"> <parameter key="size" value="1000000"/> </add_on> </add_ons> <sourcetypes/> <guid>F4C8DBB2-84F2-4A82-AA43-CA7CA786B360</guid></payload> </license>
Prechecks fail during the upgrade
One of the migration jobs that run during the ITSI upgrade process displays a Failed status.
Cause
Errors with specific ITSI objects, such as as services or KPIs, are causing issues with the upgrade and need to be addressed.
Resolution
When one of the checks fail, you can either select Proceed anyway or Restart upgrade:
- When you select Proceed anyway, the precheck job runs again but ignores any failed prechecks and continues with the upgrade. You can choose to fix the errors identified by the prechecks at a later time.
- When you select Restart upgrade, the prechecks run again. If there are still failed prechecks, contact Splunk Support.
Insufficient permission to perform upgrade
The ITSI upgrade can't complete because a user has insufficient permissions.
Cause
A user can't perform an upgrade because they are missing the write_itsi_backup_restore or delete_itsi_backup_restore capability.
Resolution
Add the write_itsi_backup_restore and delete_itsi_backup_restore capabilities to the user's role.
Version-specific upgrade notes for ITSI | ITSI upgrade paths |
This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.16.0 Cloud only, 4.17.0, 4.17.1, 4.18.0, 4.18.1, 4.19.0, 4.19.1, 4.19.2
Feedback submitted, thanks!