Docs » Scenarios for troubleshooting errors and monitoring application performance using Splunk APM » Scenario: Kai investigates the root cause of an error with the Splunk APM service map

Scenario: Kai investigates the root cause of an error with the Splunk APM service map ๐Ÿ”—

Kai, a site reliability engineer at Buttercup Games, receives tickets from multiple customers getting โ€œInvalid requestโ€ errors when purchasing games on the Buttercup Games website.

To troubleshoot the invalid request error reports, Kai takes the following steps:

  1. Kai opens the service map

  2. Kai looks for services that have root-cause errors

  3. Kai selects the service to gather more details

  4. Kai adds a link to Tag Spotlight for the offending endpoint to the customer ticket

Kai opens the service map ๐Ÿ”—

To investigate the downstream service causing the error, Kai searches for โ€œservice mapโ€ and selects the navigation item in the search results to go directly to the Service Map in APM. Kai looks through the real-time service map, which contains nodes and dependencies of services instrumented in Splunk APM.

This animation shows Kai using the search to search for service map and select the navigation item in the search results.

Kai looks for services that have root-cause errors ๐Ÿ”—

The service map identifies the root cause error rate using red. Kai finds that the paymentservice node has a red dot, and the dependency arrow from the checkoutservice node and the paymentservice node is red.

This screenshot shows the service map view of the Buttercup Games website where nodes with root-cause errors are highlighted in red.


Kai selects the service to gather more details ๐Ÿ”—

Kai selects the paymentservice node to discover the endpoint with the top error rate in the Tag Spotlight sidebar. Kai finds that all of the errors occur in one endpoint, as shown in the following screenshot:

This screenshot shows the Tag Spotlight card with endpoint data showing the top error rate and the top latency.

Summary ๐Ÿ”—

Kai used the service map to quickly isolate a service with a high root cause error rate and identified it as the likely culprit of invalid request errors customers were reporting. Kai shares this info with the service owner for further troubleshooting.

Learn more ๐Ÿ”—

To learn more about the service map in Splunk APM, see View dependencies among your services in the service map.

For information about how to instrument your applications to send application metrics and traces to Splunk Observability Cloud, see Instrument back-end applications to send spans to Splunk APM.

This page was last updated on Mar 19, 2024.