Docs » Supported integrations in Splunk Observability Cloud » Instrument back-end applications to send spans to Splunk APM » Instrument Go applications for Splunk Observability Cloud » Troubleshoot Go instrumentation for Splunk Observability Cloud

Troubleshoot Go instrumentation for Splunk Observability Cloud πŸ”—

When you instrument a Go application using the Splunk Distribution of OpenTelemetry Go and you don’t see your data in Splunk Observability Cloud, follow these troubleshooting steps.

Steps for troubleshooting Go OpenTelemetry issues πŸ”—

The following steps can help you troubleshoot Go instrumentation issues:

  1. Activate debug logging.

  2. Check for missing spans.

  3. Make sure the endpoint is correct.

Activate debug logging πŸ”—

Debug logging increases the verbosity of the Go instrumentation. This can help you troubleshoot issues. To activate debug logging, set the OTEL_LOG_LEVEL environment variable to debug.

export OTEL_LOG_LEVEL="debug"

Make sure to unset the environment variable after the issue is resolved, as its output might overload systems if left on indefinitely.

Check for missing spans πŸ”—

Go instrumentation might drop spans due to several reasons. Follow these steps to make sure that the instrumentation isn’t dropping valid spans.

All spans from a service are missing πŸ”—

If you don’t see spans in Splunk Observability Cloud for your service, do the following:

  1. Wait a few minutes and check again. There might be delays in the telemetry pipeline.

  2. Check whether the service names appear in Splunk Observability Cloud with the unknown_service prefix. For example, unknown_service:go. If that’s the case, set the OTEL_SERVICE_NAME environment variable to the name of your service and restart your application.

  3. Check your debug logs for messages like the following:

    exporting spans {"count": 154, "total_dropped": 0}
    

    The value of count in the log message is the number of spans exported for a given batch:

    • If count is higher than 0, the instrumentation is exporting spans. In that case, check the Collector configuration. See Troubleshoot the Collector.

    • If count is equal to 0, the instrumentation is not exporting spans. Make sure that all the spans end by calling the span.End() method.

Missing some spans from a service πŸ”—

After activating debug logging, check the logs for messages like the following:

exporting spans {"count": 364, "total_dropped": 1320}

The total_dropped value is the cumulative number of spans dropped by the instrumentation. If this value is higher than zero, the batch span processor is dropping new spans when the queue is full.

The batch span processor might drop spans in the following cases:

  • If the value of count in the log messages is consistently equal to the maximum batch size, the instrumentation might be creating spans faster than they can be exported. If your system has enough resources, increase the batch size and queue size. For example:

    export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=1024
    export OTEL_BSP_MAX_QUEUE_SIZE=20480
    # Don't increase the queue size if the system has limited memory
    

    If the network has limited bandwidth available, reduce your export batch size. For example:

    export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=128
    

    This might increase the export frequency and drain the queue faster.

  • If the value of count is not consistently equal to the maximum batch size, make sure you have a stable connection to the target and that you have adequate bandwidth. You can also reduce export timeouts, decrease the export size and frequency, and increase the queue size. For example:

    # 5s export timeout.
    export OTEL_BSP_EXPORT_TIMEOUT=5000
    # 30s maximum time between exports.
    export OTEL_BSP_SCHEDULE_DELAY=30000
    export OTEL_BSP_MAX_QUEUE_SIZE=5120
    export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=128
    

    Make sure to allocate enough memory resources on your system to accommodate the increase in queue size. Changes in the export configuration might result in the instrumentation dropping whole export batches that take too long.

Make sure the endpoint is correct πŸ”—

If you get the following logged error message, the exporter might not be able to connect with the endpoint:

2022/03/02 20:29:29 context deadline exceeded
2022/03/02 20:29:29 max retry time elapsed: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: missing address"

To solve this issue, make sure the following conditions are true:

  1. The target endpoint is up and receiving connections.

  2. The target endpoint is reachable from the connecting service.

  3. The target endpoint is correct when providing an alternative value.

If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.

Available to Splunk Observability Cloud customers

Available to prospective customers and free trial users

  • Ask a question and get answers through community support at Splunk Answers .

  • Join the Splunk #observability user group Slack channel to communicate with customers, partners, and Splunk employees worldwide. To join, see Chat groups in the Get Started with Splunk Community manual.

This page was last updated on Sep 18, 2024.