Docs » Collect application spans and traces » Instrument Java applications for Splunk Observability Cloud » Troubleshoot Java instrumentation for Splunk Observability Cloud

Troubleshoot Java instrumentation for Splunk Observability Cloud 🔗

When you instrument a Java application using the Splunk Distribution of OpenTelemetry Java and you don’t see your data in Observability Cloud, review these solutions:

Steps for troubleshooting Java OpenTelemetry issues 🔗

The following steps can help you troubleshoot Java agent issues:

  1. Enable debug logging

  2. Check the status of the runtime

Enable debug logging 🔗

Debug logging is a special execution mode that outputs more information about the Java agent of the Splunk Distribution of OpenTelemetry Java. This can help you troubleshoot Java instrumentation issues.

To turn on the debug logging for the agent, pass the following argument when running your application:

-Dotel.javaagent.debug=true

When you run the agent with debug logging enabled, debug information is sent to the console as stderr. Debug log entries look like the following example:

...
[opentelemetry.auto.trace 2021-10-10 10:57:05:814 +0200] [main] DEBUG io.opencensus.tags.Tags - <Could not load lite implementation for TagsComponent, now using default implementation for TagsComponent.3>
[opentelemetry.auto.trace 2021-10-10 10:57:05:722 +0200] [main] DEBUG io.grpc.netty.shaded.io.netty.util.internal.PlatformDependent0 - direct buffer constructor: unavailable
...

While not all debug entries are relevant to the issue affecting your Java instrumentation, the root cause is likely to appear in your debug log.

Note

Enable debug logging only when needed. Debug mode requires more resources.

Check the status of the runtime 🔗

Run the jps -lvm command to verify that the Java runtime has started. The output is a list of all the Java Virtual Machines (JVM) currently running. Make sure the JVM you instrumented appears among them.

In the following example, the first entry shows a JVM running the agent with -javaagent:

37602 target/spring-petclinic-2.4.5.jar -javaagent:./splunk-otel-javaagent.jar -Dotel.resource.attributes=service.name=pet-store-demo,deployment.environment=prod,service.version=1.2.0 -Dotel,javaagent.debug=true
38262 jdk.jcmd/sun.tools.jps.Jps -lvm -Dapplication.home=/usr/lib/jvm/java-16-openjdk-amd64 -Xms8m -Djdk.module.main=jdk.jcmd

If the instrumented JVM doesn’t appear in the list, check the JVM or application logs to find the cause of the problem. Also check that the additional startup parameters are correctly passed to the runtime. See Instrument a Java application for Splunk Observability Cloud to learn more about startup parameters.

Library instrumentation issues 🔗

If you find an issue with a specific instrumentation of a library, or suspect there might be an issue affecting that instrumentation, disabling it can help you troubleshoot the Java agent.

To disable a specific library instrumentation, add the following argument:

-Dotel.instrumentation.<name>.enabled=false

Replace <name> with the corresponding instrumentation from the OpenTelemetry Java instrumentation on GitHub at https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/suppressing-instrumentation.md#suppressing-specific-agent-instrumentation.

Class instrumentation issues 🔗

You can prevent specific classes from being instrumented. Excluded classes don’t send spans, which is useful for muting specific classes or packages.

To disable instrumentation for a class, set the otel.javaagent.exclude-classes system property or the OTEL_JAVAAGENT_EXCLUDE_CLASSES environment variable to the name of the class or classes.

You can enter multiple classes. For example, my.package.MyClass,my.package2.*.

Warning

Disabling instrumentation for specific classes can have unintended side effects. Use this feature with caution.

Trace exporter issues 🔗

By default, the Splunk Distribution of OpenTelemetry Java uses the OTLP exporter. Any issue affecting the export of traces produces an error in the debug logs.

OTLP can’t export spans 🔗

If you see the following error in the logs, it means that the agent can’t send trace data to the OpenTelemetry connector:

[BatchSpanProcessor_WorkerThread-1] ERROR io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter - Failed to export spans. Server is UNAVAILABLE. Make sure your collector is running and reachable from this network. Full error message:UNAVAILABLE: io exception

To troubleshoot the lack of connectivity between the OTLP exporter and the OTel connector, try the following steps:

  1. Make sure that otel.exporter.otlp.endpoint points to the correct OpenTelemetry connector instance host.

  2. Check that your OTel connector instance is up and configured. See Troubleshoot the OpenTelemetry Collector.

  3. Check that the OTLP gRPC receiver is enabled in the OTel connector and plugged into the traces pipeline.

  4. Check that the OTel connector points to the following address: http://<host>:4317. Verify that your URL is correct.

Channel pipeline error 🔗

If you’re seeing the following error in your logs, it might mean that the Java agent is trying to send trace data to the Splunk ingest API endpoint, which is not yet supported by OTLP:

[grpc-default-executor-1] ERROR io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter - Failed to export spans. Server is UNAVAILABLE. Make sure your collector is running and reachable from this network. Full error message:UNAVAILABLE: io exception
Channel Pipeline: [SslHandler#0, ProtocolNegotiators$ClientTlsHandler#0, WriteBufferingAndExceptionHandler#0, DefaultChannelPipeline$TailContext#0]

To solve this issue, use the Jaeger exporter instead. See Exporters configuration.

Jaeger can’t export spans 🔗

If you’re seeing the following warnings in your logs, it means that the Java agent can’t send trace data to the Smart Agent, the OTel connector, or Splunk Cloud Platform using the Jaeger exporter:

[BatchSpanProcessor_WorkerThread-1] WARN io.opentelemetry.exporter.jaeger.thrift.JaegerThriftSpanExporter - Failed to export spans
io.jaegertracing.internal.exceptions.SenderException: Could not send 8 spans
   at io.jaegertracing.thrift.internal.senders.HttpSender.send(HttpSender.java:69)
   ...
Caused by: java.net.ConnectException: Failed to connect to localhost/0:0:0:0:0:0:0:1:9080
   at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:265)
   ...
Caused by: java.net.ConnectException: Connection refused (Connection refused)
   ...

To troubleshoot the lack of connectivity between Jaeger and Splunk Observability Cloud, try the following steps:

  1. Make sure that otel.exporter.jaeger.endpoint points to a Smart Agent or OpenTelemetry Collector instance, or to the Splunk Ingest URL. See the Splunk Ingest URL summary in Summary of Splunk Observability Cloud API Endpoints.

  2. Check that the Smart Agent or the OTel connector instance is up and configured.

  3. Check that the Jaeger Thrift HTTP receiver is enabled and plugged into the traces pipeline. See Check exposed ports.

  4. Check that the endpoint is correct. The Smart Agent and OpenTelemetry Collector use different ports and paths by default. For the Jaeger receiver, the Smart Agent uses http://<host>:9080/v1/trace, while the OTel connector uses http://<host>:14268/api/traces.

401 error when sending spans 🔗

If you send traces directly to Observability Cloud and receive a 401 error code, the authentication token specified in SPLUNK_ACCESS_TOKEN is invalid. The following are possible reasons:

  • The value is null.

  • The value is not a well-formed token.

  • The token is not an access token that has authScope set to ingest.

Make sure that you’re using a valid Splunk access token when sending data directly to your Splunk platform instance. See Create and manage user API access tokens.

Metrics exporter issues 🔗

If you see the following warning in your logs, it means that the Java agent can’t send metrics to your OTel connector, Smart Agent, or to the Splunk platform endpoints:

[signalfx-metrics-publisher] WARN com.splunk.javaagent.shaded.io.micrometer.signalfx.SignalFxMeterRegistry - failed to send metrics: Unable to send data points

To troubleshoot connectivity issues affecting application metrics, try the following steps:

  1. Make sure that splunk.metrics.endpoint points to the correct host.

  2. Check that the Smart Agent or the OTel connector instance is up and configured.

  3. Check that the Smart Agent and the OpenTelemetry Collector are using the correct ports for the SignalFx receiver. The Collector uses http://<host>:9943, and the Agent uses http://<host>:9080/v2/datapoint.

  4. Make sure that you’re using a valid Splunk access token when sending data directly to your Splunk platform instance. See Create and manage user API access tokens.

Note

Metric collection for Java using OpenTelemetry instrumentation is still experimental.