Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opentelemetry: Race condition between connection shutdown and export can result in duplicate spans #44894

Open
arn-ivu opened this issue Dec 3, 2024 · 4 comments
Labels
area/tracing kind/bug Something isn't working

Comments

@arn-ivu
Copy link

arn-ivu commented Dec 3, 2024

Describe the bug

When there are ~10seconds between two traces/spans beeing exported, it can happen that the exporter closes the connection after the next spans have been exported but before the otel-collector responds with OK.
This is what it looks like on TCP-level:

//traces are exported. everything ok
1.   12:55:22 app->otel POST /v1/traces (protobuf-content a)
2.   12:55:22 otel->app OK
//a trace is exported directly followed by connection termination
3.   12:55:32 app->otel POST /v1/traces (protobuf-content b)
4.   12:55:32 otel->app ACK 
5.   12:55:32 app->otel FIN
6.   12:55:32 otel->app OK
7.   12:55:32 otel->app FIN, ACK
//connection re-opened
8.   12:55:32 app->otel SYN
9.   12:55:32 otel->app SYN, ACK
10. 12:55:32 app->otel ACK
//traces are exported again
11. 12:55:32 app->otel POST /v1/traces (brotobuf-content b)
12. 12:55:32 otel->app OK

Here is the actual tcp dump containing the example: /example/tcp_traffic_http.pcap

the server's OK in step 6 is probably dropped (conforming to http-standard). This results in a timeout followed by a retry of the export.
As the otel-collector appearently doesn't deduplicate the spans this results in duplicate spans in the trace.

This is how the trace should look like: trace_normal.json
This is how the trace looks like when the issue occurs: trace_with_duplicates.json

From what I can see in the code, I assume this is a race-condition, where the connection-shutdown is triggered but does not set the isShutdown-flag before the next spans are exported. I imagine something like this is happening:

thread 1:
    // we didn't export anything for 10s. Let's close the connection
    VertxHttpSpanExport.shutdown()
thread 2:
    //receives new spans to export
    VertxHttpSpanExport.export()
        HttpExporter.export()
            if (isShutdown.get()) -> false
            //proceed exporting
thread 1:
       HttpExporter.shutdown()
           if (!isShutdown.compareAndSet(false, true))
           //proceed shutting down

Expected behavior

The exporter should not export the same traces twice to the otel-collector.

Actual behavior

The exporter sometimes (probably: race condition) closes the connection to the otel-collector right after spans where exported, but before the otel-collector could respond with OK. It will then reopen the connection and send the spans a second time.

How to Reproduce?

Reproducer: https://github.com/arn-ivu/reproducer-duplicate-spans

Requirements:

  • docker
  • optional: a tool to capture tcp-traffic, e.g. wireshark

Steps to reproduce:

  • Start the reproducer: ./mvnw compile quarkus:dev
  • Since this is about a race-condition you may have to wait for some time until the issue occurs (on my machine it was often ~10 minutes)

Find duplicate Spans in grafana:

  • In the dev-ui: http://localhost:8080/q/dev-ui/extensions
    • Click on Grafana-ui
  • Naviagate to "Explore", Choose "Tempo" as source
  • Paste this query to the traceQL-input: {resource.service.name="reproducer-duplicate-spans"} | count() > 2

Monitor the tcp-traffic with wireshark

Output of uname -a or ver

Linux e5808115ef08 5.15.167.4-microsoft-standard-WSL2 #1 SMP Tue Nov 5 00:21:55 UTC 2024 x86_64 Linux

Output of java -version

openjdk version "22.0.1" 2024-04-16 OpenJDK Runtime Environment Temurin-22.0.1+8 (build 22.0.1+8) OpenJDK 64-Bit Server VM Temurin-22.0.1+8 (build 22.0.1+8, mixed mode, sharing)

Quarkus version or git rev

3.17.2

Build tool (ie. output of mvnw --version or gradlew --version)

Apache Maven 3.9.9

Additional information

While I have analyzed the problem mostly for quarkus.otel.exporter.otlp.protocol=http/protobuf
I did observe the same problem with quarkus.otel.exporter.otlp.protocol=grpc
See https://github.com/arn-ivu/reproducer-duplicate-spans/tree/main/examples for tcp-traffic of both cases

@arn-ivu arn-ivu added the kind/bug Something isn't working label Dec 3, 2024
Copy link

quarkus-bot bot commented Dec 3, 2024

/cc @brunobat (opentelemetry), @radcortez (opentelemetry)

@arn-ivu
Copy link
Author

arn-ivu commented Dec 3, 2024

Here is the original Zulip-chat for this issue: https://quarkusio.zulipchat.com/#narrow/channel/187030-users/topic/Quarkus-Opentelemetry.3A.20Sporadic.20occurence.20of.20duplicate.20spans
We did some poking in the dark originally, so it might not be super helpful for this issue.

@brunobat
Copy link
Contributor

brunobat commented Dec 4, 2024

From the chat:
"I guess we are sending the FIN before the span b ACK from the collector, therefore marking it as not exported and trying again later.
Looks like the 1st ACK for span b is lost."
I guess there shouldn't be a FIN before a timeout to the ACK from span b.
@geoand, I wonder if you have some spare cycles to look at this one?

@geoand
Copy link
Contributor

geoand commented Dec 4, 2024

I might, but don't count on it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tracing kind/bug Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

3 participants