Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't panic on unknown token_id in gRPC callbacks. #210

Conversation

erikness-doordash
Copy link
Contributor

@erikness-doordash erikness-doordash commented Oct 18, 2023

Sometimes Dispatcher.on_grpc_receive_initial_metadata() will get an invalid stream ID, and then panic. Dispatcher.on_grpc_receive_trailing_metadata() as well.

This is a quick fix that will turn a panic!() call into a trace!() call. The broader issue/bug is out of scope here, though as far as I know it's benign other than this panic.

NOTE: there may be a more concise way to write this Rust code. (thanks Piotr)
NOTE: adding this change to on_grpc_receive() as well, just to be safe, even though I haven't seen this panic get triggered myself.

See:

  1. Issue with more details: Panic when there are no remote hosts #206
  2. Issue tracking why these two methods get an invalid token number at all: gRPC callbacks are reentrant, defer them proxy-wasm-cpp-host#373

src/dispatcher.rs Show resolved Hide resolved
src/dispatcher.rs Show resolved Hide resolved
@erikness-doordash erikness-doordash changed the title Change panic to trace log on_grpc_receive_initial_metadata() Change panic to trace log during on_grpc_receive_initial_metadata() Oct 18, 2023
@erikness-doordash
Copy link
Contributor Author

Just tested this, no more crash after scaling a remote service to 0

@erikness-doordash erikness-doordash force-pushed the patch-initial-metadata-tokenid-bug branch 2 times, most recently from fc6da7c to 81585d2 Compare October 19, 2023 23:10
@erikness-doordash erikness-doordash changed the title Change panic to trace log during on_grpc_receive_initial_metadata() Change panic to trace log during on_grpc_receive_[initial/trailing]_metadata() Oct 19, 2023
.expect("invalid token_id");
let grpc_streams_ref = self.grpc_streams.borrow_mut();
let context_id_hash_slot = grpc_streams_ref.get(&token_id);
let context_id = match context_id_hash_slot {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be written as:

let context_id = match self.grpc_streams.borrow_mut().get(&token_id) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I thought the rust compiler would stop me. Checked again though, it works fine, thanks for the tip

src/dispatcher.rs Show resolved Hide resolved
src/dispatcher.rs Show resolved Hide resolved
@PiotrSikora
Copy link
Member

Could you fix DCO?

@erikness-doordash
Copy link
Contributor Author

Code search to support patching on_grpc_receive() as well

I went looking through the host code; I couldn't find where exactly Envoy calls these methods in the unhappy path (can't init stream), it may be pretty low level.

But looking at this call stack, I suspect on_grpc_receive() needs this too:

  1. (proxy-wasm-cpp-host) ContextBase::onGrpcReceiveInitialMetadata
  2. (envoy) Context::onGrpcReceiveInitialMetadataWrapper()
  3. (envoy) GrpcStreamClientHandler::onReceiveInitialMetadata()
    1. AsyncStreamImpl::onHeaders()
      1. AsyncStreamImpl::encodeHeaders()
      2. ???
    2. GoogleAsyncStreamImpl::handleOpCompletion()
      • This has a branch for 2 methods on the "callbacks_" object: onReceiveInitialMetadata() and onReceiveMessageRaw()

(similar for trailing metadata)

Though I have not seen on_grpc_receive() calls triggered in our own code in this circumstance (see: #206).

@erikness-doordash erikness-doordash changed the title Change panic to trace log during on_grpc_receive_[initial/trailing]_metadata() Change panic to trace log during 3 on_grpc_receive*() messages Oct 26, 2023
Copy link
Member

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, sans the nit.

src/dispatcher.rs Outdated Show resolved Hide resolved
Signed-off-by: erikness-doordash <[email protected]>
@PiotrSikora PiotrSikora changed the title Change panic to trace log during 3 on_grpc_receive*() messages Don't panic on unknown token_id in gRPC callbacks. Nov 8, 2023
@erikness-doordash
Copy link
Contributor Author

LGTM, sans the nit.

@PiotrSikora can you merge it?

Copy link
Member

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@PiotrSikora PiotrSikora merged commit 9b4b4a5 into proxy-wasm:master Nov 9, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants