-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce HeaderValue
type to replace use of String
#27
base: main
Are you sure you want to change the base?
Conversation
6004488
to
db44a7c
Compare
Signed-off-by: Yaroslav Skopets <[email protected]>
db44a7c
to
2d818a3
Compare
@PiotrSikora PTAL. While it's not encouraged but we do have certain traffic contains non UTF-8 header value. Since spec allows it historically, we shouldn't block that.
|
This is actually very nuanced issue, since it exposes users to various attacks... but I agree that we should allow opaque data, and I had @yskopets the (Note: I need to wrap a few things over the next 2 days, so I'll get back to this later in the week, since I don't want to make this breaking change in rush.) |
I was thinking about returning A typical developer journey starts from for (name, value) in &request_headers() {
println!("{}={}", name, value);
} Returning a type like For comparison, And if you decide to return Regarding BTW, have you looked into https://crates.io/crates/http ? I started from there, but since it doesn't allow for pseudo headers like |
@PiotrSikora After looking into other use cases, I think, it makes sense to rename In particular,
Users who need operations such as Another reason to have |
wondering if this is a blocking change to move off our fork? considering all libraries discussed are not api stable (even http is 0.2.4) @lizan, there's an implication that there is customer traffic that not only exists, but is specifically using rust-wasm-sdk. If that's so, can you have them comment what non-utf8 headers they are using? I mean directly because some times engineers make mistakes, and libraries change to suit them, then they stop using that feature. For example, many core libraries lack support of non-string and instead deal on decoding side (ex substituting with ? when bytes are messed up). I would argue this isn't a blocking change if there is no current user of this, especially in such a young library ecosystem |
here's a suggested way out.
|
PS on the point of the actual change here, I would be really cautious about any binary api especially looking at the broader picture and impact of change. I've had some experience making fast parsers and binary is a double-edged sword. In worst case, people end up converting to strings in tight loops and you can't cache that anymore because they only have a ref to the binary form. Even if there is a utility to cache the string, it can be awkward to have someone do non-allocating comparisons with pre-existing strings leading to the same. Bottom line is probably there are many reasons why nearly all http apis in languages choose to not allow arbitrary binary representations at the http logical abstraction. While primarily it would be about usability, performance would likely come quick second especially in code that runs in a proxy. Even if the customer were able to convince this particular codebase to have a api that harms others, they won't be able to change the rest of the ecosystem other languages etc. IOTW I would backfill any missing UTF-8 test cases, but stop there. If there is code here that converts binary headers into strings, just backfill tests that prove they don't panic on malicious or poorly chosen header values. |
last note mainly to brain dump.. If it is really important non-interference, I would classify that differently than an api. For example if there are headers in as bytes and they need to stay exactly the same bytes, malicious or not, we could have a "normal" api, which is a lazy decoded view of them. At that point, only when reading back a weird value do you have to make a decision, and even if you decode into a string with ? substitution, the raw bytes stay as they are unless overwritten. This limits impact only to the ability to intentionally add poorly chosen binary values, effectively doing this intentionally is not supported by api, but pre-existing malicious or poorly chosen values remain supported. |
Just pointing out we were dealing with a similar issue in deno, see here: denoland/deno#7208 This concerns me both from a
If there is a concern about breaking the |
We've actually just run into this as well with Having a |
Signed-off-by: spacewander <[email protected]>
Context
hostcalls::get_map
crashes Envoy when an HTTP header value is not a UTF-8 encoded string #25