-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-proposal: websocket token authentication with subprotocols #119
Comments
@minrk thanks for this detailed, cross-referenced write-up. This all makes sense but I was wondering if you could elaborate a bit more on why we're rejecting alternative approaches. The reason I ask is that right now if someone looks at this proposal as written, it kinda looks like the justification for sending the token via WebSocket subprotocols is: "Kubernetes is doing it." But what Kubernetes is doing also looks like a hack and misuse of the subprotocol header. So I think it would be nice if we could expand our justification for adopting this approach, perhaps by going through the alternatives and explaining in detail why they are worse than the proposed approach. For example, I guess we can't use cookies because we must support scenarios where the browser app is served from a server on one hostname but the WebSocket connection needs to go to a server at a different hostname. I think it would be good to get all of this down in writing somewhere. |
Fair points! I was planning to do that if/when this becomes an actual JEP, but I can do it here, too. I think the main reason is that the transition to message-based handshaking is quite a bit more complex to do in a graceful way.
Correct, we can't use just cookies (we can and do use cookies in many cases). We have a requirement that delivering a token should be accepted in cases like:
for which the only channels we have available to us to pass credentials are:
An Authorization header would be standard and ideal and already accepted by Jupyter Server, but browser implementers have made the dubious decision to exclude this option from the javascript API. Right now, we are doing 1. in these cases, which is frankly okay as long endpoints are over https or localhost, but not ideal. The only downside is that query parameters may appear in logs, which is why it's identified as a CWE. Option 2. can also work, and does not require stuffing credentials in the subprotocols header. The downsides are mainly that it would be a more complex transition for clients and more complex implementation for servers. For servers:
For clients:
The subprotocols strategy is easier for both clients and servers because:
Both approaches are definitely doable. Another question to resolve is whether/how servers signal that they support the new handshake. For the kernels subprotocol, they simply don't and clients have to try connecting with and reconnecting without, which has worked fine, to my knowledge. If we declare a mechanism to unambiguously signal support on the server side, clients can be simplified slightly in that the discovery of which handshake to use can happen explicitly first, but both implementations must still be present for compatibility. |
A shorter summary of the comparison is that I thought for years that the transition to message handshakes was the only way and it made me too tired to think about implementing properly, so I never proposed it. Only when I saw the kubernetes approach did I think it was sufficiently simple to actually write down this proposal for something that's been a known issue since before this project was called Jupyter. |
Thank you! That's exactly what I was asking for and very helpful. Your last message about feasibility really captures for me why your proposed way forward is the only one that makes sense given the constraints. One thing that came up in conversation at the SSC meeting today is a concern about auth token size and any potential size limitation on the header. Do we have constraints on token size? For example, is it possible for a Jupyter deployment to have tokens that somehow change based on the user's username and if the user has a very long username, could that token get long enough to hit up against some size limitation on the headers? Should we save that question and others for a JEP? |
we don't have our own constraints, and there will likely be deployments where the token is a JWT, which can technically get arbitrarily long, but generally not anywhere close to the general header limit, which is configurable but tends to be around 8kB. But I don't think this introduces any related issues, because we are already passing this same token in the Authorization header for all non-websocket API requests. So if token size is an issue, it would be for all API requests, not unique to websockets. Passing this via Sec-WebSocket-Protocol is only a few more bytes than passing it via Authorization. |
Sorry for this late reply, thanks @minrk for this very detailed pre-proposal. Even if it is backward compatible and does not break anything, I think it would be worth having a JEP, at least for documentation purpose for maintainers of the affected projects. I'll let other people confirm it, but it was the general feeling during the last SSC meeting. |
JEP is open: #121 |
Background
Combining these facts:
As a result, websocket requests must pass tokens in a URL parameter.
motivates having a new mechanism by which to pass the auth token for websocket requests that's not in the URL.
Proposal
Add support for passing authentication tokens in the websocket subprotocol for Jupyter Server, to avoid tokens in URL parameters.
This is a scheme devised by Kubernetes, where the subprotocols API allows specifying the Sec-Websocket-Protocol header, and we can put the token in there.
Example implementation
(draft implementation of this at jupyter-server/jupyter_server#1407)
To authenticate websockets via new mechanism, clients should:
which sets the header:
The response will have the header:
The reason for the double subprotocol is that if any subprotocol is requested, the response must include one of the requested subprotocols for the connection to be accepted by all browsers. Not all browser require this, but Chrome does. The
v1.token.websocket.jupyter.org
serves no purpose if there is already a subprotocol defined and required, and should be optional.Remaining questions
The main implementation question is how should frontends negotiate/discover this feature. We have already done it once in jupyterlab/jupyterlab#11841 adding the kernel subprotocol, and the solution was to make no assumptions and try once with subprotocols, and then reconnect without them if it fails. This becomes a little bit more complicated when there are two subprotocols that may not be implemented, but the same strategy is valid and requires no prior knowledge of the server's support for the new scheme.
Why a JEP
This is a JEP because it affects a relatively low-level feature in the Jupyer Server that clients (JupyterLab) should implement. It is fully backward-compatible, so perhaps it doesn't need to be, but it does involve implementation in both jupyter-server and jupyterlab. I have a proposed implementation already. I'd be happy for this to just be a new feature in Jupyter Server that future frontends can implement as needed. There is no plan to deprecate the existing behavior, so there is relatively little pressure to implement by frontends, as nothing can be broken by this.
There are also multiple possible ways to go about this, such as a handshake message, which has a lot of disadvantages, as I see it (backward-compatiblity being much harder, for one).
Affected subprojects:
Recommended reviewers:
The text was updated successfully, but these errors were encountered: