Fix the query hashing algorithm #6205

Geal · 2024-10-29T09:48:17Z

This splits part of the work from #5255 to make it easier to merge. This PR only covers the fixes for the query hashing algorithm, which is currently used in entity caching, without integrating the changes to the query planner cache key.

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

Exceptions

Note any exceptions here

Notes

It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩

this makes sure there will be no possible collision by extension (example: hashing `ab` then `cd` VS hashing `a` then `bcd`)

apollo-router/src/spec/query/change.rs

IvanGoncharov · 2024-10-29T22:49:50Z

apollo-router/src/spec/query/change.rs

+        Ok(visitor)
+    }
+
+    pub(crate) fn hash_schema(&mut self) -> Result<(), BoxError> {


As an idea for follow up PR: What if use struct destructuring (without ...) for accessing fields on schema?
In that way, we can ensure that all fields are considered for hashing, so if new fields are added to apollo-rs, they will result in compiler errors.
That way, we protect against stuff that would be added to the GraphQL spec in the future, e.g., support for directives in a new location.

it could be useful for the other structures we use elsewhere in the visitor. Here, the schema also holds a list of types, but the point of this algorithm would be to ignore irrelevant types

apollo-router/src/spec/query/change.rs

Co-authored-by: Ivan Goncharov <[email protected]>

We do not need to hash the same field's definition multiple times, but the query's field usage needs to always be done

IvanGoncharov · 2024-11-01T00:48:55Z

apollo-router/src/spec/query/change.rs

+
+        "^ARGUMENT_DEF_LIST".hash(self);
+        for argument in &field_def.arguments {
+            self.hash_input_value_definition(argument)?;


I'm not 100% sure, but it looks like it also should hash @fromContext:
https://www.apollographql.com/docs/graphos/reference/federation/directives#fromcontext
I looked into apollo-federation, and it seems that Rust QP doesn't support it, but it is implemented in JS QP.
I don't think this directive will affect entity cache, because you also hash the selection set of subgraph query (I'm just assuming here)

However, this directive should affect QP (at least for the JS planner), so it's breaking QP caching.

IvanGoncharov

@Geal I triple checked this PR and it looks great 👍
The only thing I found is @fromContext, otherwise it is ready to be merged.

If @fromContext is not an issue, then just ping me on Slack and I will approve this one ASAP.

Geal · 2024-11-05T17:09:41Z

@IvanGoncharov I added the initial support for the context directives. I will take another look tomorrow, but I think the approach is sound:

as we go through the query, whenever we encounter a @context directive, we record the type in a map context name -> [Type}
when we encounter a field with the join__field directive and its contextArguments argument, then for each argument, we take the context name and selection
we look up the types linked to that context name. Considering the way we go through the query and relevant types, it might return types that would not appear in this query, so it might be a bit stricter than necessary
for each of those type, we then use the selection and hash that in the same way we handle @key and @requires

There are some assumptions in there that we might want to check, like what happens when a contextArgument refers to a context that we did not find in the query (maybe that would have been prevented at composition?)

IvanGoncharov · 2024-11-26T19:54:40Z

Converting to draft as result of internal discussion regarding the risks introduced by parent PR, see details in #5255

IvanGoncharov and others added 30 commits May 27, 2024 15:57

add test cases

f61d243

add a separator between each part of the hash

12658ac

this makes sure there will be no possible collision by extension (example: hashing `ab` then `cd` VS hashing `a` then `bcd`)

deactivate the query string hashing for now

b768d00

fix some tests

78ca936

reactivate all tests

2511864

fixes for query variables

d783eee

fix some field hashing tests

459681d

add separators

842dce9

hash some directives

9bb31df

hash the schema

d632e8a

hash directive definitions

9da105e

cleanup

6e6d517

hash interface implementers

85d93fe

update hashes

7968c22

Merge branch 'dev' into geal/fix-hashing-algorithm

47dc6fe

lint

4d7ceea

add a test for directives applied to interface definitions

9c34952

cleanup

fcb3f3f

add a metric tracking how many query plans could be reused

0b17980

simplify the caching key

bd7e5eb

update hashes

22359ed

Merge branch 'dev' into geal/fix-hashing-algorithm

dfd8819

fix integration tests

4d47696

Merge branch 'dev' into geal/fix-hashing-algorithm

d1e2f34

lint

adb9d11

Merge branch 'dev' into geal/fix-hashing-algorithm

2263dcb

Merge branch 'dev' into geal/fix-hashing-algorithm

98ec603

Merge branch 'dev' into geal/fix-hashing-algorithm

bf53aa2

update snapshots

b8361b9

update snapshots

ccf9fe2

IvanGoncharov reviewed Oct 29, 2024

View reviewed changes

apollo-router/src/spec/query/change.rs Outdated Show resolved Hide resolved

IvanGoncharov reviewed Oct 29, 2024

View reviewed changes

apollo-router/src/spec/query/change.rs Show resolved Hide resolved

IvanGoncharov reviewed Oct 30, 2024

View reviewed changes

apollo-router/src/spec/query/change.rs Show resolved Hide resolved

IvanGoncharov reviewed Oct 30, 2024

View reviewed changes

apollo-router/src/spec/query/change.rs Outdated Show resolved Hide resolved

IvanGoncharov reviewed Oct 30, 2024

View reviewed changes

apollo-router/src/spec/query/change.rs Show resolved Hide resolved

IvanGoncharov reviewed Oct 30, 2024

View reviewed changes

apollo-router/src/spec/query/change.rs Outdated Show resolved Hide resolved

Geal and others added 3 commits October 31, 2024 15:16

Apply suggestions from code review

9394ee7

Co-authored-by: Ivan Goncharov <[email protected]>

review feedback

350fde0

separate field definition hashing from query field hashing and cache it

1598fbf

We do not need to hash the same field's definition multiple times, but the query's field usage needs to always be done

IvanGoncharov reviewed Nov 1, 2024

View reviewed changes

IvanGoncharov requested changes Nov 1, 2024

View reviewed changes

Geal added 3 commits November 5, 2024 17:09

Merge branch 'dev' into geal/fix-query-hashing

4c444ca

add a test for the context and fromContext directives

fb25e9d

add support for the context directives

56e81af

Geal added 7 commits November 5, 2024 18:10

lint

8302c45

Merge branch 'dev' into geal/fix-query-hashing

71209e3

add comment for fromContext

55161ef

Merge branch 'dev' into geal/fix-query-hashing

57c3096

Merge branch 'dev' into geal/fix-query-hashing

f618d2c

fix test

4b0d42a

fix tests

e9b8997

bnjjj requested a review from IvanGoncharov November 18, 2024 15:38

IvanGoncharov self-assigned this Nov 26, 2024

IvanGoncharov marked this pull request as draft November 26, 2024 19:41

IvanGoncharov marked this pull request as ready for review November 27, 2024 15:28

IvanGoncharov approved these changes Nov 27, 2024

View reviewed changes

Add changelog

b61497d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the query hashing algorithm #6205

Fix the query hashing algorithm #6205

Geal commented Oct 29, 2024 •

edited by jira bot

Loading

IvanGoncharov Oct 29, 2024

Geal Oct 31, 2024

IvanGoncharov Nov 1, 2024 •

edited

Loading

IvanGoncharov left a comment

Geal commented Nov 5, 2024

IvanGoncharov commented Nov 26, 2024

Fix the query hashing algorithm #6205

Are you sure you want to change the base?

Fix the query hashing algorithm #6205

Conversation

Geal commented Oct 29, 2024 • edited by jira bot Loading

Footnotes

IvanGoncharov Oct 29, 2024

Choose a reason for hiding this comment

Geal Oct 31, 2024

Choose a reason for hiding this comment

IvanGoncharov Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

IvanGoncharov left a comment

Choose a reason for hiding this comment

Geal commented Nov 5, 2024

IvanGoncharov commented Nov 26, 2024

Geal commented Oct 29, 2024 •

edited by jira bot

Loading

IvanGoncharov Nov 1, 2024 •

edited

Loading