Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache handling error with cache-and-network policy in production #366

Open
romain-hasseveldt opened this issue Sep 30, 2024 · 6 comments
Open

Comments

@romain-hasseveldt
Copy link

I am encountering a strange issue with Apollo Client when using the cache-and-network fetch policy.

When utilizing useSuspenseQuery with keywords filters to retrieve posts, the query sometimes returns an empty array. However, despite this empty array, the frontend continues to display posts that should not be present.

What's strange is that this behavior occurs only in production or when querying the production backend from my local frontend. In my local environment, everything functions as expected without this issue.

I'm looking for insights into why this cache handling error might be happening and how to address it.

@jerelmiller
Copy link
Member

Hey @romain-hasseveldt 👋

Without seeing your setup and assuming you're using the app router, I'll provide you with a few things to keep in mind. The following is assuming you're running this in a client component, which is first rendered on the server in an SSR pass.

React determines when to keep the stream open based on Suspense boundaries, which are triggered by pending promises. Once the promises resolve and the suspense fallback is no longer shown, there is no guarantee that React will keep the stream open to provide additional updates from the server to the browser.

With a cache-and-network fetch policy, useSuspenseQuery will only suspend if it can't provide the full query result from the cache. If it can, then the component won't suspend and the cached result will be streamed to the browser. There is a chance that your query is reading an empty array from somewhere in your cache, and this is being streamed down to the browser. Unfortunately React doesn't provide any additional primitives for us that allows us to keep the stream open. Its possible that the stream may close before we can provide the "network" part of the fetch policy. The only way you'd see that result from the network request would be a race condition where the request completes before React closes the stream. This could be very hit or miss though so I wouldn't rely on it.

Because you're using a cache-and-network fetch policy, you're also guaranteed a network request, even if you get a cache hit for your query. I believe this is true once your client components are hydrated in the browser. This network request may actually be returning results, so this would explain why you see them show up later.

So in summary, you might be streaming an empty array from a server, then the browser executes the network request to fulfill the fetch policy. The result of the network request is what gives you the set of posts. This is pure speculation though since I don't have any code to go off of or understand your setup.

I'll challenge the idea though that cache-and-network will be useful for you here. As a best practice (and the way we setup the library), we require that you create a new client and cache instance for every request (this is done for you as long as you're creating the client in the makeClient function provided to ApolloNextAppProvider). This ensures that you don't accidentally leak sensitive user data between users by storing the result of multiple user's requests in the same cache. Since you're working with a new cache for each request, the server cache only helps if you're executing multiple queries in the same request with some overlapping data. For that reason, I'd highly recommend switching to a cache-first fetch policy instead since you're unlikely to take advantage of the additional network request.

Let me know if this helps!

@romain-hasseveldt
Copy link
Author

romain-hasseveldt commented Oct 1, 2024

Thank you for your thorough response, @jerelmiller, especially considering my issue, which I wrote quickly because it was late in France!

To give you more context: I am indeed using the app router, and my API is served by Strapi.

Here are the components in question:

First, the forum page:

<Suspense fallback={<PostListSkeleton />}>
  <PostList forumDocumentId={params.forumDocumentId} />
</Suspense>

Second, the PostList component:

'use client';

import { useSearchParams } from 'next/navigation';
import { useTranslations } from 'next-intl';
import { useCallback, useMemo, useTransition } from 'react';

import { PostFilterEnum } from '@/enums';
import {
  useGetForumPostsSuspenseQuery,
  useGetMeSuspenseQuery,
} from '@/gql/generated/graphql';
import { InfiniteScroll } from '@/lib/infinite-scroll';

import { PostCard } from './PostCard';
import { GenericList } from '../elements/GenericList';
import { PostListSkeleton } from '../skeletons/PostListSkeleton';

interface Props {
  forumDocumentId: string;
}

export const PostList = ({ forumDocumentId }: Props) => {
  const t = useTranslations('PostList');
  const searchParams = useSearchParams();
  const keywords = searchParams.get('keywords');
  const filter = searchParams.get('filter');
  const [isPending, startTransition] = useTransition();

  const {
    data: { me },
  } = useGetMeSuspenseQuery();

  const {
    data: { forumPosts },
    fetchMore,
  } = useGetForumPostsSuspenseQuery({
    fetchPolicy: 'cache-and-network',
    variables: {
      forumDocumentId,
      pagination: { start: 0, limit: 10 },
      sort: ['createdAt:desc'],
      ...((keywords || filter) && {
        filters: {
          ...(keywords && {
            or: [
              { text: { containsi: keywords } },
              { user: { firstName: { containsi: keywords } },
            ],
          }),
          ...(filter === PostFilterEnum.MyPosts && {
            user: { documentId: { eq: me?.documentId } },
          }),
          ...(filter === PostFilterEnum.MyCommentedPosts && {
            comments: { user: { documentId: { eq: me?.documentId } },
          }),
        },
      }),
    },
  });

  const pageInfo = useMemo(() => forumPosts?.pageInfo, [forumPosts]);

  const handleLoadMore = useCallback(() => {
    startTransition(() => {
      if (pageInfo) {
        fetchMore({
          variables: {
            pagination: {
              start: pageInfo.page * pageInfo.pageSize,
              limit: 10,
            },
          },
        });
      }
    });
  }, [pageInfo, fetchMore]);


  return (
    <InfiniteScroll
      isLoading={isPending}
      loader={<PostListSkeleton />}
      hasMore={(pageInfo?.page || 1) < (pageInfo?.pageCount || 1)}
      onLoadMore={handleLoadMore}
    >
      <GenericList
        isFullWidth
        items={forumPosts?.nodes}
        emptyMessage={t('empty')}
        emptyStyle='card'
        renderItem={(post) => (
          <PostCard post={post} forumDocumentId={forumDocumentId} />
        )}
      />
    </InfiniteScroll>
  );
};

I switched to a cache-and-network fetch policy to handle the fact that the only keyArgs for forumPosts is the forumDocumentId, as you can see here:

return new ApolloClient({
  cache: new InMemoryCache({
    typePolicies: {
      Query: {
        forumPosts: {
          keyArgs: ['forumDocumentId'],
          merge,
        },
      },
    },
  }),
});

As such, when I filter by keywords, by my posts, or by my commented posts, everything feeds into the same cache. This results, for example, in filtering by my posts, leaving the page, and returning to it, and Apollo hits the cache and always returns my posts, whereas I would prefer it to return all posts.

I hope this clarifies things. Maybe I shouldn't be using Suspense, or I should manage keyArgs better, or perhaps refetch?

Thanks again!

@jerelmiller
Copy link
Member

@romain-hasseveldt that certainly helps! Can I ask you to clarify this part for me?

Apollo hits the cache and always returns my posts, whereas I would prefer it to return all posts

I think I'm missing what the difference is between "posts" and "all posts" here. Can you perhaps give me a quick example of data you're expecting to see vs the data you getting back from the cache?

Maybe I shouldn't be using Suspense

Keep in mind that Suspense is what allows you to start the fetch on the server and stream the results to the browser. We disable fetches in useQuery on the server and only run them in the browser, so if you switch away from useSuspenseQuery here, your fetch won't happen until the client has hydrated. That changes the performance characteristics. Perhaps that is ok, but just wanted you to be aware of what that change would do.

I should manage keyArgs better

This is certainly possible, but I'd like to understand that first point a bit first. More than likely a combination of keyArgs + a read function might help here, but knowing what you're trying to return vs what is actually returning would help.

That said, I do think there is opportunity for some additional keyArgs here. For example, you include sorting as part of your variables. If that sortBy is not part of keyArgs you may have data on your server at the end of the list that you haven't fetched yet. Apollo Client doesn't know that though and because your only keyArg is that forumDocumentId, you may get a cache hit and return an incomplete list only containing the data you've fetched thus far (which may or may not be the full result set). Take a look through your variables and see which ones might have a similar effect.

@romain-hasseveldt
Copy link
Author

To give you an example @jerelmiller

I land on a page that lists all posts in a forum. All posts (within the pagination limit) are displayed.
I then filter the posts to show only the ones I created. At that point, only my posts are displayed.
I leave the page.
When I return to the page, the filter is no longer active, but I still only see my posts instead of all posts as expected.

Does that make it clearer?

@jerelmiller
Copy link
Member

@romain-hasseveldt yes that helps a lot!

What I suspect is happening is that because your keyArgs only contains the forumDocumentId, your merge function is overwriting your original result set when you apply a filter to your query. If the data written to the cache only contains the filtered results and overwrites the original list, then the next time you revisit the page and the cache tries to read from that field, the only data it has is the filtered result set so it gives you back whats stored in that field. This is speculation though since I don't know what your merge function looks like.

If you're set on storing everything in the same list, you're going to need to do that filtering yourself client-side with a read function. Since you don't have one currently, the default behavior is to return whatever data was last stored as a result of running your merge function (hopefully you can start to see the problem).

You can perform your own client-side filtering in a read function using the args option passed to the 2nd argument of the read function. You can use this to apply filtering to the full result set which your merge function should be returning. If your read function returns undefined at any point, this is a signal to Apollo Client that it has an incomplete result and it should instead fetch the data from the server.

If you haven't already, I'd highly recommend taking a look at the pagination core API docs which give a great overview of how the read and merge functions work together to return the result you're looking for. This has some great examples on how these work together.

That said, one big downside here is knowing whether or not your merged set of results contains the full result set from the server. The solution above can only perform client-side filtering. Unless you have a way to determine whether you've fetched the entire list of results from the server, your read function is only ever going to be able to return a subset of the data you've fetched thus far.

For this reason, I'd still encourage you to consider adding those filters as keyArgs. Yes this would store that data in separate lists, but its easier to determine what has been fetched vs what hasn't. You'll be doing MUCH less work client-side to make this work since you can let the complexity of filtering be handled by your server. If you're worried about memory, keep in mind that the cache is normalized, so records that exist in more than one of these lists are deduped and replaced with references.

Does this help?

@romain-hasseveldt
Copy link
Author

romain-hasseveldt commented Oct 1, 2024

Very helpful @jerelmiller , thank you for these precise answers.

I was testing at the same time passing $filters as key arguments.

The problem, correct me if I'm wrong, is that this way I will create as many references in the cache as there are filters. Considering I have a filter on keywords, I will end up with a cache ref for 'm', 'my', 'mys', 'mysu', 'mysup', 'mysupe', 'mysuper', 'mysuperf', 'mysuperfi', 'mysuperfil', 'mysuperfilt', 'mysuperfilte', 'mysuperfilter'...

Plus, when I'm creating a new post, should I be rewriting or refetching queries with all combinations of arguments?

That's the reason why I wanted to use cache-and-network, to ensure I'm as up-to-date as possible with the server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants