Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft][Please review] Make Paginator use simpler queries when there are no sql joins used (#8278) #11595

Draft
wants to merge 10 commits into
base: 3.3.x
Choose a base branch
from

Conversation

d-ph
Copy link
Contributor

@d-ph d-ph commented Sep 5, 2024

May I ask for an initial review of the code in this PR before I move forward with it, please?

This PR implements the scope of work discussed in #8278 (comment). It only implements the "no joins used" case. It doesn't implement the "only ToOne joins used" case. I plan do implement that other case in a future PR, after this one gets reviewed and hopefully merged.

A few comments:

  1. This PR adds quite a few instance variables to the Paginator class, and also adds a few big docblocks that explain things. Please let me know whether this is acceptable.
  2. As discussed in the github issue mentioned above, the feature introduced by this PR is only ever effective if the users explicitly use a certain argument value (namely: $fetchJoinCollection = null). This means that there could be a long period of time before the "auto-detection of better defaults" gets enabled by default (if ever).
  3. The PR lacks unit tests. I plan to add them after I get an initial "ok" for this PR.
  4. I see that this PR says that there are failing unit tests. They don't seem related to this PR's code change. I would look into reproducing them in my machine after the initial code review is done.

Thank you.

@d-ph d-ph changed the title [Draft] Make Paginator use simpler queries when there are no sql joins used (#8278) [Draft][Please review] Make Paginator use simpler queries when there are no sql joins used (#8278) Sep 5, 2024
@greg0ire greg0ire closed this Sep 5, 2024
@greg0ire greg0ire reopened this Sep 5, 2024
@greg0ire
Copy link
Member

greg0ire commented Sep 5, 2024

Triggering a new build to address 4. , #11596 should fix it.

src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
@d-ph
Copy link
Contributor Author

d-ph commented Oct 1, 2024

@greg0ire

Hi,

While I wait for some final answers from stof to my (regrettably) wall of text, may I know your initial thoughts on my proposition to rename some instance variables in the Paginator class, please?

I explained it in more details in the discussion above, but the bottom line is:

  1. Rename Paginator::$fetchJoinCollection to something along the following line: $query(Potentially)ProducesDuplicateRows(ForTheRootAlias). It's just a change of a variable name, because the "semantics" of that variable would remain the same (i.e. it being a boolean, defaulted to "true"). The potentially undesired consequence is that the Paginator::getFetchJoinCollection() would have to remain in the class, and be deprecated (a new getter would also be added as a replacement).

  2. Split Paginator::$useOutputWalkers into two variables: $useResultQueryOutputWalker and $useCountQueryOutputWalker. The potentially undesired consequence is that, again, the current getter and setter for that instance variable would be deprecated, and 4 new instance functions would be added: getters and setters for the 2 new variables. I'm not totally crazy about it myself, but it is what it is.


Would you say that you would initially allow for the changes described above to be made to the Paginator by me?

Thanks.

@greg0ire
Copy link
Member

greg0ire commented Oct 1, 2024

  1. If fetchJoinCollection is truly inaccurate, then yes, I'd be open to renaming it, but maybe to something shorter like $queryProducesDuplicates. The ensuing deprecation is no big deal IMO.
  2. Nothing to comment here, I think it's fine.

@d-ph
Copy link
Contributor Author

d-ph commented Oct 31, 2024

Hi all,

I pushed another iteration of this PR, taking into account the feedback so far. Could you re-evaluate, please?

Major points:

  1. This PR now goes farther than "agreed to". I.e. it analyses the query's AST to see what kind of JOINs are used, and what is being selected in the SELECT clause. If this is too far, then I may comment out some part of it, to "stagger" the release of this PR across multiple releases.
  2. The routine that analyses the AST is about 150 lines long. If this is too cluttering, then the "auto detection" could perhaps be extracted to a separate "factory class". Or I could put that function to the bottom of that file, so that it's less distracting to the causal reader.
  3. The "simple" count query is now used regardless of whether there are ToMany JOINs or not. I used the "list of conditions" from stof to determine whether the simple count query can be used, instead of the "complex" one. I personally consider this to be a bold move, so if it's too bold, then I can restore the original "proposition" of not allowing the "simple" count query to be used, when there are ToMany JOINs present.
  4. Tests are still missing. I will put them in when I get a confirmation that the PR is going in the acceptable direction.

Thank you.

Copy link
Member

@greg0ire greg0ire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to @stof regarding your question about whether this is going in the right direction.

public function __construct(
Query|QueryBuilder $query,
private readonly bool $fetchJoinCollection = true,
private readonly bool $queryCouldProduceDuplicates = true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs need to be updated at the very least because of this. Can you also read it and see if anything else needs to be modified, and whether any new behavior you are adding is worth documenting?

try {
$rootClassMetadata = $entityManager->getClassMetadata($fromRoot->rangeVariableDeclaration->abstractSchemaName);
$queryFeatures['rootEntityHasSingleIdentifierFieldName'] = (bool) $rootClassMetadata->getSingleIdentifierFieldName();
} catch (MappingException) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the catching of the MappingException?

It's primarily needed because this is how ClassMetadata::getSingleIdentifierFieldName() communicates that "the identifier is not single column". One way to avoid the try..catch is to duplicate the if-conditions from ClassMetadata::getSingleIdentifierFieldName() in here, but having a code duplication here would introduce a maintenance cost, which I personally consider a bigger nuisance than having a try..catch block.

Also, EntityManager::getClassMetadata() throws that exception class when the $className is unrecognised. Instead of making an assumption that "there must be metadata for every class detected at this point of the code", I wanted to be more graceful and defensive, and chose to just catch the exception and assume that the "auto-detection" cannot be carried out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually avoid using exception for flow control, and I think that's because they are costly.

I think I would add a dedicated method, so as to remove the need of casting the result to a boolean as well, even if it means a little duplication.

Also, EntityManager::getClassMetadata() throws that exception class when the $className is unrecognised. Instead of making an assumption that "there must be metadata for every class detected at this point of the code", I wanted to be more graceful and defensive, and chose to just catch the exception and assume that the "auto-detection" cannot be carried out.

In what case will $className not be recognized and we wouldn't want to let the user know about that?

src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
@@ -60,31 +268,80 @@ public function getQuery(): Query
}

/**
* Returns whether the query joins a collection.
* @deprecated Use ::getQueryCouldProduceDuplicates() instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UPGRADE.md needs to document the deprecations.

src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
src/Tools/Pagination/Paginator.php Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants