Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative impact on configuration phase performance #182

Closed
tmszdmsk opened this issue May 29, 2017 · 34 comments
Closed

Negative impact on configuration phase performance #182

tmszdmsk opened this issue May 29, 2017 · 34 comments

Comments

@tmszdmsk
Copy link

We use axion-release-plugin in our git monorepo which consists of over 60 deployables. Profiling our build process revealed that axion-release-plugin is responsible for most of execution time in configuration phase. With axion-release-plugin enabled it takes about 700ms, while w/o it is ~15ms. Of course if we multiply it by 60 it turns out that plugin adds ~45seconds to total build time of the whole repository.

  1. Is there any way to speed it up?
  2. If it is ran multiple times in the context of one repo(as in our case) - can any optimisation be made to make it effectively run once (and cache some data between executions)?
@tmszdmsk tmszdmsk changed the title negative impact on configuration phase performance Negative impact on configuration phase performance May 29, 2017
@adamdubiel
Copy link
Collaborator

adamdubiel commented May 29, 2017

I never ran any performance tests against axion-release, since we use repo-per-project, so I think there is plenty of room for improvement. Could you generate sample monorepo project for test purposes?

@tmszdmsk
Copy link
Author

tmszdmsk commented May 29, 2017

@adamdubiel
So... I think monorepo itself isn't an issue. I believe that JGit has some performance issues with big repositories (our has ~20k commits in master). I did an example repository based on axion-release-plugin and added a couple of commits on top of it (20k).

My totally unprofessional benchmark shows that ./gradlew build --dry-run --profile configuration phase finishes in:

I ran it multiple times for each repo and picked lowest values I could get.

@adamdubiel
Copy link
Collaborator

adamdubiel commented May 29, 2017

Interesting.. Are you using tag with highest version mode or the default?

EDIT: nevermind, i read build.gradle in the repo you provided. I will take a look at that, might be able to optimize something.

@tmszdmsk
Copy link
Author

this is the whole config we use (so the answer is "default")

scmVersion{
    tag {
        prefix = 'appname'
    }
}

@tmszdmsk
Copy link
Author

tmszdmsk commented May 31, 2017

I've dug into the code a little bit and I have an impression that retrieving scmVersion.version (which is all you need in configuration phase) does few more things "by the way" that enormously affect the current version resolution time. IMHO there should be some fast lane to retrieve that property as fast as possible.

I can work on that issue in my spare time but I need some guidance at the beginning. What do you think?

@adamdubiel
Copy link
Collaborator

I started writing JMH test to profile GitRepository class. Simple test shows that it takes ~100ms to fetch stuff from Git in empty repository. I didn't yet test it in huge (20000k commits) repo.

What are the "by the way" things that you found?

@tmszdmsk
Copy link
Author

tmszdmsk commented Jun 5, 2017

I was in particular concerned about code starting from this point https://github.com/allegro/axion-release-plugin/blob/master/src/main/groovy/pl/allegro/tech/build/axion/release/infrastructure/git/GitRepository.groovy#L204 but after a 2nd look I get that order of tags returned isn't necessarily in the order we want.

Anyway, 🤞

@adamdubiel
Copy link
Collaborator

@tmszdmsk i think i just nailed the issue. I created very basic JMH test on performance_tests branch. After running it on artificially created repos with different commit count i noticed that execution time is bound to number of commits, not how many commits need to be read before arriving at tag.

Baseline version calculation time on repo with 1000 commits and tag located 100 commits from HEAD was 1.2 seconds.

First fix was removing unnecessary call to repository in VersionResolver. This helped to get to 800ms for version calculation.

I suspected that reading and resolving all tags from the repo could be the culprit, however replacing it with dummy code did almost nothing for execution time.

The only thing left was RevWalk in GitRepository. However this is pure JGit code. Digging deep into JGIt and analysing StartGenerator i found that for Axion configuration (topological sorting) a new TopoSortGenerator is created.

Reading Java docs of TopoSortGenerator says it all:

When the constructor completes the supplied generator will have no
commits remaining, as all of the commits will be held inside of this
generator's internal buffer.

The reason why execution time was bound to number of commits was actually reading & buffering all commits in memory. After removing topological sort and relying on default kind of "best effort", commit time based sort, execution time dropped to 300ms.

Docs for default sorting state:

Applications should not rely upon the ordering produced by this strategy.
Any ordering in the output is caused by low level implementation details
and may change without notice.

However all axion-release tests pass. Seeing the profits in execution time, i think it is worth giving a shot. Maybe i should add configuration flag which would force the old sorting method.

Could you grab the code from performance_tests branch, build it locally (./gradlew install), use it your project and report the difference (if any)?

@tmszdmsk
Copy link
Author

and the winner is... performance_tests.
It is slightly faster, instead of ~0.5s for each versioned module, it takes ~0.3s now.

@adamdubiel
Copy link
Collaborator

Hmm i expected it to be a little faster, but thats a good sign anyways. I will merge all PRs, but have no idea where to look for more ms. Maybe the creation of object graph/DI?

@adamdubiel
Copy link
Collaborator

adamdubiel commented Jul 17, 2017

Also i think of adding some kind of cache on GitRepository results (as you suggested) - it would speed up things a lot and is quite easy to implement. I just need to know if you apply the same plugin multiple times with different tag prefixes? Or is it always the same configuration?

@tmszdmsk
Copy link
Author

I remember I thought about that initially. Sounds great.

Each versioned module has its own configuration with custom tag prefix.

@adamdubiel
Copy link
Collaborator

Hmm okay, so it will probably be bit more complicated, but sounds like a challenge :) I will be out for two weeks now, so i will get back to it in August. Should i release 1.8.0 before i disappear or it doesn't matter that much?

@tmszdmsk
Copy link
Author

it is not uber important right now ;) have a great vacations!

@tmszdmsk
Copy link
Author

another thing is that I can spend some spare time on it. Just write few words how you imagine adding cache. Some starting point would be useful!

@adamdubiel
Copy link
Collaborator

adamdubiel commented Jul 17, 2017

Okay, so on first thought i would embed all the logic somewhere in GitRepository, since caching logic will be very ScmRepository dependent.

There are two things we can cache.

One is calling GitRepository#tagsMatching. The cache can keep all tags and resolved commit ids (everything until .grep which does the pattern matching).

Second one would be kind of incremental cache for RevWalk. The assumption is that repo will not change during build time (pretty sane one :)). So each time we call GitReposiotry#taggedCommitsInternal :

  • lookup cache for any TagsOnCommit objects that would match tag pattern and maybeSinceCommit (cache has to persevere the order of commits)
  • if none: start RevWalk from last cache commit id (or maybeSinceCommit if cache empty)
  • on each RevWalk#next(), see if there are any tags on current commit (not just the ones matching); if so, cache TagsOnCommit
  • when arriving at tag that matches, return and mark it as last commit in cache

This way we will:

  • add very little overhead for "normal" repos
  • for monorepos: the cost of calculating versions will not be the function of number of projects, it will be capped by reading the whole commits from the repo once (the worst case) - it already was like this for each project before TopoSortGenerator fix

I'm not sure how this cache should be shared between task executions, but if there are no hidden problems static field looks good enough. Of course just to be safe there should be an off switch for the caching mechanism.

These are just quick thoughts on how to implement this kind of cache. Some things might not make sense when implementing, so treat it only as an inspiration, not the definitive guide :) And it would be great to add another simple jmh test case to see the difference easily.

@adamdubiel
Copy link
Collaborator

@tmszdmsk i have a first working version of cache. However i need to know one thing - is your monorepo a set of subprojects under single parent project? Or are these totally separate gradle projects?

I'm wondering if cache should be kept in static field (might be risky due to Grade Demon caching whole JVMs).

@tmszdmsk
Copy link
Author

@adamdubiel we have one project with big list of includes in settings.gradle.

@adamdubiel
Copy link
Collaborator

Okay, so the basic version works, but the results are not as great as i would expect. For hermes repository reconfigured to recalculate version for root project and each of the subprojects (15 times in total) cache shaves off ~50ms per run (0.7 seconds gained).

However i would expect that gains in your monorepo will be much bigger. In Hermes distance from HEAD to tag is just a few commits. In repo with different versioning schemes, this distribution will vary a lot.

During testing i noticed one more bottleneck: checking for uncommitted changes in repo. This takes between 100-200ms per version calculation! For Hermes repo this is the dominating factor.

So, to wrap up (time is rounded to indicate order of magnitude):

Phase Old version (ms) Performance version (ms)
First tags traversal (cache warmup) 150 150
Subsequent tags traversal 60 0
Checking for uncommitted changes 200 200

Would you mind checking out the new version from performance_tests branch? This contains only cache. In the meantime i will try to optimize uncommitted changes check.

@tmszdmsk
Copy link
Author

Unfortunately, I've got exception:
https://gist.github.com/tmszdmsk/a1b42fdad63ff2fa413f8a31b756a527

I don't have time right now to dig deeper.

@adamdubiel
Copy link
Collaborator

Looks like you repo contains commit with prefix matching configured tag prefix, which is not SemVer compatible (i.e. release-me). No hurry though :) Just let me know when/if you have time.

@tmszdmsk
Copy link
Author

scmVersion{
    tag {
        prefix = 'alerting-tester'
    }
}

project.version = scmVersion.version

,

$ git tag | grep alerting-tester
alerting-tester-0.1.0
alerting-tester-0.1.1
alerting-tester-0.1.2
alerting-tester-0.1.3
alerting-tester-0.1.4
alerting-tester-0.1.5

and works with axion-release-plugin from master.

@adamdubiel
Copy link
Collaborator

Damn ;) Okay, i will dig into this as well.

@adamdubiel
Copy link
Collaborator

I released 1.7.2 with first perf improvement (dropping TOPO sort), and here are the effects on one of my projects.

1.7.1 configuration phase time: 1.913s
1.7.2 configuration phase time: 0.251s

It's blazing fast now compared to the older one :)

@tmszdmsk
Copy link
Author

A brief test shows it's twice as fast for our monorepo. 👍

@adamdubiel adamdubiel modified the milestones: 1.8.0, 1.8.1 Oct 7, 2017
@adamdubiel adamdubiel modified the milestones: 1.8.1, 1.8.2 Oct 7, 2017
@adamdubiel adamdubiel removed this from the 1.8.2 milestone Dec 29, 2017
@adamdubiel
Copy link
Collaborator

I'm sorry for pushing this even further away. I have some experimental code, but no time to actually test it and make it production-grade. You can either contribute or rely on my best-effort approach :)

@schoeneu
Copy link

Hi @adamdubiel , do you have a PR/branch with your experimental code for Git caching? I can't promise that I'd be able to work on it, but I'd like to give it at try.

Background: We have the same issue, but in addition to speed, we're running into heap memory issues on the Gradle daemon as the current way of doing things keeps a copy of the git-related objects around for every Project in our multiproject build, even though they all share the same underlying repo. Would the level of caching that you're suggesting also remedy this?

@adamdubiel
Copy link
Collaborator

Yes, i do have this branch somewhere and while i'm not 100% sure, i think it would help. I can dig it up, the cache layer was pretty much finished, but i failed to integrate it with the whole project (lot of tests and constructors to change to pass it down the chain).

@schoeneu
Copy link

If you could dig that branch up and push it, that would be greatly appreciated :) .

@adamdubiel
Copy link
Collaborator

@JozoVilcek
Copy link

JozoVilcek commented Jun 28, 2020

I believe I am also hit by this ... experiencing long configuration phase if doing project.version = scmVersion.version. It is 4 minutes (with scm version) vs 5 seconds (without). It is a repo with ~21k commits and 200 gradle modules.

Did anyone manage to get some speedup somehow or it still needs the cache to be implemented. I am using plugin version 1.10.3

@JozoVilcek
Copy link

JozoVilcek commented Jul 18, 2020

@adamdubiel I did get an inspiration from your performance branch and created a PR #346
It does improve configuration phase considerably for my case.

@JozoVilcek
Copy link

@adamdubiel will you get some time to look at the open PR for this issue?
#346

I would like to know if this approach is acceptable and can get to master any time soon.

@bgalek
Copy link
Member

bgalek commented Mar 27, 2021

@JozoVilcek you've got some comments ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants