Skip to content
This repository has been archived by the owner on Jun 14, 2022. It is now read-only.

Monorepo for Ibis? #1

Open
wesm opened this issue Sep 21, 2020 · 4 comments
Open

Monorepo for Ibis? #1

wesm opened this issue Sep 21, 2020 · 4 comments

Comments

@wesm
Copy link
Member

wesm commented Sep 21, 2020

I'm curious about the many-repo approach versus a mono-repo approach (similar to what Vaex does https://github.com/vaexio/vaex/tree/master/packages). It seems like having many repositories will make CI/integration testing much more difficult (PRs in the core that break the dependent packages won't be detected -- they could be detected in a monorepo)

@datapythonista
Copy link
Contributor

That's a good point. There is nothing that prevents us from installing git+https://github.com/ibis-project/ibis-dask in the main repo CI, and testing the backends for changes in the core of Ibis, in the main repo PR. Or we can also consider doing that only before releasing new versions, making sure a new Ibis version won't break known backends.

The main advantages of having separate repos, in my opinion, are that we can have separate maintainers for the core of Ibis and the individual backends. And that having backends in separate repos and packages should help implement proper modularity and software architecture practices. Like using entrypoints for the backends instead of the current implementation (see ibis-project/ibis#2379), having hard dependencies for the backends that are not core dependencies (instead of the current soft dependency approach), etc.

The good thing is that if we implement backends as separate modules, then it's trivial to get them back to the main repo if we think that's a good idea. While doing the opposite is quite more tricky, because things are not properly decoupled. So, happy to reevaluate this decision in the future, but I think it's a good next step, even if we later change our mind.

@wesm
Copy link
Member Author

wesm commented Sep 21, 2020

You can achieve one-directional integration testing for changes in the core repo only, but bidirectional integration testing isn't possible on GitHub without using a monorepo. It may not become an issue, but I just wanted to emphasize that having multiple Python packages doesn't require multiple git repositories (which has more downsides than upsides IMHO).

@jreback
Copy link

jreback commented Sep 21, 2020

yeah i agree with @datapythonista rationale here

core maintainers often don't have the expertise to properly maintain a lot of these different backends

the maintenance burden of additional backends can also be mitigated by having maintainers of the backend specific repos

@wesm
Copy link
Member Author

wesm commented Sep 21, 2020

I guess I don't see a problem with having a lot of committers in a single repository as long as everyone knows what their role is (e.g. in Arrow we have > 50 committers and have never had issues). Robust change validation and shared CI infrastructure is worth a lot.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants