-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migration Race Condition #16
Comments
Hi @voxdolo - thanks for raising this issue! This is actually a really interesting problem - not one that we've run into yet as our team and project sizes are small. What we basically need is a distributed consensus algorithm for deciding on a partial ordering of migrations (some migrations can be run out of order, some can't). And distributed systems are hard, which makes me think this needs careful thought before we jump into a solution. But also, because the problem is generic enough, there will be plenty of existing solutions in other domains that we can weigh up and pick the best from. However I also think that the specific form of database migrations might mean we can make several assumptions that makes the problem a lot easier than the generic problem, too. The only immediate possible issues with the idea of using a set for
With regards to 1), maybe this is fine? In most cases if two developers are simultaneously working on things that both require migrations, you'd hope they'd be independent enough that the order in which they're run in doesn't matter. But I still feel like I need some convincing that there aren't any cases where this could cause problems, and that when problems do arise, it would fail in a sensible way. To use a contrived example, if developer A created a migration to alter a It's possible there's a reasonably easy way for us to work out which migrations are safe to run out of order - migrations that affect different tables should be safe, for example - but I think this could potentially become quite difficult when you start considering things like which tables + columns views depend on. With regards to 2), my concern is that with the current mechanism, as long as the most recent entry in With regards to 3) - currently, if you run I'm interested in what your thoughts are on the above 3 things, and if there's anything else you can think of. I'm going to continue pondering it too. Obviously the current scenario you describe, in which migrations could be missed, is far from ideal. If this is a problem you're facing, I wonder if it's worth getting a partial solution in place quickly while we try to come up with a better solution for the long run? What I'm thinking is a fairly simple approach: build a set from |
Due to the way
tern
manages itsschema_versions
table, it's relatively easy to create a situation where a slightly older migration will never get run on a developer or CI machine in a distributed team setting.The scenario:
If two developers make separate migrations at roughly the same time and the first developer to push is also the one with the more recent migration timestamp, the migration from the second developer will not be run on their machine. This problem can promote to deployed environments where there is continuous deployment (we use Circle CI to push all passing commits to our staging env).
Potential solution:
Rather than using the current mechanism for determining if a migration has run or not:
https://github.com/bugsbio/lein-tern/blob/master/src/tern/migrate.clj#L27-L31
schema_versions
could be treated as a set and when a migration's timestamp is not present, it is run.Any thoughts?
The text was updated successfully, but these errors were encountered: