7 years ago, I wrote a blog
post
about how we at Jane Street were using our distributed version control system
(hg
, though the story would be the same for git
) in a partially centralized
way. Essentially, we built a centralized repo and a continuous integration
system whose job was to merge in new changesets. The key responsibility of this
system was to make sure that a change was rejected unless it merged, compiled
and tested
cleanly.
This half-distributed, half-centralized approach let us enjoy the benefits of a DVCS, while still getting the coherence and easy of sharing that comes from having a central authoritative source.
Since then, our development tools have changed a lot, including the arrival of a new code review and release management system called Iron. In writing Iron we discovered that centralization was valuable in ways we hadn’t considered before. In particular, despite the fact that good support for merging is central to a DVCS, centralization is actually a critical ingredient to making merges work better.
To understand how centralization can help, let’s talk about one reason why merging is a fraught process to begin with.
The criss-cross merge
The basic approach to merging in a DVCS like hg
or git
is pretty simple.
Here are the basic steps that are taken to merge two heads A
and B
.
- Find the greatest common ancestor (
GCA(A,B)
) of the heads to be merged. - Compute the patch from that base point to one of the two heads, say,
A
. - Take the patch you just computed, and apply it to
B
. Conflicts appear when the patch, which was actually based onGCA(A,B)
, doesn’t apply cleanly toB
. The result of this process is the merge.
The above discussion oversimplifies the story by assuming there’s a well defined
GCA, but this just isn’t always true. To see why, consider a repository staring
with a root revision R
, and two revisions made independently on top of R
.
A
/
R
\
B
Now, imagine that two different developers each concurrently decide to merge the
heads A
and B
and do some further development. Note that in both of the
cases shown below, the GCA for the merge between A
and B
is R
.
Developer 1 Developer 2
A---C--D A
/ / / \
R / R \
\ / \ \
B B---E--F
Now, if we bring these two separate histories together into one repo, we have something like this.
A---C--D
/ \ /
R \
\ / \
B---E--F
Now, what happens if we want to merge D and F? In particular, what is
GCA(D,F)
? Both A
and B
are common ancestors, but neither one is greater
than the other. In this case, there are in some sense two different GCAs, or,
more precisely, there are multiple maximal common ancestors, or MCAs. This
case is often described as a criss-cross merge, and is the source of much
wailing and gnashing of teeth among developers and users of DVCSs.
git
and hg
have different ways of dealing with the case of multiple MCAs. By
default, hg
just picks one of the MCAs arbitrarily and does the merge based on
that. Given that different choices of the merge base will lead to different
results, making that choice arbitrarily is pretty disturbing.
git
, on the other hand, has a strategy called recursive merge that repeatedly
merges together the MCAs, and then uses that merged MCA as the basis for
computing the diffs to A
and B
on which the final merge will be based. And
hg
has a new strategy called bid
merge that
is willing to make different choices as to the GCA to use on a file by file
basis.
None of these approaches amount to principled solutions, and while they work better in some cases and worse in others, they all sometimes lead to bad results. It’s tempting to look for a way out of this conundrum altogether, by avoiding the possibility of criss cross merges in the first place.
Avoiding the criss-cross merge
For those who haven’t read my previous posts about how Iron approaches
merges,
I’ll describe it briefly here. Iron organizes its branches into a hierarchy:
every repository has a root feature, and that feature can have children, and
those can have children as well. Thus, our main repository, called Jane, has a
root feature called jane
, and one can develop changes to jane
in child
features, such as jane/Core.Applicative
or jane/quickcheck
.
Critically, the merging of features is constrained. Note that in Iron, every feature is defined by its base and tip revision, where the diff between those two revisions is effectively the contents of the feature. Here are some of the key operations allowed on features.
fe release
, moves changes from a child feature into a parent. This can only be done once the child feature is fully merged with its parent, and has the effect of setting the tip of parent to be the tip of the child, and typically deleting the child.
As an example, if the jane/quickcheck
feature is based at the current tip of
jane
(and is fully reviewed, and all its tests pass), then calling
fe release jane/quickcheck
will move the tip of jane
forward to be equal to
the tip of jane/quickcheck
, and will delete jane/quickcheck
.
fe rebase
lets you merge a feature with its parents, effectively pulling changes from a parent feature into a child. This has the effect of changing the base of the feature to be the tip of its parent, and the tip of the feature to be the result of the merge.
So, if other features have been released into jane
since the
jane/Core.Applicative
feature was created, then the base of
jane/Core.Applicative
will no longer be the tip of jane
. Calling
fe rebase jane/Core.Applicative
will merge the tip of jane/Core.Applicative
with the tip of jane
, and will set the base of jane/Core.Applicative
to the
tip of jane
.
fe rename
, which in addition to allowing you to simply change the name of a feature, also lets you introduce a parent-child relationship between features that didn’t previously have one. e.g., callingfe rename jane/Core.Applicative jane/quickcheck/Core.Applicative
causes theCore.Applicative
feature to become a child of, and so be able to depend on the changes in, thequickcheck
feature.
All of these operations are implemented against a single, centralized server which keeps track of the state of all our features. This centralization lets Iron enforce some useful invariants along the way, critically, that the GCA of a feature and its parent is well defined, and is equal to the base of the feature. This simple property turns out to outlaw criss-cross merges, which avoids all of the mess we described earlier.
The happy outcome turns out to depend critically on the fact that we built a central server that could enforce the invariants in question, or, more precisely, that we built a consistent service
discovered by chance, the existence of the central server is key to enforcing the necessary invariant. In particular, the scenario of two different users concurrently releasing into the same feature or rebasing the same feature simply isn’t possible when there’s a centralized monitor determining who goes first.
In retrospect, this shouldn’t be too surprising. The criss-cross merge is really a result of concurrency, and the idea that introducing a lock (which is what a centralized server does for you) can be used to exclude unwanted concurrent executions in a distributed systems should surprise no one.
In the end, you can trace it all back to the CAP theorem: If you want progress while partitioned, you need to give up on consistency in some way. And criss cross merges are caused by a kind of inconsistency.
Centralization obviously has downsides, but I think Iron picks a nice point along the spectrum here: writing code is totally doable while disconnected, but operations like rebase and release that affect how information is shared between features requires you to be connected. I think it’s a small price to pay to never have to deal with a criss-cross merge.