At Jane Street, we care a lot about code review. We think that high quality code, and in particular, readable code, helps us maintain the safety of our systems and keeps things simple and clean enough for us to stay nimble.
But code review is hard to do without some kind of automation. We have an
existing tool called
cr that we’ve used for a long time. I posted a
description of the initial design (here
and here) back in
2009, and posted an update last year
on how we’d revised our approach to review since then.
cr is showing its age. It has significant problems, both in terms of the
workflow it implements, and in terms of performance.
cr itself is, ironically,
a fairly messy piece of code, so rather than fix it, we’ve been working for the
last few months on a brand new replacement called Iron, or
fe for short. We’ve
just started rolling Iron out, and it’s a big improvement, and represents a
significant step up in our understanding of how to organize review. It isn’t
released publicly yet, but we will open source it after it stabilizes a bit
This is the first in what I expect to be a series of posts about Iron. In this one, I’d going to try to explain why there’s a problem to be solved in the first place, and I’ll do that by walking through some examples that should underline why this is a tricky problem.
Scenario 1: Simple patch review
One simple approach to code review is patch review. Here’s how it might work in a simple case.
- Bob develops a new feature
- He creates a patch, and shows the patch to Alice.
- Alice reviews the patch, and if it looks good, she approves it.
- The patch is accepted, and applied to the tree, making a new revision.
That’s simple enough, but what happens if Bob makes some changes in response to Alices’ review? How does Alice become confident in those changes? One common answer is that she should simply read the new patch from scratch.
But reading the new patch in its entirety violates one of our design principles for Iron: don’t be boring. Humans are bad at carefully thinking about boring things, and rereading a patch that hasn’t changed much since you saw it last is incredibly boring.
There’s a simple workaround here. Rather than reread the whole patch, Alice can just reread the bits that changed. Here’s a simple picture to help illustrate:
B is the state of the code when Bob created his feature, or the base
revision of his feature.
F1 is the state of the feature when he handed the
B->F1, to Alice. And
F2 is the state of the feature after he
responded to Alice’s request.
With this in mind, it’s easy to see that Alice has the option of reading
F1->F2, which just contains Bob’s fixes rather than the
B->F2, which has
both Bob’s original changes plus his fixes.
Scenario 2: Concurrent patch review
Patch review gets a bit more complicated when more people get involved.
Consider what happens if Carla has her own feature, and let’s say she managed to get it released before Bob did. That means that Bob will need to release his patch applied to a different version of the code than the one he based his feature on. Here’s a picture to illustrate the situation.
Fc is Carla’s feature, and
Fb is Bob’s. Given that
Fc has been
released, Bob needs to make his feature a descendent of
Fc in order to release
If the diff
B1->Fb applies cleanly to
Fc, then that’s easy enough: we can
just apply Bob’s patch to Carla’s released feature and call it a day. This is
effectively a way for Bob to merge his patch with Carla’s.
This raises some natural concerns. For one thing, Alice didn’t read Bob’s patch
in a vacuum: she presumably read it in the context of the state of the codebase
as it was when the patch
B1->Fb was created. Now that the patch is being
Fc, how does Alice know that it still makes sense? The fact that
there are no textual conflicts is no guarantee.
The answer is that Alice doesn’t know for sure. The only principled way around this would be for Alice to take a lock on the repository as she reads Bob’s patch, and not let anything (including Carla’s patch) jump ahead.
But this way madness lies. Development at scale require parallelism, and so you need to allow concurrent review. In the end, all we can do is to try to cover for this in other ways, in particular by using tests and types to make accidental breaking of the code from such a sitution less likely.
Scenario 3: Conflicted patch review
What if Bob’s patch doesn’t apply cleanly to Carla’s release? In this case, Bob
needs to create a new revision that merges together Carla’s changes with his
own, say, by applying the patch
Fc, and resolving the conflicts by
hand. Similarly, Bob could use the merge algorithm from his version control
system to do the merge, but again conflicts must be resolved by hand.
Given that Alice has already read
B->Fb, how does she become comfortable with
Should could read the diff
Fc->Fb', but that would violates our boredom
Fc->Fb' will generally look a lot like
B->Fb. Reading the
Fb->Fb' is no better, since this will largely be a recapitulation of the
changes made by Carla in
One way out of this boredom trap is to read the diff-of-diffs, or ddiffs. In
other words, Alice can read
(B->Fb)->(Fc->Fb'). By reading this, she can get a
view on how the feature changed.
Now, diffs-of-diffs can be hard to read, and you only want to read them when you
absolutely have to. That’s why we wrote a tool called
patdiff4 for use with
patdiff4 does a hunk-by-hunk analysis of the code in the four revisions
of the merge diamond, and tries to present each hunk in the simplest way
possible, sometimes using ddiffs, but using other simpler visualizations where
Dealing with review of conflicted merges is at the heart of Iron. In Iron, we
refer to this as a rebase, but not because it’s a rebase of the kind you see
in git or mercurial. After all, from the point of view of the version control
system, you might be doing a merge rather than a rebase. But from Iron’s
perspective, the goal is to rebase the review. Thus, after a rebase, Iron thinks
of Alice as having approved of
Fc->Fb' even though she read
B->Fb and then
later read the ddiff
Rebasing isn’t the end of the Iron story. In future posts, I’ll discuss the two other main ideas that animate Iron: Iron’s management of scrutiny, which is how it keeps track of the level of scrutiny that a user is supposed to apply to a given review; and hierarchical features, a tool for handling complex workflows, including dealing with features that depend on each other.