At Jane Street we have a number of systems that are vital to the operation of the firm. As the company has grown, so have these systems, and in the process acquired a few oddities. The below is a post I wrote a while ago for our internal blog to give people outside of tech an idea of why it often takes a long time to add features to these systems or straighten out some of their weird behaviors.
Imagine you’re driving your car. It’s an okay car. It drives. It may be a little rusty, and the engine makes some rattling sounds, but you’re not too concerned. Actually, upon closer inspection you notice one of the tires is flat. Funny, you must’ve been driving like this for years. Come to think of it, you did always have to pull pretty hard on the steering wheel to avoid veering off the road… You should probably do something about this.
Easy, just stop and change the tire, should only take a few minutes. But here’s the catch: you can never stop. Well. That’s annoying. You can probably still change the tire if you spend a lot of time preparing, make a careful plan, and then risk your life.
Assuming you survive the tire change, there’s still that rattling noise in the engine. That’s obviously going to be an issue sooner or later, and no amount of acrobatics is going to help you change the engine whilst driving. Hm. So I lied before. You can stop the car, but only for a few minutes at a time, and you’d better be damn sure the car will start again when you need to go on. So you can’t just stop at a garage and change the engine – that’ll take too long and who knows whether they’ll connect all the hoses and gears and stuff1 the right way on the first try. You adopt the obvious solution:
- At a quick stop, have a new engine strapped to the roof of your car
- Connect things up so that you can switch between the old and the new engine2
- At another quick stop remove the old engine
- Yet another quick stop, move the new engine from the roof into the engine compartment
- Spend the next 6 months shortening excess hose, smoothing out dents in your roof, and scrubbing away oil stains in the upholstery.
The bottom line is that when you want to make any kind of change to a system that can never be shutdown or can only be down for very short amounts of time, you probably can’t (or at least shouldn’t) just make that change in one go. You have to break it down into very small, well understood steps, not all of which may directly contribute to what you actually want to achieve.