Core Principles: uniformity of interface

This is intended to be the first in a series of posts talking about the design principles behind core, Jane Street’s alternative to OCaml’s standard library.

It’s worth noting that we haven’t quite fully achieved any of our design goals. Core is at the center of a complicated and evolving software infrastructure, and it takes longer to force changes through that infrastructure that it does to figure out what changes should be made. So these principles serve as both a guide to how the library is currently laid out as well as an indication of what kinds of changes are likely to come over the next year or so. The principle I’m going to talk about in this post is the idea of uniformity of interface. There are a few basic reasons for keeping interfaces uniform: first, to make it easier for people to learn and remember a module’s interface; second, to make it easier to use functors to extend a module’s functionality; and third, to avoid wasting time on making essentially trivial design decisions over and over. The last one is a bit surprising but is nonetheless real. When you have a significant number of people collaborating on a code base, having standards for how that code is to be written eliminates a lot of pointless decision-making about how things should be done.

Here are a few of the design ideas we’ve had that we try to apply uniformly:

Types and modules

In core, almost all types have dedicated modules, with the type associated with a module called t. This is not an uncommon pattern in OCaml code in general and in the standard library in particular, but in core, the approach is taken more consistently. Thus, core has modules for float, int, option and bool. This is convenient both because it provides natural place to put functions and values that otherwise just swim around in Pervasives, and because it makes the naming easier to remember. For instance, the modules Bool, Float and Int all have to_string and of_string functions. Similarly, the Int module has the same basic interface as the Int64, Int32 and Nativeint modules.

`t` comes first

One choice that you have to make over and over again in any library is the order in which arguments are listed. One thing you could optimize for when making this decision is the ease of use for partial application. This is not a crazy approach, but it’s often hard to guess in advance which order will be most useful. There are other things to consider as well: putting a function argument (e.g., the function that you pass to List.map) at the end often increases readability, since the function argument can be quite large and is often awkward sitting in the middle of the argument list. Sadly, this often conflicts with the most useful order for partial application.

Rather than make idiosyncratic choices on a function-by-function basis, we prefer to have clear and unambiguous rules where possible. Once such rule we’ve (mostly) adopted is, within a module whose primary type is t, to put the argument of type t first. Thus, Map.find, Hashtbl.find and Queue.enqueue all take the container type first. This rule doesn’t lead to an optimal choice for every function, but it is very convenient, and is simple and easy to apply consistently.

Exceptions, options and function names

In core, the default functions only throw exceptions in truly exceptional circumstances. Thus, List.find returns an option rather than throwing Not_found. That said, there are cases where the exception-throwing version of the function is useful as well. The convention we now use is to mark the exception-throwing version of a function with _exn. So, we have Map.find and Map.find_exn, Queue.peek and Queue.peek_exn, and List.nth and List.nth_exn.

Standardized interface includes

There are a number of standardized interfaces that we use as components of lots of different signatures. Thus, if you had a module representing a type that could be converted back and forth to floats, supported comparison and has its own hash function, you could write the interface as follows:

module M : sig
  type t
  include Floatable with type floatable = t
  include Comparable with type comparable = t
  include Hashable with type hashable = t
end

By making some of core’s conventions explicit, it makes it easier to enforce these conventions, so that parallel functions are forced to have the same name and type signatures across many different modules. It also makes it easier to design functors on top of these modules. So, for example, the Piecewise_linear module contains a functor that takes as its input any module which is both floatable and sexpable.