This is intended to be the first in a series of posts talking about the design
principles behind core
, Jane Street’s alternative to OCaml’s standard library.
It’s worth noting that we haven’t quite fully achieved any of our design goals.
Core
is at the center of a complicated and evolving software infrastructure,
and it takes longer to force changes through that infrastructure that it does to
figure out what changes should be made. So these principles serve as both a
guide to how the library is currently laid out as well as an indication of what
kinds of changes are likely to come over the next year or so. The principle I’m
going to talk about in this post is the idea of uniformity of interface. There
are a few basic reasons for keeping interfaces uniform: first, to make it easier
for people to learn and remember a module’s interface; second, to make it easier
to use functors to extend a module’s functionality; and third, to avoid wasting
time on making essentially trivial design decisions over and over. The last one
is a bit surprising but is nonetheless real. When you have a significant number
of people collaborating on a code base, having standards for how that code is to
be written eliminates a lot of pointless decision-making about how things should
be done.
Here are a few of the design ideas we’ve had that we try to apply uniformly:
Types and modules
In core
, almost all types have dedicated modules, with the type associated
with a module called t
. This is not an uncommon pattern in OCaml code in
general and in the standard library in particular, but in core, the approach is
taken more consistently. Thus, core has modules for float
, int
, option
and
bool
. This is convenient both because it provides natural place to put
functions and values that otherwise just swim around in Pervasives
, and
because it makes the naming easier to remember. For instance, the modules
Bool
, Float
and Int
all have to_string
and of_string
functions.
Similarly, the Int
module has the same basic interface as the Int64
, Int32
and Nativeint
modules.
t
comes first
One choice that you have to make over and over again in any library is the order
in which arguments are listed. One thing you could optimize for when making this
decision is the ease of use for partial application. This is not a crazy
approach, but it’s often hard to guess in advance which order will be most
useful. There are other things to consider as well: putting a function argument
(e.g., the function that you pass to List.map
) at the end often increases
readability, since the function argument can be quite large and is often awkward
sitting in the middle of the argument list. Sadly, this often conflicts with the
most useful order for partial application.
Rather than make idiosyncratic choices on a function-by-function basis, we
prefer to have clear and unambiguous rules where possible. Once such rule we’ve
(mostly) adopted is, within a module whose primary type is t
, to put the
argument of type t
first. Thus, Map.find
, Hashtbl.find
and Queue.enqueue
all take the container type first. This rule doesn’t lead to an optimal choice
for every function, but it is very convenient, and is simple and easy to apply
consistently.
Exceptions, options and function names
In core
, the default functions only throw exceptions in truly exceptional
circumstances. Thus, List.find
returns an option rather than throwing
Not_found
. That said, there are cases where the exception-throwing version of
the function is useful as well. The convention we now use is to mark the
exception-throwing version of a function with _exn
. So, we have Map.find
and
Map.find_exn
, Queue.peek
and Queue.peek_exn
, and List.nth
and
List.nth_exn
.
Standardized interface includes
There are a number of standardized interfaces that we use as components of lots of different signatures. Thus, if you had a module representing a type that could be converted back and forth to floats, supported comparison and has its own hash function, you could write the interface as follows:
module M : sig
type t
include Floatable with type floatable = t
include Comparable with type comparable = t
include Hashable with type hashable = t
end
By making some of core
’s conventions explicit, it makes it easier to enforce
these conventions, so that parallel functions are forced to have the same name
and type signatures across many different modules. It also makes it easier to
design functors on top of these modules. So, for example, the Piecewise_linear
module contains a functor that takes as its input any module which is both
floatable
and sexpable
.