OCaml with Jane Street extensions is available from our public opam repo. Only a slice of the features described in this series are currently implemented.
In part one, we discussed how OCaml’s locality mode enables safe stack allocation. In this post, we will explore additional modes for representing ownership.
Ownership
OCaml isn’t a pure language—in fact, records containing mutable fields are relatively common. The compiler also doesn’t track where values may be referenced, so must assume all values are potentially shared. At runtime, this means several parts of the program may hold pointers to any particular piece of data.
Shared mutable data is notoriously difficult to reason about, and doubly so with the addition of concurrency. Conversely, pure functions provide useful guarantees, so OCaml encourages a side-effect free style. For example, records support functional updates, which preserve immutability:
type 'a box = { x : 'a option }
let clear box =
{ box with x = None }
;;
val clear : 'a box -> 'b box
Unfortunately, immutability comes at a cost. A functional update allocates a new record, since other code may hold references to the original record. We illustrate the result here—purple blocks are live values, which become red when unreferenced:
Writing to an explicitly mutable field would not incur this allocation, but instead exposes us to shared mutability.
Ultimately, immutable updates only approach this problem from one direction. We could instead imagine eliminating the shared portion of “shared mutability.” If we knew that our code holds the only reference to the original record, the functional update immediately makes the original value unreferenced—i.e. available for collection and re-use. Therefore, the runtime could equivalently perform an in-place update, even though the type exposes an immutable interface.
One of Rust’s primary design goals is to entirely forbid shared mutability, which it achieves by only allowing mutable values to be referenced uniquely. In OCaml, uniqueness isn’t a strict necessity, since the garbage collector can handle cyclic lifetimes arising from unrestricted mutability. However, we still stand to benefit from the optimizations and expressivity enabled by unique references. In fact, in part three of this series, we’ll see how uniqueness enables us to statically prohibit data races in multicore OCaml.
The Uniqueness Mode
A variable has the unique
mode when it is the only reference to a
particular value in the entire program. Variables without this
restriction have the default shared
mode.
Previously, we saw that the compiler tracks which variables may escape their region; these were global. Uniqueness can be handled in a similar fashion—tracking which variables may have aliases tells us which are shared. Unique variables are hence those used at most once.
For example, given a function that takes a unique argument:
val consume : 'a @ unique -> unit
We’re no longer allowed to use this function in a way that violates uniqueness.
let foo (unique x) =
consume x;
x
;;
3 | x ^ Error: x is used uniquely so cannot be used twice.
OCaml’s uniqueness mode mirrors a similar concept in Rust, where values are unique by default. When passed to a function “by value,” a Rust variable is moved. Ownership is transferred to the callee, which has free rein to mutate or destroy the value.
Sub-Moding
A unique variable may be safely used as if it were shared, so unique
is a sub-mode of shared
. This hierarchy is reversed compared to
locality: the non-default mode, unique
, is the sub-mode.
In this sense, unique
behaves like global
and shared
behaves
like local
. For example, we can construct a shared variable by
referencing a unique value, but not vice versa.
let bar (unique x) =
x, x
;;
val bar : 'a @ unique -> 'a * 'a
Uniqueness’ submoding relation makes it more intuitive to interpret a unique function parameter as a promise by the caller.
-
A
unique
parameter means the caller promises to provide uniquely referenced values. The callee is hence allowed to overwrite the value. -
A
shared
parameter does not encode a promise from the caller; the callee must assume the parameter is aliased.
Strong Updates
Using the following syntax, we can perform an in-place functional update on a unique record.
let clear (unique box) =
{ overwrite box with x = None }
;;
val clear : 'a box @ unique -> 'b box @ unique
It may alarm you to notice that clear
returns a 'b
box
—does this let us cast the box to a new type in-place?
The answer is yes, and it’s perfectly safe to do so. No other code
maintains a reference to the box, so it cannot be referred to as a
value of the stale type.
This result demonstrates that unique variables provide another
important capability: updating a value’s type in-place. Such an
operation is known as a strong update, and lets us write (for
example) an in-place map
function:
let map (f : 'a @ unique -> 'b @ unique) (unique box) =
match box.x with
| None -> { overwrite box with x = None }
| Some x -> { overwrite box with x = Some (f x) }
;;
val map : ('a @ unique -> 'b @ unique) -> 'a box @ unique -> 'b box @ unique
Borrowing
Unique variables also empower users to write safer API contracts. Procedural resource management is one obvious use case, so let’s attempt to define a simple set of functions for manipulating files.
type file
val open_ : path:string -> file @ unique
val read : file -> string
val close : file @ unique -> unit
Since close
requires a unique parameter, we can be sure that files
are never used after being closed.
What happens when we try to use this API?
let print_and_close (unique file) =
print_endline (read file);
close file
;;
3 | close file ^^^^ Error: file is used uniquely so cannot be used twice.
Even though read
does not require a unique parameter, we’re no
longer allowed to close the file. Using a unique value at the shared
mode is a one-way transformation: once we create a single shared
reference to file
, we’re no longer allowed to use file
uniquely.
This behavior seems overly restrictive. We know that read
itself doesn’t use file
uniquely—so if it also doesn’t leak
any references to file
, we could keep using file
uniquely after
read
returns. Fortunately, we’ve already discussed
a mode that expresses this constraint: locality!
val read : file @ local -> string
let print_and_close (unique file) =
print_endline (read &file);
close file
;;
val print_and_close : file @ unique -> unit
The syntax &file
denotes borrowing, which weakens the mode of
file
to local shared
. The resulting reference cannot be leaked by
read
, so after read
returns, we still have the only reference to
file
.
In Rust, borrowing is less restrictive, as the callee is paramaterized over the true lifetime of the borrowed value.
Exclusivity
Let’s expand our API to include a write
function. A file represents
access to mutable state—and we’re avoiding shared
mutability—so write
should require a unique file.
val write : file @ unique -> string -> unit
But when we try to use this function…
let write_and_close (unique file) =
write file "data";
close file
;;
3 | close file ^^^^ Error: file is used uniquely so cannot be used twice.
If we pass file
uniquely, it cannot be used again. While we could
return file
from write
, piping unique values through every
function call isn’t a good solution. Ideally, we want to express that
although our function mutates file
, it does not consume it.
To better handle this case, we add a third mode to the uniqueness
axis: exclusive
, which sits in between the initial options. A
variable at the exclusive mode is the only active reference to a
particular value.
Specifically, exclusivity expresses that while other references to this value may exist, none are accessible until the conclusion of the exclusive reference’s region. Therefore, we may freely mutate an exclusive value, but we may not use it uniquely.
val write : file @ local exclusive -> string -> unit
let write_and_close (unique file) =
write &file "data";
close file
;;
val write_and_close : file @ unique -> unit
Because the call to write
borrows file
exactly once, &file
has
mode local exclusive
. If file
was borrowed multiple times, its
mode would instead be weakened to local shared
.
val write2 : file @ local exclusive -> file @ local exclusive -> string -> unit
let write_twice (unique x) =
write2 &x &x "data"
;;
2 | write2 &x &x "data" ^^ Error: found a shared value where an exclusive value was expected.
Exclusively Mutable Fields
The exclusive mode is not just useful for resource management. It also expresses that a function may mutate a record, but never strongly updates it. Given an exclusive variable, we know our code has the only active reference to the value, but later code may inspect it at the current type.
Therefore, we also introduce exclusively mutable fields:
type counter = { exclusively mutable i : int }
An exclusively mutable field behaves as you might expect. It may be written to via an exclusive reference, but appears immutable when accessed via a shared reference. With exclusively mutable fields, we can enjoy the performance benefits of mutability without the strict move semantics of uniqueness.
let increment (local exclusive counter) =
counter.i <- counter.i + 1
;;
val increment : counter @ local exclusive -> unit
Marking individual fields as exclusively mutable is a natural extension of OCaml’s current mutability story, but notably diverges from Rust’s approach. In Rust, variable bindings are marked as mutable upon declaration—that is, mutability becomes another deep property. (Perhaps a mode!) Despite this difference, OCaml’s uniqueness mode axis now mirrors Rust’s hierarchy of references:
-
A unique variable is akin to a value in Rust. Passing a parameter by value allows the callee to destroy it.
-
An exclusive variable is akin to a mutable reference in Rust. Passing a parameter by mutable reference allows the callee to overwrite, but not destroy, the value.
-
A shared variable is akin to an immutable reference in Rust. Passing a parameter by immutable reference restricts the callee to reading or copying the value.
There is one important difference: OCaml’s references are not paramaterized over lifetime variables. Like we saw in part one, this means uniqueness doesn’t lead to higher-order polymorphism and doesn’t interfere with type inference.
Closures & Linearity
So far, we’ve seen how the uniqueness and locality axes interact to represent ownership. Unfortunately, we can now write the following code…
type 'a box = { x : 'a option }
let wrap (unique box) =
fun () -> box
;;
let unsound (unique box) =
let get = wrap box in
let a = get () in
let b = get () in
{ overwrite a with x = Some 0 }, { overwrite b with x = Some "string" }
;;
Naively, applying get
simply returns the wrapped box. But that means
invoking get
twice creates two supposedly unique references to the
same box. Violating uniqueness breaks type safety, as we can use
strong updates to refer to the same value at multiple types.
val wrap : 'a @ unique -> unit -> 'a @ unique
val unsound : 'a box @ unique -> int box @ unique * string box @ unique
Therefore, we must not allow wrap
to return a closure with signature
unit -> 'a box @ unique
. We know the resulting function may be
executed once, but is unsafe to invoke a second time. More
generally, any closure capturing a unique
value must be run at most
once, as its first invocation may consume the value.
The Linearity Mode
To encode this restriction, we will define a third mode axis: linearity. In practice, linearity modes will almost exclusively apply to variables of a function type (i.e. closures), but this is not a requirement.
-
A variable with mode
once
carries the restriction of being used at most once. It is allowed to close overunique
andonce
variables. -
A variable with the default mode
many
does not have this restriction; it may be used freely and may only close overshared
/many
variables.
A many
variable can be used as if it were once
, so many
is a
sub-mode of once
. Like locality, the salient mode (once
) is the
super-mode. From a parameter perspective, once
represents a promise
by the callee:
-
A
once
parameter means the callee promises not to use the value more than once. -
A
many
parameter encodes no promise; the callee may use the value any number of times.
let baz (once x) =
x, x
;;
2 | x, x ^ Error: found a once value where a many value was expected.
In some sense, linearity is dual to the uniqueness axis. Both
analyses are concerned with aliasing: a once
variable prohibits the
creation of new references to its value, whereas a unique
variable
states that no such references exist.
Consuming Variables
Returning to our unsound example, wrap
now returns a closure at mode
once
. When we try to invoke the result twice, the compiler raises
an error:
val wrap : 'a @ unique -> (unit -> 'a @ unique) @ once
let unsound (unique box) =
let get = wrap box in
let a = get () in
let b = get () in
{ overwrite a with x = Some 0 }, { overwrite b with x = Some "string" }
;;
4 | let b = get () in ^^^ Error: found a once value where a many value was expected.
Like we saw with locality, a modal return places a requirement on the
caller. In this case, whoever calls wrap
promises not to use its
result multiple times, as desired.
Rust offers a slightly less expressive solution: closures that consume
values implicitly implement the FnOnce
trait, which only allows one
invocation. Unlike OCaml’s linearity modes, this trait specifically
bounds function types.
Borrowing Variables
So far, we’ve conspicuously omitted closures that capture exclusive
references. When a closure borrows a variable, it cannot use the
value uniquely, so the resulting function is safe to run multiple
times—that is, it needn’t be once
.
However, borrows still come with restrictions. Borrowing always
creates local
references, so a closure that borrows a variable must
itself be local:
type 'a box = { exclusively mutable x : 'a option }
let wrap (unique box) =
let clear () = &box.x <- None in
clear
;;
5 | clear ^^^^^ Error: this value escapes its region.
In Rust, closures that borrow variables exhibit similar behavior—the function’s lifetime cannot exceed that of the referenced value.
However, we already know clear
can never outlive box
, because
box
is global. We may instead move box
into the closure and
allow clear
to escape. To do so, we add an explicit global
annotation:
let wrap (unique box) =
let global clear () = &box.x <- None in
clear
;;
val wrap : 'a box @ unique -> (unit -> unit) @ separate
Ownership is transferred to the closure, so box
may not be used
outside of clear
. In Rust, this behavior can be requested by adding
the move
keyword to a closure.
Separate Functions
It was slightly misleading to say that closures capturing exclusive references can be run multiple times—this is true, but they’re not re-entrant. Intuitively, when a closure stores an exclusive reference, invocations must use the reference one at a time; in other words, they cannot overlap.
This limitation is encoded by a third mode in the linearity axis,
separate
. The separate
mode prohibits the creation of new
references to a value during its execution. Therefore, a separate
closure may be invoked repeatedly, but only one invocation can be
active at any point in time. For example:
let separate run f = f () in
run (fun _ -> ());
run (fun _ -> run (fun _ -> ()))
5 | run (fun _ -> run (fun _ -> ())) ^^^ Error: found a separate value where a many value was expected.
Luckily, most higher-order functions do not require re-entrancy. For
example, List.iter
doesn’t overlap applications of its callback, so
it can take a separate parameter.
let rec iter list f =
match list with
| [] -> ()
| x :: xs -> f x; iter xs f
;;
val iter : 'a list -> ('a -> unit) @ separate -> unit
Now, callers are able to provide callbacks that close over exclusive references.
let unique x = () in
iter [] (fun _ -> write &x);
iter [] (fun _ -> consume x)
5 | iter [] (fun _ -> consume x) ^^^^^^^^^^^^^^^^^^^^ Error: found a once value where a separate value was expected.
The separate
mode also has a corresponding Rust trait. In Rust,
closures that capture mutable (i.e. exclusive) references implicitly
implement the FnMut
trait, which makes invocation require a mutable
reference. Because OCaml does not rely on type bounds to encode
linearity constraints, type inference is again unaffected, whereas
Rust’s function traits further complicate the type system.
Ownership in Practice
Jane Street’s compilers team is currently implementing the uniqueness and linearity axes. Based on a design by Leo White and Stephen Dolan, Anton Lorenzen’s summer 2022 intern project produced an initial uniqueness analysis. Zesen Qian is now extending the project to support borrowing and linearity.
Despite locality’s success, one might be skeptical that the uniqueness and linearity modes will provide more than minor optimizations. After all, OCaml already supports explicitly mutable fields that may be written to without incurring allocations.
However, with the release of OCaml 5, mutable fields are now subject to data races. If multiple domains access a field and at least one performs a write, results observed by readers are unspecified—though the program remains type safe. Fortunately, the addition of exclusively mutable fields gives us a powerful new tool to combat this class of concurrency bugs. In part three, we will use modes to define a statically data-race-free API in multi-core OCaml.