OCaml with Jane Street extensions is available from our public opam repo. Only a slice of the features described in this series are currently implemented.

In part one, we discussed how OCaml’s locality mode enables safe stack allocation. In this post, we will explore additional modes for representing ownership.

Ownership

OCaml isn’t a pure language—in fact, records containing mutable fields are relatively common. The compiler also doesn’t track where values may be referenced, so must assume all values are potentially shared. At runtime, this means several parts of the program may hold pointers to any particular piece of data.

Shared mutable data is notoriously difficult to reason about, and doubly so with the addition of concurrency. Conversely, pure functions provide useful guarantees, so OCaml encourages a side-effect free style. For example, records support functional updates, which preserve immutability:

type 'a box = { x : 'a option }

let clear box =
    { box with x = None }
;;

val clear : 'a box -> 'b box

Unfortunately, immutability comes at a cost. A functional update allocates a new record, since other code may hold references to the original record. We illustrate the result here—purple blocks are live values, which become red when unreferenced:

Writing to an explicitly mutable field would not incur this allocation, but instead exposes us to shared mutability.

Ultimately, immutable updates only approach this problem from one direction. We could instead imagine eliminating the shared portion of “shared mutability.” If we knew that our code holds the only reference to the original record, the functional update immediately makes the original value unreferenced—i.e. available for collection and re-use. Therefore, the runtime could equivalently perform an in-place update, even though the type exposes an immutable interface.

Diagram of a unique immutable box update.

One of Rust’s primary design goals is to entirely forbid shared mutability, which it achieves by only allowing mutable values to be referenced uniquely. In OCaml, uniqueness isn’t a strict necessity, since the garbage collector can handle cyclic lifetimes arising from unrestricted mutability. However, we still stand to benefit from the optimizations and expressivity enabled by unique references. In fact, in part three of this series, we’ll see how uniqueness enables us to statically prohibit data races in multicore OCaml.

The Uniqueness Mode

A variable has the unique mode when it is the only reference to a particular value in the entire program. Variables without this restriction have the default shared mode.

Previously, we saw that the compiler tracks which variables may escape their region; these were global. Uniqueness can be handled in a similar fashion—tracking which variables may have aliases tells us which are shared. Unique variables are hence those used at most once.

For example, given a function that takes a unique argument:

val consume : 'a @ unique -> unit

We’re no longer allowed to use this function in a way that violates uniqueness.

let foo (unique x) =
    consume x;
    x
;;

3 | x
    ^
Error: x is used uniquely so cannot be used twice.

OCaml’s uniqueness mode mirrors a similar concept in Rust, where values are unique by default. When passed to a function “by value,” a Rust variable is moved. Ownership is transferred to the callee, which has free rein to mutate or destroy the value.

Sub-Moding

A unique variable may be safely used as if it were shared, so unique is a sub-mode of shared. This hierarchy is reversed compared to locality: the non-default mode, unique, is the sub-mode.

In this sense, unique behaves like global and shared behaves like local. For example, we can construct a shared variable by referencing a unique value, but not vice versa.

let bar (unique x) =
    x, x
;;

val bar : 'a @ unique -> 'a * 'a

Uniqueness’ submoding relation makes it more intuitive to interpret a unique function parameter as a promise by the caller.

A unique parameter means the caller promises to provide uniquely referenced values. The callee is hence allowed to overwrite the value.
A shared parameter does not encode a promise from the caller; the callee must assume the parameter is aliased.

Strong Updates

Using the following syntax, we can perform an in-place functional update on a unique record.

let clear (unique box) =
    { overwrite box with x = None }
;;

val clear : 'a box @ unique -> 'b box @ unique

It may alarm you to notice that clear returns a 'b box—does this let us cast the box to a new type in-place? The answer is yes, and it’s perfectly safe to do so. No other code maintains a reference to the box, so it cannot be referred to as a value of the stale type.

This result demonstrates that unique variables provide another important capability: updating a value’s type in-place. Such an operation is known as a strong update, and lets us write (for example) an in-place map function:

let map (f : 'a @ unique -> 'b @ unique) (unique box) =
    match box.x with
    | None -> { overwrite box with x = None }
    | Some x -> { overwrite box with x = Some (f x) }
;;

val map : ('a @ unique -> 'b @ unique) -> 'a box @ unique -> 'b box @ unique

Borrowing

Unique variables also empower users to write safer API contracts. Procedural resource management is one obvious use case, so let’s attempt to define a simple set of functions for manipulating files.

type file

val open_ : path:string -> file @ unique
val read  : file -> string
val close : file @ unique -> unit

Since close requires a unique parameter, we can be sure that files are never used after being closed.

What happens when we try to use this API?

let print_and_close (unique file) =
    print_endline (read file);
    close file
;;

3 | close file
          ^^^^
Error: file is used uniquely so cannot be used twice.

Even though read does not require a unique parameter, we’re no longer allowed to close the file. Using a unique value at the shared mode is a one-way transformation: once we create a single shared reference to file, we’re no longer allowed to use file uniquely.

This behavior seems overly restrictive. We know that read itself doesn’t use file uniquely—so if it also doesn’t leak any references to file, we could keep using file uniquely after read returns. Fortunately, we’ve already discussed a mode that expresses this constraint: locality!

val read : file @ local -> string

let print_and_close (unique file) =
    print_endline (read &file);
    close file
;;

val print_and_close : file @ unique -> unit

The syntax &file denotes borrowing, which weakens the mode of file to local shared. The resulting reference cannot be leaked by read, so after read returns, we still have the only reference to file.

In Rust, borrowing is less restrictive, as the callee is paramaterized over the true lifetime of the borrowed value.

Exclusivity

Let’s expand our API to include a write function. A file represents access to mutable state—and we’re avoiding shared mutability—so write should require a unique file.

val write : file @ unique -> string -> unit

But when we try to use this function…

let write_and_close (unique file) =
    write file "data";
    close file
;;

3 | close file
          ^^^^
Error: file is used uniquely so cannot be used twice.

If we pass file uniquely, it cannot be used again. While we could return file from write, piping unique values through every function call isn’t a good solution. Ideally, we want to express that although our function mutates file, it does not consume it.

To better handle this case, we add a third mode to the uniqueness axis: exclusive, which sits in between the initial options. A variable at the exclusive mode is the only active reference to a particular value.

Specifically, exclusivity expresses that while other references to this value may exist, none are accessible until the conclusion of the exclusive reference’s region. Therefore, we may freely mutate an exclusive value, but we may not use it uniquely.

val write : file @ local exclusive -> string -> unit

let write_and_close (unique file) =
    write &file "data";
    close file
;;

val write_and_close : file @ unique -> unit

Because the call to write borrows file exactly once, &file has mode local exclusive. If file was borrowed multiple times, its mode would instead be weakened to local shared.

val write2 : file @ local exclusive -> file @ local exclusive -> string -> unit

let write_twice (unique x) =
    write2 &x &x "data"
;;

2 | write2 &x &x "data"
              ^^
Error: found a shared value where an exclusive value was expected.

Exclusively Mutable Fields

The exclusive mode is not just useful for resource management. It also expresses that a function may mutate a record, but never strongly updates it. Given an exclusive variable, we know our code has the only active reference to the value, but later code may inspect it at the current type.

Therefore, we also introduce exclusively mutable fields:

type counter = { exclusively mutable i : int }

An exclusively mutable field behaves as you might expect. It may be written to via an exclusive reference, but appears immutable when accessed via a shared reference. With exclusively mutable fields, we can enjoy the performance benefits of mutability without the strict move semantics of uniqueness.

let increment (local exclusive counter) =
    counter.i <- counter.i + 1
;;

val increment : counter @ local exclusive -> unit

Marking individual fields as exclusively mutable is a natural extension of OCaml’s current mutability story, but notably diverges from Rust’s approach. In Rust, variable bindings are marked as mutable upon declaration—that is, mutability becomes another deep property. (Perhaps a mode!) Despite this difference, OCaml’s uniqueness mode axis now mirrors Rust’s hierarchy of references:

A unique variable is akin to a value in Rust. Passing a parameter by value allows the callee to destroy it.
An exclusive variable is akin to a mutable reference in Rust. Passing a parameter by mutable reference allows the callee to overwrite, but not destroy, the value.
A shared variable is akin to an immutable reference in Rust. Passing a parameter by immutable reference restricts the callee to reading or copying the value.

There is one important difference: OCaml’s references are not paramaterized over lifetime variables. Like we saw in part one, this means uniqueness doesn’t lead to higher-order polymorphism and doesn’t interfere with type inference.

Closures & Linearity

So far, we’ve seen how the uniqueness and locality axes interact to represent ownership. Unfortunately, we can now write the following code…

type 'a box = { x : 'a option }

let wrap (unique box) =
    fun () -> box
;;

let unsound (unique box) =
    let get = wrap box in
    let a = get () in
    let b = get () in
    { overwrite a with x = Some 0 }, { overwrite b with x = Some "string" }
;;

Naively, applying get simply returns the wrapped box. But that means invoking get twice creates two supposedly unique references to the same box. Violating uniqueness breaks type safety, as we can use strong updates to refer to the same value at multiple types.

val wrap : 'a @ unique -> unit -> 'a @ unique
val unsound : 'a box @ unique -> int box @ unique * string box @ unique

Therefore, we must not allow wrap to return a closure with signature unit -> 'a box @ unique. We know the resulting function may be executed once, but is unsafe to invoke a second time. More generally, any closure capturing a unique value must be run at most once, as its first invocation may consume the value.

The Linearity Mode

To encode this restriction, we will define a third mode axis: linearity. In practice, linearity modes will almost exclusively apply to variables of a function type (i.e. closures), but this is not a requirement.

A variable with mode once carries the restriction of being used at most once. It is allowed to close over unique and once variables.
A variable with the default mode many does not have this restriction; it may be used freely and may only close over shared/many variables.

A many variable can be used as if it were once, so many is a sub-mode of once. Like locality, the salient mode (once) is the super-mode. From a parameter perspective, once represents a promise by the callee:

A once parameter means the callee promises not to use the value more than once.
A many parameter encodes no promise; the callee may use the value any number of times.

let baz (once x) =
    x, x
;;

2 | x, x
       ^
Error: found a once value where a many value was expected.

In some sense, linearity is dual to the uniqueness axis. Both analyses are concerned with aliasing: a once variable prohibits the creation of new references to its value, whereas a unique variable states that no such references exist.

Consuming Variables

Returning to our unsound example, wrap now returns a closure at mode once. When we try to invoke the result twice, the compiler raises an error:

val wrap : 'a @ unique -> (unit -> 'a @ unique) @ once

let unsound (unique box) =
    let get = wrap box in
    let a = get () in
    let b = get () in
    { overwrite a with x = Some 0 }, { overwrite b with x = Some "string" }
;;

4 | let b = get () in
            ^^^
Error: found a once value where a many value was expected.

Like we saw with locality, a modal return places a requirement on the caller. In this case, whoever calls wrap promises not to use its result multiple times, as desired.

Rust offers a slightly less expressive solution: closures that consume values implicitly implement the FnOnce trait, which only allows one invocation. Unlike OCaml’s linearity modes, this trait specifically bounds function types.

Borrowing Variables

So far, we’ve conspicuously omitted closures that capture exclusive references. When a closure borrows a variable, it cannot use the value uniquely, so the resulting function is safe to run multiple times—that is, it needn’t be once.

However, borrows still come with restrictions. Borrowing always creates local references, so a closure that borrows a variable must itself be local:

type 'a box = { exclusively mutable x : 'a option }

let wrap (unique box) =
    let clear () = &box.x <- None in
    clear
;;

5 | clear
    ^^^^^
Error: this value escapes its region.

In Rust, closures that borrow variables exhibit similar behavior—the function’s lifetime cannot exceed that of the referenced value.

However, we already know clear can never outlive box, because box is global. We may instead move box into the closure and allow clear to escape. To do so, we add an explicit global annotation:

let wrap (unique box) =
    let global clear () = &box.x <- None in
    clear
;;

val wrap : 'a box @ unique -> (unit -> unit) @ separate

Ownership is transferred to the closure, so box may not be used outside of clear. In Rust, this behavior can be requested by adding the move keyword to a closure.

Separate Functions

It was slightly misleading to say that closures capturing exclusive references can be run multiple times—this is true, but they’re not re-entrant. Intuitively, when a closure stores an exclusive reference, invocations must use the reference one at a time; in other words, they cannot overlap.

This limitation is encoded by a third mode in the linearity axis, separate. The separate mode prohibits the creation of new references to a value during its execution. Therefore, a separate closure may be invoked repeatedly, but only one invocation can be active at any point in time. For example:

let separate run f = f () in

run (fun _ -> ());

run (fun _ -> run (fun _ -> ()))

5 | run (fun _ -> run (fun _ -> ()))
                  ^^^
Error: found a separate value where a many value was expected.

Luckily, most higher-order functions do not require re-entrancy. For example, List.iter doesn’t overlap applications of its callback, so it can take a separate parameter.

let rec iter list f =
    match list with
    | [] -> ()
    | x :: xs -> f x; iter xs f
;;

val iter : 'a list -> ('a -> unit) @ separate -> unit

Now, callers are able to provide callbacks that close over exclusive references.

let unique x = () in

iter [] (fun _ -> write &x);

iter [] (fun _ -> consume x)

5 | iter [] (fun _ -> consume x)
            ^^^^^^^^^^^^^^^^^^^^
Error: found a once value where a separate value was expected.

The separate mode also has a corresponding Rust trait. In Rust, closures that capture mutable (i.e. exclusive) references implicitly implement the FnMut trait, which makes invocation require a mutable reference. Because OCaml does not rely on type bounds to encode linearity constraints, type inference is again unaffected, whereas Rust’s function traits further complicate the type system.

Ownership in Practice

Jane Street’s compilers team is currently implementing the uniqueness and linearity axes. Based on a design by Leo White and Stephen Dolan, Anton Lorenzen’s summer 2022 intern project produced an initial uniqueness analysis. Zesen Qian is now extending the project to support borrowing and linearity.

Despite locality’s success, one might be skeptical that the uniqueness and linearity modes will provide more than minor optimizations. After all, OCaml already supports explicitly mutable fields that may be written to without incurring allocations.

However, with the release of OCaml 5, mutable fields are now subject to data races. If multiple domains access a field and at least one performs a write, results observed by readers are unspecified—though the program remains type safe. Fortunately, the addition of exclusively mutable fields gives us a powerful new tool to combat this class of concurrency bugs. In part three, we will use modes to define a statically data-race-free API in multi-core OCaml.

Oxidizing OCaml: Rust-Style Ownership

Ownership

The Uniqueness Mode

Sub-Moding

Strong Updates

Borrowing

Exclusivity

Exclusively Mutable Fields

Closures & Linearity

The Linearity Mode

Consuming Variables

Borrowing Variables

Separate Functions

Ownership in Practice

Signals & Threads Podcast

Featured Tech Talk

Featured Reads

Tags

RSS

Jane Street Open Source

Join Our Team

Oxidizing OCaml: Rust-Style Ownership

Ownership

The Uniqueness Mode

Sub-Moding

Strong Updates

Borrowing

Exclusivity

Exclusively Mutable Fields

Closures & Linearity

The Linearity Mode

Consuming Variables

Borrowing Variables

Separate Functions

Ownership in Practice

Subscribe to Email Updates

Signals & Threads Podcast

Featured Tech Talk

Featured Reads

Tags

RSS

Jane Street Open Source

Join Our Team