I recently ported the Hardcaml_step_testbench library, one of the libraries that we use at Jane Street for Hardcaml simulations, from using monads to using algebraic effects, a new OCaml 5 feature. This blog post walks through what algebraic effects are, why you should consider using them in lieu of monads, and how to actually work with them using the Handled_effect library. One thing I’ve come to believe is that most of what can be done with monads can be done with algebraic effects in a much more elegant way.

Algebraic effects were originally added to OCaml for general-purpose concurrent execution of programs for OCaml 5, which supports thread-level parallelism. The fact that they can be repurposed for Hardcaml simulations speaks to how well-thought-out and general a language feature this is.

I am writing this post as someone who is not a type-theory expert. The fact that I can use algebraic effects without fully understanding the underlying mechanics is one nice feature of their design.

What’s wrong with monads?

Monads have been used by OCaml programmers for a long time to model computation. Jane Street’s own monadic Async library, which is used for concurrent programming, powers a lot of our infrastructure. Why would we want to replace it?

Reason 1: Monads infect your code and make everything harder to read

Recall that monads have the following type signature:

(* This is part of the [Monad.S] module type *)
type 'a t

val return : 'a -> 'a t
val bind : 'a t -> f:('a -> 'b t) -> 'b t
val map : 'a t -> f:('a -> 'b) -> 'b t

Suppose we are interacting with a server that helps us accumulate numbers. Using the Deferred.t monad from Async as an example, an interface might look something like this:

type txn

val start_transaction : unit -> txn Deferred.t

val send_number : txn -> int -> unit Deferred.t

val wait_for_completion : txn -> int Deferred.t

You could use these in a bit of code like so:

let send_numbers_to_server n =
  let%bind txn = start_transaction () in
  let%bind () =
    Deferred.for_ 0 ~to_:(n - 1) ~do_:(fun i ->
      let%bind () = send_number txn i in
      printf "Sent number %d\n" i;
      return ())
  in
  let%bind result = wait_for_completion txn in
  printf "Transaction done: %d!" result;
  return ()
;;

For those not familiar, let%bind x = foo () in bar () is some syntactic sugar that is transformed into bind (foo ()) ~f:(fun x -> bar ()).

The Async monad is all over this code. let%bind () = surrounds operations that perform an asynchronous operation; you need the noisy return () in a bunch of places to ensure you finish on a Deferred.t type; and you have to use special versions of core library functions like Deferred.List.iter—or here, Deferred.for_—instead of the normal versions, because you must return a Deferred.t. Once you start using a monad, you’re effectively trapped in it. All code that interacts with your usage must also consider the monad. It’s annoying.

If we had a hypothetical Async library built with OxCaml algebraic effects instead, it might look something like this:

let send_numbers_to_server (h : Deferred.Handler.t) n =
  let txn = start_transaction h in
  for 0 to n - 1 do
    send_request h txn i;
    printf "Sent number %d\n" i;
  done;
  let result = wait_for_completion h txn in
  printf "Transaction done: %d!" result;
;;

No special let% ppx; no returns; no monad-flavored standard library functions; and the overall function returns a normal value that callers don’t have to treat specially.

Reason 2: You can’t use unboxed types and the local mode

Monads make it tricky to use some valuable OxCaml features, notably unboxed types and the local mode.

Consider the following piece of code:

open Core 
open Unboxed

(* Assume [Bar] is some module that implements Monad.S and some utility functions *)
module Bar : sig
  include Monad.S
  
  val do_stuff : unit -> unit t
end

type foo =
  { a : int
  ; b : int
  }

let do_thing () =
  let foo @ local = { a = 1; b = 2 } in
  let%map x = Bar.do_stuff () in
  F64.of_int (foo.a + foo.b)
;;

The above code will not compile! To understand why, let’s consider what the code looks like when we run it through the PPX preprocessor:

let do_thing () =
  let foo @ local = { a = 1; b = 2 } in
  Bar.map (Bar.do_stuff ()) ~f:(fun () ->
    F64.of_int (foo.a + foo.b)
  )
;;

OxCaml introduces the notion of layouts to types and modes to values. There are many kinds of layouts corresponding to the memory representation of the type, which is out of scope here. In this case, we’re using a layout that’s backwards compatible with OCaml’s default memory representation, namely the value layout. Modes on the other hand track the properties of values. Here, the mode simply tracks whether a value is allocated in a caller’s stack (it only lives in the caller’s region), or on the global heap (it is managed by the garbage collector).

Once you add in the layout and mode annotations, our Bar.map has the following signature:

val map
  : ('a : value) ('b : value) .
    'a t @@ global -> f:('a -> 'b) @@ global -> 'b @@ global

The two problems here are:

The foo record is locally allocated. This means that f cannot capture it in its closure environment, since f has to be globally allocated.
The return value of the closure, f, has the layout value. F64.t, which is the return type of F64.of_int, unfortunately has the type float#, which is not compatible with the layout value.

The Tools & Compilers group at Jane Street have some work in ppx_let that tries to get around this somewhat by allowing f to be a locally allocated closure. This is a good enough solution for a lot of cases, but not where you simply cannot have a locally allocated closure. One such example is Async, which requires the closure to be globally allocated, as the closure itself is passed to the Async scheduler as tasks to be scheduled. Furthermore, it only actually solves the first of the two problems above.

Effects are a neat solution to this. They circumvent the need for globally allocated closures, as they shift the task of managing the “environment” of a particular context into the language runtime. This conveniently allows us to use various new fancy OxCaml features.

Effects have other upsides, like better stack traces and more seamless composability. (If you’ve ever tried to compose two monads together you will know how painful it can be.)

A layman’s introduction to Algebraic Effects

Effects can be thought of as primitives built into the language that allow you to suspend execution in the control flow of a piece of computation and yield control to a scheduler / runtime. When you “perform” an effect (more on that later), execution pauses and the handler receives a continuation k—a first-class value representing “everything that was going to happen after the perform.” From there you can continue with the underlying value (the way you might after a let%bind in the Async monad), throw, or hold onto the continuation for later.

We will walk through some simple examples, before going through more complicated examples for Hardcaml simulations.

If you are vaguely familiar with OCaml effects in the OCaml manual, you will notice that things look quite a bit different. The type-safe OxCaml effects API differs from the stock OCaml effects API.

Let’s start with a simple example to understand the API provided by Handled_effect.

Let’s say we want to build a library that allows us to perform a trivial computation that either increments or decrements a value. We want to end up with code like this:

let computation (handler : E.Handler.t @ local) =
  let x = 1 in
  let y = E.perform handler (Plus_one x) in (* suspends here *)
  let z = E.perform handler (Minus_one y) in (* suspends here *)
  print_s [%message (x : int) (y : int) (z : int)]
;;

let%expect_test "" =
  run_computation computation;
  [%expect {| ((x 1) (y 2) (z 1)) |}]
;;

This is the core “business logic”: it doesn’t know how Plus_one or Minus_one are implemented. We’re just setting up the flow of the computation. In monadic Async terms, the code above is like the code the user writes that’s peppered with let%binds. The actual performing of the computation (which we’ll get to below) is like what the Async scheduler does behind the scenes with that user code.

The E.Handler.t value, called an “effect handler”, can be viewed as an object that you pass around to access the implementation for the effect E. (Notice that we have annotated the handler argument with @ local. For now, just take my word for it that it needs the local annotation for type-safety purposes. I will elaborate on why at the end of the blog post.)

Let’s break down the other parts. You start by defining the possible effect operations. This is a GADT that specifies the possible operations that your computation can perform that will suspend execution. It can look something like this:

open Core

module Effect_ops = struct
  (* One of the type arguments must specify the return 
     type of performing the effect; here, an int *)
  type 'a t =
    | Plus_one     : int -> int t
      (** An operation that when given [x], return [x + 1] *)
    | Subtract_one : int -> int t
      (** An operation that when given [x], return [x - 1] *)
end

(* We invoke the Handled_effect `Make` functor on our 
   module to Effect-ify it. *)
module E = Handled_effect.Make (Effect_ops)

The module E is the primary module users will interact with when working with effects. There are many functions and types in there, which we will walk through in the example. But below is a heavily simplified interface of the E module that we just constructed:

module E : sig
  module Handler : sig
    type t
  end
 
  (** Performs an operation in the context of running a computation. *)
  val perform : Handler.t @ local -> 'a Effect_ops.t -> 'a @ once unique
 
  (** Evaluates a computation *)
  val run : (Handler.t @ local -> 'a) -> 'a Result.t
  
  module Result : sig
    type ('a, 'e, 'es) t =
      | Value : 'a -> ('a, 'e, 'es) t
        (** This is returned when the computation is finished. *)
      | Exception : exn -> ('a, 'e, 'es) t
        (** The computation raises an unhandled exception. We'll 
            ignore exceptions for the purposes of this blog post. *)
      | Operation :
          ('o, 'e) op  * ('o, 'a, 'e, 'es) continuation
          -> ('a, 'e, 'es) t
        (** This is returned when the computation calls 
            [E.perform operation]. The first argument is the operation 
            in question, and the second argument is a continuation object 
            that can be used to resume execution of the computation with 
            the result.
        *)
  end
end

So when you use Effects you’re really thinking about two pieces:

The computation—this is your business logic. Computations have a type signature E.Handler.t @ local -> 'a, where 'a is the return value of the overall computation.
The operation handlers—this is the code that interprets what the operations mean.

When the computation calls E.perform, the execution jumps to the operations handler. From the operations handler, you resume execution of the computation using the continuation by calling Handled_effect.continue k.

Writing the computation is usually straightforward and not much different from writing regular code without Effects. Most of the complexity lies in writing the operation handlers. Here’s what it looks like in this example:

let rec handle_computation_result (result : (_, _) E.Result.t) =
  match result with
  | Value result ->
    (* The computation has reached the end and returned the result *)
    result
  | Operation (op, k) ->
    (* If we're here, the effect has suspended. The [op] type is the set of
       operations the computation can perform as expressed by our [Effect_ops]
       type. [k] is a continuation that the user can use to resume the
       computation execution with [Handled_effect.continue]
    *)
    (match op with
     | Plus_one x ->
       handle_computation_result (Handled_effect.continue k (x + 1) [])
     | Subtract_one x ->
       handle_computation_result (Handled_effect.continue k (x - 1) []))
  | Exception exn ->
    (* In real examples, we would do smarter things with exceptions, but to keep
       these examples easy to follow, we simply reraise them.
     *)
    raise exn
;;

let run_computation (type a) (computation : E.Handler.t @ local -> a) : a =
  handle_computation_result (E.run computation)
;;

Algebraic Effects for Hardcaml simulations

The Hardcaml_step_testbench library is a library we use for FPGA simulations in Hardcaml. We have used this library for years to simulate FPGA circuits. Recently, we added support for an effect-based API. It’s a nice demonstration of what effects can get us in a domain that they’re not completely designed for.

We will walk through some example code that emulates the core behaviour of this library to showcase the effectful API. The actual library is much more featureful, but the core idea can be illustrated with a toy implementation.

A digital circuit can be abstractly thought of as a stateful component that consumes inputs and produces outputs atomically at every time step. Hardware designers call this time step a clock cycle, as physically it corresponds to a clock signal that is supplied to the circuit. An important distinction from software programs is that inputs and outputs are consumed at every clock cycle, whether or not the circuit is performing any higher-level application functionality.

(To the digital design engineers—I’m restricting this to synchronous single clock-domain circuits simulations. There are tricks we do to deal with multiple clock domains, but it’s not important for what we’re talking about here.)

Compiling these digital circuits into FPGAs can take on the order of hours (or closer to months for ASICs). We like getting faster feedback on the changes we make to our testbenches and hardware, so we write simulations for these circuits.

In running digital circuit simulations, we are trying to synchronize two interacting components:

The circuit itself
Testbench threads that synchronize and interact with the circuit

The execution flow of the testbench and simulated circuit looks something like this:

interleave single thread

In practice, we oftentimes have multiple threads of execution in our circuit testbenches. This is useful, as digital circuits can sometimes have complicated and (mostly) disjoint parts that different testbench threads might want to interact with individually.

interleaving multiple threads

For the purpose of this blog post, we are going to make some simplifications:

When the user runs the computation, all the concurrent tasks must be known at that point
We require all computations to have no return values (i.e., they return unit)

So what makes this tricky in a pre-effects world? Imagine we want to interleave the following two computations, with cycle () representing the synchronization points:

[ (fun () ->
     for i = 0 to 2 do
       cycle ();   (* Synchronization point *)
       printf "foo %d\n" i;
     done)
; (fun () ->
     for i = 0 to 2 do
       cycle ();   (* Synchronization point *)
       printf "bar %d\n" i;
     done);
]

In the pre-Effects world, the only way we can do this is by using closures. We need closures because we don’t have a practical way of representing the part of the execution that comes after the call to cycle.

The most ergonomic way to do this happens to be with monads. You could imagine having a computation monad that looks something like this:

type 'a t =
   | Return : 'a -> 'a t
   | Bind   : ('a t * ('a -> 'b t)) -> 'b t
   | Cycle  : unit t
   
val cycle : unit -> unit t

val run_computations : (unit -> unit t) list -> unit

The run_computations bit of code is quite involved so I won’t include it here. (The fact that it’s so complex is one motivation for using Effects.) The rough intuition is:

Start at the head of the computation. It will walk it up to a Bind(Cycle, f) point. It will then set aside and store the f closure. If the computation evaluation ends up at Return (), just mark it as done (there is no closure to store in this case).
Move on to the next computation, and do the same thing, until you get to the end of the list.
Repeat the process until all the computation has been done.

Putting that all together, and the monadic version of our testbench runner may look something like this:

run_computations 
  [ (fun () ->
      for_ 0 ~to_:2 ~do_:(fun i ->
        let%bind () = Step.step () in
        printf "foo %d\n" i;
        Step.return ()))
  ; (fun () ->
      for_ 0 ~to_:2 ~do_:(fun i ->
        let%bind () = Step.step () in
        printf "bar %d\n" i;
        Step.return ()))
  ]
  [%expect
    {|
    foo 0
    bar 0
    foo 1
    bar 1
    foo 2
    bar 2
    |}]

This of course has all the problems of monads that we talked about earlier. But in a pre-Effects world there isn’t another way to support the notion of a synchronization point.

With OxCaml Effects, we do have a first-class way of representing the “computation to come”. We can have an API that looks something like the following:

module Handler : sig
  type t
end

val run_computations : (Handler.t @ local -> unit) list -> unit

val step : Handler.t @ local -> unit

The same code above can be written in a much cleaner style:

let%expect_test "" =
  run_computations
    [ (fun h ->
        for i = 0 to 2 do
          step h;
          printf "foo %d\n" i
        done)
    ; (fun h ->
        for i = 0 to 2 do
          step h;
          printf "bar %d\n" i
        done)
    ];
  [%expect
    {|
    foo 0
    bar 0
    foo 1
    bar 1
    foo 2
    bar 2
    |}]
;;

Note the interesting bit in this API: whenever a computation calls step, it yields control to the “step runtime”, as a synchronization point, until all the threads reach their respective synchronization points. This is a very powerful feature that allows each of the testbench computations to interact independently with the circuit in the same state, without having to coordinate their work explicitly. At the synchronization point, the underlying circuit simulator will advance the circuit by one step, before resuming the execution of the computations. As high-level programmers interacting with the Handled_effect library, we don’t need to understand how this all works under the hood.

So, how does one actually go about implementing the above code with effects?

Similar to before, we first define the effect operations. In this case, we need to perform an operation to yield control to the runtime, which we do with a Step operation. We also define a step function that the users will interact with.

module Effect_ops = struct
  type 'a t = Step : unit t
end

module E = Effect.Make (Effect_ops)
module Handler = E.Handler

(* val step : Handler.t @ local -> unit *)
let step (h : Handler.t @ local) = E.perform h Step

Then, we define some kind of state tracker for the execution of a computation. We encapsulate this in a Thread_state.t type. The trick here is that when the computation performs an effect, we don’t actually need to call the continuation immediately. We can have several concurrent computations in flight, and several continuations.

module Thread_state = struct
  type t =
    | Unstarted of (Handler.t @ local -> unit)
    | Running of (unit, unit, unit) E.Continuation.t Unique.Once.t
      (** When the computation state is [Running], it means the computation called
          [step] and has suspended its execution. It will synchronize with all other
          computations at their respective calls to [step] before advancing.

          The continuation object has the unique mode, which means the compiler verifies
          that it can only be used exactly once. [Unique.Once.t] is used here
          to cross the continuation into the aliased mode (it can be used multiple
          times, which is the default in OCaml) to defer this check into runtime
          rather than compile time. The exact details of how it works are not too
          important for the purposes of this blog post.
        *)
    | Finished

  let handle_result (result : (unit, unit) E.Result.t @ once unique) =
    match result with
    | Value () ->
      Finished
    | Exception e ->
      Exn.reraise e "Step raised exn"
    | Operation (op, k) ->
      (match op with
       | Step -> Running (Unique.Once.make k))
  ;;

  (* Advance the computation until it calls [step] *)
  let run_until_step t =
    match t with
    | Unstarted computation ->
      handle_result (E.run computation)
    | Running k_uniq ->
      handle_result (Handled_effect.continue (Unique.Once.get_exn k_uniq) () [])
    | Finished -> Finished
  ;;
end

With all these pieces together, we can implement the run_computations function.

let run_computations (computations : (E.Handler.t @ local -> unit) list) =
  let states =
    Array.map (Array.of_list computations) ~f:(fun computation ->
      Thread_state.Unstarted computation)
  in
  while
    Array.exists states ~f:(function
      | Finished -> false
      | Unstarted | Running _ -> true)
  do
    Array.map_inplace states ~f:Thread_state.run_until_step
    (* In practice, we will advance time on the underlying circuit being simulated. *)
  done
;;

This is a simplification of what happens in practice, but not by much! The only primitive we are missing here is spawn, which allows a computation to spawn further computations. This is something we can also achieve with Algebraic Effects.

OxCaml’s type safety guarantees, or, Why does a handler have a local mode?

The paper on Locality and Effect Reflection goes through the formal argument for why our system uses a local mode for typesafe effects. Here I will attempt to give a brief intuitive explanation.

Recall that we define a computation with an effect E as E.Handler.t @ local -> 'a. When we call E.run on such a computation, it evaluates the function in an environment that can intercept calls to E.perform to yield control to the effect operations handler. In other words, we can’t arbitrarily call E.perform outside the context of having registered effect handlers for E.

Suppose we could run computations of type E.Handler.t -> 'a, without the local annotation on the handler. One can see how you could express ill-defined logic in this case:

let global_handler = E.run (fun h -> h) in
E.perform global_handler operation

Inside E.run, we are in the context with an appropriate effects operation handler. But what happens when we call E.perform global_handler operation? E is no longer handled!

The @ local stops us from doing this. The above code snippet will fail with a compile error due to a mode error.

(* File "fail_compilation_example.ml", line 82, characters 39-40: This value is local but is
   expected to be global.
                                     |
                                     |
                                     | *)
let global_handler = E.run (fun h -> h) in
E.perform global_handler operation

Wrapping up

I hope this post helps convince you that effects > monads in many cases, and that it’s worth giving the Handled_effect library a look.

Fun with Algebraic Effects - from Toy Examples to Hardcaml Simulations

What’s wrong with monads?

Reason 1: Monads infect your code and make everything harder to read

Reason 2: You can’t use unboxed types and the local mode

A layman’s introduction to Algebraic Effects

Algebraic Effects for Hardcaml simulations

OxCaml’s type safety guarantees, or, Why does a handler have a local mode?

Wrapping up

Signals & Threads Podcast

Featured Tech Talk

Featured Reads

Tags

RSS

Jane Street Open Source

Join Our Team

Fun with Algebraic Effects - from Toy Examples to Hardcaml Simulations

What’s wrong with monads?

Reason 1: Monads infect your code and make everything harder to read

Reason 2: You can’t use unboxed types and the local mode

A layman’s introduction to Algebraic Effects

Algebraic Effects for Hardcaml simulations

OxCaml’s type safety guarantees, or, Why does a handler have a local mode?

Wrapping up

Subscribe to Email Updates

Signals & Threads Podcast

Featured Tech Talk

Featured Reads

Tags

RSS

Jane Street Open Source

Join Our Team