Introducing Incremental

I’m pleased to announce the release of Incremental (well commented mli here), a powerful library for building self-adjusting computations, i.e., computations that can be updated efficiently when their inputs change.

At its simplest, you can think of a self-adjusting computation as a fancy spreadsheet. In a spreadsheet, each cell contains either simple data, or an equation that describes how the value in this cell should be derived from values in other cells. Collectively, this amounts to a graph-structured computation, and one of the critical optimizations in Excel is that when some of the cells change, Excel only recomputes the parts of the graph that depend on those changed cells.

What makes self-adjusting computation (or SAC) different from a spreadsheet is its dynamism. The structure of the computational graph in an SAC can change at runtime, in response to the changing input data.

This dynamism gives you quite a bit of flexibility, which can be used in different ways. Here are a few.

On-line combinatorial algorithms. Incremental is based on work by Umut Acar et. al.., on self adjusting computations (that’s where the term comes from), and there, they were mostly interested in building efficient on-line versions of various combinatoral algorithms. In many cases, they could match the asymptotic complexity of custom on-line algorithms by fairly simple incrementalizations of all-at-once algorithms.

Incremental GUI construction. One simple and natural way to model a GUI application is to structure your display as a function that generates a view from some more abstract data model.

Having a function that constructs a view from scratch at every iteration is simple, but prohibitively expensive. But if you can write this function so that it produces an incrementalized computation, then you have a solution that is both simple and efficient. We’ve used this technique in a number of our UIs, to good effect.

This might remind you of how functional reactive programming (FRP) is used for construction of GUIs in languages like Elm. SAC and FRP have different semantics – FRP is mostly concerned with time-like computations, and SAC is mostly about optimizing DAG-structured computations – but they are nonetheless closely related, especially at the implementation level. You can see my post here for a description of the broader conceptual landscape that includes both FRP and SAC.

Configurable computations. An example that comes from our own work is risk calculations. Calculating measures of risk of a portfolio involves combining data from a complex collection of interdependent models. Each of these models is dependent both on live data from the markets, and on configurations determined by users.

A config change could merely tweak a coefficient, or it could change the overall structure of the computation, say by changing the list of factors used by a given model. Incremental allows you to build a computation that can update efficiently in response to both simple data changes as well as more structural config changes, in one unified framework.

A taste of Incremental

It’s hard to give a compelling example of Incremental in action in just a few lines of code, because what makes Incremental really useful is how it helps you build large and complex computations. Nonetheless, small examples can give you a sense of how the library works.

To that end, let’s walk through a few small examples. To begin, we need to instantiate the Incremental functor.

open Core.Std
module Inc = Incremental_lib.Incremental.Make ()

Each instance thus generated is its own independent computational world. The Incremental functor is generative, meaning it mints fresh types each time it’s applied, which prevents values from different incremental worlds from being mixed accidentally.

An Incremental computation always starts at its variables. Modifications to variables are how updates to input data are communicated to Incremental.

Let’s write down a few variables corresponding to the dimensions of a rectangular prism.

module Var = Inc.Var

(* dimensions of a rectangular prism *)
let width_v  = Var.create 3.
let depth_v  = Var.create 5.
let height_v = Var.create 4.

We can use Var.watch to get the (trivial) incremental computions associated with each variable.

let width  = Var.watch width_v
let depth  = Var.watch depth_v
let height = Var.watch height_v

The following is an incremental computation of the base area of the prism, and of the volume.

let base_area =
  Inc.map2 width depth ~f:( *. )
let volume =
  Inc.map2 base_area height ~f:( *. )

In order to get information out of an incremental computation, we need to explicitly mark which nodes we want data from by creating observer nodes. Because it knows which nodes are observed, the framework can track what parts of the computation are still necessary to the results.

let base_area_obs = Inc.observe base_area
let volume_obs    = Inc.observe volume

In order to force the computation to run, we need to explicitly call Inc.stabilize. Here’s some code that uses stabilize to run the computation and then gets the information from the observers.

let () =
  let v = Inc.Observer.value_exn in
  let display s =
    printf "%20s: base area: %F; volume: %F\n"
      s (v base_area_obs) (v volume_obs)
  in
  Inc.stabilize ();
  display "1st stabilize";
  Var.set height_v 10.;
  display "after set height";
  Inc.stabilize ();
  display "2nd stabilize"

If we run this, we’ll se the following output:

1st stabilize: base area: 25.; volume: 125.
    after set height: base area: 25.; volume: 125.
       2nd stabilize: base area: 25.; volume: 250.

Note that setting the height isn’t enough to change the observed values; we need a stabilization to make that happen.

That’s a fairly trivial computation, and there certainly isn’t much to incrementalize. Let’s try something a little more complicated: a function for merging together an array of incrementals, using some commutative and associative operator like addition or max.

let rec merge ar ~f =
    if Array.length ar <= 1 then ar.(0)
    else
      let len = Array.length ar in
      let len' = len / 2 + len % 2 in
      let ar' =
        Array.init len' ~f:(fun i ->
          if i * 2 + 1 >= len then ar.(i*2)
          else Inc.map2 ar.(i*2) ar.(i*2+1) ~f)
      in
      merge ar' ~f;;

Because this is done using a binary tree as the dependency graph, the complexity of updating an element is log(n), where n is the size of the array. We can use this for, computing an average:

let average ar =
  let sum = merge ar ~f:(+.) in
  Inc.map sum ~f:(fun s -> s /. float (Array.length ar))

This works, but we can do better performance-wise, at least, if our merge operation has an inverse. In that case, maintaining the sum can in principle be done on constant time, by first, removing the old value before adding in the new. Incremental has a function for taking advantage of this structure.

let sum ar =
  Inc.unordered_array_fold ~f:(+.) ~f_inverse:(-.) ar;;

Now, let’s say we want to do something a little more dynamic: in particular, what if we want to compute the average of a prefix of the given array? For that, we need to use the bind function, which allows us to produce new incremental nodes within an incremental computation.

let average_of_prefix ar length =
  Inc.bind length (fun length ->
    average (Array.init length ~f:(fun i -> ar.(i))))

The type of this function is float Inc.t array -> int Inc.t -> float Inc.t, so the length of the prefix is a fully fledged part of the incremental computation. As a result, the dependency structure of this computation changes dynamically, e.g., if the value of length is 7, then the computation only depends on length and the first 7 elements of the array.

Hopefully this gives you enough of a sense of what Incremental is about to start thinking about where it might be useful for you. Note that the overhead of incremental is not inconsiderable – on my laptop, firing a single node takes on the order of 30ns, which is far more than, say, summing numbers together. Incremental tends to be useful when the computation that is put into a single node is large relative to that overhead, or when the computational graph is large relative to the sub-graph that needs to be recomputed. Our experience has been that there are plenty of applications in this space that can benefit from Incremental.