We thought that phantom types would be an appropriate topic for our first real post because they are a good example of a powerful and useful feature of OCaml that is little used in practice.
In this post, I’ll cover a fairly simple use of phantom types: enforcing a
capability-style access-control policy. In particular, I’ll describe how you can
create easy to use read-only handles to a mutable data structure. We’ll explore
this using the example of an int ref
. The int ref
is a toy example, but the
same approach can be used for more realistic cases, such as a string library or
a database interface.
We’ll start by implementing an int ref
module on top of OCaml’s built-in ref.
module Ref : sig
type t
val create : int -> t
val set : t -> int -> unit
val get : t -> int
end
=
struct
type t = int ref
let create x = ref x
let set t x = t := x
let get t = !t
end
The simplest way of getting a read-only handle is to create another module with a different, more constrained signature.
module RORef : sig
type t
val import : Ref.t -> t
val get : t-> int
end
=
struct
type t = Ref.t
let import x = x
let get = Ref.get
end
An RORef.t
is just a Ref.t
underneath, but the signature hides that fact by
making the RORef.t
abstract. Note that there is a function for converting
Ref.t
’s to RORef.t
’s (import
), but not the other way around. This gives
you a way to create the read-only handle, but prevents someone with such a
handle from recovering the underlying read-write handle. The downside to this
approach is that it is impossible to write code that is polymorphic over
Ref.t
’s and RORef.t
’s, even if that code only uses the functionality common
to both, i.e., if it only reads.
A better solution is to use a phantom type to encode the access control rights
associated with a particular value. But what is a phantom type? The definition
unfortunately makes it sound more complicated than it is. A phantom type is a
type that is used as a parameter to another type (like the int
in int list
),
but which is unused in the actual definition (as in type 'a t = int
). The fact
that the phantom parameter is unused gives you the freedom to use it to encode
additional information about your types, which you can then convince the type
checker to keep track of for you. Since the phantom type isn’t really part of
the definition of the type, it has no effect on code-generation and so is
completely free at runtime. The way in which you convince the type-checker to
track the information you’re interested in is by constraining the appearance of
the phantom types using a signature.
It’s easier to understand once you look at an example.
type readonly
type readwrite
module PRef : sig
type 'a t
val create : int -> readwrite t
val set : readwrite t -> int -> unit
val get : 'a t -> int
val readonly : 'a t -> readonly t
end
=
struct
type 'a t = Ref.t
let create = Ref.create
let set = Ref.set
let get = Ref.get
let readonly x = x
end
In the above code, the phantom type tells you what your permissions are. A
readwrite PRef.t
can read and write, and a readonly PRef.t
can only read.
Note that the get
function doesn’t pay any attention to the phantom type,
which is why get
can be used with both readwrite
and readonly
PRef.t
’s.
The only function that can modify a ref is set
, and that requires a
readwrite PRef.t
.
Note that the types readonly
and readwrite
have no definitions. They look
like the declaration of an abstract type, except that these definitions are not
in a signature. They’re actually examples of uninhabited types, i.e., types
without associated values. The lack of values presents no problems here, since
we’re using the types only as tags.
The great thing about this approach is how seamlessly it works in practice. The user of the library can write things in a natural style, and the type system propagates the access-control constraints as you would expect. For example, the following definitions
let sumrefs reflist =
List.fold_left (+) 0 (List.map PRef.get reflist)
let increfs reflist =
List.iter (fun r -> PRef.set r (PRef.get r + 1)) reflist
will be given the following inferred types
val sumrefs : 'a PRef.t list -> int
val increfs : readwrite PRef.t list -> unit
In other words, the first function, which only reads, can operate on any kind of
ref, and the second, which mutates the refs, requires a readwrite
ref.
There is one problem with the access control policy we implemented above, which
is that there is no clean way of guaranteeing that a given value is immutable.
In particular, even if a given value is readonly
, it doesn’t preclude the
existence of another readwrite
handle to the same object somewhere else in the
program. (Obviously, immutable int refs are not a particularly compelling
application, but having both mutable and immutable versions makes sense for more
complicated data types, such as string or arrays).
But we can get immutable values as well by making the phantom types just slightly more complicated.
type readonly
type readwrite
type immutable
module IRef : sig
type 'a t
val create_immutable : int -> immutable t
val create_readwrite : int -> readwrite t
val readonly : 'a t -> readonly t
val set : readwrite t -> int -> unit
val get : 'a t -> int
end
=
struct
type 'a t = Ref.t
let create_immutable = Ref.create
let create_readwrite = Ref.create
let readonly x = x
let set = Ref.set
let get = Ref.get
end
Importantly, there’s no way for an IRef.t
to become immutable
. It must be
immutable from birth.
Extra credit: Making it more polymorphic
One thing that’s notable about the IRef
signature is that there is no way of
creating an actual polymorphic IRef.t
. The two creation functions both create
values with specific tags, immutable
or readwrite
. These specialized create
functions aren’t strictly necessary, though. We could have instead written
IRef
with the following signature.
sig
type 'a t
val create : int -> 'a t
val set : readwrite t -> int -> unit
val get : 'a t -> int
val readonly : 'a t -> readonly t
end
The user can force the creation of an immutable
or readwrite
Ref by adding a
constraint. So, you could get the effect of
let r = IRef.create_immutable 3
by instead writing
let r = (IRef.create 3 : immutable IRef.t)
The advantage of the polymorphic create function is straightforward: it allows you to write functions that are more polymorphic, and therefore more flexible. For instance, you could write a single function that could create, depending on context, an array of readwrite refs, an array of readonly refs, or an array of immutable refs.
The downside is that it may require more type annotations when you do want to be
explicit about the permissions, and it also allows some weird types to come up.
In particular, you can create an IRef.t
with any phantom parameter you want!
Nothing stops you from creating a string IRef.t
, even though string
doesn’t
make any sense as an access-control tag. Interestingly, the signature doesn’t
actually make any reference to the immutable
type, and in fact, using any
phantom parameter other than readonly
and readwrite
makes the ref immutable.
The access control restrictions still work in roughly the way you would expect,
but it is still a bit harder to think about than the original signature.