Parametric polymorphism is a basic mechanism in ML for writing code that is generic, i.e., that can be used on multiple different types. To get the basic idea of how what parametric polymorphism is, think about the following simple example.
module M : sig
(* Takes a list and returns a stuttered version, e.g., [1;2;3] is mapped to [1;1;2;2;3;3] *)
val double : 'a list -> 'a list
end = struct
let rec double = function
| [] -> []
| hd :: tl -> hd :: hd :: double tl
end
In the type signature for double
, the expression 'a
is a type variable,
meaning that this function can be used with an arbitrary type plugged in for
'a
. The reason that the type variable shows up is that the code of double
doesn’t depend in any way on the properties of the elements of the list.
At first glance, parametric polymorphism doesn’t seem all that
powerful. After all, how useful is it to write functions that can only be
generalized over all possible types? More often than not you want to write
functions that can be used over some narrow universe of types with particular
properties. Object oriented languages provide this functionality with subtyping,
and Haskell lets you get at this with type-classes. What do you do in ML?
It turns out that ML does allow you to write functions like this in a quite straightforward and ordinary way. A simple example can be found be examining the signature of a standard sort function:
val sort : cmp : ('a -> 'a -> int) -> 'a list -> 'a list
The signature above can be used on a value of any type, provided that you can also provide a comparison function for that type. In other words, a polymorphic function can take advantage of the idiosyncratic capabilities of the types it deals with, but the ability to take advantage of those capabilities must be passed in along with the values in question.
This is used in small ways throughout any non-trivial ML codebase. But we can
use this in a more structured way by creating what are sometimes called
type-indexed values. A type-indexed value is a value used to represent a set
of capabilities associated with a type. Here’s an example of a simple
type-indexed value for capturing number-ness. In what follows, the type-indexed
value is Num.Type.t
, and the rest of the Num
module is just utility
functions to make the interface pretty.
module Num : sig
module Type : sig
type 'a t
val int : int t
val float : float t
end
val (+) : 'a Type.t -> 'a -> 'a -> 'a
val (-) : 'a Type.t -> 'a -> 'a -> 'a
val ( * ) : 'a Type.t -> 'a -> 'a -> 'a
val neg : 'a Type.t -> 'a -> 'a
val zero : 'a Type.t -> 'a
val sum : 'a Type.t -> 'a list -> 'a
val sum_product 'a Type.t -> 'a list -> 'a list -> 'a
end = struct
module Type = struct
module T = struct
type 'a t = {
plus : 'a -> 'a -> 'a;
mul : 'a -> 'a -> 'a;
neg : 'a -> 'a;
zero : 'a;
}
end
open T
let int = { plus = Int.(+);
neg = Int.(-);
zero = Int.zero;
mul = Int.mul; }
let float = { plus = Float.(+);
neg = Float.(-);
zero = Float.zero;
mul = Float.mul; }
end
open Type.T
let (+) typ x y = typ.plus x y
let neg typ x = typ.neg x
let zero typ = typ.zero
let ( * ) typ x y = typ.mul x y
(* Some derived operations *)
let (-) typ x y = typ.plus x (typ.neg y)
let sum typ l = List.fold_left ~init:typ.zero ~f:typ.plus l
let sum_product typ l1 l2 = sum typ (List.map2 ~f:typ.mul l1 l2)
end
You’ll note that the definition above of Type.int
and Type.float
are
basically boilerplate. Because the modules in question themselves have a fairly
standardized interface, we could instead use a functor to create these
type-indexed values without the boilerplate:
module type Arith = sig
type t
val (+) : t -> t -> t
val neg : t -> t
val zero : t
end
module Build_type(M:Arith) = struct
let typ x = { Type.
plus = M.(+);
neg = M.(-);
zero = M.zero;
}
end
let int = let module Z = Build_type(Int) in Z.typ
let int64 = let module Z = Build_type(Int64) in Z.typ
let int32 = let module Z = Build_type(Int32) in Z.typ
let native = let module Z = Build_type(Native_int) in Z.typ
let float = let module Z = Build_type(Float) in Z.typ
let complex = let module Z = Build_type(Complex) in Z.typ
This is yet another advantage one gets from having standardized interfaces.
If type indexed-values look similar to Haskell’s type-classes, it’s because they
are. In my limited understanding of Haskell, the implementation is similar as
well, in that under the cover, Haskell passes around dictionaries of functions
which play the same role that the Type.t
s play here.
The number typeclass described above is just an example, and not something I’ve felt the need for in practice. But here are some places where we’ve used type-indexed values to good effect:
Serialization
The latest (unreleased) version of the bin_prot
macros that we use for binary
serialization and deserialization now come with a type-indexed value that ties
together all the little bits that you need to use the library. Before we did
that, one could only instantiate useful bin-prot functionality using the module
language. Now, we can do it using ordinary polymorphic functions.
Little languages
Sometimes we design domain-specific languages embedded in the type system. It is often useful to have values representing the different types that can be generated in the language. For example, we use this as part of a set of SQL bindings to represent types that we know how to convert to and from SQL.
Containers
We’ve started experimenting with type-indexed values representing the container-hood of a given object. This is a little trickier than the previous examples, since the type-indexed value has two type parameters, one for the type of the container, and one for the type of the elements of the container. In the end, this let’s you write functions with signatures like
max: ('a,'b) Container.t -> cmp:('a -> 'a -> int) -> 'b -> 'a
and use it to find the maximum element of a list (where the type-indexed value
has type ('a, 'a list) Container.t
) or an array (('a, 'a array) Container.t
)
or a string ((char,string) Container.t
).
Type-indexed values obviously have their downsides: they can be somewhat inconvenient syntactically, since you need to explicitly pass them along; and they sacrifice some performance because it leads you to call closures where you could otherwise call static functions that could be inlined instead. But overall, they are a flexible and elegant way of writing generic code in ML.