The beta version of Ocaml 3.12 has a couple of new features that relate to a post Stephen wrote a while back on how to ensure that a function definition is polymorphic. In this follow up post I will describe how one of those new mechanisms is essentially what you want for this purpose and the other is perhaps not due to a subtle interaction with how recursive definitions are type-checked.
Polymorphic type annotations
The first new feature is simply the ability to directly annotate definitions with polymorphic types. Consider
let const : 'a 'b. 'a -> 'b -> 'a =
fun x y -> x
The function const
is explicitly declared to have the polymorphic type
'a -> 'b -> 'a
.
As Stephen noted, this is not what is meant by free type variables in type annotations. For example, the following example with seemingly analogous type annotations (but a change in behavior)
let wrong_const (x : 'a) (y : 'b) : 'a = y
is accepted, but assigned a less general type than const
, namely,
val wrong_const : 'a -> 'a -> 'a
because the type variables in wrong_const
indicate merely unspecified types
whereas the universally quantified type variables in const
indicate arbitrary
types. The former are used to ensure that multiple occurrences of the same type
variable must refer to the same type within a particular definition. The latter
are used to ensure that a type is considered equal only to itself within a
particular definition.
Note that a polymorphic type annotation holds inside the body of a recursive definition as well as outside, allowing what is known as polymorphic recursion, where a recursive call is made at some non-trivial instantiation of the polymorphic type.
type 'a perfect_tree = Leaf of 'a | Node of 'a * ('a * 'a) perfect_tree
let rec flatten : 'a. 'a perfect_tree -> 'a list = function
| Leaf x -> [x]
| Node (x, t) ->
let pairs = flatten t in
let xs =
List.fold_right pairs ~init:[]
~f:(fun (x1, x2) xs -> x1 :: x2 :: xs)
in x :: xs
The recursion in the definition of flatten
is polymorphic because the
recursive call to flatten
is only well-typed if we instantiate its declared
polymorphic type to ('a * 'a) perfect_tree -> ('a * 'a) list
.
Several examples in Chris Okasaki’s classic book Purely Functional Data
Structures
involve so-called non-regular datatypes like perfect_tree
. In order to
define any useful functions on such types, one needs polymorphic recursion.
Until 3.12 one had to resort to various tricks involving recursively defined
modules or records in order to get Ocaml to accept a polymorphically recursive
function definition. In 3.12 we can now express such definitions much more
directly.
Explicit type parameters
The second new feature in 3.12 relating to polymorphism is explicit type parameters. For non-recursive definitions, this feature may be used to accomplish the same thing as polymorphic type annotations.
let const' (type a) (type b) (x : a) (y : b) : a = x
yields
val const' : 'a -> 'b -> 'a
If we make a mistake in the definition
let wrong_const' (type a) (type b) (x : a) (y : b) : a = y
the type checker lets us know
Error: This expression has type b but an expression was expected of type a
Contrast this with the wrong_const
example in the previous section.
However, this simple understanding of explicit type parameters in terms of polymorphism is, sadly, not quite true when one considers recursive functions. For instance, I was surprised to find that the following function type checks.
let rec f (type a) (x : a) : unit = f 42
The assigned type is
val f : int -> unit
What is going on here is quite subtle. The typing rule for fun (type t) -> E
considers t
abstract while typing E
, but elsewhere it is considered to be
just another unifiable type variable. The typing rule for let rec f = E
deals
with two types for the recursively defined value,
- the type bound to
f
while checking the bodyE
(which is constrained according to howf
is used inE
), and - the type inferred for its body
E
.
The last step after checking E
is to unify these two types together and
generalize the one resulting type to obtain the (possibly polymorphic) type of
f
used henceforth.
In the example above, the type inferred for the variable f
is int -> unit
.
The body of the recursive definition is
fun (type a) (x : a) -> (f 42 : unit)
The type inferred for this expression is '_a -> unit
(the type variable is no
longer held abstract since it is outside the scope of the explicit type
parameter). These two happly unify and the resulting type is int -> unit
.
It seems to me that a simple syntactic trick suffices to force the otherwise faulty polymorphic interpretation of explicit type parameters to hold. Simply push the recursion into the scope of the type parameter by transforming
let rec f (type t) = E
into
let f (type t) = let rec f = E in f
In our case, we transform
let rec f (type a) = fun (x:a) -> (f 42 : unit)
into
let f (type a) =
let rec f = fun (x:a) -> (f 42 : unit) in
f
which, as hoped, fails with the message
Error: This expression has type int but an expression was expected of type a
Conclusion
The upshot is that polymorphic type annotations are the preferred way to ensure polymorphism outright. They play well with recursive definitions, even going so far as to support polymorphic recursion.
Explicit type parameters, on the other hand, very nearly allow one to state requirements about polymorphic aspects of a function, but this only works straightforwardly for non-recursive definitions, and requires some small rewriting to work for recursive definitions. Even then, polymorphic recursion is beyond the scope of this mechanism.
To be fair, it seems that explicit type parameters were not primarily intended to indicate polymorphism. Their chief purpose, rather, is to give one a way to refer to the type parameters of a function when defining type components of first-class modules. The polymorphism intuitions are strong, however, and it seemed worthwhile to explore their limits.
Do you want to work in a place that understands why functional programming matters? Then join us.