With the external release of OCaml 4.07.0 imminent, we in Jane Street’s Tools & Compilers group have been planning what we want to work on for inclusion in OCaml 4.08. These days OCaml uses (or at least attempts) a time-based release process with releases scheduled every 6 months. We’re trying to avoid rushing in changes at the last minute – as we’ve been prone to do in the past – so this list is restricted to things we could conceivably finish in the next 4-5 months.
This blog post is part of a drive towards making OCaml compiler development – both inside and outside of Jane Street – more transparent to the wider OCaml community. By its nature this post is technical and very OCaml-specific. It has no narrative or moral, no beginning, middle and end, it is just a list of language features. Hopefully however, for those of you interested in OCaml’s development, it is at least an interesting list.
Support for GADTs in or-patterns
Currently, you cannot use GADT constructors with or-patterns. For example:
# type 'a ty = Int : int ty | Bool : bool ty | String : string ty;; type 'a ty = Int : int ty | Bool : bool ty | String : string ty # let is_string : type a. a ty -> bool = function | Int | Bool -> false | String -> true;; Characters 54-57: | Int | Bool -> false ^^^ Error: This pattern matches values of type int ty but a pattern was expected which matches values of type a ty Type int is not compatible with type a
which forces you to duplicate the match case for the different constructors:
# let is_string : type a. a ty -> bool = function | Int -> false | Bool -> false | String -> true;; val is_string : 'a ty -> bool = <fun>
Thomas Refis has a patch to allow GADTs to be matched inside of or-patterns, allowing the above code to type-check, and we would like to get that merged for 4.08.
Note that this patch does not allow type equations introduced by the constructors to be used in the match case. For example, one could imagine allowing the following code:
type ('a, 'b) ty_pair = | Int_int : (int, int) ty_pair | Int_bool : (int, bool) ty_pair | Bool_int : (bool, int) ty_pair | Bool_bool : (bool, bool) ty_pair let fst_default : type a b. (a, b) ty_pair -> a = function | Int_int | Int_bool -> 0 | Bool_int | Bool_bool -> false
Int_bool add the same equation on
int) which is sufficient to type-check the match case. However
supporting this in the general case is very tricky. We have some ideas
about how to do it but they won’t be ready for 4.08.
Improve type propagation in lets
Currently code like this fails to compile:
type t1 = T type t2 = T let foo () = (T : t1) let T = foo ()
because the pattern
T is type-checked before the expression
I think this was originally done this way so that code like:
type t1 = T type t2 = T let x : t1 = T
would work because the
: t1 annotation was considered part of the
Recently the handling of
: t1 changed so that it was copied onto both
the pattern and the expression during. This means we don’t need to
type-check the pattern first anymore, but also creates its own
difficulties, especially for ppx writers. There are also some odd corner
cases because polymorphic annotations on lets are treated differently.
We plan to change the AST to have an explicit representation for:
let pat : typ = exp
and change the order of type checking so that it goes:
typ => exp =>
pat, which seems to be the most natural order.
Shadowing of items from “include”
Currently, to extend a module you must do something like:
include (Foo : module type of struct include Foo end with module Bar := Foo.Bar) module Bar = struct include Foo.Bar let baz = 42 end
which is a bit of a mouthful. This is because you cannot define two
modules with the same name within a single module. Whilst there are
various reasons not to allow module shadowing, most of them don’t apply
to modules that come from an
include statement. So we would like to
add the ability to shadow modules, types, etc. that have come from an
include statement. This will allow the above to be written as:
include Foo module Bar = struct include Foo.Bar let baz = 42 end
Private structure items
It would be convenient to be able to define type aliases or modules for use within a module, without adding them to the module itself (and thus requiring a signature to remove them). We would like to add support for declarations such as:
private type tmp = t list private module M = F(X)
that will not appear in the surrounding module.
Note that this isn’t intended as any kind of replacement for mli files – mli files should always be used. It is really intended for use in sub-modules where the benefit of a full signature might be minimal.
It will also be supported in signatures, allowing things like:
module type S = sig type t module Sub : sig private type outer = t type t val to_outer : t -> outer end end
which avoids forcing modules of type S to define an outer type.
One could also imagine using this feature to fix “ocamlc -i” which, currently, can sometimes print incorrect interfaces. For example, running “ocamlc -i” on the following module:
type t = T module Sub = struct type t = S let to_outer S = T end
type t = T module Sub : sig type t = S val to_outer : t -> t end
which would be incorrect if put in an .mli.
Strengthening the module system
The most subtle parts of OCaml’s module system all revolve around the equalities between different items in modules. For example, a common beginner mistake is to write something like:
module Foo : S = struct type t = foo ... end
when they really wanted:
module Foo : S with type t = foo = struct type t = foo ... end
The most confusing type errors from the module system come from when a module has unexpectedly lost an equality. In the example above it is probably the programmer’s fault, but in many corner cases OCaml will unhelpfully remove an equality when it really shouldn’t.
We’re planning to include a number of features that involve OCaml keeping more equalities on module types. A module type with more equalities is sometimes called a “stronger” module types – hence the above title.
Unfortunately, strengthening module types is not a backwards compatible change. For example, changing the type of a functor from:
module F (X : sig type t end) : ...
module F (X : sig type t = int end) : ...
would clearly break some uses of
F. We feel that the short-term pain
of any breakages caused by these changes is worth it for the long-term
gain of removing a number of confusing corner cases from the language.
Last year, as part of his internship at Jane Street, Maciej Debski implemented transparent ascription. Transparent ascription is an operation:
module M = (N <: S)
M to the elements of the module type
S, but it is
still known to be equal to
N. For example,
M.t would be known to be
Map.Make(M).t would be known to be equal to
Map.Make(N).t. This is in contrast to ordinary ascription:
module M = (N : S)
M.t is not equal to
This feature is pretty useful on its own, but its mainly needed as a prerequisite for the other features in this list.
Aliases to functor parameters
Currently you cannot create a module alias for a functor parameter. For example:
module F (X : S) = struct module M = X end
currently has a type like:
module F (X : S) : sig module M : sig type t = X.t ... end end
The addition of transparent module ascription will make it possible to have module aliases for functor parameters. So the above example would have type:
module F (X : S) : sig module M = (X <: S) end
Fix “with module”
Currently, the semantics of
with module N = M are not to add an alias
N = M, but to give
N the strengthened module type of
module M = struct type t module Inner : sig type t end end module type S = sig module N : sig module Inner : sig type t end end end module type T = S with module N = M
module type T = sig module N : sig type t = M.t module Inner : sig type t = M.Inner.t end end end
We would like to strengthen this behaviour to give the alias instead:
module type T = sig module N = M end
In addition we would support the syntax
module type T = S with module N : R
to extend the signature of sub-module
N, which will allow the old
with module to be obtained using:
module type T = S with module N : module type of M
We would also allow the syntax:
module type T = S with module N = (M <: R)
to extend the signature with a transparent ascription. Note that this
last syntax, in combination with the support for aliasing functor
parameters should finally allow
with module to correctly specify the
result of a functor application.
Fix “module type of”
module type of gives the unstrengthened module type. For example,
module M = struct type t end module type S = module type of M
module type S = sig type t end
This causes all kinds of problems. We will change it to give the
strengthened module type instead, and support a
[@weak] attribute to go
back to the existing semantics. So the above example would give:
module type S = sig type t = M.t end
This change somewhat depends on the other strengthening changes to be tractable because it makes the requirements for a signature like:
module N : module type of M
stricter, and without the other changes many more
[@weak] attributes are
needed to get things compiling again.
Flambda is the name of a series of optimisation passes provided by the native code compilers, which are not enabled by default.
Much of our recent efforts around flambda have been focused around a major rewrite of some of its core components, imaginatively dubbed flambda 2.0. However, we do have some changes planned for the existing version – mostly because they’re orthogonal to the flambda 2.0 changes and should be easy to port between the two versions.
Work towards making flambda classic mode the default compilation mode
There were a number of improvements made to the compile-time performance
-Oclassic mode in OCaml 4.07.0. We’re going to be
benchmarking the performance, both run-time and compile-time, of classic
mode over the next few months. If the comparison with the current
default (non-flambda) compilation mode is good then we would like to
make classic mode the default upstream. If it still needs more work then
we’ll be trying to get that done for 4.08 so that the default might be
changed in 4.09.
Improved inlining heuristics for recursion
As part of his internship at Jane Street, Luke Maurer did some work to
give a proper semantics to how flambda handles recursion. In particular,
it will give command-line arguments like
[@unroll 7] a well defined meaning.
Improved DWARF and GDB support
Mark Shinwell’s patches to dramatically improve DWARF output and GDB support have been waiting on review for a long time. We aim to get this review done and the patches merged upstream for this release.
Move the parser to Menhir
Menhir is a parser generator for OCaml that is vastly superior to ocamlyacc. In particular, it has much better support for producing useful error messages.
A pull-request (#292) was created years ago by Gabriel Scherer and Frédéric Bour, to switch OCaml’s parser to Menhir. This work got resurrected recently and we are trying to lend a hand with finishing and testing it.
The days of
Error: Syntax error. may be numbered.
Add unsigned integer operations
There are open pull requests for both adding unsigned integer types and adding unsigned operations to the existing integer types. There is no consensus at this point around adding actual unsigned integer types, but there seems to be general agreement for at least providing the unsigned operations. We’ve had some demand for this internally, so we intend to help push this work along so it can be merged for 4.08.