From now and then, I found myself having to write some mechanical and repetitive code. The usual solution for this is to write a code generator; for instance in the form of a ppx rewriter in the case of OCaml code. This however comes with a cost: code generators are harder to review than plain code and it is a new syntax to learn for other developers. So when the repetitive pattern is local to a specific library or not widely used, it is often not worth the effort. Especially if the code in question is meant to be reviewed and maintained by several people.
Then there is the possibility of using a macro pre-processor such as cpp or cppo which is the equivalent of cpp but for OCaml. This can help in some cases but this has a cost as well:
- macros generally make the code harder to read
- errors tends to be harder to understand since they don’t point where you’d expect
- you can say goodbye to merlin
In fact, when the repetitive pattern is specific to one particular case and of reasonable size, committing and reviewing the generated code is acceptable. That’s the problem Cinaps tries to solve.
What is cinaps?
Cinaps is an application that reads input files and recognize special syntactic forms. Such forms are expected to embed some OCaml code printing something to stdout. What they print is compared against what follow these special forms. The rest works exactly the same as expectation tests.
The special form is (*$ <ocaml-code> *)
for ml source files,
/*$ <ocaml-code> */
for C source files and #|$ <ocaml-code> |#
for
S-expression files.
For instance:
$ cat file.ml
let x = 1
(*$ print_newline ();
List.iter (fun s -> Printf.printf "let ( %s ) = Pervasives.( %s )\n" s s)
["+"; "-"; "*"; "/"] *)
(*$*)
let y = 2
$ cinaps file.ml
---file.ml
+++file.ml.corrected
File "file.ml", line 5, characters 0-1:
let x = 1
(*$ print_newline ();
List.iter (fun s -> Printf.printf "let ( %s ) = Pervasives.( %s )\n" s s)
["+"; "-"; "*"; "/"] *)
+|let ( + ) = Pervasives.( + )
+|let ( - ) = Pervasives.( - )
+|let ( * ) = Pervasives.( * )
+|let ( / ) = Pervasives.( / )
(*$*)
let y = 2
$ echo $?
1
$ cp file.ml.corrected file.ml
$ cinaps file.ml
$ echo $?
0
Real example
What follows is a real example where using Cinaps made the code much easier to write and maintain. However, I changed the names for this blog post since this code is not released publicly. Note also that this example shows one way we usually write C bindings at Jane Street. It is not meant as a model of how to write C bindings, and the excellent ctypes library should be the default choice in most cases. However, this code pre-dates ctypes and migrating it would be quite a lot of work.
The example itself is part of a C binding that I wrote a few years ago. While
doing so I used Core.Flags
in order to represent a few C enumerations on the
OCaml side. Core.Flags
is a module providing a nice abstraction for
representing C
flags.
The OCaml code looks like what you’d expect from code using Core.Flags
:
module Open_flags = struct
external get_rdonly : unit -> Int63.t = "mylib_O_RDONLY" [@@noalloc]
external get_wronly : unit -> Int63.t = "mylib_O_WRONLY" [@@noalloc]
external get_rdwr : unit -> Int63.t = "mylib_O_RDWR" [@@noalloc]
external get_nonblock : unit -> Int63.t = "mylib_O_NONBLOCK" [@@noalloc]
external get_append : unit -> Int63.t = "mylib_O_APPEND" [@@noalloc]
external get_creat : unit -> Int63.t = "mylib_O_CREAT" [@@noalloc]
external get_trunc : unit -> Int63.t = "mylib_O_TRUNC" [@@noalloc]
external get_excl : unit -> Int63.t = "mylib_O_EXCL" [@@noalloc]
external get_noctty : unit -> Int63.t = "mylib_O_NOCTTY" [@@noalloc]
external get_dsync : unit -> Int63.t = "mylib_O_DSYNC" [@@noalloc]
external get_sync : unit -> Int63.t = "mylib_O_SYNC" [@@noalloc]
external get_rsync : unit -> Int63.t = "mylib_O_RSYNC" [@@noalloc]
let rdonly = get_rdonly ()
let wronly = get_wronly ()
let rdwr = get_rdwr ()
let nonblock = get_nonblock ()
let append = get_append ()
let creat = get_creat ()
let trunc = get_trunc ()
let excl = get_excl ()
let noctty = get_noctty ()
let dsync = get_dsync ()
let sync = get_sync ()
let rsync = get_rsync ()
include Flags.Make(struct
let known =
[ rdonly , "rdonly"
; wronly , "wronly"
; rdwr , "rdwr"
; nonblock , "nonblock"
; append , "append"
; creat , "creat"
; trunc , "trunc"
; excl , "excl"
; noctty , "noctty"
; dsync , "dsync"
; sync , "sync"
; rsync , "rsync"
]
let remove_zero_flags = false
let allow_intersecting = false
let should_print_error = true
end)
end
And there are about 3 modules like this in this file, plus the corresponding stubs in the C file. Writing this code initially was no fun, and adding new flags now that the C library has evolved is still no fun.
The rest of this section explains how to make it more fun with cinaps.
Setting up and using cinaps
First I add a rule in the build system to call cinaps
appropriately. I use a
few settings specific to our jenga based builds and it is currently not possible
to replicate this outside of Jane Street, but assuming you have a Makefile
,
you can write:
.PHONY: cinaps
cinaps:
cinaps -i src/*.ml src/*.c
Now whenever you call make cinaps
, all the files will be updated in place. You
can then do git diff
to see what changed.
Then I write a file src/cinaps_helpers
. It is plain OCaml source file, however
it is not suffixed with .ml so that it is not confused with a regular module of
the library. It contains the various bits that are common between the ml/C files
in the library:
(* -*- tuareg -*- *)
let stub_prefix = "mylib_"
let stub name = stub_prefix ^ name
let open_flags =
[ "O_RDONLY"
; "O_WRONLY"
; "O_RDWR"
; "O_NONBLOCK"
; "O_APPEND"
; "O_CREAT"
; "O_TRUNC"
; "O_EXCL"
; "O_NOCTTY"
; "O_DSYNC"
; "O_SYNC"
; "O_RSYNC"
]
let other_flags =
[ ...
]
let yet_other_flags =
[ ...
]
let all_flags = open_flags @ other_flags @ yet_other_flags
open StdLabels
open Printf
let pr fmt = printf (fmt ^^ "\n")
let flags_module module_name flags ~prefix ~allow_intersection =
<code to print an Open_flags like module>
Now, in my original .ml file, I can write:
(*$ #use "cinaps_helpers" $*)
(*$ flags_module "Open_flags" open_flags ~prefix:"O_" ~allow_intersecting:false *)
module Open_flags = struct
external get_rdonly : unit -> Int63.t = "mylib_O_RDONLY" [@@noalloc]
external get_wronly : unit -> Int63.t = "mylib_O_WRONLY" [@@noalloc]
external get_rdwr : unit -> Int63.t = "mylib_O_RDWR" [@@noalloc]
external get_nonblock : unit -> Int63.t = "mylib_O_NONBLOCK" [@@noalloc]
external get_append : unit -> Int63.t = "mylib_O_APPEND" [@@noalloc]
external get_creat : unit -> Int63.t = "mylib_O_CREAT" [@@noalloc]
external get_trunc : unit -> Int63.t = "mylib_O_TRUNC" [@@noalloc]
external get_excl : unit -> Int63.t = "mylib_O_EXCL" [@@noalloc]
external get_noctty : unit -> Int63.t = "mylib_O_NOCTTY" [@@noalloc]
external get_dsync : unit -> Int63.t = "mylib_O_DSYNC" [@@noalloc]
external get_sync : unit -> Int63.t = "mylib_O_SYNC" [@@noalloc]
external get_rsync : unit -> Int63.t = "mylib_O_RSYNC" [@@noalloc]
let rdonly = get_rdonly ()
let wronly = get_wronly ()
let rdwr = get_rdwr ()
let nonblock = get_nonblock ()
let append = get_append ()
let creat = get_creat ()
let trunc = get_trunc ()
let excl = get_excl ()
let noctty = get_noctty ()
let dsync = get_dsync ()
let sync = get_sync ()
let rsync = get_rsync ()
include Flags.Make(struct
let known =
[ rdonly , "rdonly"
; wronly , "wronly"
; rdwr , "rdwr"
; nonblock , "nonblock"
; append , "append"
; creat , "creat"
; trunc , "trunc"
; excl , "excl"
; noctty , "noctty"
; dsync , "dsync"
; sync , "sync"
; rsync , "rsync"
]
let remove_zero_flags = false
let allow_intersecting = false
let should_print_error = true
end)
end
(*$*)
And cinaps will check that the text between the (*$ ... *)
and (*$*)
forms
is what is printed by flags_module "Open_flags" ...
. I write something similar
in the .c file. Note the initial (*$ ... $*)
form, which is not expected to
print anything and is only used for its other side effects.
Adding new flags become trivial: add it to the list in src/cinaps_helper
and
execute make cinaps
.
Pushing the system
Now I decide that I don’t like the fact that all my constant flags are initialized at runtime and I want them to be static constant on the ml side. A simple way to do this is to write a C program that include the right headers and output a .ml file defining these constants. I use cynaps to write this C file as well:
/*$ #use "cinaps_helpers" $*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main()
{
printf("open Core\n");
printf("let mk = Int63.of_int_exn\n");
/*$
printf "\n";
let len = longest all_flags in
List.iter all_flags ~f:(fun f ->
pr {| printf("let _%-*s = mk %%d\n", %-*s);|} len f len f );
printf " " */
printf("let _O_RDONLY = mk %d\n", O_RDONLY );
printf("let _O_WRONLY = mk %d\n", O_WRONLY );
printf("let _O_RDWR = mk %d\n", O_RDWR );
printf("let _O_NONBLOCK = mk %d\n", O_NONBLOCK);
printf("let _O_APPEND = mk %d\n", O_APPEND );
printf("let _O_CREAT = mk %d\n", O_CREAT );
printf("let _O_TRUNC = mk %d\n", O_TRUNC );
printf("let _O_EXCL = mk %d\n", O_EXCL );
printf("let _O_NOCTTY = mk %d\n", O_NOCTTY );
printf("let _O_DSYNC = mk %d\n", O_DSYNC );
printf("let _O_SYNC = mk %d\n", O_SYNC );
printf("let _O_RSYNC = mk %d\n", O_RSYNC );
/*$*/
return 0;
}
Updating the various flag modules in the the ml code is as simple as editing
src/cinaps_helpers
and doing make cinaps
:
(*$ flags_module "Open_flags" open_flags ~prefix:"O_" ~allow_intersecting:false *)
module Open_flags = struct
let rdonly = Consts._O_RDONLY
let wronly = Consts._O_WRONLY
let rdwr = Consts._O_RDWR
let nonblock = Consts._O_NONBLOCK
let append = Consts._O_APPEND
let creat = Consts._O_CREAT
let trunc = Consts._O_TRUNC
let excl = Consts._O_EXCL
let noctty = Consts._O_NOCTTY
let dsync = Consts._O_DSYNC
let sync = Consts._O_SYNC
let rsync = Consts._O_RSYNC
include Flags.Make(struct
let known =
[ Consts._O_RDONLY , "rdonly"
; Consts._O_WRONLY , "wronly"
; Consts._O_RDWR , "rdwr"
; Consts._O_NONBLOCK , "nonblock"
; Consts._O_APPEND , "append"
; Consts._O_CREAT , "creat"
; Consts._O_TRUNC , "trunc"
; Consts._O_EXCL , "excl"
; Consts._O_NOCTTY , "noctty"
; Consts._O_DSYNC , "dsync"
; Consts._O_SYNC , "sync"
; Consts._O_RSYNC , "rsync"
]
let remove_zero_flags = false
let allow_intersecting = false
let should_print_error = true
end)
end
(*$*)
Tweak: indenting the generated code
You can either write cinaps code that produce properly indented code, or you can use the styler option:
.PHONY: cinaps
cinaps:
cinaps -styler ocp-indent -i src/*.ml src/*.c
History behind the name
I initially wrote this tool while I did some work on the ocaml-migrate-parsetree project. ocaml-migrate-parsetree was started by Alain Frisch and continued by Frederic Bour and aims at providing a solid and stable base for authors of ppx rewriters or other tools using the OCaml frontend. I helped a bit during development and did some testing on a large scale while rebasing our ppx infrastructure on top it.
Due to its nature, this project contains a lot of repetitive code that cannot be factorized other than by using some kind of meta-programming. Initially we had a small pre-preprocessor that was interpreting a made-up syntax and was working like cpp does. The syntax was yet another DSL and the generated code was generated on the fly. This made the .ml and .mli files harder to understand since you had to decode this DSL in order to understand what the code was.
Cinaps replaced this tool and the name was chosen to emphasize that it is not a preprocessor. It means “Cinaps Is Not A Preprocessing System”.
Status
Cinaps is published on github and is part of the upcoming v0.9 Jane Street release. The version that is published doesn’t yet support the C/S-expression syntaxes but once the stable release has gone through, an updated version of Cinaps supporting these syntaxes will be released.