People often think of formal methods and theorem provers as forbidding tools, cool in theory but with a steep learning curve that makes them hard to use in real life. In this post, we’re going to describe a case we ran into recently where we were able to leverage theorem proving technology, Z3 in particular, to validate some real world engineering we were doing on the OCaml compiler. This post is aimed at readers interested in compilers, but assumes no familiarity with actual compiler development.
We’ll start by discussing the kind of compiler optimizations we’re interested in and why they might be difficult to get right, in particular when applied to floating-point computations. We’ll then show how we used Z3 to review some optimizations we were considering adding to the OCaml compiler, and finding a subtle bug in one of them.
Constant Folding
Compilers usually perform a lot of optimizations to ensure that the
programs written by developers run as fast as possible. Among these
optimizations, a common one is
constant folding,
which allows the compiler to rewrite an expression such as 2 * 3
to
6
, performing the computation at compile time rather than at run
time. Of course, developers don’t usually write expression like
2 * 3
, but such examples show up anyway, because other compiler
optimizations can create them. e.g., by replacing a symbolic constant
with its actual value. Similarly, constant folding can in turn trigger
other optimizations, like determining that a code fragment is
unreachable, which leads to it being deleted.
A natural question is, of course, the applicability of such optimizations. In the previous example, a multiplication can be computed at compile time not only if we know the values of both operands, but also by applying basic mathematical knowledge such as:
- the presence of a neutral element:
1 * expr
can be rewritten toexpr
; - the presence of an absorbing element:
0 * expr
can be rewritten to0
; - the equivalence with another operation:
-1 * expr
can be rewritten to-expr
(the latter being both shorter and faster on most CPUs).
A common pitfall for compiler developers is to apply mathematical knowledge to objects having slightly different properties. Indeed, while the rewrites above are trivially correct for mathematical integers, one has to keep in mind that what we call integers on computers is typically quite different from what we call integers in mathematics. For instance, in most programming languages, integer values are represented using a fixed number of bits which leads to a bounded range of possible integers.
As an example, if a signed integer is represented using 8 bits, then the range
of possible values is usually -128
..127
. In this setting, is it correct to
rewrite -1 * expr
to -expr
? It turns out that it is, but only because the
usual way overflow is handled (at least when integer values are encoded using
the two’s complement
representation) is such that -(-128) = -128
. The mathematical intuition was
correct, but for not-so-obvious reasons. Floating-point numbers are
unfortunately even trickier to manipulate.
Floating-point Computations
Floating-point numbers
are an approximate representation of real numbers. Floating-point computations
are notoriously error-prone, mostly because some basic mathematical properties
over real numbers do not hold for floating-point values. For instance,
floating-point addition is not associative i.e. given three floating-point
values x
, y
and z
, the following computations do not always yield the same
result:
x + (y + z)
;(x + y) + z
.
To complicate things even further, some floating-point subtleties have no direct
equivalent in mathematics, such as the fact that there are two different values
for zero: a positive one and a negative one. The values will be considered equal
when compared but may yield different results when used in computations. For
instance, 1.0 / +0.0
is equal to positive infinity while 1.0 / -0.0
is equal
to negative infinity. (Which, incidentally, means it is not safe to replace a
floating-point value with another one that is equal according to floating-point
comparison.)
Compiler optimizations are expected to transform programs while retaining the same behavior. As a consequence, a number of compilers either do not perform any optimization on floating-point computations, or provide means to specifically disable such optimizations. The solution we discuss in this post is the restriction to optimizations that always rewrite to computations producing the same result.
While the previous sentence seems to be perfectly reasonable, it is missing practical details: what do we mean by “always” and “same”? The latter is easy to define: we regard two floating-point values as identical if and only if they have the very same bit pattern according to the IEEE 754 specification (this ensures that they could not be discriminated by any operation). The former is considerably trickier to enforce because all possible combinations of values have to be considered, and ideally all possible rounding modes. A rounding mode determines which value should be chosen when the exact result of an operation cannot be represented (because floating-point values are only approximations).
Automated Theorem Proving to the Rescue
The traditional way to demonstrate that a property is always true is to write a mathematical proof. However, writing a formal proof is hard work, so we are going to take an alternate approach, which is to use a tool in order to automate the process of doing the proof.
A SAT solver is a tool able to solve a
boolean SATisfiability problem,
i.e. to determine whether, given a logical formula such as
a
∧ (b
∨ c
) ∧ ¬ c
∧ ¬ (b
∧ d
),
there exists a mapping from the formula variables to truth values that evaluates
the formula to true.
SMT stands for
Satisfiability Modulo Theories,
where the theories act as a built-in knowledge about various kinds of
structures such as integers or arrays. Thus, accepting formulas such as
(a
> 10) ∧ (b
∨ (c = 2)) ∧ ¬ (b
∧ (d[c]
< 0)).
The built-in knowledge of SMT solvers makes them much more efficient than SAT
solvers because in a number of situations they can leverage properties from a
given theory in order to reduce the search space.
The basic use of an SMT solver consists of declaring various entities and expressing constraints over those entities before asking the solver whether the constraints can be satisfied (i.e. can hold simultaneously). If the constraints can be satisfied, the solver can usually produce a model, that is a mapping from entities to values that satisfy the constraints.
Z3 is an SMT solver that comes with support for both integer values (through bit vectors of custom sizes) and floating-point values (using the IEEE 754 representation). Assuming we want to check whether the positive zero is a right neutral element for addition, we would submit the following query to Z3:
(set-logic QF_FP)
(declare-const x (_ FloatingPoint 11 53))
(define-fun z () (_ FloatingPoint 11 53) (_ +zero 11 53))
(declare-const r RoundingMode)
(assert (not (= (fp.add r x z) x)))
(check-sat)
(get-model)
Besides the first line that sets the theory to be used, the file is relatively straightforward:
- the
declare-const
/define-fun
lines declare the various entities to be used later on: an unknown floating-point value, an alias for the positive zero, and an unknown rounding mode (the values11
and53
are simply the numbers of bits used for respectively the exponent and the mantissa of a floating-point value); - the
assert
line encodes the formula we want to check i.e.x + +0 = x
; - the last two lines ask Z3 to check the satisfiability of the formula and to output the model if any.
It is noteworthy that we actually encode the negation of the property we want to check. As a consequence, satisfiability means that the property does not hold, and the model is actually a counterexample. If we get a counterexample, then we can double-check that Z3 is right. Otherwise, we basically have to trust both Z3 itself and our use of the tool.
Z3 answers the previous query with the following:
sat
(model
(define-fun x () (_ FloatingPoint 11 53)
(_ -zero 11 53))
(define-fun r () RoundingMode
roundNearestTiesToEven)
)
On the first line, sat
means that the assertion is satisfied. The other lines
simply present a model (i.e. a counterexample for us) where x
is bound to
negative zero. The suspicious Z3 user can then feed the values to an OCaml
toplevel:
let x = -0.
and z = +0.
and bits = Int64.bits_of_float in
bits (x +. z), bits x;;
whose output indeed shows two different values (0
and
-9223372036854775808
). We should hence not rewrite x + +0
to x
.
Should Z3 have answered that there is no model for a given request, we would
have been able to ask it to produce a proof. However, the proof establishing
that there is no model for (assert (not (= (fp.mul r x one) x)))
(i.e. that
the x * 1
⇝ x
rewrite is correct) has more than 100,000 lines. This
means that, in practice, such a proof would only be passed to another tool and
thus only move the trust issue to another piece of code…
Upcoming flambda 2.0
The previous example has not been chosen at random. The x + +0
⇝ x
rewrite was for a (very) short time part of the development version of the
compiler, and then rapidly removed once Z3 proved its invalidity. Possibly
saving us a painful debugging session later on.
Using Z3, we have checked that all the simplifications over integer and floating-point values (involving arithmetic, bitwise and shifting operators) in the upcoming flambda 2.0 variant of the OCaml compiler are correct. More, these simplifications are correct for all integer sizes (OCaml defining 4 kinds of integers) and all rounding modes (OCaml offers no means to tweak the rounding mode, but it could be modified through an external function call).
Eventually, we plan to use a ppx extension combined to a simple DSL in order to be able to express the properties underlying the various rewrites as mere attributes of the OCaml code actually performing the rewrites (hence ensuring that the checks and the code are kept synchronized). The simplifications for floating-point multiplication when only one operand is known would look like:
...
| Mul ->
if F.equal this_side F.one then
The_other_side
[@z3 check_float_binary_neutral `Mul (+1.0) `Right]
[@z3 check_float_binary_neutral `Mul (+1.0) `Left]
else if F.equal this_side F.minus_one then
Float_negation_of_the_other_side
[@z3 check_float_binary_opposite `Mul (-1.0) `Left]
[@z3 check_float_binary_opposite `Mul (-1.0) `Right]
else
Cannot_simplify
...
Relying on a tool such as Z3 to check whether candidate rewrites are correct is invaluable. It does not only ensure that we do not introduce (possibly subtle and hard-to-reproduce) bugs, but also paves the way to more complex rewrites. Indeed, the safety net acts as an encouragement to experiment with ideas that could be deemed too risky otherwise.
Further reading
Verifying Optimizations using SMT Solvers (by Nuno Lopes) shows how Z3 can be used to ensure that transforms from LLVM IR to assembly code are correct.
What Every Computer Scientist Should Know About Floating-Point Arithmetic (by David Goldberg) is an in-depth lecture about floating-point representation and operations, addressing a number of common misconceptions.