Null is a pervasive concept in computing. Virtually all programming languages have a way of expressing nothing, nullity, no answer. But handling nulls correctly turns out to be tricky, and many of the contexts in which you find nulls, you’ll also find confusing and error-prone semantics surrounding them.
The heart of the problem is that, in an attempt to make programming with null easier, nulls are often propagated implicitly through computations, allowing the programmer to write code that deals with nulls without explicitly contemplating how nulls should be dealt with.
My experience has been that this is a mistake; that if you want robust, easy-to-reason-about code, the programmer must think explicitly about how to handle null cases, and that programming languages would do well to provide programmers with good support for the requisite case analysis
The point can be illustrated by considering some of the contexts in which null arises.
IEEE Floating point arithmetic
This is an oft-forgotten case, but an important one. The null value in this case
is NaN
, which stands for not-a-number. NaN
is the value you get when a basic
arithmetic operation has no reasonable answer, e.g. zero times infinity.
In the IEEE standard, NaN
behaves quite differently from other floats. The
standard requires that applying the primitive arithmetic operations to NaN
always produces NaN
. At first glance, this seems sensible. What’s 3
times
I-don't-know
? Clearly it’s I-don't-know
!
But what about primitive operations that don’t return floats, like comparisons?
According to the standard, comparison functions return false when one of their
arguments is NaN
. Now this should make you nervous, since it’s not at all
obvious that the value of I-don't-know
> 4
should be false.
Indeed, this behavior violates a bunch of fundamental invariants. For example,
it’s simply not the case that x > y is equivalent to not (x <= y) when x
or y
are NaN
. Also, comparison functions don’t give a total order on floats
– see what happens to your favorite binary tree when you throw in some NaN
s.
Reflexivity doesn’t even hold, since NaN = NaN
is false!
To see how weird the results of this can be, consider the following function for computing the max of two numbers (the example is in OCaml, but the same behavior will show up in any language that provides IEEE compliant floating point semantics):
let max x y =
if x > y then x else y
If either x
or y
is NaN
, the test will return false, and y
will be
chosen. Thus, max NaN 3
is 3
, and max 3 NaN
is NaN
. Note that max
doesn’t have the null-propagation property that the IEEE standard wanted for
primitive float-producing operations. max
doesn’t even have the symmetry
property you would expect: max x y
is not the same as max y x
.
Here’s another example. Consider the following two functions, which are intended to validate whether the provided float is between 1 and 10.
type result = Fail | Pass
let check1 x =
if x < 1. || x > 10. then Fail else Pass
let check2 x =
if x >= 1. && x <= 10. then Pass else Fail
The two functions certainly look equivalent, and if you’re not clued in to the
strange semantics of NaN
, you might be surprised to discover that check1
returns Pass
for NaN
, whereas check2
returns Fail
.
SQL’s NULL
Much ink has been spilled about the problems associated with SQL’s NULL
. The
wikipedia page on NULL
is a
pretty good summary, so instead of trying to give a complete picture here, I’ll
just give a taste of how NULL
works in SQL, and what the associated problems
are.
At first glance, SQL’s NULL
looks a lot like NaN
. If you call a simple
function and give it NULL
as one of its arguments, the result will be NULL
as well e.g., 3 + NULL
is NULL
. But unlike NaN
, SQL’s NULL
is not
restricted to floating point numbers; You can have a NULL
in a column of
arbitrary type.
Let’s see what happens when we start doing comparison functions in SQL. It turns
out that SQL propagates a kind of NULL
into boolean expressions as well. This
seems more consistent than the behavior with NaN
. Indeed, the odd way that
NaN
propagates into comparison functions is what sunk our implementation of
max
, and so you might expect max
implemented in SQL will work properly.
Let’s try. Assume we have a table t
which contains some NULL
s:
| i | j |
|------|------|
| 1 | 2 |
| NULL | 2 |
| 1 | NULL |
If we compute the max
of these two columns, we should expect the result to be:
| i | j | max |
|------|------|------|
| 1 | 2 | 2 |
| NULL | 2 | NULL |
| 1 | NULL | NULL |
Here’s some SQL code code for computing the max from two columns.
SELECT
i, j,
CASE WHEN i > j THEN i ELSE j END
AS max
FROM t
Sadly, we see the exact same bizarre behavior we got out of our floating-point
max
function.
| i | j | max |
|------|------|------|
| 1 | 2 | 2 |
| NULL | 1 | 1 |
| 1 | NULL | NULL |
Why is this? Because SQL doesn’t hold its ground when it comes to the way it
handles NULL
in conditional expressions. A NULL
condition is treated as
equivalent to false, where the more consistent behavior would be for the entire
CASE
expression to evaluate to NULL
.
The root of the problem here is the attempt to pick a reasonable default
behavior in the presence of NULL
in a way that doesn’t just scuttle the entire
computation. Saving such a computation isn’t hopeless – these null-handling
heuristics often produce reasonable answers. But they can produce unreasonable
answers as well, and the end result is that the behavior of SQL in the presence
of NULL
is inconsistent and confusing.
The above example is really just the tip of the iceberg. You can see examples of strange behavior with aggregate functions, selects and joins as well.
Fundamentally, both SQL’s NULL
and the floating-point NaN
fail because the
choice of how to salvage a calculation that encounters null depends on things
that can not be known by the heuristic.
One good aspect of SQL’s NULL
handling is that SQL provides ways of enforcing
constraints that given columns contain no NULL
s. Given the odd behavior of
NULL
s, it’s an essential feature.
Null references
Another common form of null is the null reference (or null pointer). Null references appear in most mainstream statically typed programming languages, including C, C++, C#, Java. In the following, I’ll talk about Java’s null references, but the same basic issues show up in many languages.
Unlike IEEE floating point arithmetic and SQL, Java does not try to automatically salvage computations that encounter nulls. Instead, any computation that tries to make a method call on a null reference will result in an exception. In some sense, exceptions are their own kind of null, a special value that a function can return when it can’t return anything else. (non-termination is yet another way for a computation to refrain from returning an ordinary value.)
The problem with null references is that while their handling is explicit at the
level of values, it’s implicit at the level of types. Java’s type system allows
for all object references to potentially be null references, even though for
most variables most of the time, null will never show up. This is unlike SQL,
where some values can be declared as non-NULL
.
Because the type system gives you no clue as to the presence of nulls, null reference exceptions are rather ubiquitous in Java programs. Indeed, elaborate systems like ESC Java (ESC stands for “Extended Static Checking”) have been developed to enforce various static checks, prominent among them the elimination of runtime null reference exceptions.
As a side note, it’s interesting that Java has such a loosey-goosey approach to null object references, when it has a fairly strict approach to exception handling. Indeed, Java requires one to be quite strict and explicit when dealing with so-called checked exceptions.
ML’s option type
The problem with Java’s null references is different from the problem with SQL and IEEE floating point arithmetic. In those two cases, the system tries to “do the right thing” when null comes up, and that right thing turns out to sometimes be wrong; the problem with Java is that it doesn’t provide sufficient tools in the language to ease the programmer’s life in dealing with nulls.
Languages like ML and Haskell that are based on the Hindley-Milner type system,
on the other hand, provide quite powerful tools for null handling. Rather than
talk about such languages collectively, I’ll talk in terms of the instance I
know best, OCaml. In OCaml, null is not lurking in every type. If a variable has
type int array
, then that variable is always populated with an int array
,
and never with null.
When you need a variable whose value might or might not be there, it is modeled
explicitly in the type system. One common way of doing so is using what’s called
an option
. The option
type is not a primitive of the language, but an
ordinary data type, which can be declared as follows:
type 'a option =
| None
| Some of 'a
This is a tagged union with two cases: None
, indicating the lack of a value;
and Some
, indicating the presence of a specified value. To see how this might
be used in practice, consider the following simple hashtable interface:
module Table : sig
type ('key,'data) t
val create : unit -> ('key,'data) t
val find : ('key,'data) t -> 'key -> 'data option
val replace : ('key,'data) t -> 'key -> 'data -> unit
end
Notice that find
returns an optional value. The reason is that find
may fail
to find an entry corresponding to the given key. This fact is modeled in the
type by the fact that the return value is of type 'data option
rather than
simply 'data
.
In addition to allowing one to express in the type system that a value may be
missing, OCaml also provides powerful tools for doing the corresponding case
analysis. Here’s a simple example deals with options explicitly using match
statements.
let increment_count table key =
let current_count =
match Table.find table key with
| None -> 0
| Some x -> x
in
Table.replace table key (current_count + 1)
Here we explicitly state that we want the null case (where the key can’t be found in the table of counts) to be treated as if the table had an entry with value equal to zero.
Explicit null handling like the kind shown above can be done in, say, SQL, using
the COALESCE
statement. But in OCaml, the case analysis is obligatory – you
can’t silently use an optional value without contemplating what will happen in
the None
case. Consider the following implementation of increment_count
:
let increment_count table key =
Table.replace table key (Table.find table key + 1)
While the analogous code would compile in SQL, the OCaml compiler will reject
it, noting that the expression Table.find table key
was used as if it were an
int
, but was in fact an int option
. This may seem cumbersome, but in
reality, the explicit separation of cases where null is and is not possible is a
great relief, since it frees the programmer from worrying about the cases not
flagged by the compiler.
The requirement to handle nulls explicitly doesn’t mean that we can’t reduce the amount of boilerplate required. If we have a particular null-handling policy in mind, we can write helper functions to automate it. For example, we can write a function:
let option_get ~if_none option =
match option with
| Some x -> x
| None -> if_none
This allows us to rewrite our increment_count
function as follows
let increment_count table key =
let current_count = option_get ~if_none:0 (Table.find table key) in
Table.replace table key (current_count + 1)
Indeed, in Jane Street’s Base
library there is an
Option
module devoted to useful helpful helper functions of this sort,
including an option monad which is a quite general and powerful
technique for dealing with option-generating computations.
OCaml’s approach is far from perfect. While OCaml is quite explicit about the use of options, it is quite the opposite when it comes to exceptions. Exceptions, like null references in Java, lurk in every function, and this is a very real problem. Indeed the inability to track exceptions in the type system has lead us to try to avoid exceptions except for truly exceptional conditions (and to use options instead).
If you’re interested in working at a place where functional programming meets the real world, you should think about applying to Jane Street. We’re always looking to hire great programmers with an interest in functional programming.