null object

a behavioral pattern
Represent a value without actually having a value.
PNG
SVG
hide notes
The null object Design Pattern
Frequency
Complexity

I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

Tony Hoare (25 August 2009):
"Null References: The Billion Dollar Mistake". InfoQ.com.

Whether null references indeed generated damages worth 1 billion dollars may be the subject of heated debates. One thing, however, is for sure: Almost every software developer has encountered a null reference error at some point. But as long as the problem has existed, developers have been trying to find ways to fix it.

The Null Object pattern is one of these.

It used to be quite popular in the heydays of OOP but has lost a bit of its prominence due to the advent of newer programming concepts and languages (we will discuss some of them later).

While it might not be that common anymore, there's still a chance to find it in legacy systems or older codebases. So it might be worth taking a look.

The Problem: Null References

Let's start with a recap on the problem of null references. In most programming languages, a variable pointing to a null reference represents the absence of a concrete value - its value is literally nothing.

There are many cases where this can happen:

  1. You use an invalid id to request an object from your database. Since the database can't find any entry matching your id, it returns a null value, indicating the search revealed no result set.

  2. In your action-based RPG video game, the sword of your mighty warrior can brake during a battle. Since you don't have the concept of broken weapons in your game, your weapon's variable will instead point to a null value.

  3. An even more straightforward example: You want to convert the string "73e4" into a number. Since your conversion function cannot successfully parse the input, it returns a null value instead (if not throwing an exception).

Null References are problematic: Accessing the members of a null value is likely to result in a NullPointerException, which, if unhandled, will crash your app. After all, it would not make sense to call any method on nothing.

This behavior is acceptable as long as your program implements additional checks before accessing potential null variables. But since most programmers (including me) are lazy people, it's common for us to forget these extra checks, which leads as often to unexpected program abortions - and, of course, angry users...

So, a safe way to avoid these exceptions is by adding a check every time we access an object that could be null.

Sounds quite cumbersome?

Indeed, and that's why we have the Null Object pattern.

The Null Object

Instead of checking for null values everywhere, we instead use a different approach: Whenever our program logic would return null, we rather return an object. This object has the same interface as our real object but does not contain any actual implementation. The caller notices no difference: All methods can be invoked as usual, except the invocations don't execute any code.

Let's illustrate this with an example: We have developed a customer management system that allows our users to manage their customer data. One function allows users to update a customer's address.

The implementation is straightforward: We request a Customer entity from the database through its id. When the instance is returned, we call its updateAddress method to set the new address data.

A Customer Repository returning Customer entities
A Customer Repository with a method to retrieve Customer Entities

As the instance returned by the database could be null - e.g., if our id did not match any valid entity - we usually would have to add an additional check to ensure we're not accessing a null reference.

To avoid this and later checks, we instead create a new subclass of Customer, which we call NullCustomer, and let our repository method return that instead of a null value.

Since our NullCustomer is derived from Customer, they both have the same public interface, so there is no need to change any client code.

A Customer Repository returning either a real customer or a null object.
In case there is no customer for the given id, the repository returns a NullObject

We could even go one step further and apply this strategy across our whole app. As a result, we could skip any null checks without the fear of running into NullPointerExceptions - doesn't sound too bad, does it?

Well, generally, yes... but there are some things to consider...

Pitfalls

Commands and Queries

To better understand the implications of this pattern, we take a brief look at how methods can affect an object's state.

Basically, there are two types of methods:

  1. Methods that change the application or system state, sometimes also called Commands. All write operations fall into this category, whether it's a setter updating an object's attribute or a log function writing to the console. Since executing them affects our application's state, we say they have side effects.
A logger with a print method.
A method with a side effect changes the program state but does not return a value.
  1. Methods that only return data are often referred to as Queries. Since they don't alter any state, they are free of side effects. The getCustomerById method we used earlier is a good example of a query: It only returns a Customer entity without modifying any state.
A repository method returning a customer entity.
Pure methods return data without altering any state.

It is recommended not to mix both concepts in the same method. Queries should return values but not change any state, whereas Commands should generally have no return values.

Okay, but why is this separation necessary? It plays an important role when deciding how to implement our NullObject: Since the pattern represents a nonexistent value or nothing, calling any of its methods should never change the application's state.

As we know, the only operations that can change state are Commands. Hence, our NullObject must override all Commands with an empty implementation to avoid state modifications.

Calling such an "empty" command is always safe since it neither returns nor changes anything. A clever compiler may even be able to optimize the call away.

For queries, the situation is more complex: Here, the caller expects a return value, but what should our NullObject provide? Returning null would make little sense as it just postpones the null reference problem further back. Additional checks would still be required again, which we wanted to avoid.

The only reasonable solution would be returning another NullObject. But what to do if we again have to call a query? We would have to continue returning NullObjects until we reach the end of our call chain.

A NullObject returning another NullObject instance.
If we start with a NullObject, we have to follow this approach throughout our complete object hierarchy to avoid any null values.

Creating such NullObjects structures parallel to existing class hierarchies would soon become cumbersome, limiting the pattern's use in scenarios with many queries to be overriden.

Hiding Bugs

There is another, more serious issue one has to consider when using the NullObject: The pattern can postpone or hide bugs in your application.

If a method returns null, this often indicates that something went wrong: For example, you requested an object from a DB using an incorrect id, or you tried a conversion between two incompatible data types. Your application should handle these cases explicitly, as it's unlikely that your program logic can continue as if nothing happened.

Accordingly, our previous getCustomerById method returning a NullObject is a rather bad example of this pattern. Maybe the customer has already been deleted, or we mixed up the IDs somewhere before. However, a customer's request with an invalid ID is most likely due to a bug in our application. Returning a null reference would be a clear way to signal the client that something went wrong.

If our method would otherwise return a NullObject, the caller would assume that everything worked as expected since the return value behaves like an ordinary object. The actual error - using an invalid id for a customer request - gets obfuscated by our NullObject. Sooner or later, this will lead to our app behaving strangely, as any changes to the Customer entity won't be permanent.

Imagine what your users would say if they spent ten minutes editing a customer's dataset through an input form just to receive a message at the end: Sorry, your data cannot be saved because this customer does not exist.

You then better hope your users don't know where you live...

Implementation

Implementing the NullObject pattern has some implications regarding the existing code base: Since the pattern must replace all operations of the object it surrogates, all base methods must be overrideable. This can best be ensured using a common interface for the NullObject and the type it replaces.

NullObjects and real implementations use a common interface.
NullObject and real implementation use the same interface.

Since all operations of the NullObject must be side-effect-free, the object does not have to maintain any state. This is convenient because it also means that all NullObject instances can be treated equally. In most cases, using a single instance - maybe even a static one? - would therefore be enough.

Alternatives

As we've seen, despite being useful, NullObjects can also have some drawbacks, like shadowing potential errors. But since NullReference errors are a widespread problem, many of today's languages offer additional strategies to compete with this problem.

Let's have a look at some of them.

Optional Chaining

Some languages provide an Optional Chaining operator (or Null Conditional in C#), often written as ?.. Calling this operator works similarly to the default member access operator ., but prechecks whether the accessed object is a valid reference. If the called operation is a Command, nothing will happen, and if the method is a Query, a null value will be returned.

const customer = repository.getCustomerById(id);

// the folloing two snippets are equivalent:

// 1. use an explicit null check
if(customer !== null) {
  customer.udpateAddress(address);
}
// 2. using the optional chaining operator
customer?.updateAddress(address);

This behavior reassembles the logic of the NullObject pattern in large parts. Still, it avoids the problem of implementing nested NullObject call hierarchies, as operator chainging is supported, making it easier to traverse through depper structures. A call like

thesis?.student?.supervisor?.scheduleMeeting()

is totally safe against NullPointerExceptions - whether it's a good programming style is a different matter...

Applying this operator can be very handy, but as with the NullObject pattern, there is always the risk of hiding the root cause of a potential error, so please use it with caution.

Non-Nullable types

Some languages allow declaring a variable explicitly as non-nullable and letting the compiler enforce that it never gets a null value assigned to it. This allows the compiler to identify any access to a variable containing a null value at compile time. Because you still need to implement additional checks - otherwise, your code wouldn't compile - this is more of an addition to the NullObject pattern than a replacement.

C#, for instance, provides a programming mode where all references are implicitly declared non-nullable, and potential null value access must always be handled explicitly.

// enable the null annotation context
#nullable enable

public Customer {
  public UpdateAddress(Address newAddress) {
    // contains code with side effect
    // (implementation omitted)
  }
}

public Repository {
  // note the "?" postfix indicating that
  // this method can return a null reference.
  public Customer? GetCustomerById(string id) {
    // (implementation omitted)
  }
}

public static void Main() {
  var repository = new Repository();
  var customer = repository.GetCustomerById("123");
  // the following line will generate a warning
  // because we have to check customer explicitly
  //  for null before invoking any methods
  customer.UpdateAddress(new Address());
}

Languages without Null Values

Another way is to banish the concept of references being null from your language altogether. Rust, for example, follows this approach by providing an enum called Option that can be used to express the absence of a value.

enum Option<T> { 
    None, 
    Some(T), 
}

When calling a function that returns an Option value, the caller has to check whether the option's state is None or Some. The latter is the only case where the underlying data can safely be accessed.

This concept is similar to the Optional type in Java or Nullable in C#. However, while these languages still support null values, Rust does not.