My friends and family are under attack in Ukraine.
Donate to protect them directly or help international organizations.

Null Properties

Let's say that I have an object that is used to represent what's in my database. It has many properties, along with getters for these properties.

class Customer
{
    private ?string $firstName;
    private ?string $lastName;
    private ?string $phone1;
    private ?string $phone2;
    private ?string $email;
    private ?Address $address;
}

All the properties are nullable. When I start using this object and encounter a null, it can mean many things:

I decided not to load this data to save bandwidth or memory.
I accidentally failed to load the data because of a mistake in my code.
The value in the DB is null.
I assigned it to null so that it becomes null in the DB when I save the object later.
I assigned it to null so that I could serialize a more compact version of the object (think smaller JSON).
I accidentally assigned it to null because some method returned null due to a bug.
Etc.

Null can mean so many things. It's like when you ask a friend to do something, and they say no. No, you don't have time right now but will have later? No, you're not capable of doing it? No, you won't do it for free?

In larger applications, it can take hours or even days to debug a problem related to an unexpected null. What I need is to have a more precise way of communicating information.

I Didn't Load All Data

I avoid partial objects. Instead of creating classes that correspond to my database structure and leave most of the instances empty, I create separate classes for more specific purposes. For example, if I know that I need to display a list of customers just with their names, I could load CustomerName objects:

class CustomerName
{
    private string $firstName;
    private string $lastName;
}

CustomerName forces the properties to be non-null. It requires more effort in designing these objects, but it becomes easier to use them. As a bonus, it's now impossible to accidentally not load the data. If I attempt to access $firstName without first initializing it, I'll get a fatal error. This is great, because it lets me spot the problem much closer to the root cause. I can push it even further and create a value object (watch video), so that it wouldn't even be possible to instantiate with nulls. Example:

final class CustomerName
{
    private string $firstName;
    private string $lastName;

    public function __construct(string $firstName, string $lastName)
    {
        $this->firstName = $firstName;
        $this->lastName = $lastName;
    }

    // Only getters, no setters
}

The Database Value is Null

If I use value objects and the $firstName is null in the database, then I will not be able to create an instance of CustomerName. This is a reasonable constraint. The database value in this case should never be null. I should ensure that the first name is never saved incorrectly, because my application can't function otherwise.

For some database columns, it makes sense to have null values, but I won't translate that into an object with a null property. For example, if my $address is optional, then I won't attempt to create a Customer object. I will take the time to understand the use cases where I need each piece of data and design appropriate objects. There is no rule of thumb, because it depends on the domain.

Nevertheless, here's an example. Let's say that the reason I store an address for customers is so that I could mail them a monthly invoice. Some customers opted to receive their invoices electronically and don't have an address on file. In that case, I might have two invoice classes. Each will have common invoicing data, but different contact information:

class ElectronicInvoice
{
    private Invoice $invoice;
    private Email $email;
}

class MailInvoice
{
    private Invoice $invoice;
    private Address $address;
}

I would instantiate one or the other based on which invoicing method the customer selected. If the customer selected mail invoices, but I don't have an address in the database, then I won't be able to create a MailInvoice object. In this case, I'll have to fix the code where the user selects the invoicing method, to ensure that they can't do that without saving an address. At no stage should my database be inconsistent.

They key here is to think about what I need in my application, as opposed to worrying about how data is stored. I worry about my repositories and database schema at the very end.

I Want a Lighter Serialized Object

I never modify the state of my object for such use cases. If the object is not immediately destroyed, I risk either making incorrect decision based on its state later on, or worse, save a null to the database and lose data.

Instead, I prefer to create my own explicit serializers. For example:

class Contact
{
    private string $phone1;
    private string $phone2;
    private string $email;
    private Address $address;

    /** 
     * @return array{phone1: string, phone2: string} 
     */
    public function toPhonesArray(): array
    {
        return [
            'phone1' => $this->phone1,
            'phone2' => $this->phone2,
        ];
    }
}

The way I describe the array in @return is good for keeping things explicit, and can be validated with tools like PHPStan and Psalm.

I can then json_encode the result of this method. I prefer this approach over guessing what an all-purpose serializer will produce, or having to configure the serializer for all the edge cases.

I Assigned the Output of a Buggy Method

I typically prevent these bugs in the first place by throwing an exception instead of returning a null. For example:

try {
    $customer->address = $this->db->findAddress($addressId);
} catch (RecordNotFound $exception) {
    // Do whatever makes sense in this scenario
}

Now there is no way to continue execution without doing something about the missing address. This avoids assigning nulls and then guessing what it means 20 method calls later.

Conclusion

These are not the only reasons to use a null, but I always try to identify why I need a null and look for better, more explicit alternative. I want to be able to quickly build a mental model of what's happening in the code, have fewer places where things go wrong, and fail early for easier debugging.