preface

We all use NULL a lot. We must admit that it is comfortable, efficient and fast, yet we have encountered countless problems in using it. Now that we recognize a problem, what other cognitive biases are preventing us from addressing it?

What does Null mean?

Null is just a flag. It can mean different things depending on the context in which it is used and invoked.

This leads to the worst mistake in software development: coupling a hidden decision in the contract between an object and the person who uses it.

Coupling: The only software design problem

As if this important flaw were not enough, it also broke our only design rule: it used the same entity to represent multiple elements of the domain, and therefore had to be interpreted differently in a variety of different contexts.

The only software design principle

A good software principle requires that we have a high degree of cohesion. All objects should be as specific as possible and have only a single responsibility (S in SOLID).

The least cohesive object in any system is the wildcard: Null

NULL can map to several different concepts in the real world

Catastrophic failure

Null is not polymorphic for any object, so any function that calls it breaks the chain of subsequent calls.

Example 1: Let’s use a model to simulate the interactions between people during the current COVID-19 pandemic.

final class City {    public function interactionBetween($somePerson.$anotherPerson) {        if ($this->meetingProbability() < random()) {                returnnull; // No interaction}else {                return new PersonToPersonInteraction($somePerson.$anotherPerson);        }    }}final class PersonToPersonInteraction {    public function propagate($aVirus) {            if ($this->somePerson->isInfectedWith($aVirus) && $aVirus->infectionProbability() > random()) {                $this->anotherPerson->getInfectedWith($aVirus); }}}$covid19 = new Virus();    $janeDoe = new Person();$johnSmith = new Person();    $wuhan = new City();$interaction = $wuhan->interactionBetween($johnSmith.$janeDoe);if ($interaction! = null) {$interaction->propagate($covid19); }/* In this example, we simulate the interaction between an infected person and a healthy person. Jane is healthy, but if the R0 virus infects her, she could be infected. * /Copy the code

We see two Null flags and the corresponding if clause. It looks like Null propagation is under control, but that’s just an illusion.

Historical background

Null was created by accident in 1965.

Tony Hoare is the creator of the quicksort algorithm and the winner of the Turing Prize, the Nobel Prize for computing. He added it to the Algol language because it seemed practical and easy to implement. But decades later, he regretted it.

This story is recounted in great detail in the following article:

Null pointer references: A billion dollar error

I call it my billion dollar mistake… At the time, I was designing the first comprehensive type system for reference functionality in an object-oriented language. My goal is to ensure that all use of references is perfectly safe, checked automatically by the compiler. But I can’t resist the temptation to define a Null reference because it’s too easy to implement. This led to numerous errors, bugs, and system crashes. These problems have probably cost a billion dollars over the past four decades. — Tony Hall, inventor of the ALGOL W.

Readers can also watch the full video.

excuse

As developers, we use Null because it is easy to use (easy to write) and we believe it increases the efficiency of our software.

This is a misconception because we ignore the fact that a piece of code can be read ten times or more when you only need to write it once.

Reading code with Null is particularly difficult. So we’re just kicking the can down the road.

Efficiency is one reason (this is the most common excuse for coupling). Except in very special and critical cases, the performance penalty is negligible. It makes sense in systems that prioritize efficiency over readability, adaptability, and maintainability (there are always trade-offs in quality metrics).

No matter how The Times change, this cognitive bias will persist. Modern virtual machines, though, optimize code for us.

To rely on evidence rather than intuition, we just need to start benchmarking rather than insisting on the false claim that efficiency is more important than readability.

Fail fast

Null is used (or abused) to cover up unexpected situations, allowing errors in code to spread too far, creating frightening ripple effects.

One of the principles of good design is to fail fast.

Let’s look at another example: Suppose we have a patient data sheet, and we want to fill in the date of birth of the patient.

If there are errors in the visual component and object creation, an empty (null) birth date may appear in the data table.

When running a nightly batch program to collect the birthdays of all patients to calculate the average age, the patient data with an empty birth date generates an error.

The stack that provides the developer with valuable information can be a long way from where the defect resides. Next I can only wish you a happy debugging!

In addition, it is possible to transfer data, files, and so on through apis for different systems written in different programming languages.

A developer’s worst nightmare is having to debug this bug late at night and try to figure out the root cause of the problem.

Typed languages that do not handle optional content

Most typed languages use the same approach to prevent errors by ensuring that an object sent (or returned) as a parameter conforms to a protocol.

Unfortunately, some of these languages have started to backslide, allowing objects to be declared of a certain type and optionally Null.

This breaks the chain of calls, forcing developers to use If to handle scenarios where objects don’t exist, violating the open/close principle.

More importantly, Null breaks type control. If we use typed languages and trust the compiler to defend against the network, Null will penetrate it like a virus and spread to the other types indicated below.

Worst mistake in the history of computer science. – Lucidchart

The solution

Don’t use it.

alternative

As always, in order to solve all our problems, we should stick to what we believe to be the only axiomatic design rule.

Search for solutions in the problem domain and bring them into our model.

Model polymorphism is missing

In these cases, there are more elegant solutions to avoid using if for optional content when you must declare a type for an object.

In a classification language, it is sufficient to use the NullObject design pattern in concrete sibling classes and declare the superclass as a collaborator type according to the Liskov substitution principle (SOLID L).

However, if we decided to implement this solution, we would violate another design principle, which is that we should have a good reason to subclass and avoid reusing code or tweaking the class hierarchy.

The best solution in a classification language is to declare an interface whose conventions must be followed by both real and empty object classes.

Do this in the first example:

Interface SocialInteraction{    public function propagate($aVirus); }final class SocialDistancing implements SocialInteraction { publicfunction propagate($aVirus) {// do nothing!!!! }}final class PersonToPersonInteraction implements SocialInteraction { publicfunction propagate($aVirus) {            if ($this->somePerson->isInfectedWith($aVirus) && $aVirus->infectionProbability() > random()) {                $this->anotherPerson->getInfectedWith($aVirus);        }    }}final class City {    public function interactionBetween($aPerson.$anotherPerson) {        returnnew SocialDistancing(); // These cities are smart enough to implement social distancing to simulate human interaction}}$covid19 = new Virus();    $janeDoe = new Person();$johnSmith = new Person();    $wuhan = new City();$interaction = $wuhan->interactionBetween($johnSmith.$janeDoe);$interaction->propagate($covid19); /* Jane will not be affected because this interaction prevents the virus from spreadingCopy the code

No viruses will be introduced, and there will be no if or Null!

In this case, we replace Null with a description that exists in the problem domain.

Let’s look at the birth date example

Let’s go back to the patient form example. Even if the corresponding fields in the table are missing, we still have to calculate the average.

Interface Visitable {    public function accept($aVisitor); }final class Date implements Visitable { publicfunction accept($aVisitor) {            $aVisitor->visitDate($this);    }}final class DateNotPresent implements Visitable {    public function accept($aVisitor) {            $aVisitor->visitDateNotPresent($this);    }}final class AverageCalculator {    private $count = 0;        private $ageSum = 0;    public function visitDate($aDate) {            $this->count++;            $this->ageSum += today() - $aDate;    }    public function visitDateNotPresent($aDate) {    }    public function average() {        if ($this->count == 0)                return 0;            else                return $this->ageSum / $this->count;    }}function averagePatientsAge($patients) {        $calculator = new AverageCalculator();        foreach ($patients as $patient)            $patient->birthDate()->accept($calculator);        return $calculator->average(); }Copy the code

We use the Visitor pattern to process objects that can have empty object behavior.

Not Null

In addition, we use polymorphism to remove unnecessary IF and leave the solution to other non-average calculation through the open/close principle.

We didn’t use a lot of algorithms, but built a solution that was more declarative, maintainable, and extensible.

Use languages that explicitly support types that do not exist

Some languages support the Optional Maybe/Optional concept, which is a special case for implementing the above solution at the language level.

conclusion

In our industry, we have learned from many far-reaching practices that Null is not an option. Nevertheless, it is supported by almost all commercial languages and is used by developers.

We should at least start questioning its use and be more mature and responsible in developing software.

One goal of this series of articles is to open up the space for debate and discussion about software design. We look forward to reading the comments on this article.

By Maximiliano Contieri

Translator: Step

The original address: https://www.infoq.cn/article/UYYOS0VgETwcGmO1pH07