Archive for the ‘Software Engineering’ Category

Null

One of my favorite videos is Null Island from Minute Earth. I frequently link to it when I get into a discussion about whether null is an acceptable value for a certain use case.

What I really like is the definition around null and the focus of null having a concrete definition, that is: “we don’t know”. When used in this way, we have a clear understanding of what null is and the context in which it’s used (whether some field in a relational table, JSON object, value object, etc.) inherits this definition (i.e. “it’s either this value or we don’t know”), yielding something that’s fairly easy to reason about.

When nulls are ill-defined or have multiple definitions, complexity and confusion grow. Null is not:

  • Zero
  • Empty set
  • Empty string
  • Invalid value
  • A flag value for an error

Equating null to any of the above means that if you come across a null, you need to dig deeper into your code or database to figure out what that null actually means.

The flip side of this is avoiding nulls altogether, and there are really 2 cases here:

  • There is no need for null (i.e. we do know what the value is, in every use case)
  • Architect the system such that a null isn’t surfaced

In the first case, null doesn’t fit the use case, so there’s no need for it. When possible, this is ideal, and you avoid the necessity for null checks.

For the second case, architecting this way always seems to involve adding more complexity, to the point where it’s questionable if there’s a net benefit.

The road never built

On reliability, Robert Glass notes the following in Frequently Forgotten Fundamental Facts about Software Engineering:

Roughly 35 percent of software defects emerge from missing logic paths, and another 40 percent are from the execution of a unique combination of logic paths. They will not be caught by 100-percent coverage (100-percent coverage can, therefore, potentially detect only about 25 percent of the errors!).

John Cook dubs these “sins of omission”:

If these figures are correct, three out of four software bugs are sins of omission, errors due to things left undone. These are bugs due to contingencies the developers did not think to handle.

I can’t validate the statistics, but they do ring true to my experiences as well; simply forgetting to handle an edge case, or even a not-so-edge case, is something I’ve done more often than I’d like. I’ve introduced my fair share of “true bugs”, code paths that fails to do what’s intended, but with far less frequency.

The statistics also hint at the limitations of unit testing, as you can can’t test a logic path that doesn’t exist. While you can push for (and maybe even attain) 100% code coverage, it’s a fuzzy metric and by no means guarantees error-free functionality.

Code Coverage