NASA’s recent DNSSEC snafu and the checklist

February 16th, 2012 by kc

Reading about NASA’s recent DNSSEC snafu, and especially Comcast’s impressively cogent description of what went wrong (i.e., a mishap that seems way too easy to ‘hap’), I’m reminded of the page I found most interesting in The Checklist Manifesto:

We’re obsessed in medicine with having great components — the best drugs, the best devices, the best specialists — but pay little attention to how to make them fit together well. Berwick (president of the Institute for Healthcare Improvement in Boston) notes, “Anyone who understands systems will know immediately that optimizing parts is not a good route to system excellence”… He gives the example of a famous thought experiment of trying to build the world’s greatest car by assembling the world’s greatest car parts. We connect the engine of a Ferrari, the brakes of a Porsche, the suspension of a BMW, the body of a Volvo. “What we get, of course, is nothing close to a great car; we get a pile of very expensive junk.”

Nonetheless, in medicine that’s exactly what we have done. We have a $30B/year National Institutes of Health, which has been a remarkable powerhouse of medical discoveries. But we have no National Institute of Health Systems Innovation alongside it studying how best to incorporate these discoveries into daily practice — no NTSB equivalent swooping in to study failures the way crash investigators do, no Boeing mapping out the checklists, no agency tracking the month-to-month results.

The same can be said in numerous other fields. We don’t study routine failures in teaching, in law, in government programs, in the financial industry, or elsewhere. We don’t look for the patterns of our recurrent mistakes or devise and refine potential solutions for them.

But we could, and that is the ultimate point. We are all plagued by failures — by missed subtleties, overlooked knowledge, and outright errors. For the most part, we have imagined that little can be done beyond working harder to catch the problem and clean up after them. We are not in the habit of thinking the way army pilots did as they looked upon their shiny new Model 299 bomber — a machine so complex no one was sure human beings could fly it. They too could have decided just to “try harder” or to dismiss a crash as the failings of a “weak” pilot.

Instead they chose to accept their fallibilities. They recognized the simplicity and power of using a checklist.

And so can we. Indeed, against the complexity of the world, we must. There is no other choice. When we look closely, we recognize the same balls being dropped over and over, even by those of great ability and determination. We know the patterns. We see the costs. It’s time to try something else.

Try a checklist.

The Checklist Manifesto, Atul Gawande.

More on this topic the next time it happens, which I reckon won’t be too far into the future. (Cringe.)

