"The potential for catastrophe is actually found in the normal functioning of complex systems."
- Malcolm Gladwell
I'm going to go out on a limb and predict that the ultimate culprit in the sudden-acceleration Toyota cases making the news is-you guessed it-the electronics. I say this for three reasons. First, what's with that story that the gas pedal sticks to the floor? Unless you're in a race or in extremis, who pushes the pedal to the proverbial metal? And even if you do, what's it made of, Velcro?
Second, my car was made by Toyota under a more expensive name, and it suddenly decelerates, usually during a turn. The service manager blamed it on "drive by wire" which basically means "computer controlled." He didn't ask me if I wanted to have it fixed or report the problem to the manufacturer. I guess I am lucky that-so far-it hasn't suddenly accelerated instead. As it is, I never cut left turns too close.
The third reason is that a different news story also caught my eye. At first, I thought they were dredging up the old story about people killed in the '90s by a radiation machine when a combination of user and software interaction accidentally ramped the dosage. But, it turns out, this news is current, and more patients have been injured or killed recently for a different but similar reason. So much for once burned twice shy.
The insinuation of technology into every aspect of our lives means that its inherent risks are here to stay. As pointed out by Malcolm Gladwell in his recent anthology, What The Dog Saw: And Other Adventures , accidents may be the inescapable result of the complexity of technological systems.
I've already come to the conclusion that large enterprise IT environments contain so many variables-especially the possible combinations between systems and users-that you could not test them all in your lifetime and that of your children. But, where does that leave us? We can't just give up, and we certainly shouldn't where lives are involved. What do we do?
First, we need to lose some basic attitudes:
- Forget quality. In fact, lose the moniker "quality assurance" altogether, because it is too nebulous and creates unrealistic expectations. The word "quality" encompasses too many attributes to be a valid measure, and, frankly, testers can't "assure" quality or much else for that matter. Customers love and use flawed systems all the time because their benefits outweigh their issues. After all, I'm still driving my car.
- Forget defects. I've never believed that the goal of testing is to remove defects, because that requires us to prove a negative-i.e., that there are no more defects. And there are defects that no one cares about, because they never encounter them or, when they do, they are easy to overcome. Almost every shop I know of has a long, long list of reported defects that will never be fixed because other things are more important.
- Expect disaster. How many times has there been a blowup of major proportions despite our best efforts to test all that we could? I'm sure the makers of the Toyotas and the radiation machines exercised extreme care in the testing of their systems, yet people still died. In the more common case, the best-designed and best-executed test plans can still leave behind the potential for financial or operational catastrophe. We often hope for the best but fail to plan for the worst.
Then, we need to adopt new attitudes:
- Focus on risk reduction. Instead of trying to make sure systems work, identify the greatest