your bills years ago and have written you off as a deadbeat. You decide to watch the Rose Bowl game on TV, but there is some kind of glitch with a communications satellite.
At least you can still receive postal mail. Your mail today includes a letter from your employer, saying that the office will closed indefinitely until the organization resolves unexpected, emergency computer problems. Your employer will contact you when the offices are open again. No word about how (or if) you will be paid in the meantime.
The TV news has coverage of a nuclear power plant that has exploded. Something about problems with an automated system that controls routine maintenance at the nuclear plant-it became confused about the date and shut off the flow of coolant to the reactor. There is another story on the news about senior citizens begging for food in freezing weather, because their government pension checks have not arrived.
What nightmarish day is this anyway? Saturday, January 1, 2000.
The Real Problem
The Y2K fixes, while numerous, were in themselves straightforward and low risk, perhaps 1 or 2 on a scale of 1 to 10 of the difficulty of software fixes.
Various crackpots devised magic solutions-automated tools that they claimed could scan through hundreds of thousands of lines of existing source code per hour, identify Y2K problems and automatically repair them for a cost measured in pennies.
Why do I call the tool developers crackpots? Several of these solutions actually worked, though the results still required laborious manual double-checking. No, it's because making the repairs, while essential, was far from being the biggest problem. The more important issue was the unseen, unintended side effects of those repairs. In many organizations, the regression testing consumed the largest part of the budget.
The real problem was (and still is) the maintainability of the code. Code may have originally been written with little thought given to its maintainability; for example, the structure may not have been particularly modular and logical.
Old code often runs in an obsolete technical infrastructure (so that very few if any people can be found who understand it); has documentation that is obsolete, missing or incomprehensible; or has been patched over the years (with patches on top of patches). The code no longer has an architecture, but seems more like a murky blob of entangled spaghetti, where everything connects to everything else, so that it's highly likely that a simple change will cause a defect to propagate to unrelated parts of the system. And with ongoing turnover of the people who use the system and who are assigned to maintain it, nobody really understands how the old code works.
Software entropy says that the reliability of software degrades over time: after code reaches a certain age, each modification is likely to insert more new defects than it removes. If the original design was informal and maintenance practices have been casual, it usually takes software only a few years to reach this point.
Testing does not catch everything, especially when performed under hectic deadlines. Based on audits of the efficacy of Y2K fix projects, Capers Jones of Software Productivity Research estimates that for every one hundred Y2K fixes, seven new defects (7%) were introduced that still remain-they have not yet been found and removed, despite all the regression testing. Many of these Y2K-introduced defects will not be uncovered for years.
By 1998 or 1999, testers could not do much about the real Y2K problem-mediocre to poor software maintainability. But we may be able to influence future system design and maintenance practices. We know how