Every experienced tester has a painful story about the infrastructure upgrade that blew up a development project—or worse, a critical production system. Remember the minor operating system upgrade that was going to be so “transparent” that it wouldn’t need testing? Or that “trivial” compiler upgrade?
I’ve lived through a few of these, not one of which took less than three times its planned duration. Hard experience teaches testers to be wary of any infrastructure upgrade. Nevertheless, we don’t have to test everything in the affected applications. Instead, we need to identify and address the actual risks in our testing.
One of my consulting clients urgently needed to upgrade the unsupported operating system under its mission-critical merchandising system. That would force them also to upgrade versions of their database engine, compiler, and report generator for mutual compatibility.
The retail business depended daily on the merchandising system and required frequent enhancements to it. Taking time for a platform upgrade that would add no business value was a big concern, and IT was worried. How could they ensure that the merchandising system would function identically after the upgrade? How could they complete the essential upgrade work before the business resumed clamoring for critical enhancements? The proposed testing alone was going to require more time than the business could afford.
IT management asked me to review the upgrade test strategy with two goals: ensure that it adequately addressed the risks, and find any possible ways to minimize the required testing time.
Everyone agreed that good testing would be critical for success, but the test team was not well positioned to tackle the upgrade. There were no predesigned regression tests for the merchandising system and no automated scripts for any of the applications. Nine of the ten team members, including the test manager, had worked for the company less than a year, and only one had any previous retail domain experience. The team was still learning the applications.
Understandably, the test manager had approached the infrastructure upgrade conservatively, with a test strategy to develop and execute a comprehensive black-box regression test suite. Primary focus would be on the merchandising system, and the test would also cover major flows through the entire enterprise integration. Analogous to many Y2K projects, the team would capture “before” results for comparison after the upgrades.
I saw long-term advantages in this strategy. It would give the team a good basis for regression testing future enhancements and infrastructure upgrades impacting the merchandising system. It would also be a tremendous learning opportunity, allowing the testers to build deep knowledge of their company’s most critical systems. But it would be costly in the short term. The test development work alone would be a mammoth effort and—even less acceptable in a fast-moving business—testing would take a long time, during which no major application upgrades could be applied. Only a few testers would be available for much of the work; most were fully committed on other business-critical projects.
How could all that work and time be justified for this project alone? There had to be faster and cheaper ways to address the risks the multiple upgrades posed to the merchandising system.
As I explored the project with the project manager, test manager, DBA, and other members of the upgrade team, it was clear that everyone was operating on assumptions of waterfall sequencing, and project responsibilities sharply divided according to specialty. The test team felt responsible for all the testing. The test strategy was based on employing only the testers’ existing skills and waiting until late in the process for access to systems to test.