Taming the Torrent

[article]

I've been in the software industry long enough to remember when one release a year was the norm. Except for emergency fixes, which were clustered shortly after the release, intended changes proceeded at a stately annual pace. But recently, the QA director of a major electronics manufacturer calculated that they promoted software into production multiple times per day.

This fire hose effect is attributable partly to the overall decrease in cycle time brought about by new techniques and technology, partly to the sheer complexity of the operating environment, and partly to the frequent emergencies caused by unexpected change impact. As each day goes by, there are more applications, more functionality, and, especially, more risk.

What makes this extra scary is that every tester I know is struggling just to skim the surface of the latest additions; comprehensive regression testing is a distant dream. Even if they knew what all the accumulated requirements were—and they don't—they don't have a fraction of the time and resources to get it done and no prospect of getting them.

The situation screams for automation, but historically automation has not been able to keep pace with rapid change. There is often inadequate (if any) notification of changes well enough in advance to update test scripts, and the maintenance effort itself is typically too time-consuming to keep up.

It seems hopeless—but it doesn't have to be.

The key to managing constant change on a daily basis is to streamline the process by targeting your efforts based on potential risk and then to automate everything you can. This allows you to devote your limited time and resources to removing as much risk as efficiently as possible. Of course, it sounds easier than it is, but it is achievable if you follow these steps:

1. Identify critical functionality: Granted, the odds are slim to none that you have anything approaching comprehensive, current requirements, but that is no excuse for not taking steps to start collecting the most critical ones. The best place to start is by analyzing production: What business processes drive operations? Which are the most common, the most important, have the highest exposure? Use system logs to glean what processes are running when and how often. Talk to the business and gain an understanding of which functionality—or lack thereof—keeps them up at night.Don't let the scale of the problem deter you from doing the best you can—there is no time like the present to get started. The problem is only going to get worse as time passes. Set yourself an achievable goal—say, the top five or ten most critical processes—and get a handle on that first. Then, systematically work your way down the list. You'll probably never reach the bottom, but removing some risk is better than none.

2. Implement change detection: The reason software is promoted into production is because something changed. The critical question is, what changed and where? Was it a business process, the code, the data, an external interface, security settings, the operating environment, or all of these? The best case is to automate, because relying on human intervention is tricky at best. Developers may not think the code they tweaked poses any risk, or users may not realize the master data they changed has far-ranging impact. Depending on your application and technology landscape, there may be tools that can compare before and after environments for you. Generally available commercial database tools can be used to look for changes to schemas or content. Source-control systems can flag code modifications. Search for these tools; they are worth finding and using. If tools don't exist for some or all of your landscape, then institute manual processes and enforce them without mercy. Allow no changes to be introduced that are not reported. In the best of worlds, a change-control board would monitor and approve all requests for changes first, but in the real world you will have to rely on individual contributors to color inside the lines.

3. Perform impact analysis: Once you know what changed, you need to know what is affected. A critical business process can be adversely impacted by a number of changes, not all of them obvious but any single one potentially problematic. For instance, database changes can trip up any related application, an interface can have upstream and downstream implications, and so forth. Again, an automated approach is ideal. Using a relational repository to store your test assets helps enable impact analysis by allowing relationships to be queried. If you know where components or data are referenced in a test or process, you can make the updates rapidly enough to keep pace. The goal is to know which tests will target the areas most likely to be affected by changes so they can be selected for execution.

4. Automate test execution: Automation is vital to keeping up, simply because the frequency and timing of changes rarely permits the overhead of manual testing. Without a beefy team of testers standing by 24-7, it is a practical impossibility to execute even the most streamlined test cycle. The beauty of automation is that it can be executed on demand—even overnight—and quickly enough to catch errors so that an extra fix-test cycle can be performed when needed. The downside, as previously noted, is that most automated tests are too cumbersome to be maintained quickly enough. Making changes to complex script code introduces as much risk as application changes. The solution here is to find and adopt an automation strategy that implements tests as data instead of code and, whenever possible, employs a database instead of flat files. This allows you to use the power of a relational repository to automate impact analysis and updates.

If you follow these steps and apply the tenacity necessary to make progress every day, even just a little, you can achieve a measure of control and confidence over the torrent of changes raining down on you.

There are companies that have achieved rapid, comprehensive regression testing, and they report an astonishing acceleration in delivery, as opposed to a delay. The reason? By removing risks before they reach production, significantly less time and money is spent on emergencies. Those resources are freed to deliver faster and reduce the backlog. It's a beautiful thing.

About the author

AgileConnection is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.