Big Bang, Large Crater

[article]

At a recent pre-bid meeting I sat in my chair with a growing sense of disbelief as the customer described how the new system would be installed. He had written a "big bang" installation into the contract, which would require the complete demolition of the existing computer system the same night the new system was installed. The system was mission critical--any disruption in service would paralyze operations at one facility and interrupt operations nationwide.

Big bang installations utilize the big red switch methodology, where a transition just requires throwing a big red switch, and then a miracle happens. Miracles seldom happen in the software industry, and going for a big bang often leaves you with nothing more than a large crater.

The hallmark of the big bang is that it requires you to stack several risky steps on top of each other. For example, the plan may require you to successfully install two or three new sub-systems in a single time window, while simultaneously performing a data migration from one schema to another.

If any step is not completed properly, the entire installation fails. The level of planning and coordination required to pull this off is often beyond the managerial maturity of the organizations involved, which makes the plan riskier still. Organizations often don't recognize the risk involved in what they are attempting. They honestly can't imagine another way of doing things.

Tight schedules are often cited as the reason to use the big bang approach; it appears to be the only way to wedge an upgrade into the schedule. For example, a single window of opportunity is available to perform the cutover. The big bang is necessary because the system is so critical that the only time to make the change is between midnight and four a.m. Christmas morning.

Sometimes an organization is unwilling to commit the resources, both in personnel or material, to spread the installation and integration over a more reasonable period of time. There is a perception that the big bang approach is cheaper. I have had managers tell me that a big bang is like removing a band aid--a quick jerk is better. This attitude ignores the impact to the schedule and cost if the installation goes poorly. Cleaning up the resulting mess is costly and can take a long time.

When practical, an incremental approach is always a better solution than the big bang. Convert a big bang installation into an incremental one by unstacking steps. This breaks up the large risk into a series of smaller risks, which are easier to manage.

Look for hidden assumptions requiring a big bang. I was recently involved in a project that required the removal of more than 200 field devices over a single weekend. This would require managing a small army of electricians. We assumed that the new field devices would be incompatible with the old software; therefore the computer system had to be replaced at the same time. We realized we could design a new field device that was backwards compatible with their existing devices. We were able to field test a single one of these devices for months and didn't need to install the new computer system until much later.

Sometimes, the only practical approach is to use a big bang. If you are faced with a big bang installation, use these tips to help you survive the blast: 

  • Over plan. Big bang installations should be planned in minute, excruciating detail. Every member of the installation team should know exactly what they should be doing for every minute of the installation. Prepare detailed checklists of the steps that must be followed.
  • Do a dry run. If possible, set up a test system that mimics the target platform. Use it to execute your installation plan. Identify task durations and missing steps, and then re-plan.
  • Over staff. There should be resources standing by who have no assigned tasks during the installation. They are then available to deal with any problems that arise during the installation.
  • Prepare a fallback plan. Wherever possible, have a mechanism that reverts to the previous system. Practice executing this fallback plan prior to performing the installation so you know it works and how long it takes to fall back.
  • Identify triggers. Identify trigger conditions that will cause you to execute the fallback plan. For example, if the installation has to be complete by four a.m. and your fallback plan takes two hours to execute, you need to make a go/no go decision by two a.m.

Big bang installations are some of the most exciting events in software development. A beautiful fireworks display is a joy to watch. Just be careful that one of them doesn't go off in your hand!

About the author

AgileConnection is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.