# Risk Analysis Basics

[article]

Setting: Our tester, Tim, is verifying load performance of a server. He has been waiting for his chance to use the server to run his tests. While he's waiting for the developers to finish, he realizes that if the server dies, he can't verify the load performance of the application. Tim makes a beeline for Pam the project manager's office.

Tim: "Hey, did you know this server is critical to our ability to load test?"

Pam: "Hmm, no, I didn't realize that." (Pam goes back to reviewing the schedule.)

Tim: "Well, I want to get another one, okay?"

Pam: "What?! No, you can't have another server. If you get another server, other people will want more servers, and then our budget will be shot."

Tim: "But if we don't have the server at all, I won't be able to test."

Pam: "Hmm, then our bug counts will go down. That's not bad."

(Tim glares at Pam.)

Pam: "Okay, then it's your job to tell me how likely the equipment is to break and how much it will cost to fix."

Ever had a conversation like this with a project manager? I hope not. But if you had, you probably walked away furious and disgusted. You knew that the project manager really didn't care what your answer was. However, you know that you somehow have to bring this information to the project manager's attention, so that she can take a more responsible approach to managing the potential issue.

Potential issues are risks. Formal risk analysis is what happens when you consider the likelihood that a potential issue will occur, and take into account the severity of it happening, giving you the exposure. Then you create a mitigation plan to deal with the problem. Testing is one form of risk mitigation, by looking for defects before the customers find them. But that's not the only form of risk mitigation you're likely to need.Sometimes mapping out the risk can be helpful. I use a table like this one to explain risks:

 Risk Probability of occurrence Risk severity Exposure Trigger date Mitigation plan Define this in words How likely is this risk to occur? Use high,medium, low How severe a problem is this risk if it occurs? Usehigh, medium,low Multiply probability and severity together, to derive a joint value The date by which you will set the mitigation plan in place What are you going to do about this risk? Load server may not be available in time to test High (it was in use by other groups for the last release) High (we can't test performance under load without theserver) (High, High) 2/1 Buy a new server, install it by 2/15, up and running by 3/1, in timeto start load testing

First, I define the risk in words people can understand. Here, we're talking about a particular server's availability. Then, I define the probability that this problem could occur. If you have historical information, use it. If you don't have any previous knowledge, then guess. In my example, we know that in the previous release, other groups also needed to use the load server.

Then, define the severity of the risk. How bad a problem is this, if it
occurs? In this case, the potential problem is very bad, assuming we need to test the product under load. Once you've defined the probability and severity, you can multiply them together. Some people use numbers to quantify risk, so they can easily multiply and get a number. I find that having managers see (High, High) in bright red is enough information. In my experience, managers or other people with organizational power manipulate the numbers to give the answer other managers or other powerful people want to see. It's much harder to manipulate the highs, mediums, and lows.

For all risks, define the date by which you need a mitigation plan, a plan to manage the risk if it does come true. For High exposure risks, define the mitigation plan. (In your organization, you may also need a plan for medium exposure risks. Some organizations require plans for even lower exposure risks. It depends on the risk tolerance of your organization.)

Now, when you go to your project manager to explain that there's a potential problem with testing, it's easier for the project manager to see the potential problems and how they impact the whole project.

Of course, risk management doesn't give you a magic wand and a crystalball. Rather, risk management is about looking at likely scenarios (and even at some unlikely scenarios) and taking some action to reduce their potential effects.Why Do Risk Analysis?
You do risk analysis for only one reason: Would you manage the project differently if any of your risks happened? I especially look for risks that could put us out of business, or prevent us from shipping product.

When I work with people on generating risk scenarios in the project, I ask them to look at risks in these areas:

• Risks to getting the project completed—The machine availability problem above is a great example of that risk
• Risks to using the product in the field—I find use cases or other forms of customer-scenario generation work well here
• Risks to the business from using the product in the field—If your customers find this problem, could their reaction impair your ability to do business

Then you make plans to deal with the risks you never want to come true. In Tim's project, for example, the lack of the machine when it was needed would prevent them from doing load testing. How bad a problem is that, really? Would Tim's company have shipped anyway? If so, then there was no risk. I'm not saying this is good business, but if Tim's company already decided that the potential down side, the severity, is not high enough, then there is only limited business risk if a server is not available and the testing is not done.

Risk analysis can't be exact. If it were exact, you'd be predicting the future (and by the way, if you figure out how to predict the future, please do let me know). But having a place to start the discussion about what the problem is, and how it affects the project, is much better than the frustrating and inadequate dialog we saw at the beginning of this column.