How to Give an Accurate Answer

Summary:

Scott Ames explains the Test Requirements Agile Metric and offers a real-world example of its use in software estimation.

"How long’s it gonna take?" My response: "Six weeks, plus or minus two days. No more." In this article, I'll give a rebuttal to Daryl Kulak's article, "Let’s Stop the Wishful Thinking." I will show why his beliefs about software estimating, while understandable, are questionable because of the advent of the Test Requirements Agile Metric (TRAM).

We Are Such Good Estimators!

The question posed to me along with my reply uses a real-world experience. The head of development at NEC was asking on behalf of a customer how long it would be until the final release of the customer’s software. The customer wanted it within three weeks. Although everyone felt it would be possible to release within that timeframe, I knew a six-week estimate would be more realistic. The use of TRAM helped me justify my statement, not just to the rest of the development staff but also to the client.

Storypoints

This is where Scrum’s “storypoints” would have failed us. Equating storypoints to an amount of time is ludicrous for a development team. As Mr. Kulak says, “It’s ridiculous. The power of the storypoint in estimating user stories is that it is vague. Keep that power.” Since we are trying to eliminate the vagaries, we will eliminate the storypoints.

You might ask yourself, “Oh, no! How will we estimate?” You do so by using the TRAM’s method of verification points. Verification points are a defined, rather than a fuzzy, metric. Each requirement is given a score based on the type of defect it would cause if it failed, as follows:

Catastrophic: The defect could cause disasters like loss of life, mass destruction, economic collapse, etc. This severity level should only be used in mission- or life-critical systems and must have at least exciter (see below) priority.
Showstopper: The defect makes the product or a major component entirely unusable. In mission- or life-critical systems, failure could be hazardous. This severity level must have at least recommended priority.
High: The defect makes the product or a major component difficult to use, and the workaround, if one exists, is difficult or cumbersome.
Medium: The defect makes the product or a major component difficult to use, but a simple workaround exists.
Low: The defect causes user inconvenience or annoyance but does not affect any required functionality.

These scores are then modified by the priority level:

Mandatory: The defect is highly visible or critically impacts the customer. It must be repaired prior to general availability release.
Exciter: The defect has significant impact upon the customer and inclusion of this functionality would greatly increase customer satisfaction with the product.
Moderate: The defect moderately impacts the customer and should be repaired before a general availability release, but it is not necessary unless at least medium severity. This level is also used for requirements that have not been prioritized.
Recommended: The defect has some impact upon the customer and should be repaired before a general availability release, but it is not necessary unless the defect is scored at least high severity.
Desired: The defect has a low impact upon the customer and should be repaired before a general availability release, but it is not necessary.

As you can see, while defects are still prioritized, some requirements will have the same priority. This is not a problem; rather, it is the developer’s choice to determine which defect to fix next or the requirement will be rescored.

The team decides on the requirement’s severity, and the product owner on the requirement’s priority. Since the estimation does not handle time in “man-hours,” but rather in “team-weeks,” the estimation has more value because all those little bumps in time are smoothed out. You don’t even need to show an employee’s time off, as other members of the team will be able to pick up his tasks.

Verification points, being a defined metric, are the same no matter who calculates them. New York, Denver, or New Delhi, a verification point is a verification point and means the same thing. This is not true of storypoints. With verification points, you’ll be able to determine which of two teams should develop faster by using simple velocity. The total verification points for a project are a good metric for determining the development effort’s overall size. You can accurately determine how many verification points can be cleared during each iteration by using a form of iterative development, over time. Cleared verification points represent deliverable software. Verification points cleared by the team per day, week, iteration, or sprint is a valuable metric that can be used to show how much effort was required per verification point, determine how rapidly a project can be completed, and estimate a project’s duration and cost.

At NEC, I implemented the TRAM on the mobility project to aid in determining how much functionality to attempt during each sprint. Fourteen weeks into the project, our customer asked us to make a final delivery of the project within three weeks. Management hoped that we could do that for them. However, we had only cleared 280 verification points of product during those fourteen weeks, giving us a velocity of twenty verification points per week. As there were 120 verification points of product still in the backlog, we told them that our best estimate for completion would be six weeks. It is worth noting that the TRAM analysis estimate was 100 percent accurate. We made the final delivery of the project in exactly six weeks. One thing that I was asked by management at NEC was, “What if we made more overtime mandatory to attempt to get the project out in three weeks?” We already had mandatory overtime on Saturdays for the prior three weeks, and the effects were not helpful. During the first week, the team worked an additional day and produced an additional day’s worth of product. In the second week, however, production started to fall. The team only produced 90 percent of what it had accomplished during a normal workweek. The third week, production slipped to 80 percent. Obviously the team was burning out. Rather than slip further to 60 percent, which is where the team was heading, I recommended cessation of mandatory overtime, which was then implemented. Velocity then returned to pre-overtime levels. This saved the company two weeks of development time and associated costs. For a team of sixty, this was a significant monetary savings for NEC, proving the power of verification points.

Verification points from the TRAM are a very good way to save your company time and money while producing very accurate estimates that will be useful to business people.

User Comments

1 comment

Interesting sample. The only difference between the two articles is the timing of when the estimation was needed. In this article, they were well on there way. In the previous article, no analysis or history was available to make the estimate.

About the author

Scott Ames

Scott G. Ames has fifteen years in software quality, is a Certified ScrumMaster, and is the chief TRAM engineer at Good-To-Go!