4 Balanced Metrics for Tracking Agile Teams

[article]
Summary:
Whatever your feelings on metrics, organizations will expect them for your team. You don't want to measure only one aspect to the detriment of other information, but you also don't want to measure too many things and scatter your team's focus. Here are four metrics that balance each other out and help gauge an agile team's productivity, work quality, predictability, and health.

There are as many was to measure a project as there are to build it. Unfortunately, many of these metrics are useless. Eric Ries calls them "vanity metrics" because they look good and make you feel good but offer little in the way of actionable value.

Whatever your feelings on metrics, at the end of the day, organizations will expect and want them. With the yardstick of "helping the team to self-reflect and improve" and the caveat "your mileage may vary," here are my four go-to metrics for an agile team, along with some experiences on their effectiveness.

Four Interlocking Team Measures

Why are there four? If you only measure one key metric, it is easy to get tunnel vision. Be it the teams focusing on just making the metric better (often through gaming the system) or management using the measure to drive all decisions, you can end up with a product or organization that looks good but is really driving off a cliff.

Likewise, with as many as ten metrics it is more likely that different parts of the organization will focus on different metrics, driving a wedge into the efforts to align the organization. Humans best handle three to five concepts at a time, so four main metrics seemed like the optimal dashboard. 

Cycle Time
Cycle time is your direct connection to productivity. The shorter the cycle time, the more things are getting done in a given timebox.

You measure this from when work starts to when the feature is done. In software terms, I tend to think of this as "hands on keyboard" time. Measuring cycle time is best done automatically via your agile lifecycle tool of choice, though even measuring with a physical task board will give you useful data.

Escaped Defects
This measure is the connection between customer satisfaction and the team. The lower the defect rate, the more satisfied the customer is likely to be with the product. With a high escaped defect rate, even the most awesome product is going to have a lot of unsatisfied customers.

You measure this by the number of problems (bugs, defects, etc.) found in the product once it has been delivered to the user. Until a story is done, it is still in process, so focus on the story's execution is preferable over tracking in-progress defects

Planned-to-Done Ratio
This metric is a way to measure predictability. If a team commits to thirty stories and only delivers nine, the product owner has about a 30 percent chance of getting what they want. If, on the other hand, the team commits to ten stories and delivers nine, the PO has roughly a 90 percent chance of getting what they want.

Measuring is a simple exercise of documenting how much work the team commits to doing at the start of the sprint versus how much they have completed at the end of the sprint.

Happiness
This is the team "health" metric. It creates awareness that puts the other three metrics into better context. If all the other metrics are perfect and happiness is low, then the team is probably getting burned out, fast.

Build this into your sprint retrospectives. Open every retrospective with the team writing their happiness scores on whatever scale you choose. Track these numbers from sprint to sprint to see the trends.  

Why These Metrics?

Cycle time and escaped defects are highly quantifiable and well understood across industries. Smaller numbers mean you are delivering a higher quality product, faster. I originally added the planned-to-done ratio primarily because it was something the teams could have an immediate and real impact on, so this fulfilled the "early wins" idea. It becomes useful long term in mapping predictability, which helps in forecasting. The happiness metric is the “human factor,” which lets us gauge the overall team health.

The first three measures form a self-supporting triangle that prevents gaming the system. If you crash your cycle time, then defects will almost certainly go up. A high planned-to-done ratio can be great, unless cycle time is through the roof, showing the team is getting very little done per sprint. Finally, by layering happiness over the rest, you can see the human side of the equation. A low happiness score is nearly always a sign of underlying problems and can be a leading indicator of something else. 

You may be wondering about velocity. I track velocity also, but I think it has a very specific place. The four team metrics are for the team to reflect upon during a retrospective, with an eye toward getting better.

Velocity, on the other hand, is a measure the team uses during sprint planning. Its only use is as a rough gauge to how much work to take on in the next sprint. It also can be horribly misused if shared up the management chain—there are better ways to predict when a team will be done or how effective it is.

When measuring velocity, I measure both the story point and story count velocity. By doing this, I find the team has built-in checks and balances to their workload. For example, let's say the team has a three-sprint average of 50 story points and ten stories. If their next sprint is 48 points and nine stories, then they are probably going to finish all the work. If they exceed one of the numbers—say, doing 48 points but twenty stories (a bunch of small ones)—then the sprint might be at risk, as that's a lot of context switching. And if they exceed both numbers—say, committing to 70 points and fifteen stories—then this is a clear warning flag, and a good coach might want to touch base with the team to make sure they are confident that they can do better than their rolling average.

Metrics in Action

These charts are based on real data and are a snapshot about eighteen months into an agile transformation. I tend to stick with a six-month rolling window because if you go much beyond that, things have changed so much as to be irrelevant to what the team is doing or working on now.

Cycle Time
The spike represents the team moving to a new project and the ramp-up time as they got used to the work on the new project while going through a series of organizational changes.

Cycle Time Chart Sample.png

Escaped Defects
This graph shows a fairly typical curve for teams that have moved to cross-functional roles and automated testing. With everyone in the team jointly responsible for the story and the quality and a greater focus on test automation, we see a dramatic drop in defects found in the product after release.

Escaped Defect Chart Sample.png

Planned-to-Done Ratio
This team lost its ScrumMaster, which impacted its overall performance, as reflected in the first sprint's data. In the second sprint, an experienced ScrumMaster came in to help. The early dips represent the team getting used to a new set of norms, and the later dips were a result of changes in the program that reduced the clarity of the team's backlog.

Planned to Done Chart Sample.png

Happiness
These data show how the support of a ScrumMaster improved the team’s overall health. The graph also reflects that the churn in the product and organization impacted the team’s happiness later on.

Happiness Chart Sample.png

Based on these graphs, the first thing I'd plan is to engage with the team and listen to what's been going on in the last couple of sprints. The dip in the planned-to-done ratio and in the happiness metrics are enough to tell me there might be something going on. The low cycle time and escaped defects would lead me to suspect the problems were external to the team.

The real challenges were coming from a chaotic product strategy that had the team bouncing around among priorities. The volatility in the backlog changes led to lower quality stories. The team was developed enough to stop when they dug into a story they didn't understand and shift to work they did. This lowered the planned-to-done ration because not all work committed to could be finished, while cycle time was low as they worked on things they had a good understanding of.

Try These Team Metrics

These are the team metrics I've had the most luck with. Their interrelationship prevents gaming one measure without impacting others. They provide useful data to the team for retrospective improvement, and they are meaningful to leadership and help with forecasting.

If you’re interested in trying these metrics out, you can use the Team Dashboard pack I’ve created in Google Docs by downloading it here.

User Comments

5 comments
Todd Scorza's picture

Hey Joel,

Great article, your metrics will be put into use immediately.  I have one question regarding the expected trend, prediction information on the team metrics. I see the expected trend averaging the last two completed story points and I saw the note to average the worst 3 completed story points for worst case pediction. However, the expected prediction and worst case trend seem to be non functioning equations.  Are these useful and how can they be put to use, if they are useful?

Todd 

May 6, 2017 - 10:51am
Joel Bancroft-Connors's picture

Todd,

 

My apologies for totally missing this question. 

 

You've found a place where I'm in process of trying a new formula. Previously I used a rolling, last three sprint average. I'm trying to move to a Mean Average using the best and worst. Right now the formula is not working in the Google Doc. My apologies for the confusion.

July 19, 2017 - 2:49pm
Bill Donaldson's picture

Great article! Hopefully all teams will have a set of metrics but they must be visible to be useful.  See my post on Creating a Culture Change with Visual Management

 

I’ve used Planned to Done metric can be helpful to get teams who aren't meeting their commitments.  However, this metrics can introduce a lot of anxiety and unnecessary introspection especially when the problem is outside of the team’s control.  I’d recommend an alternative the SAFe Program Predictability Measure.  The benefit of this measure is a ratio of the business value delivered not the team.  To get this the Business/PO is involved at the start to set value and during the demo to assess value.  Now the larger team can have discussions the internal and external reasons for not meeting the expectations.

July 19, 2017 - 2:18pm
Joel Bancroft-Connors's picture

Bill,

Interesting idea, I can definitely see value in this once the organization has moved into being able to apply business value to their stories. Do you see it being valuable in early stages when the product owner may not even be fully engaged and is still doing just rank order prioritization?

 

 

July 19, 2017 - 2:54pm
Tim Thompson's picture

Great article. I really like the planned to done ratio metric. I wonder how "escaped defects" are measured. Is this supposed to be only based on customer complaints? Plenty of studies show that dissatisfied customers often do not complain. They either find a workaround, stop using a feature, or abandon the product altogether. Should any issues found before release and not fixed be part of the "escaped defects"? I am sure that in any software application there is a growing number of such issues because adding features always trumps fixing existing flaws.

August 3, 2017 - 10:21am

About the author

AgileConnection is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.