You're thinking that now is a good time to start a software process measurement program in your organization. To prepare, you've read one or more books on measurement. You have a good idea about what data you'll need to collect and how you'll analyze it. But you've also heard stories about how similar efforts have failed in other organizations—stories that don't jibe with what you've read, suggesting that there may be more going on than what can be covered in a book.
What can you do to fill in the gaps? Consider running a trial measurement program on yourself before launching into a bigger program. By doing a practice program "in the small," and applying introspection and imagination, you can learn about the value and limits of measurement, gain insights into why many software measurement efforts fail, and prepare yourself for launching a successful program. You'll need only a spreadsheet that can produce graphs—Microsoft Excel or the equivalent is more than sufficient. A small stack of sticky notes is also useful.
On a new spreadsheet, label three columns: Date, Arrival, and Departure. For the next few weeks, record your arrival and departure times at work. (For the first few weeks at a new job, I also track commute times.) For bonus points, make a chart on a second spreadsheet page, and link it to the live data so that the chart updates when you add new data. This is a good excuse for experimenting with the graphing capabilities of your spreadsheet—a skill you’ll need later. Try out various chart types to see what they reveal about the data.
After a few weeks of recording times, you might notice a few things:
There will be missing data. Unless you're very well organized or leave yourself good reminders, you might forget to note or record an arrival or departure time. It's easy to forget or get caught up in a crisis. One lesson from this is that any measurements you gather, and any metrics you calculate, must be able to survive missing data.
What gets measured will get improved. You might observe that the mere act of recording your arrival time affects your motivation to get in to work on time, or stay a bit later (or leave at a saner hour). By measuring something, you focus your attention on it, albeit briefly. In a larger metrics program, focusing a team's attention on the right things, even if only for a few moments a day, can have a strong, positive effect on the team's behavior.
Numbers don't tell the whole story. From arrival and departure times, you can calculate and chart time on the job. Or can you? What about time you put in away from the office? You need to extend your data collection to account for non-office time. Can you figure out a way to do this that doesn't complicate data collection or charting? Getting precise measurements on some things can be complicated. The harder data collection becomes, the greater the temptation to avoid the complication and stay with simple, yet imprecise, measurements.
Charts are compelling. Trends are easier to spot when looking at a good chart than when reading raw numbers. Charts are also effective ways to present data to managers who have limited attention spans but still need to make informed decisions.
Add Other Measures
You might reasonably object, "But what does time on the job have to do with anything? I could sleep at my desk and the numbers would still look good. Shouldn't I be measuring how much effort I'm putting in, or how effective I'm being?" You’ve just touched on a truth: Quantity is easy to measure, but quantity shouldn’t be mistaken for quality. Let's see if we can find additional things to measure that might be more representative of effort.
The volume of email that you receive and the number of emails in your inbox can be a crude measure of your workload. Add columns labeled Inflow and Inbox to your spreadsheet. In these, record the number of emails you receive each day, and the number that remain in your inbox at the end of the day. The volume of incoming email is something you can influence (do you really need to be on that mailing list?) but not control. To capture what you can control, add a column to track the number of emails you send. Try keeping separate tallies for the number of messages you originate, versus the number that are replies. I find that the ratio of replies to total outgoing messages is a good indicator of when I've gone into a reactive mode.
You might notice that measuring your inbox size has an immediate one-time effect on your behavior. It might trigger an "Oh wow, I've got to clean out my inbox" response, followed by a flurry of responding, reorganizing, archiving, and outright deleting. Some measures you make in a larger measurement program are like that: They have a big one-time effect. And sometimes that's enough.
Monitoring your email counts might have a longer-term effect on your behavior. By drawing your attention to the counts at least once a day, you might find that you change your email handling habits. Good measures can focus attention in ways that affect long-term behavior.
Note that none of these counts addresses the quality of the email you're receiving or sending, or the complexity of the issues that the emails in your inbox represent. Quality is difficult to measure in the absence of rigorous specifications. Often the best we can do is approximate quality through quantitative measures.
With a few weeks of data in hand, try overlaying a chart of these new counts on your arrival/departure chart. Quite often, two or more charts viewed together will reveal patterns that aren't evident in any one chart. Some patterns are obvious: Miss a day or two of work, and it takes several days to get your inbox back under control. Some patterns are less obvious until you’ve collected lots of data. On a project where I was providing support to the sales team, I noticed quarterly cycles in email flow, with the volume picking up dramatically near the end of a quarter as they worked to close big deals. Knowing this pattern helped me plan for end-of-quarter distractions.
Knowing that your email backlog follows a quarterly pattern can give you some comfort that things aren't getting out of control near the end of a quarter. In the same way, knowing that past projects have seen defect rates spike before leveling off in early testing can help you avoid panic when the defect count on your current project makes a dramatic, early rise. Don't relax and take your eyes off the defect reports, but don't lose sleep, either.
As you look through your charts, you may find yourself looking back at some period of instability and wondering what happened. Though you might remember now that two weeks ago there was a short-notice, two-day customer meeting, your memory will fade over time. Yet the blip on the chart remains. To help answer future "what happened there" questions, add a Notes column to your spreadsheet, and record whatever events seem significant. Just as notes like "out 2 days with the flu" are useful for explaining personal performance aberrations, recording “power failure, database server down 4 days” or "new requirement for custom widgets" can help you defend a schedule slip.
Sometimes a rumor or announcement about a future event is, in itself, significant and will cause measurable changes in behavior. Recording "BigCo visit announced" or "layoff rumors" might help answer future "what happened" questions.
By keeping your eyes open for events that might be significant, and watching what happens, you can learn a lot about organizational dynamics. Ask colleagues about the types of events that they find significant when monitoring projects, and grow the set of things you watch for.
Reflect on Measurements
As you reflect on the measurements you are taking, you'll find that they are useful in different ways:
Sometimes you have to dig deeper. Some measurements serve as important indicators that more intrusive measures are warranted. If you notice that your body temperature has been several degrees high or low for a few days, you shouldn’t immediately reach for antibiotics. Temperature alone is an insufficient diagnostic tool. Instead, you might want to get yourself to a doctor for a more intrusive examination. Similarly, it’s normal for a project to have some number of open defect reports, but if the number takes a sudden spike upward or a large unexpected drop, then it's time for someone to take a deeper look at what's going on.
Sometimes just having the measure is enough. Some measurements are useful primarily to keep important things in present attention. Diet and exercise can easily go by the wayside during an extended project crunch, as the team consumes massive quantities of soda and pizza. Charting your weight on a daily basis reminds you to watch your diet and get regular exercise, and it helps avoid nasty "I gained five pounds! What happened!?!" surprises. Similarly, defect reports can often stack up if you're not paying attention to them, leading to nasty "We missed a milestone! What happened!?!" surprises when the team gets overwhelmed by rework.
Sometimes you'll notice a trend. Some measurements can help you spot trends, allowing you to predict and adjust. If I see that my hours at work have been steadily increasing over the past few weeks, and that my email backlog has been growing steadily, I might choose to negotiate adjustments to my workload. Or, I might let my wife know not to expect me for dinner for a while. In the same way, when you're chugging along at full speed on a project, you can avoid an ugly "Murphy's Law" moment by tracking available disk space on the source code repository server. By using the disk consumption trend to predict when the server will run out of space, you can arrange for repository maintenance well before the project gets stopped dead in its tracks.
Sometimes you'll discover a hidden benefit. Ask your colleagues what measurements they find useful. You may be surprised at the power of some nonobvious measurements. Early in my career, a co-worker confided that he expected a layoff soon. He explained that he had been tracking our "burn rate" (cash outflow), and he had predicted when our venture capital funding would be exhausted. Paying attention to this in later companies has saved me from some unpleasant surprises.
Imagine Your Reaction
Imagine that it's a quiet part of the day. You have your personal chart laid out on your desk and you're looking it over for patterns. Suddenly, your boss appears over your shoulder and says, "That looks interesting. What is it? Can I have a copy?"
Now, notice your reaction. Did you get a good twist in your gut? Good. Hold on to it. By examining your reaction to this scenario, you can gain some insight into why many measurement programs fail.
When I imagine this scenario, I get a feeling of dread. Dread that I might find myself in a situation where I'm being judged by numbers that I know don't reflect real achievements and effectiveness. I worry that my manager is going to waste my time by asking for detailed explanations about why the chart looks the way it does, and that I'll have to repeat this waste of time every week. I fear that when review time comes, I'll get dinged for some nonsense like "doesn’t keep inbox under control." (To be fair to past managers, most would immediately understand the significance, or lack thereof, of the type of personal data I've suggested collecting. But it only takes one bad manager to leave scars.)
From my reactions to this imagined situation, I drew these lessons:
You might draw other lessons. Compare notes with colleagues, since people react in different ways. Having some knowledge about the range of reactions this scenario can bring up will help you plan how best to introduce your measurement program.
Now imagine that you are sitting in your manager’s weekly staff meeting. Toward the end of the meeting, she announces her concern that your team isn't being responsive to requests from other groups, and that you have had a great idea that will help her. She pulls out your graph and shows it around the room, telling the team that it is the kind of visual presentation that will help her keep on top of the situation. She asks team members to record their daily inbox size and message counts, and to turn in a graph of it along with their weekly status report.
Again, notice your reaction. Imagine also the reactions of others on your team. How do you think they'll feel about the announcement? How do you think they'll feel about you? Then, if she reminds the team that yearly performance reviews will be done the following month, how might these feelings change?
Feelings Matter. We've just stepped into one of the great unspoken truths of software process measurement: Feelings matter; and fear matters a lot. When people learn that they will be the subject (or target) of a software measurement program, particularly if they're surprised by the announcement, you can expect strong reactions, some of which will be based on fear.
Of these, the fear-based reactions are the ones you want to avoid. To understand why, let's see how these reactions play out.
Consider Others' Perceptions
After your manager's imaginary meeting, there would probably be a flurry of email shuffling and deleting, and a sudden increase in the amount of outgoing email. Faced with the choice of spending an hour addressing an important email, or using that same time to "process" a dozen less important emails, some team members will choose to optimize their performance as measured. Some incoming emails might be seen as hot potatoes, to be turned around as quickly as possible to keep them from appearing in the inbox count, and to keep the outgoing email count high. Overall productivity suffers, but the numbers look good.
Eventually, though, complaints would filter back to your manager. Complaints that it is taking your team multiple email exchanges to resolve issues, where a single email had been sufficient in the past, or that the people in your group seem to be responding to emails without taking the time to read them completely.
What went wrong?
When your manager mentioned measurement in close proximity to performance reviews, measurement became linked in people's minds to reward and punishment, and fear took hold. And because the measurements weren’t carefully chosen, your manager's request for data had the opposite of her intended effect. By optimizing for quantity rather than quality, some members of the team threw the system out of balance. Your manager started with a concern that her team was not being responsive to email, and ended up with a team that wasn't responsive to email. "What gets measured gets improved" can be a double-edged sword when improvement comes at the expense of balance.
In this scenario, the link between measurement and rewards was accidental. But often the link is intentional. Managers are often on the lookout for ways to measure individual performance, but measuring the effectiveness of software developers can be surprisingly difficult. It's tempting and easy to latch on to a measurement program as a way to rank people. Appraisals are a tricky subject on their own; linking them to measurement complicates both. Once people become aware of how they're being measured, they’ll tend to conform to the reward structure, either by working to optimize the things they're getting measured on at the expense of what is truly important, or by "gaming the system" to boost their numbers. In either case, the system you're trying to measure gets destabilized, and the measurement program may lose viability and fail.
This scenario parallels many real-life situations that my colleagues and I have observed. The common theme is that a measurement program gets launched too quickly, and gets co-opted by management to measure individual performance. People respond to the new reward system by optimizing their measured performance, with disastrous, or at best unfortunate, systemic effects. Where quantity is rewarded, you’ll get lots of quantity. Reward developers based on the number of lines of code they produce, and you'll get a system with lots of lines of code. This is rarely what you truly want. Reward (or withhold punishment) based on quick-task turnaround, and people might rush into tasks without taking the time to completely understand them. By choosing your measures without thinking through the consequences, you may indirectly penalize behaviors that you want to encourage.
Apply What You've Learned
A good software measurement program is essential for an organization that wants to improve its software development processes. As you plan your measurement program, remember first and foremost that software is produced by people, and those people have their own aspirations, concerns, and fears. Many otherwise good books on measurement seem to pretend that human beings aren't involved in the process of developing software. Yet the human element is the primary determinant of whether a measurement program will succeed or fail.
When you launch a large-scale measurement effort, apply the skills you've developed when doing your self-measurement program.
Don't forget to apply the insights you've gained through self-measurement as well. Keep people in mind as you plan and launch your measurement program. By explaining what you intend to do before you start doing it, you'll avoid some of the fear that is invoked when people get caught by surprise. If team members understand what will be measured and why, and what will be done with the data, more fear will be driven out. If you have management support, at least in the form of an agreement that none of your data will be used to evaluate individuals, people will feel safer. Maintain the team's trust by presenting results in terms of process, and not in terms of people. Resist all attempts by management to use the program for personnel evaluation.
And don't forget yourself. By keeping a self-measurement program going, you'll make a good start toward keeping your own workload under control. Try it!