Lean Metrics for Agile Software Configuration Management

[article]
  • Throughputof the request-queue is the [average] number of requests per unit of time. So the number of requests that arrive, are approved, or closed all relate to different aspects of throughput:
    • the arrival rate tells us the demand, the approval rate tells us our how frequently we are supplying (“feeding”) actionable requests to development,
    • and the closure rate tells us how frequently we are delivering realized value to the customer.

Note that the "value" associated with each request is not part of the equation, just the rate at which requests flow through these critical value-transformation points in the system: when it is first conceptualized, when it is understood & authorized, and when it is finally realized. 

  • Process Batch-size is all the requests that are decided upon in a single CCB or governance meeting, ... 
  • Transfer Batch-size would be the number of requests we allow to be queued-up (submitted) prior to approving them (possibly for a CCB meeting, or maybe for a particular release, iteration, or build/milestone). It is similar to a backlog (but not quite the same thing).
  • Processing-time is average approval-time of a request from the time it arrives up until it is accepted and authorized. And ...
  • Transfer-time is the time [average] time between approval of a request and when development work commences to implement the request. Sometimes a request is approved, but still waits a long time before it is prioritized high-enough to be selected for and targeted to a release, iteration or build/milestone. Transfer-time can also be the time between when a change-task is allocated until the time it is actively being worked on.
  • Takt time in this case would relate to response time and would be the [average] number of request the team can approve during a given day/hour if they didn't have to wait-around for information from other stakeholders. 
  • System outage would occur if the approval pipeline is broken, and we are somehow unable to respond to customer demand by supplying development with approved requests to implement.

If any of our readers have other ideas about what the above would correspond to for request/change management (or disagree with any of the above), we’d like to hear from you. We don’t claim to have perfect knowledge or perfect understanding on how to translate these concepts into request-management terms and we appreciate any feedback that helps us to learn & improve.

Integration Codelines as "Streams" of Development Change-Flow
A codeline that is used to integrate changes and baseline the results can be viewed as a value-delivery stream of sorts. The codeline is the media through which software development changes flow in a value-stream. The efficiency and throughput of the system is not based so much on the customer assigned business-value for the functionality being developed: that can change at any time, along with any other business condition or priority.

So the efficiency and throughput of a codeline have more to do with the rate at which changes flow through the codeline. The value of those changes is assured or promoted to the extent that the quality of the changes can be assured, but ensuring the changes implemented are valued is a function more of the request/order management pipeline than of the development and integration pipeline. This view of a codeline as a stream through which integrated changes flow is supported by Laura Wingerd's book Practical Perforce: Channeling the Flow of Change in Software Development Collaboration (indeed, it is supported by the very title of the book).

If we regard a codeline as a production system in this manner, its availability to the team is a critical resource. If the codeline is unavailable, it represents a "network outage" and critical block/bottleneck of the flow of value through the system. This relates to the above as follows:

  • Throughput of the codeline is the [average] number of change "transactions" per unit of time. In this case we'll use hours or days. So the number of change-tasks committed per day or per hour is the throughput (note that the "value" associated with each change is not part of the equation, just the rate at which changes flow through the system).
  • Process Batch-size is all the changes made for a single change-task to "commit" and ... 
  • Transfer Batch-size would be the number of change-tasks we allow to be queued-up (submitted) prior to integrating (merging, building & testing) the result. Note that if we target only one change-task per integration attempt, then we are basically attempting single-piece flow.
  • Processing-time is average duration of a development-task from the time it begins up until it is ready-to-commit. And ..
  • Transfer-time is the time it takes to transfer (merge) and then verify (build & test) the result. We could also call this the overall integration time (or the integration-request cycle-time)
  • Takt time in this case would regard the development as the "customers" and would be the [average] number of changes the team can complete during a given day/hour if they didn't have to wait-around for someone else's changes to be committed. 
  • System outage would occur if the codeline/build is broken. It could also be unavailable for other reasons, like if corresponding network or hardware of version-control tool was "down", but for now let's just assume that outages are due to failure of the codeline to build and/or pass its tests (we can call these "breakages" rather than "outages")
  • MTTR (Mean-time to repair) is the average time to fix codeline "breakage," and ... 
  • MTBF (Mean-time before failure) is the average time between "breakages" of the codeline

Note that if full builds (rather than incremental builds) are used for verifying commits, then build-time is independent of the number of changes. Also note that it might be useful to capture the [average] number of people blocked by a "breakage," as well as the number of people recruited (and total effort expended) to fix it. That will helps us determine the severity (cost) of the breakage, and whether or not we're better off trying to have the whole team try to fix it, or just one person (ostensibly the person who broke it), or somewhere in between (maybe just the set of folks who are blocked).

 Nested Synchronization and Harmonic Cadences

Lean "guru" Don Reinertsen (author of Managing the Design Factory), has written about using nested synchronization with harmonic cadences:

"The practical economics of different processes may demand different batch sizes and different cadences. Whenever we operate coupled processes using different cadences it is best to synchronize these cadences as harmonic multiples of the slowest cadence. You can see this if you consider how you would synchronize the arrival of frequent commuter flights with less frequent long haul flights at an airline hub."

In their latest book Implementing Lean Software Development: From Concept to Cash, Mary and Tom Poppendieck advise using continuous integration (with test-driven development) and nested synchronization instead of infrequent, big-bang integration. Both this and Reinertsen's statement above apply directly to Agile CM environments and how we might measure key indicators related to these factors:

  • Harmonic cadences address nested synchronization of integration/build frequencies, both in the case of
  1. different types of builds (private build, integration build, release build), and ... 
  2. different levels of builds (component builds, product-builds)
  3. and also in the case of connected supplier/consumer "queues" where builds or components are received from an (internal or external) supplier and incorporated into our own product/components builds.

About the author

Brad Appleton's picture Brad Appleton

Brad Appleton is a software CM/ALM solution architect and lean/agile development champion at a large telecommunications company. Currently he helps projects and teams adopt and apply lean/agile development and CM/ALM practices and tools. He is coauthor of the book Software Configuration Management Patterns, a columnist for the CMCrossroads and AgileConnection communities at Techwell.com,  and a former section editor for The C++ Report. You can read Brad's blog at blog.bradapp.net.

About the author

Steve Berczuk's picture Steve Berczuk

Steve Berczuk is a Principal Engineer and Scrum Master at Fitbit. The author of Software Configuration Management Patterns: Effective Teamwork, Practical Integration, he is a recognized expert in software configuration management and agile software development. Steve is passionate about helping teams work effectively to produce quality software. He has an M.S. in operations research from Stanford University and an S.B. in Electrical Engineering from MIT, and is a certified, practicing ScrumMaster. Contact Steve at steve@berczuk.com or visit berczuk.com and follow his blog at blog.berczuk.com.

About the author

Robert Cowham's picture Robert Cowham

Robert Cowham has long been interested in software configuration management while retaining the attitude of a generalist with experience and skills in many aspects of software development. A regular presenter at conferences, he authored the Agile SCM column within the CM Journal together with Brad Appleton and Steve Berczuk. His day job is as Services Director for Square Mile Systems whose main focus is on skills and techniques for infrastructure configuration management and DCIM (Data Center Infrastructure Management) - applying configuration management principles to hardware documentation and implementation as well as mapping ITIL services to the underlying layers.

AgileConnection is one of the growing communities of the TechWell network.

Featuring fresh, insightful stories, TechWell.com is the place to go for what is happening in software development and delivery.  Join the conversation now!

Upcoming Events

Sep 22
Sep 24
Oct 12
Nov 09