common language for the team. The base VM template is effectively a gold standard that is reliably consistent between projects and activities, as commonplace as a PDF file or a TCP socket. A customer team that adopted this approach experienced a transformative clarity best described as “a kind of reverse Tower of Babel”; for the first time, there was a single way to talk about both infrastructure and product issues.
Jettisoning everything except “one unchanging compute environment” may sound like the doctor just prescribed amputation at the neck for your headache. “Not gonna work for us,” you might be saying (if you’re still reading!). “Our product ships on eleven platforms times at least four active branches per platform. The whole point of RelEng is to actually build those bits, from those branches, for those platforms. Just skipping them is not a solution.” Naturally: the One True Machine pattern offers nothing to the supremely irritating Solaris 8 patch stream your biggest customer has bribed you to keep on life support since 1995. Nor does it do anything to relieve the burden of porting critical infrastructure to a shaky new platform when the product takes on a new row of cells in the OS support matrix. So really, what’s the point?
The point is best illustrated with a Pareto chart, a trust graphic long employed by quality managers from all walks for visualizing the impact of the lowest hanging fruit. Make a list of all your problems, sort into buckets by platform. What percent of the issues could you have caught on the single most popular host platform/configuration? If your picture looks anything like our customers, the answer is “a lot,” often a clear majority. Declaring that RelEng will only regularly test on one platform, only allow developer-scheduled builds on that platform, only run initial regressions on one platform is another way of saying that the team is optimizing its ability to attack its most common problems.
There’s a secondary, more subtle, but more important effect of using One True Machine: it’s so much faster, more efficient, and generally more fun to use that the rest of the organization will slowly align behind it. New features are written with testing on the OTM in mind. Cross-compilers and emulators for oddball platforms materialize. The default ‘OS:’ field in the defect tracker is set to it. Productive teams relish momentum, and the presence of an express train is reason enough to redefine goals as its destination.
At our customer, the One True Machine became the only way to build and test the product. By petrifying this part of the process, they opened up countless other possibilities that were not only easy to implement, they were easy to sustain. The result was a release team that was at once nimble, scalable, and, perhaps most importantly, transparent. When the rest of the software development organization saw how easily they could request builds, schedule test runs, and get fast feedback, they were directly motivated to continue to allow RelEng to take the one great shortcut—“we’re only going to make this work on one box,”—that fueled a huge, self-perpetuating productivity gain across the company.
About the Author Usman Muzaffar is Vice President of Product Management at Electric Cloud, the leading provider of software production management solutions. He was part of the team that founded the company and served as one of the original developers on both the ElectricAccelerator and ElectricCommander products. Prior to Electric Cloud, he worked as a Software Engineer at Scriptics, Inc. and Interwoven (acquired by Autonomy) designing and developing content management, syndication, and distribution systems.