Descrambling Parallel Build Logs

[article]
Summary:
One of GNU make's many features allows you to shorten build times by running more than one command at a time. If your dependencies are all correct, or nearly correct, this can give you a significant improvement, and since it's built into the tool you get it "for free." But GNU make's parallel build feature—often called "dash j mode," after the command-line option that is used to enable it—is not without drawbacks. The worst of these is that GNU make parallel builds can produce incorrect results if the build dependencies are not sufficiently correct.

One of GNU make's many features allows you to shorten build times by running more than one command at a time. If your dependencies are all correct, or nearly correct, this can give you a significant improvement, and since it's built into the tool you get it "for free." But GNU make's parallel build feature—often called "dash j mode," after the command-line option that is used to enable it—is not without drawbacks. The worst of these is that GNU make parallel builds can produce incorrect results if the build dependencies are not sufficiently correct. Fixing that is way outside the scope of this column. But we can do something about the second biggest problem: the scrambled build log. It's not a perfect solution, as you'll see, but it addresses a large portion of the problem, and it's simple to implement.

A typical scrambled build log
If you haven't used GNU make parallel builds, you may not be familiar with the problem we're trying to solve here. The issue is that with multiple commands running simultaneously, all writing output and error text to the same location, the build log becomes a jumbled mess. The output from each command gets interleaved with the output from other commands. Worse, the error messages get mixed up too, so it becomes difficult or impossible to tell which commands are producing the errors. Consider this trivial makefile:

all: a b c
a b c: ; @for n in 1 2 3 4 ; do echo $@-$$n && sleep 1 ; done
Run serially, the output looks like this:

a-1
a-2
a-3
a-4
b-1
b-2
b-3
b-4
c-1
c-2
c-3
c-4

But run in parallel, the output is all mixed up:

a-1
b-1
c-1
b-2
a-2
c-2
a-3
b-3
c-3
b-4
a-4
c-4

If you run it in parallel again, the output is all mixed up again, but in a different way:

a-1
c-1
b-1
a-2
c-2
b-2
a-3
c-3
b-3
a-4
c-4
b-4

In fact, you might get a different permutation of the output every time you run the build. That's no problem if you never have any errors in your build, but for those of us without that kind of luck, this situation is a complete disaster. Unfortunately, you can't completely solve this problem without help from GNU make itself, but we can make things a little better by doing a bit of programming, and leveraging another, often overlooked, feature of GNU make.

Locker
The first piece of our solution is a simple utility that uses file locking to ensure only one command can write to the build log at a time. We'll look at a Linux version of the utility here to get a better idea of how it works.

// locker
//
//      A command invocation wrapper that captures stdout/stderr from the
//      command, then grabs an exclusive lock on a specified file and copies
//      the command stdout/stderr to the real stdout.  Intended for use with
//      GNU make parallel builds to keep stderr/stdout from each command
//      distinct.
//
//      Usage:  locker

About the author

AgileConnection is one of the growing communities of the TechWell network.

Featuring fresh, insightful stories, TechWell.com is the place to go for what is happening in software development and delivery.  Join the conversation now!