This past week our team had been asked to make some modifications to an e-mail that was sent to customers after they made a purchase. The change was fairly routine; add some text that should only display under a certain condition. The e-mails were fairly easy to generate as the team had chosen the Velocity library as the template system. In fact, there was a way to generate some sample e-mails without starting up the application server.
As we dug into the task, we found out that the simple change wasn't as simple as we expected - go figure. And to make things even more complicated, every change to the e-mail required the team to first save a copy of all the e-mails, make the change, generate new copies of the e-mails, and then use some form of 'diff' comparison tool to check each one to make sure nothing unexpectedly changed. This inevitably meant looking up the syntax for using diff with two lists of files, plus it required that ignoring some changes like different dates, different order of items, etc. There were enough examples to check that it became slightly mind-numbing work, and thus far too easy to mistake a real error for an expected difference.
Faced with this task, we did what we felt any appropriately lazy programmer would do: we automated the tests.
We used the "Gold Master" pattern as J.B. Rainsberger describes in his book "JUnit Recipes" in Chapter 12: "Testing web components". First we saved the expected copy of the e-mails to text files and checked them into version control. Next, we converted the code that generated the e-mails to JUnit tests that generated fresh copies and compared them against the saved versions. This required that we make sure the e-mails were all generated deterministically; that is, we made sure that there were no e-mails that directly used the current time in the tests (since they'd fail tomorrow, or even a minute later) and all lists of items were sorted in a predictable order.
Sometimes Gold Master tests cause more problems that they're worth. These tests are essentially character-by-character comparisons that can catch irrelevant changes in whitespace or formatting. However, in our case, the e-mails changed infrequently so we were willing to accept these potential false positives. (To prevent too many annoying failures, we stripped excess whitespace in our tests and then compared the results.) And since these e-mails were sent to the customers, we wanted a step in our process to confirm any changes to these e-mails - expected or not.
In the end, automating the tests took much longer than it would have taken to do the diffs by hand. But by working through this process, we knew that the computer would be much more accurate about reporting differences than we would. Now there was no manual effort required to verify the generated e-mails, which meant we would have more time to improve the e-mail templates with every future change. Since our tests were acting as a mercilessly honest judge, we felt safe to make much more aggressive refactorings to simplify the e-mail templates.
But perhaps more importantly - it just felt good. Once we finished this e-mail change, we then had to decide if we wanted to automate the comparisons of the remaining half-a-dozen e-mails. We had enough time left in our iteration to try it out, and it turned out to be simple work now that we'd paid off the initial expense of creating the first few tests.
And boy is it ever a great feeling when doing the right thing is actually easy.
IntelliJ IDEA has a great visual comparison tool which shows you the highlighted differences between two multiline Strings when you compare using JUnit's assertEquals() method. While the text JUnit runner-based assertEquals() failure message does summarize the differences succinctly, we found the visual diff offered by IntelliJ to be much easier to read for these Gold Master tests.