There are two types of code that developers write. First, there is the code that gets the job done-which we'll call this type of code functional code because it supplies the functionality that satisfies user requirements. Second, there is code that keeps the functional code from failing because of erroneous input (or some other unexpected environmental condition). We'll call this type of code error code because it handles errors. For many programmers, this is the code that they are forced to write out of necessity, not because it is particularly enjoyable.
Writing both types of code simultaneously is problematic because there are context switches that must be made inside the head of a software developer between the two types of code. These context shifts are problematic; they require the developer to stop thinking about one type of code and start thinking about the other.
Consider Johnny, a hard working hypothetical developer, writing a new application. Johnny begins by writing the functional code, maybe even going so far as using something like UML to fully understand the various user scenarios that Johnny must code. Good, Johnny. Indeed, good programmers like Johnny can find a wealth of information out there to help them write good functional code. The books all address it, the tutorials address it, and there are many useful published examples to work from.
But, what happens when Johnny realizes the need for error code? Perhaps he is in the middle of writing or specifying some code object when he decides that, say, an input needs to be bounds-checked. What does Johnny do? One choice for Johnny is to stop writing the functional code and write the error code instead. This requires a context shift inside Johnny's developer-head. He must stop thinking about the user scenarios and the functional code that he is implementing, and start thinking about how to handle the error. Since handling errors can be complicated, this may take him some time.
Now, when Johnny returns to the task of writing the functional code, his mind has to recall what he was thinking about when he last put it down. This context shift is harder than the first, given the myriad design-decision details and minute technical details that go into writing any nontrivial program. You see the problem: poor Johnny has had to endure two context switches to handle a single error. Imagine how many such context switches happen writing even a small application.
Another choice for Johnny would be to postpone writing the error code in order to avoid the context shift. Assuming Johnny remembers to eventually get around to writing the error code, he's probably going to have to spend some time recalling the nature of the error event he's trying to write a handler for. So now Johnny is writing the error code without the benefit of context. Writing error code is problematic no matter how you face it. And therefore a ripe place for guys like me to look for bugs. So now let's look at the testing perspective, how do we approach testing error code?
Forcing error messages to occur is the best way to get error code to execute. Software should either appropriately respond to bad input or it should successfully prevent the input from ever getting to the software in the first place. The only way to know for sure is to test the application with a battery of bad inputs. There are many factors to consider when testing error code. Perhaps the most important is to understand how the application responds to erroneous input. I try to identify three different types of error handlers:
Input filters can be used to prevent bad input from ever getting to the software under test. In effect, bad inputs are filtered by, for example, a graphical user interface, and only legal inputs are allowed past the interface.
Input checking can be performed to ensure that the software will not execute using bad input. The simplest case is that every time an input enters the system, the developer inserts an IF statement to ensure that the input is legal before it is processed; that is, IF the input is legal, THEN process it, ELSE display an error message. During this first attack, it is our goal to ensure that we see all such error messages.
Exception handlers are a last resort and are used to clean up after the software has failed as a result of processing bad input. In other words, bad inputs are allowed into the system, used in processing, and the system is allowed to fail. The exception handler is a routine that is called when the software fails. It usually contains code that resets internal variables, closes files, and restores the ability of the software to interact with its users. In general, some error message is also displayed.
Testers must consider each input that the software under test accepts and focus on erroneous values. The idea here is to enter values that are too big, too small, too long, too short-which values that are out of the acceptable range or values of the wrong data type. The major defect one will find with this approach is missing error cases-input data that the developer did not know was erroneous or individual cases that were overlooked. Missing cases almost always cause the software to hang or crash. One should also be on the lookout for misplaced error messages. Sometimes the developer gets the error message right but assigns it to the wrong input values. Thus, the message seems like nonsense for the particular input values submitted.
Finally, of pure nuisance value are uninformative error messages. Although such messages cause no direct harm to the user, they are sloppy and will cast doubt in a user's mind on the credibility of the software producer. "Error 5-Unknown Data" might have seemed a good idea to some developer, but will cause frustration in the mind of the user who will have no idea what they did wrong. Whether one is testing an input field in a GUI panel or a parameter in an API call, one must consider properties of an input when conducting this attack. Some general properties to consider are:
Input type: Entering invalid types will often cause an error message. For example, if the input in question is an integer, then enter a real number or a character.
Input length: For character (alphanumeric) inputs, entering a few too many characters will often elicit an error message.
Boundary values: Every numeric data type has boundary values and sometimes these values represent special cases. The integer zero for example is the boundary between positive and negative numbers.
Be prepared to find some spectacular bugs!
For more information and examples, see Chapter 2, How to Break Software (Addison-Wesley, 2002).