The Trouble with Derivation

Better Software Magazine
Volume-Issue: 
2009-03
Summary:

This article discusses the dark underbelly of derivation: the fragile base class. It's possible to modify a base class in such a way that, even though you've improved its implementation and all your tests work just fine, you've nonetheless damaged the derived classes, perhaps fatally.

The notion of polymorphism, or method overriding, is fundamental to object-oriented thinking. That said, the practice of derivation, or subclassing, is vastly overused in most program designs. Subclassing can actually introduce more problems than it solves. I'm talking specifically about implementation inheritance -the "extends" relationships in Java. There's almost no downside to interface inheritance -"implements" in Java-where you override an abstract method in a base class. In fact, you could look at the Gang of Four design patterns [1] as ways to replace implementation inheritance with interface inheritance. Once you exclude the patterns that are hardly ever used, like Interpreter, only one pattern, Class Adapter, requires an "extends." The rest all use interfaces. This meta-pattern isn't an accident.

From a theory point of view, the problem with derivation is coupling. The derived classes are very tightly coupled to their base classes, and a change in either could affect the other.

That's bad, of course. A class should be isolated from the rest of the program so that any internal changes to the class won't impact anybody that uses objects of that class. This same reasoning applies to base-class/derived-class relationship.

There are a lot of places where that too-tight coupling relationship causes problems. Consider the Java example of the Template Method design pattern shown in listing 1.

I've opened the door for the subclass method to wreak real havoc here. First, since I was foolish enough to define someValue as protected, the derived class can (and does) modify it. I'll be really surprised, for example, when someValue has magically changed its value after the prepare() call. Moreover, I now can't touch someValue as I maintain the code or the derived-class code will break. A minor change to the superclass could potentially impact every subclass. Fields should always be private, unless they're actual constants.

Even worse, the prepare() call-or any method that's structured like prepare()-can do pretty much anything it wants, even things that are dangerous. What if prepare() throws an unexpected exception, for example?

I see this problem constantly. Consider Java's lowly toString() method. In Java, the Object class, which is every other class's superclass, defines a toString() that's supposed to return a String representation of the object. The toString() method lets you write code like that in listing 2 to help with your debugging.

I've often seen people override toString() to do something other than return a simple representation of the object, however. This flaw could be annoying but benign. If toString() returned an elaborate HTML or SQL representation of the object, for example, logging that representation would be annoyingly useless, but it wouldn't break the program. However, a toString() override can be actively evil. For example, toString() could throw an exception that you weren't expecting and didn't catch, thereby bringing down the server with what you thought of as a simple logging request.

The main practical problem with derivation is something called the "fragile-base-class problem." It's possible for someone to modify a base class in a way that looks perfectly safe but nonetheless breaks the derived classes. Since the base-class modifications will pass all the old regression tests, this sort of problem is particularly difficult to find. Moreover, the derived classes might not be available to the person who modified the base class, so they can't be tested. They may be in another application, or they may be using a library that includes the base class.

Java provides a classic example of a fragile base class in its InputStream class. InputStream has three methods for getting input: An abstract read() method reads a single byte

File: 
AttachmentSize
The_Trouble_with_Derivation.pdf323.96 KB

About the author

Allen I. Holub's picture
Allen I. Holub

Allen I. Holub has worked in the computer industry since 1979. Focusing on object-oriented technology and agile process, he excels at helping companies fix broken software development processes and adopt new ones by providing CEO/CTO coaching, staff training, project management, and programming services. Allen has authored nine books, most recently Holub on Patterns, and more than 150 magazine articles, writing as contributing editor for Dr. Dobb's Journal, JavaWorld, and SD Times among others. Allen has worked on projects ranging from operating systems and compilers to Ajax Web applications, and he teaches for the University of California, Berkeley, Extension. Allen is the security-track chair for the Software Development conference and sits on the board of advisors of Ascenium Corp. and Ontometrics Corp. Contact Allen at www.holub.com/allen.html.

Upcoming Events