Code reuse and especially object reuse is not a new topic. Software professionals have been talking about reuse for decades, but somehow true reuse still eludes virtually every organization engaged in software development. Scanning through the literature on reuse one can find plenty of articles on the benefits of reuse and even quite a few sources on object oriented design principles that best lead to reuse. What is missing, are success stories and practical advice on how to make reuse a reality within your organization. That’s where this article differs.
This article is based on first hand experiences implementing reuse practices and strategies at companies such as Alamo Local Market Division and FedEx Custom Critical. We will start by exploring what we actually mean by reuse. Next we will provide a reuse practices maturity model that you can use to gauge progress within your own organization. We will conclude with an overview of the specific practices and tools that we’ve used to implement reuse practices and strategies.
What is Reuse?
Almost everyone I talk to has a different idea about what constitutes reuse. For example, if a developer copies a source file from one project into another project, is that reuse? How about if the developer copies a compiled class instead of a source file? My point is not to state that one view is correct and the other isn’t, but merely to point out that there are many kinds of reuse. And if we are going to make any true headway in the practice of reuse we have to understand these different kinds of reuse and the repercussions of each. To this aim I would like to suggest the following Reuse maturity Model.
Reuse Maturity Model
The following Reuse Maturity Model is based on experience within my software development organization as well as experiences with and observations of other development organizations.
Level 1 - Single Project Source Based Reuse Organizations at the first maturity level place all their source code within a single project. Very often this single pool of source code will hold multiple applications. Project A is home to two applications: A and B. This practice totally sidesteps the above question of whether copying a source file from one project to another project constitutes reuse. There is simply no need to copy source files or compiled files from one project to another because there is only one project.
Of course, there is a limit to how well this practice will scale. Once the number of applications or the number of developers increases, maintaining a common pool of source code will become increasingly difficult.
Level 2 - Multi Project Source Based Reuse The next stage on the reuse maturity roadmap is to separate the source base into multiple projects (usually the project boundaries will correspond with the application boundaries) and practice source based reuse between the projects. Source based reuse in this scenario entails copying the source developed in one project into another project. The prime targets of such reuse are utilities, which were originally developed in one project but can be used in the next project. Project A is home to application A and project B is home to application B. The code common to both projects (depicted in orange) is copied into both projects.
The problem with multiple project source based reuse is that after being copied, the reused source code has no link back to the original. This causes all sorts of maintenance problems, since bug fixes will have to be applied to every project that reuses the copied code (something that will never happen in practice). Also, as the two host project evolve, the reused code will evolve with them, thus making maintenance even more difficult.
Level 3 - Ad hoc Binary Reuse Ad hoc binary reuse is the next step along this maturity roadmap. Organizations advance to this level after trying multi-project source based reuse and realizing the drawbacks of that approach. Under ad hoc binary reuse, project boundaries realign and no longer mirror application boundaries. Projects at this maturity level can correspond to applications or to reusable components. The utilities source code that was copied from project to project at the previous maturity level is now placed in its own project. This utilities project has its own lifecycle that is independent of the application projects. The application projects include the binary artifacts of the utilities project and a dependency relationship between the projects is established. Maintenance of the utilities project is greatly simplified because rather than maintaining multiple diverging copies of the utilities, only a single version needs to be maintained now. As an application requires additional features of the utilities, those features can be added to the utilities project. This way the additional features become available to all applications using the utilities.
But at this maturity level there are no release procedures or dependency management procedures. Whenever new features are required, they are added to the utilities project, the project is rebuilt and the resulting artifacts are placed in the application project. This makes it impossible to know what exact version of the utilities project is being used by any application project. And that makes it impossible to know whether a new feature made the library incompatible with the other application projects already using the utilities. Also, when a bug is discovered, it is difficult to know what version of the code base contained the bug and after a fix is implemented, it is difficult to know whether the newest version (the one that contains the fix) will be compatible with all the application projects that need the utilities library.
Level 4 - Controlled Binary Reuse and the Reuse/Release Equivalence Principle (REP) Controlled binary reuse builds on ad hoc binary reuse. The project boundaries remain the same, with application projects and component/library projects. Where this maturity level differs from the one before it is in the process for versioning and tracking the releases of projects. At this maturity level, each release of a project is controlled and tracked with a version number.
At this level, when a bug is discovered within a reused component, the exact version of the component with the bug can be identified. The dependencies between projects can now be tracked explicitly as well. For example, we may know that application A requires version 1.0 or higher of project C, whereas application B requires version 1.1 or higher.
Finally, this reuse maturity level embodies the Reuse/Release Equivalence Principle first made public by Robert C. Martin in C++ Report in 1996. The REP states:
The granule of reuse is the granule of release. Only components that are released through a tracking system can be effectively reused.
Experiences with Reuse
The reuse practices that I present here have evolved over the last six years and have been put to the test on several medium to large scale projects. One of those projects is the MPOWERENT project for Alamo Local Market Division, and another is the i2o pilot project for FedEx Custom Critical.
The MPOWERENT project resulted in library of ten reusable components totaling hundreds of classes and hundreds of KLOCs. These reusable components have already been reused in three applications. The FedEx i2o pilot project resulted in a library of 20 reusable components (hundreds of classes and approximately 100 KLOCs) that are being used by two applications right now. Both of these projects have been developed in Java (J2EE) and both projects utilized practices consistent with level 4 of the Reuse Maturity Model. I believe that the two types of tools that were most instrumental is allowing both of the above mentioned project to be developed at level 4 are the build system and the centralized build management server. Tools that can automate parts of the process are very important because although level 4 is in theory attainable without automation in practice that is never the case.
Build System The build system is a tool like ANT or Make, or some other similar tool (even shell or Perl scripts) that can be used to build your project independently of the development environment. The above-mentioned projects used ANT as their build system since it was a natural fit for these Java based projects. But what is important is that the build system be independent of the development environment. In other words, you should not have to rely on your developers to build the software in their IDE. Your build system needs to be independent of the IDE.
Once your build system is independent of the IDE, then you can automate it. Automation is the key to achieving level 4 and putting into practice the Reuse/Release Equivalence Principle. Plus, automation will allow you to do some pretty useful things like setting up nightly builds so that every night your project is built and tested and the results are automatically emailed to interested parties.
The Centralized Build Management Server A centralized build management server (BMS) is a tool that automates a controlled build process allowing builds to be tracked and published. It is the tool that comes as close as a tool can to embodying the Reuse/Release Equivalence Principle. The build management server that we used on the Alamo and FedEx projects is an open source tool called Anthill (http://www.urbancode.com/projects/anthill/).
Anthill began as an internal tool we developed to help us with the issue of controlling binary releases. As a development organization we moved from Level 1 through Level 2 to Level 3 of the Reuse Maturity Model and were looking for a way to improve on that. We realized that one of the biggest problems we were having is knowing what version of a library was being used by a particular project. Although our build system was independent of the IDE, each developer was still building the projects they needed on their own. So if a developer needed to use project C in his application, he would get project C from the source repository, build the project and place the resulting artifacts (jar files) in his application project. There was no central location where developers could go and look for the binary artifacts of a project.
Also, even though our process required that developers tag the source repository every time they did a release, that rarely happened in practice. We clearly needed something that would tag the source repository automatically.
Another problem that we were finding is that despite best intentions, developers were making builds that included source code that was not committed to the source repository. This wreaked havoc during the maintenance phase.
All of these problems were solved by a Centralized Build Management Server like Anthill. Anthill automatically creates a project Intranet site for every project. This project site contains project documentation, javadocs, browseable source code, build logs, revision logs, as well as downloadable artifacts for every build of the project. The Anthill BMS contains a list of projects that it manages. Each project can be assigned a build schedule, for example a nightly build schedule or a 2-hour build schedule. Anthill initiates a build for each project according to its schedule. So a project on the nightly schedule would be built by Anthill every night.
When Anthill initiates a build it does the following:
- Get the latest version of the project source code from the source repository
- Obtains from the source repository a log of revisions since the last build. If there were no revisions then the build ends and Anthill cleans up any temporary files.
- If there were revisions, then Anthill increments the build number.
- Optionally tags the repository with the build number. This step is optional. For any production of QA builds it is recommended that the repository be tagged, but for development builds (which may be happening several times per day) a tag may not be necessary.
- Builds the project by calling ANT with a specified build script. The build script is part of the project that Anthill got from the repository. It is the same build script that developers can use to build the project locally in their development environment for testing.
- Sends out emails to all users interested in this project with the status of the build (did is succeed or fail) along with links to the build log and revision log for this build.
- Cleans up any temporary files.
Because the artifacts on the project intranet sites are authoritative, we no longer have the problem of developers forgetting to tag the repository or to commit changes to files. The repository gets tagged automatically by the BMS. And if a developer forgets to commit code changes, then those changes are not part of the authoritative build. If that is a problem, then the problem will be discovered and the relevant files will be committed for the next authoritative build.
Practicing reuse is not a simple or easy task. Even at level 4 with tools like ANT and Anthill there are many difficulties such as dealing with large project dependency graphs, cyclical dependencies, concurrent development of multiple projects that need to add features to the same dependency, and I’m sure many other challenges that I have not yet encountered. I would like this article to stand as a small advance on the battlefield that our industry will face eventually. Hopefully this article will stimulate some thoughts and discussions as well as future articles sharing other success stories and practices. Please feel to email me with your comments.