High performance of integration builds is one of the keys to risk-free software development. This article discusses simple and advanced approaches towards addressing concerns of performance of integration builds.
Notion Of An Integration Build
Software build is a process of transforming of project code base into usable applications. An integration build is a build that is performed to ensure that new changes integrate well into the existing code base. Integration builds provide feedback on quality of new changes. This feedback is used to deliver timely fixes if the changes don't integrate and break the project code base. Integration builds are often run by a dedicated build management server and are triggered when new changes are detected in a version control system. Running integration builds continuously is also known as continuous integration.
Performance Requirements For Integration Builds
The main value of an integration build is a feedback on quality of new changes. With fast builds build breakage can be identified and addressed in lesser time, reducing risks caused by delays in project delivery. That is why build speed is the most important characteristic of an integration build.
Approaches For Addressing Build Speed Concerns
Software build processes are highly I/O, memory and computationally intensive. It is possible to address build speed concerns by increasing computational resources available to a build server. Build performance can be further improved by applying advanced approaches, such as:
- Partitioning build server load though build remoting
- Parallelizing build process through build clustering
- Partitioning of test execution
Adding Computational Resources
Build speed can be improved by adding computational resources available to a build management server. Such resources include: CPU speed, quantity of CPUs, RAM, disk and network I/O. Yet, vertical scalability by adding computational resources to a single build machine is limited for such resources cannot be added indefinitely. More advanced options should be considered to scale build performance further.
Picture 1. Increasing build speed by adding computational resources
Partitioning Build Server Load
A single-box build machine soon becomes either I/O or CPU bound. Also, financial concerns may limit growth - SMP systems with a high number of processors can be expensive. Once the limit for vertical scalability is reached, the load on the build server can be distributed evenly by moving some or all of the build processes to rather inexpensive build machines running in a remote builder mode while being controlled by a central build manager. While this approach requires using advanced build tools such as Parabuild, it allows scaling build infrastructure further as the load grows.
Picture 2. Moving load from a build management server to remote builders
Though partitioning load on the build management server addresses overloading of a single build box by moving load out, the build time remains limited by performance of a machine running a build. Build time can be further decreased by running parts of a single build process in parallel.
Parallelizing Build Process
A typical build process is made of a sequence of steps that a normally execute in a sequential manner. Depending on how build steps depends on each other, some the build steps can be run in parallel by a set of remote machines. Also, certain steps such as compilation may be parallelized as well. The approach when a set of remote machines executes parts of a particular build step in parallel is also known as a build clustering and is available in new generation of build tools such as ElectricAccelerator and Incredibuild, it helps to scale build performance beyond capabilities of a single build box and further decrease build time.
Picture 3. Parallelizing build execution using build clustering
Partitioning Test LoadUnit tests running in a batch mode are an important part of a successful integration build because they validate those new changes doesn't introduce regression of quality of the code base. Over time, the number of tests grows, so does time to run the tests. Running tests is subject to the same performance concerns seen in software builds. Test performance can be improved by breaking a test set into groups that are deployed into remote builders. Each remote builder may run its own test group in parallel with other remote builders. Upon completion of all test groups runs test results are consolidated back at the build management server and presented for failure analysis or for archival.
Picture 4. Partitioning test execution
It is possible to address concerns of performance of integration builds by dedicating adequate hardware resources to a software build management server, and by applying advanced techniques such as build load partitioning, parallel (clustered) builds and partitioning of test load.
1. Slava Imeshev; Capacity Planning For Software Build Management Servers; CM Journal, September 2005; http://www.cmcrossroads.com/ubbthreads/showflat.php?Number=51496