Scoring and Evaluating Software Methods, Practices, and Results

[article]

Software engineering and software project management are complex activities.  Both software development and software management have dozens of methodologies and scores of tools available that are beneficial.  In addition, there are quite a few methods and practices that have been shown to be harmful, based on depositions and court documents in litigation for software project failures. In order to evaluate the effectiveness or harm of these numerous and disparate factors, we have developed a simple scoring method. The scoring method runs from +10 for maximum benefits to -10 for maximum harm.

Software development and software project management have dozens of methods, hundreds of tools, and scores of practices. Many of these are beneficial, but many are harmful too. There is a need to be able to evaluate and rank many different topics using a consistent scale.

To deal with this situation, a scoring method has been developed that allows disparate topics to be ranked using a common scale. Methods, practices, and results are scored using a scale that runs from +10 to -10 using the criteria shown in table 1.1.

Both the approximate impact on productivity and the approximate impact on quality are included. The scoring method can be applied to specific ranges such as 1,000 function points or 10,000 function points. It can also be applied to specific types of software such as information technology, web application, commercial software, military software, and several others.



Table 1: Scoring Ranges for Software Methodologies and Practices



 



 



 



 



 



 



 



 



Score



 



Productivity



Quality



 



 



 



 



Improvement



Improvement



 



 



 



 



 



 



 



 



10



 



25%



35%



 



 



9



 



20%



30%



 



 



8



 



17%



25%



 



 



7



 



15%



20%



 



 



6



 



12%



17%



 



 



5



 



10%



15%



 



 



4



 



7%



10%



 



 



3



 



3%



5%



 



 



2



 



1%



2%



 



 



1



 



0%



0%



 



 



0



 



0%



0%



 



 



-1



 



0%



0%



 



 



-2



 



-1%



-2%



 



 



-3



 



-3%



-5%



 



 



-4



 



-7%



-10%



 



 



-5



 



-10%



-15%



 



 



-6



 



-12%



-17%



 



 



-7



 



-15%



-20%



 



 



-8



 



-17%



-25%



 



 



-9



 



-20%



-30%



 



 



-10



 



-25%



-35%



 


 

 

 

 

 

 

 

The midpoint or “average” against which improvements are measured are traditional application development methods such as waterfall development performed by organizations that either don’t use the Software Engineering Institute’s capability maturity model or are at level 1. Low-level programming languages are also assumed. This fairly primitive combination remains more or less the most widely used development method even in 2008.

One important topic needs to be understood. Quality needs to be improved faster and to a higher level than productivity in order for productivity to improve at all. The reason for this is that finding and fixing bugs is overall the most expensive activity in software development. Quality leads and productivity follows. Attempts to improve productivity without improving quality first are not effective.

For software engineering a serious historical problem has been that measurement practices are so poor that quantified results are scarce. There are many claims for tools, languages, and methodologies that assert each should be viewed as a “best practice.” But empirical data on their actual effectiveness in terms of quality or productivity has been scarce. Three points need to be considered.

The first point is that software applications vary in size by many orders of magnitude. Methods that might be ranked as “best practices” for small programs of 1,000 function points in size may not be equally effective for large systems of 100,000 function points in size.

The second point is that software engineering is not a “one size fits all” kind of occupation. There are many different forms of software such as embedded applications, commercial software packages, information technology projects, games, military applications, outsourced applications, open-source applications, and several others. These various kinds of software applications do not necessarily use the same languages, tools, or development methods.

The third point is that tools, languages, and methods are not equally effective or important for all activities. For example, a powerful programming language such as Objective C will obviously have beneficial effects on coding speed and code quality. But which programming language is used has no effect on requirements creep, user documentation, or project management. Therefore, the phrase “best practice” also has to identify which specific activities are improved. This is complicated because activities include development, deployment, and post-deployment maintenance and enhancements. Indeed, for large applications, development can take up to five years, installation can take up to one year, and usage can last as long as twenty-five years before the application is finally retired. Over the course of more than thirty years, there will be hundreds of activities.

The result of these various factors is that selecting a set of “best practices for software engineering” is a fairly complicated undertaking. Each method, tool, or language needs to be evaluated in terms of its effectiveness by size, by application type, and by activity.

Overall Rankings of Methods, Practices, and Sociological Factors
In order to be considered a “best practice,” a method or tool has to have some quantitative proof that it actually provides value in terms of quality improvement, productivity improvement, maintainability improvement, or some other tangible factors.

Looking at the situation from the other end, there are also methods, practices, and social issues that have demonstrated that they are harmful and should always be avoided. For the most part, the data on harmful factors comes from depositions and court documents in litigation.

In between the “good” and “bad” ends of this spectrum are practices that might be termed “neutral.” They are sometimes marginally helpful and sometimes not. But in neither case do they seem to have much impact.

Although the author’s book Software Engineering Best Practices dealt with methods and practices by size and by type, it might be of interest to show the complete range of factors ranked in descending order, with the ones having the widest and most convincing proof of usefulness at the top of the list. Table 2 lists a total of 200 methodologies, practices, and social issues that have an impact on software applications and projects.

The average scores shown in table 2 are actually based on the average of six separate evaluations:

1.     Small applications < 1,000 function points

2.     Medium applications between 1,000 and 10,000 function points

3.     Large applications > 10,000 function points

4.     Information technology and web applications

5.     Commercial, systems, and embedded applications

6.     Government and military applications

The data for the scoring comes from observations among about 150 Fortune 500 companies, some fifty smaller companies, and thirty government organizations. Negative scores also include data from fifteen lawsuits. The scoring method does not have high precision and the placement is somewhat subjective. However, the scoring method does have the advantage of showing the range of impact of a great many variable factors. This article is based on the author’s two recent books: Software Engineering Best Practices published by McGraw Hill in 2009 and The Economics of Software Quality published by Addison Wesley in 2011.

However, the resulting spreadsheet is quite large and complex, so only the overall average results are shown here:



Table 2: Evaluation of Software Methods, Practices, and Results



 



 



 



 



 



 



 



 



 



Methodology, Practice, Result



 



Average



 



 



 



 



 



 



 



 



 



 



Best Practices



 



 



 



 



 



1



Reusability (> 85% zero-defect materials)



9.65



 



2



Defect potentials < 3.00 per function point



9.35



 



3



Defect removal efficiency > 95%



 



9.32



 



4



Personal software process (PSP)



 



9.25



 



5



Team software process (TSP)



 



 



9.18



 



6



Automated static analysis



 



 



9.17



 



7



Inspections (code)



 



 



 



9.15



 



8



Measurement of defect removal efficiency



9.08



 



9



Hybrid (CMM+TSP/PSP+others)



 



9.06



 



10



Reusable feature certification



 



 



9.00



 



11



Reusable feature change controls



 



9.00



 



12



Reusable feature recall method



 



 



9.00



 



13



Reusable feature warranties



 



 



9.00



 



14



Reusable source code (zero defect)



 



9.00



 



 



 



 



 



 



 



 



 



 



Very Good Practices



 



 



 



 



15



Early estimates of defect potentials



 



8.83



 



16



Object-oriented development (OO)



 



8.83



 



17



Automated security testing



 



 



8.58



 



18



Measurement of bad-fix injections



 



8.50



 



19



Reusable test cases (zero defects)



 



8.50



 



20



Formal security analysis



 



 



8.43



 



21



Agile development



 



 



 



8.41



 



22



Inspections (requirements)



 



 



8.40



 



23



Timeboxing



 



 



 



8.38



 



24



Activity-based productivity measures



 



8.33



 



25



Reusable designs (scalable)



 



 



8.33



 



26



Formal risk management



 



 



8.27



 



27



Automated defect tracking tools



 



8.17



 



28



Measurement of defect origins



 



 



8.17



 



29



Benchmarks against industry data



 



8.15



 



30



Function point analysis (high-speed)



 



8.15



 



31



Formal progress reports (weekly)



 



8.06



 



32



Formal measurement programs



 



 



8.00



 



33



Reusable architecture (scalable)



 



8.00



 



34



Inspections (design)



 



 



 



7.94



 



35



Lean Six-Sigma



 



 



 



7.94



 



36



Six-sigma for software



 



 



7.94



 



37



Automated cost estimating tools



 



 



7.92



 



38



Automated maintenance work benches



 



7.90



 



39



Formal cost tracking reports



 



 



7.89



 



40



Formal test plans



 



 



 



7.81



 



41



Automated unit testing



 



 



7.75



 



42



Automated sizing tools (function points)



 



7.73



 



43



Scrum session (daily)



 



 



7.70



 



44



Automated configuration control



 



7.69



 



45



Reusable requirements (scalable)



 



7.67



 



46



Automated project management tools



 



7.63



 



47



Formal requirements analysis



 



 



7.63



 



48



Data mining for business rule extraction



 



7.60



 



49



Function point analysis (pattern matches)



7.58



 



50



High-level languages (current)



 



 



7.53



 



51



Automated quality and risk prediction



 



7.53



 



52



Reusable tutorial materials



 



 



7.50



 



53



Function point analysis (IFPUG)



 



7.37



 



54



Measurement of requirements changes



 



7.37



 



55



Formal architecture for large applications



7.36



 



56



Best-practice analysis before start



 



7.33



 



57



Reusable feature catalog



 



 



7.33



 



58



Quality function deployment (QFD)



 



7.32



 



59



Specialists for key skills



 



 



7.29



 



60



Joint application design (JAD)



 



 



7.27



 



61



Automated test coverage analysis



 



7.23



 



62



Re-estimating for requirements changes



 



7.17



 



63



Measurement of defect severity levels



 



7.13



 



64



Formal SQA team



 



 



 



7.10



 



65



Inspections (test materials)



 



 



7.04



 



66



Automated requirements analysis



 



7.00



 



67



DMAIC



 



 



 



 



7.00



 



68



Reusable construction plans



 



 



7.00



 



69



Reusable HELP information



 



 



7.00



 



70



Reusable test scripts



 



 



 



7.00



 



 



 



 



 



 



 



 



 



 



Good Practices



 



 



 



 



 



71



Rational Unified Process (RUP)



 



6.98



 



72



Automated deployment support



 



 



6.87



 



73



Automated cyclomatic complexity analysis



6.83



 



74



Forensic analysis of cancelled projects



 



6.83



 



75



Reusable reference manuals



 



 



6.83



 



76



Automated documentation tools



 



 



6.79



 



77



Capability Maturity Model (CMMI Level 5)



6.79



 



78



Annual training (technical staff)



 



 



6.67



 



79



Metrics conversion (automated)



 



 



6.67



 



80



Change review boards



 



 



6.62



 



81



Formal Governance



 



 



 



6.58



 



82



Automated test library control



 



 



6.50



 



83



Formal scope management



 



 



6.50



 



84



Annual training (managers)



 



 



6.33



 



85



Dashboard-style status reports



 



 



6.33



 



86



Extreme programming (XP)



 



 



6.28



 



87



Service-oriented architecture (SOA)



 



6.26



 



88



Automated requirements tracing



 



6.25



 



89



Total cost of ownership (TCO) measures



6.18



 



90



Automated performance analysis



 



6.17



 



91



Baselines for process improvement



 



6.17



 



92



Use cases



 



 



 



6.17



 



93



Automated test case generation



 



 



6.00



 



94



User satisfaction surveys



 



 



6.00



 



95



Formal project office



 



 



 



5.88



 



96



Automated modeling/simulation



 



 



5.83



 



97



Certification (six sigma)



 



 



5.83



 



98



Outsourcing (maintenance => CMMI 3)



5.83



 



99



Capability Maturity Model (CMMI Level 4)



5.79



 



100



Certification (software quality assurance)



5.67



 



101



Outsourcing (development => CMM 3)



 



5.67



 



102



Value analysis (intangible value)



 



5.67



 



103



Root-cause analysis



 



 



 



5.50



 



104



Total cost of learning (TOL) measures



5.50



 



105



Cost of quality (COQ)



 



 



5.42



 



106



Embedded users in team



 



 



5.33



 



107



Normal structured design



 



 



5.17



 



108



Capability Maturity Model (CMMI Level 3)



5.06



 



109



Earned-value measures



 



 



5.00



 



110



Unified Modeling Language (UML)



 



5.00



 



111



Value analysis (tangible value)



 



 



5.00



 



 



 



 



 



 



 



 



 



 



Fair Practices



 



 



 



 



 



112



Normal maintenance activities



 



 



4.54



 



113



Rapid application development (RAD)



 



4.54



 



114



Certification (function points)



 



 



4.50



 



115



Function point analysis (Finnish)



 



 



4.50



 



116



Function point analysis (Netherlands)



 



4.50



 



117



Partial code reviews



 



 



4.42



 



118



Automated restructuring



 



 



4.33



 



119



Function point analysis (COSMIC)



 



4.33



 



120



Partial design reviews



 



 



4.33



 



121



Team Wiki communications



 



 



4.33



 



122



Function point analysis (unadjusted)



 



4.33



 



123



Function points (micro .001 to 10)



 



4.17



 



124



Automated daily progress reports



 



4.08



 



125



User stories



 



 



 



3.83



 



126



Outsourcing (offshore => CMM 3)



 



3.67



 



127



Goal-question metrics



 



 



3.50



 



128



Certification (project managers)



 



 



3.33



 



129



Refactoring



 



 



 



3.33



 



130



Manual document production



 



 



3.17



 



131



Capability Maturity Model (CMMI Level 2)



3.00



 



132



Certification (test personnel)



 



 



2.83



 



133



Pair programming



 



 



 



2.83



 



134



Clean-room development



 



 



2.50



 



135



Formal design languages



 



 



2.50



 



136



ISO Quality standards



 



 



2.00



 



 



 



 



 



 



 



 



 



 



Neutral Practices



 



 



 



 



 



137



Function point analysis (backfiring)



 



1.83



 



138



Use Case points



 



 



 



1.67



 



139



Normal customer support



 



 



1.50



 



140



Partial governance (low risk projects)



 



1.00



 



141



Object-oriented metrics



 



 



0.33



 



142



Manual testing



 



 



 



0.17



 



143



Outsourcing (development < CMM 3)



 



0.17



 



144



Story points



 



 



 



0.17



 



145



Low-level languages (current)



 



 



0.00



 



146



Outsourcing (maintenance < CMM 3)



 



0.00



 



147



Waterfall development



 



 



-0.33



 



148



Manual change control



 



 



-0.50



 



149



Manual test library control



 



 



-0.50



 



150



Reusability (average quality materials)



 



-0.67



 



151



Capability Maturity Model (CMMI Level 1)



-1.50



 



152



Informal progress tracking



 



 



-1.50



 



153



Outsourcing (offshore < CMM 3)



 



-1.67



 



 



 



 



 



 



 



 



 



 



Unsafe Practices



 



 



 



 



 



154



Inadequate test library control



 



 



-2.00



 



155



Generalists instead of specialists



 



-2.50



 



156



Manual cost estimating methods



 



-2.50



 



157



Inadequate measurement of productivity



-2.67



 



158



Cost per defect metrics



 



 



-2.83



 



159



Inadequate customer support



 



 



-2.83



 



160



Friction between stakeholders and team



 



-3.50



 



161



Informal requirements gathering



 



-3.67



 



162



Lines of code metrics (logical LOC)



 



-4.00



 



163



Inadequate governance



 



 



-4.17



 



164



Lines of code metrics (physical LOC)



 



-4.50



 



165



Partial productivity measures (coding)



 



-4.50



 



166



Inadequate sizing



 



 



 



-4.67



 



167



High-level languages (obsolete)



 



 



-5.00



 



168



Inadequate communications among team



-5.33



 



169



Inadequate change control



 



 



-5.42



 



170



Inadequate value analysis



 



 



-5.50



 



 



 



 



 



 



 



 



 



 



Worst Practices



 



 



 



 



 



171



Friction/antagonism among team members



-6.00



 



172



Inadequate cost estimating methods



 



-6.04



 



173



Inadequate risk analysis



 



 



-6.17



 



174



Low-level languages (obsolete)



 



 



-6.25



 



175



Government mandates (short lead times)



-6.33



 



176



Inadequate testing



 



 



 



-6.38



 



177



Friction/antagonism among management



-6.50



 



178



Inadequate communications with stakeholders



-6.50



 



179



Inadequate measurement of quality



 



-6.50



 



180



Inadequate problem reports



 



 



-6.67



 



181



Error-prone modules in applications



 



-6.83



 



182



Friction/antagonism among stakeholders



 



-6.83



 



183



Failure to estimate requirements changes



-6.85



 



184



Inadequate defect tracking methods



 



-7.17



 



185



Rejection of estimates for business reasons



-7.33



 



186



Layoffs/loss of key personnel



 



 



-7.33



 



187



Inadequate inspections



 



 



-7.42



 



188



Inadequate security controls



 



 



-7.48



 



189



Excessive schedule pressure



 



 



-7.50



 



190



Inadequate progress tracking



 



 



-7.50



 



191



Litigation (non-compete violation)



 



-7.50



 



192



Inadequate cost tracking



 



 



-7.75



 



193



Litigation (breach of contract)



 



 



-8.00



 



194



Defect potentials > 6.00 per function point



-9.00



 



195



Reusability (high defect volumes)



 



-9.17



 



196



Defect removal efficiency < 85%



 



-9.18



 



197



Litigation (poor quality/damages)



 



-9.50



 



198



Litigation (security flaw damages)



 



-9.50



 



199



Litigation (patent violation)



 



 



-10.00



 



200



Litigation (intellectual property theft)



 



-10.00



 

It should be realized that table 2 is a work in progress. Also, the value of table 2 is not in the precision of the rankings, which are somewhat subjective, but in the ability of the simple scoring method to show the overall sweep of many disparate topics using a single scale.

Table 2 is often used as a quick evaluation method for software organizations and software projects. From interviews with project teams and software managers, the methods actually deployed are checked off on table 2. Then the numeric scores from table 2 are summed and averaged.

A leading company will deploy methods that, when summed, total to more than 250 and average more than 5.5. Lagging organizations and lagging projects will sum to less than 100 and average below 4.0. The worst average encountered so far was only 1.8 and that was done as background to a lawsuit for breach of contract. The vendor, who was also the defendant, was severely behind in the use of effective methods and practices.

Note that the set of factors included are a mixture. They include full development methods such as team software process (TSP) and partial methods such as quality function deployment (QFD). They include specific practices such as “inspections” of various kinds, and also social issues such as friction between stakeholders and developers. They also include metrics such as “lines of code,” which is ranked as a harmful factor because this metric penalizes high-level languages and distorts both quality and productivity data. What all these things have in common is that they either improve or degrade quality and productivity.

Since programming languages are also significant, it might be asked why specific languages such as Java, Ruby, or Objective C are not included. This is because as of 2011 more than 2,500 programming languages exist, and new languages are being created at a rate of about one every calendar month.

In addition, a majority of large software applications utilize several languages at the same time, such as JAVA and HTML, or combinations that may top a dozen languages in the same applications. There are too many languages and they change far too rapidly for an evaluation to be useful for more than a few months of time. Therefore, languages are covered only in a general way: are they high-level or low-level, and are they current languages or “dead” languages no longer in use for new development.

Unfortunately, a single list of values averaged over three different size ranges and multiple types of applications does not illustrate the complexity of best-practice analysis. Table 3 shows examples of thirty best practices for small applications of 1,000 function points and for large systems of 10,000 function points. As can be seen, the two lists have very different patterns of best practices.

The flexibility of the agile methods is a good match for small applications, while the rigor of TSP and PSP is a good match for the difficulties of large-system development.

Table 3: Best Practice Differences between 1,000 and 10,000 Function Points


cpsep11-1
It is useful to discuss polar opposites such as best practices and worst practices which are on opposite ends of the spectrum.

The definition of a “worst practice” is a method or approach that has been proven to cause harm to a significant number of projects that used it. The word “harm” means either degradation of quality, reduction of productivity, or concealing the true status of projects. In addition, “harm” also includes data that is so inaccurate that it leads to false conclusions about economic value.

Each of the harmful methods and approaches individually has been proven to cause harm in a significant number of applications that used them. This is not to say that they always fail. Sometimes, albeit rarely, they may even be useful. But in a majority of situations, they do more harm than good in repeated trials.

What is a distressing aspect of the software industry is that bad practices seldom occur in isolation. From looking at the depositions and court documents of lawsuits for projects that were cancelled or never operated effectively, it usually happens that multiple worst practices are used concurrently.

From data and observations on the usage patterns of software methods and practices, it is distressing to note that practices in the harmful or worst set are actually found on about 65 percent of U.S. software projects as noted when doing assessments. Conversely, best practices that score 9 or higher have only been noted on about 14 percent of U.S. software projects. It is no wonder that failures far outnumber successes for large software applications!

From working as an expert witness in a number of breach-of-contract lawsuits, I have observed that many harmful practices tend to occur repeatedly. These collectively are viewed by the author as candidates for being deemed “professional malpractice.” The definition of professional malpractice is something that causes harm that a trained practitioner should know is harmful and, therefore, shoukd avoid using it.

Following are thirty issues that have caused trouble so often that the author views them as professional malpractice, primarily if they occur for applications in the 10,000 function point size range. That is the range where failures outnumber successes and where litigation is distressingly common. Only one of fifteen lawsuits where the author worked as an expert witness was smaller than 10,000 function points.

Table 4: Candidates for Classification as “Professional Malpractice”



1



Defect removal efficiency < 85%



2



Defect potentials > 6.00 per function point



3



Reusability (uncertified with high defect volumes)



4



Inadequate cost tracking with “leakage” > 40% of actual costs



5



Excessive schedule pressure by clients and executives



6



Inadequate or deceptive progress tracking that conceals problems



7



Inadequate security controls



8



Inadequate inspections of requirements, design, and code



9



Inadequate defect tracking methods that starts late



10



Failure to estimate requirements changes



11



Error-prone modules in applications



12



Inadequate problem reports



13



Inadequate measurement of quality



14



Rejection of estimates for business reasons by clients or executives



15



Inadequate testing with low coverage



16



Inadequate risk analysis prior to funding



17



Inadequate cost estimating methods



18



Inadequate value analysis



19



Inadequate change control



20



Inadequate sizing prior to funding



21



Partial productivity measures that concentrates on coding



22



Lines of code metrics (LOC) for economic analysis



23



Inadequate governance by corporate executives



24



Inadequate requirements gathering



25



Cost per defect metrics



26



Inadequate customer support



27



Inadequate measurement of productivity



28



Generalists instead of specialists for large systems



29



Manual cost estimating methods for large systems



30



Inadequate test library control

It is unfortunate that several of these harmful practices, such as “cost per defect” and “lines of code” are still used for hundreds of projects without the users even knowing that “cost per defect” penalizes quality and “lines of code” penalizes high-level languages.

Collectively, many or most of these thirty harmful practices are noted in more than 75 percent of software applications =>10,000 function points in size. Below 1,000 function points, the significance of many of these decline and they would drop out of the malpractice range.

Summary and Conclusions
The phrase “software engineering” is actually a misnomer. Software development is not a recognized engineering field. Worse, large software applications fail and run late more often than they succeed.

There are countless claims of tools and methods that are advertised as improving software, but a severe shortage of empirical data on things that really work. There is also a shortage of empirical data on things that cause harm.

The simple scoring method used in this article attempts to provide at least a rough correlation between methods and practices and their effectiveness, quality, and productivity. The current results are somewhat subjective and may change as new data becomes available. However, the scoring method does illustrate a wide range of results from extremely valuable to extremely harmful.

References and Suggested Readings
Bundschuh, Manfred and Deggers, Carol; The IT Measurement Compendium; Springer-Verlag, Heidelberg, Deutschland; ISBN 978-3-540-68187-8; 2008.

Charette, Bob; Software Engineering Risk Analysis and Management; McGraw Hill, New York, NY; 1989.

Ewusi-Mensah, Kweku; Software Development Failures; MIT Press, Cambridge, MA; 2003; ISBN 0-26205072-2276 pages.

Galorath, Dan; Software Sizing, Estimating, and Risk Management: When Performance is Measured Performance Improves; Auerbach Publishing, Philadelphia; 2006; ISBN 10: 0849335930; 576 pages.

Garmus, David and Herron, David; Function Point Analysis – Measurement Practices for Successful Software Projects; Addison Wesley Longman, Boston, MA; 2001; ISBN 0-201-69944-3;363 pages.

Gilb, Tom and Graham, Dorothy; Software Inspections; Addison Wesley, Reading, MA; 1993; ISBN 10: 0201631814.

Glass, R.L.; Software Runaways: Lessons Learned from Massive Software Project Failures; Prentice Hall, Englewood Cliffs; 1998.

International Function Point Users Group (IFPUG); IT Measurement – Practical Advice from the Experts; Addison Wesley Longman, Boston, MA; 2002; ISBN 0-201-74158-X; 759 pages.

Johnson, James et al; The Chaos Report; The Standish Group, West Yarmouth, MA; 2000.

Jones, Capers and Bonsignour, Olivier; The Economics of Software Quality; Addison Wesley, Boston, MA; 2012; ISBN 10: 0-13-258220-1; 587 pages.

Jones, Capers; Best Practices in Software Engineering; McGraw Hill, New York NY; 2010; ISBN 978-0-07-162161-8 660 pages.

Jones, Capers; Applied Software Measurement; McGraw Hill, 3rd edition 2008; ISBN 978-0-07-150244-3; 575 pages; 3rd edition due in the Spring of 2008.

Jones, Capers; Assessment and Control of Software Risks; Prentice Hall, 1994; ISBN 0-13-741406-4; 711 pages.

Jones, Capers; Software Quality – Analysis and Guidelines for Success; International Thomson Computer Press, Boston, MA; ISBN 1-85032-876-6; 1997; 492 pages.

Jones, Capers; Estimating Software Costs; McGraw Hill, New York; 2007; ISBN 13-978-0-07-148300-1.

Jones, Capers; Software Assessments, Benchmarks, and Best Practices; Addison Wesley Longman, Boston, MA; ISBN 0-201-48542-7; 2000; 657 pages.

Jones, Capers: “Sizing Up Software;” Scientific American Magazine, Volume 279, No. 6, December 1998; pages 104-111.

Jones, Capers; Conflict and Litigation Between Software Clients and Developers; Software Productivity Research, Inc.; Narragansett, RI; 2008; 45 pages.

Jones, Capers; “Preventing Software Failure: Problems Noted in Breach of Contract Litigation”; Capers Jones & Associates, Narragansett, RI; 2008; 25 pages.

Kan, Stephen H.; Metrics and Models in Software Quality Engineering, 2nd edition; Addison Wesley Longman, Boston, MA; ISBN 0-201-72915-6; 2003; 528 pages.

McConnell; Software Estimating: Demystifying the Black Art; Microsoft Press, Redmund, WA; 2006.

McConnell, Code Complete; Microsoft Press, Redmond, WA; 1993; ISBN 13-978-1556154843; 886 pages.

Pressman, Roger; Software Engineering – A Practitioner’s Approach; McGraw Hill, NY; 6th edition, 2005; ISBN 0-07-285318-2.

Radice, Ronald A.; High Qualitiy Low Cost Software Inspections; Paradoxicon Publishingl Andover, MA; ISBN 0-9645913-1-6; 2002; 479 pages.

Wiegers, Karl E.; Peer Reviews in Software – A Practical Guide; Addison Wesley Longman, Boston, MA; ISBN 0-201-73485-0; 2002; 232 pages.

Yourdon, Ed; Death March - The Complete Software Developer’s Guide to Surviving “Mission Impossible” Projects; Prentice Hall PTR, Upper Saddle River, NJ; ISBN 0-13-748310-4; 1997; 218 pages.

Yourdon, Ed; Outsource: Competing in the Global Productivity Race; Prentice Hall PTR, Upper Saddle River, NJ; ISBN 0-13-147571-1; 2005; 251 pages.

Web Sites
International Software Benchmarking Standards Group (ISBSG): www.ISBSG.org
International Function Point Users Group (IFPUG): www.IFPUG.org
Process Fusion: www.process-fusion.net
Project Management Institute (www.PMI.org)
Software Engineering Institute (SEI): www.SEI.org
Software Productivity Research (SPR): www.SPR.com

About the author

AgileConnection is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.