Offers a comprehensive treatment of the techniques and practices of systems reliability and failure prevention, without the use of advanced mathematics. Features real-world examples from communication networks, aircraft and missile systems, the process industry, and satellite missions. The book helps professonals set reliability requirements for a new product, monitor compliance with these requirements during development and later life-cycle phases, account for software failures in an integrated reliability assessment, and allocate a fixed reliability improvement budget to guide decisions by cost considerations and trade-offs.
Review By: P.L. Kosel
07/08/2010I enjoyed Herbert Hecht's newest book "Systems Reliability and Failure Prevention." Hecht takes a complex topic and distills it into an easy-to-read reference. His examples, including those distorted by excess media attention (i.e., Mars mission failures), break down into straightforward examinations of "failures in action." This ability to tie real-world examples into the theoretical toolset lends an extra air of legitimacy to the information presented.
This book presents a toolbox of theoretical and practical skills useful to reliability engineers in many different disciplines. Unfortunately for the majority of StickyMinds.com readers, the book targets large-scale system implementations and also focuses on hardware and hardware/software hybrids (i.e., electronics and flight control systems). One chapter focuses on software reliability in which the information provided is high level and not revolutionary for anyone with a background in software quality assurance.
That said, if you are interested in hardware or hardware/software integration, you will find a lot of valuable information, ranging from the basics of reliability engineering to a detailed thesis on the organizational causes of system failure. Hecht supports the theories with chapters that teach practical application: analytics, testing, and redundancy in design. The descriptions of the theories and techniques are appropriately supported by mathematical, tabular, and graphical representations that enhance the scope of the toolset. In addition, a chapter on lifecycle implementation offers guidelines for establishing the appropriate role of reliability engineering in the process flow of a given project or process.
Finally, and perhaps most importantly, two chapters address one of the core issues surrounding reliability engineering–COST! Many of Hecht's examples are directly related to ill-fated cost savings measures that ultimately result in massive failures. He provides techniques for estimating the cost of failure and building a cost of reliability model. The techniques are presented in simple terms with easy-to-understand examples.
I didn't always agree with Hecht's assessment of failure cause and effect. I spent four years on one of the projects he identifies as a cost-based failure, which I vehemently blame on organizational failure. Regardless, Hecht presents and defends his theories completely and objectively. He is well versed and deeply committed to the topic.
It's unfortunate that the majority of the book is spent studying failures of large systems (spacecraft, airplanes, etc.) when so many of us deal more directly with smaller projects–more specifically, software projects. However, with some imaginative extrapolation, I think anyone with a strong enough interest will find useful information in this book. I know I did.