about six to eight appliances active sixteen hours a day, with test technicians and engineers running tests in two shifts, while the load generation programs ran twentyfour hours a day.
For both automated and manual testing, our techniques were primarily behavioral (black box). The development team performed extensive unit and component testing, applying structural (white box) tests to their code. In this way, our independent test team efforts complemented their own testing, resulting in a bug find ratio of 30:70, respectively.
Figure two also illustrates our test team composition. The two in-house test engineers focused on developing and supervising the execution of test cases. The six test technicians ran the test cases, reported results, and handled “pushing” software updates to the devices. One usability test engineer focused on human factors and Internet harmonization issues. Two test toolsmiths developed and ran the automated load and performance tools.
We hired people who understood the customer’s outlook. Our test team was composed almost entirely of seasoned behavioral test engineers, non-technical usability professionals, and technical support professionals; only the test toolsmiths and the server test engineer were experienced software developers.
6. Bugs: Where They Arose, What They Affected
In terms of defects (bugs), it is interested to look at where the faults—the actual code errors—occurred and contrast that with where the failure manifested itself. These analyses are shown in figures three and four. Looking at defect root cause data for both structural and behavioral testing in terms of the culpable subsystem—where the fault was found—we found that about 2/3rds of the bugs resided in the intrinsic locus of quality, 1/4th in the systemic locus, and 1/12th in the harmonic locus. While we do not have new and-changed function-point or line-of-code metrics available, these proportions fit intuitively with amount of development effort expended on the client software as compared to the less-extensive customization of commercial off-the-shelf systems on the server side.
For the behavioral testing alone—which more closely resembles customer usage—if we look at which locus of quality experienced the failure mode, we see a different picture. Slightly less than half the defects found by the test team occurred during intrinsic tests, and only a little more than 1/5th of the defects surfaced during systemic tests. About three in ten defects were found as part of harmonic testing.
7. Considerations for Internet Appliance Software Quality
The subject of defect data brings us to the lessons learned about achieving quality for an Internet appliance. We present here some interesting and perhaps surprising discoveries about factors that support a seamless customer experience of Internet appliance quality:
- There is such a thing as “enough” CPU, memory, and local storage. Unlike the typical PC, we found that limited-functionality applications both presented a simpler interface and responded with acceptable performance in tight quarters.
- There is, however, no such thing as “too much” bandwidth. Telephone lines, while ubiquitous and non-threatening to computer-averse customers, provide inconsistent speed and connection capabilities with analog modems, which makes the data communication slow and fickle. Telephone line management features like call waiting, off-hook detection, and central office voice mail systems provide additional challenges.
- Getting the UI right is hard. Usability studies, Beta programs, frequent interface tweaks, and a few complete redesigns of some applications were needed to obtain a simple user interface. The hardware elements are critical also; the wrong keyboard or mouse can make the most intuitive workflow and screen presentation hard to use.
- Reliability and stability don’t happen by accident. An Internet appliance is hardly an "invisible computer" if the browser dies, the e-mail application goes dead, or updates of software or data