Test data has long been a challenge for testing; privacy legislation, identify theft, and the continued trend towards outsourcing has made it even worse. Just establishing and maintaining a comprehensive test environment can take half or more of all testing time and effort. In this week's column, Linda Hayes adds in the new and expanding privacy laws that inevitably limit your testing options. Yet from the quagmire of laws and company standards, better testing can emerge.
In the old days, production could provide a refresh from time to time for your test bed. Although this was not easy, it was a starting point. You still had to make sure you either had the storage to get a full copy or could extract a coherent subset, and of course you could not reuse data because it was a moving target. And even though production was arguably a good sample, there was no guarantee that all your potential test conditions existed. Finding the right conditions to satisfy a test was the proverbial needle in a haystack.
This was already hard, so what do you have to worry about now?
For starters, when you visit the doctor you have yet more paperwork to complete, authorizing (or not) the disclosure of your medical information, thanks to the Health Insurance Portability and Accountability Act of 1996. The rules set by this act highly restrict who can access patient data, as well as when and why. Since your medical provider has to have permission to disclose your data to other health professionals or family members, it can't be provided to perfect strangers who are testing the latest release of software for a health insurance company or hospital management firm.
As a tester, if your application touches this information in any way, you can't use unconditioned production data for testing. You can scrub names or social security numbers, which is not a necessarily new technique (although it's not always done). But now you have to worry about less obvious elements that can give the patient away, for example, a policy or claim number can tie back to an actual person through another system.
That means any software that touches money or where it lives also has to be kept private. Again, not just your social security number, but other data that could be traced back to you even indirectly. Bank or credit card numbers are obvious, but what if an order number could be tied back to an invoice that could identify you?
All of this comes back in some way or another to your identity. Identify theft is a huge problem. Consumers and credit card companies are getting smarter and Web-exposed systems are becoming more secure, but all of the software that powers these relationships has to be tested, which means testers may have access to data that they otherwise would never be authorized to see.
I recall one project where we tested upgrades to a human resource system. We had to battle with the system administrator to give us supervisory password access to the test server so we could theoretically hire and fire employees, change their salaries, etc. The network support group resisted because the information was highly confidential, but we argued that the testing region was different from production. As a practical matter, we had to exercise every available function. As a compromise, we received regular refreshes from production, which we had to scrub.
Granted, legislation is catching up and law enforcement is waking up, so protection is becoming a priority...at least in some places. But what if testing