True Performance: Moving Beyond Basic Load Testing

[article]
Summary:
Basic load testing is valuable, but it's important to move past simplistic efforts. Here are some ways to gain more accurate metrics from your load tests.

Load testing takes a lot of time and effort to set up correctly. There are many factors to plan for and implement, and some of them can be quite subtle.

There’s a tremendous amount of value in getting even a rudimentary load testing scenario up and running quickly. Immediate feedback on the state of your system can reap great rewards, but in many cases, it’s extremely important to quickly move beyond the initial basic load testing effort. These simplistic scenarios can present an incomplete—or worse, outright misleading—picture of how your system is behaving under stress.

Taking the time and effort to move beyond basic load testing is a crucial step to ensure that stakeholders and customers receive accurate metrics in order to make correct decisions.

Baseline Data
One of the most critical, often overlooked aspects of getting an accurate picture of your system’s performance is the data you use when testing your system. This is unfortunate, because the size, trend, and “shape” of the data has a tremendous impact on so many things across your entire application: UI rendering, processing in the business layer, and, of course, the data tier. (Note: “Shape of data” is a phrase I’ve come up with to describe things like trends and distribution of data. For example, a social networking platform may be extremely heavy in blog usage but have little wiki or media content.)

If you’re working with a business-intelligence or data-analytics system, it’s simply unrealistic to have only a clean or empty data set when checking your load. I’ll actually go a step further and say it’s downright irresponsible. You need your scenarios to validate against the realistic demands of processing months or years of data. Similarly, if you’re trying to profile an e-commerce site, you want a realistic set of products, reviews, customer records, etc., to comprise your working data set.

Getting ahold of or creating your data sets is an important task for which you’ve got to plan and dedicate time. Are you going to create your data, or are you going to use a real-world data set? Both options bring their own sets of challenges and constraints.

Real-World Data, Real-World Headaches
If you’re lucky, you may be able to get ahold of a real-world data set. There’s nothing better than using data that represents exactly how the system is used in the wild! In the past, I’ve reached out to customers with whom I’ve had great relationships. Using a “live” data set from a customer often means coming up with some scripts to sanitize the data. You want to ensure that you’re respecting and protecting the customers’ sensitive data, and sometimes this data may have potential legal liabilities attached to it.

If you’re sanitizing a real data set, you’ll need to ensure that you’re not changing the trend, size, or shape of the data. If you’re trying to eliminate potentially sensitive discussion threads from a company forum or mail list, you’ll want to ensure that you replace the discussions with text that’s similar in word count. The same example goes for other types of data, like documents, media files, etc.

You’ll also want to avoid changing dates around data-creation events, because significant impacts might happen across your system—e.g., a trend-analysis routine might run blisteringly fast when there’s only one date but fall apart when pointed at data distributed across several years.

User Comments

1 comment
Mihai Iuga's picture
Mihai Iuga

I agree but not all the way.

First: after reading this article is not clear to me if the author sees a difference between Performance, Load and Stress testing.

Actually, looking back at his name, I am sure the author knows that difference very well, but the message in the article it is confusing: In the second paragraph reads about Simplistic Load scenarios that are not giving the correct picture on how the system is behaving under stress. Well, let's decide if we are talking Load test or Stress test!

Another thing that I do not agree with is the general idea (in this article) that is better to eliminate simple test scenarios.

I consider this approach not practical. And the article is misleading here.

From my seat this article seems to promote only a complex of concerted scenarios that is supposed to "imitate" the life. Demiurgic task!

I consider this way of testing very useful indeed but not enough: I wanna know the maximum number of users that can be logged in at once. Or I wanna know maximum numbers of products that can be in a shopping cart. Or I wanna know how many users can hit the logging page in 1 second without errors. I can go like that for minutes but what I want to convey is: Simple scenarios are important on defining the performances. They give the tester the possibility to "divide and conquer" and to the developer the possibility to evaluate improvements of specific functional tasks.

Your Complex of Scenarios is giving nothing of these.

Think of a new model for a car: how is it tested for Load? Is it driven hard for 1 hour in different roads, pavements, driving styles, weather conditions, traffic conditions, etc? Of course that is done but beside that there are a lot more tests on Load (performance eventually). I wrote "car" because the industry is very mature, competitive, dynamic and has a solid engineering foundation (which is missing to Software Industry).

Shortly: I consider the article useful because emphasizes on side of the truth. I consider the article poisonous for the same reason.

Read it and be aware of its shortcomings.

July 6, 2012 - 11:22am

About the author

Jim Holmes's picture Jim Holmes

Jim Holmes is the test studio evangelist at Telerik. He has over twenty-five years in the IT field in a wide range of positions. Jim has worked in the US Air Force, Department of Defense, software consulting, and commercial software products. He’s a passionate advocate of test automation. He co-authored Windows Developer Power Tools and blogs at FrazzledDad.com. Jim is the president of the board of directors for the CodeMash conference held in the middle of winter at an indoor waterpark in Sandusky, Ohio. Email him at jim.holmes@telerik.com.

AgileConnection is one of the growing communities of the TechWell network.

Featuring fresh, insightful stories, TechWell.com is the place to go for what is happening in software development and delivery.  Join the conversation now!