Back Up Your Hunches with Data
These questions will give you some clues as to the problem, but data will confirm your hunch (or point you in a different direction). I'm not talking about a measurement program here, but simple, "good enough" data gathered from observation or existing sources. Once you have data, look at it from several vantage points, or you may miss something important.
For example, at a large financial services company, operations analysts collected data on the number of minutes during which each application was unavailable during core business hours. They presented the data at a monthly management meeting using a pie chart. The chart clearly showed which application had the largest portion "outage minutes" for the month. Presenting the data in a pie chart facilitated putting the manager of the offending application on the hot seat at the management meeting. But, it didn't help much with understanding the problem, or knowing where to look for solutions.
By the time I arrived to help them solve their problems, they had many months of data, and looking at it differently provided useful information.
I used the questions above to guide how I viewed the data. I looked at the total number of outage minutes each month. I look at the trends for each application. I looked at outages in time series. But, the data didn't answer the question, what is the impact on our organization? I set out to answer that question, and found that not all outages were equal. A day-long outage in a specialized application only affected ten people. A five-minute outage in another system meant everyone in the department—several hundred people—couldn't access any applications, but it had only happened once. A short outage in an application used by one-third of the department that happened every day accounted for most of the outage minutes over time. Knowing this didn't solve the problem, but it told them where to focus their fixing.
Quick-fixers may bemoan the amount of time I spent cleaning the data, looking at it this way and that way, asking more questions and gathering more data. It took about a week. Of course it did take a bit more time to repeat this process at the application level. Still, taken in total, the time spent asking questions and gathering data was far less than the time that had passed while management tried to understand the situation with data presented in way that hid useful information—and less then the outage minutes for all affected employees for one month.
Generate At Least Three Candidate Solutions
Now, it's time to look for possible solutions. Quick-fix thinking conditions us to choose the first idea that's even remotely plausible. Don't dismiss your first idea, but don't stop there either. Develop at least three candidate solutions. You may go back to your first idea, but by developing additional options, you'll understand the problem better. If you have difficulty thinking of more than one candidate solution, turn your thought process upside down and ask, "How could we make the situation worse?" That paradoxical question almost always jiggles more ideas loose.
As part of developing a solution, identify at least ten things that could go wrong with each candidate solution. Looking at the downsides might send you back to generating more options. Sometimes the best solution isn't the most elegant; it's the one with the fewest or least objectionable downsides.
Once you have several options to choose from, choose one and put it motion. Chances are it won't work out quite as planned—and that's an opportunity for learning. Observe and gather data, re-adjust, and try again.
Every solution carries the seed to the next problem. That's a given. When you apply systematic thinking to the problem, it's less likely that next problem will compound the problem you were trying to fix in the first place.