Translating code to blackbox eperience and vice versa

Some people wonder how I was able to find a lot of critical bugs in some of the applications that I have tested.

1) The biggest thing is to know and understand the architecture of the application or feature that you are testing. Why? Because it’s important to understand how things technically work and figure out where the weakpoints are. It’s very much like figuring out where the pipeline is for your house and figure out where you might end up having a leak to prevent a leak. You don’t want to just dig and breaking at places that do not have pipes. It doesn’t make sense. In the same aspect software testing is the same manner. It doesn’t make sense to create tests that do not fully test the feature. Pipeline is easy to see where things are at the end, because of the faucets, toilets, drains, etc; but to see how things are flowing in your house might be a bit more difficult without a blueprint. In the same manner, it’s about taking a look at the design of how things are suppose to function and where things are suppose to work together. In companies that have design specifications, UI and/or code, I take a long time to study them; to understand them. The time it takes for me to study them, the longer the pay return in the bugs that I will write. Talking to developers to understand where the design flaws might be, can potentially help reduce the risk of introducing bugs in the first place. In pipelining, where are things going to show weakness? change in pressure in the water due to heat differences, or flow? etc? I’m not a plumber and very little experience in fluid flow, so I can’t really answer those questions in pipelining. But in a similar sense, software QAing is the same issue. Where in the design are going to show weakness? and why? The biggest thing I can say is, if you understand code/know how to code, take a look at the code design, think on how you would program it, and then thing where you would mess up the most. This leads into the next point.

2) Critical Weaknesses :
In most cases, I’ve seen in my work experience, bugs often occur because of unexpected situations, even more specifically, where variables take on unexpected values. In most crashing or hang cases, this usually has to do with memory issues or resource conflicts.
a) Out of memory issues
b) null pointers
c) shifting of bit
d) buffer overflow
e) mutex lock

3) Real world experience/examples to help to understand these issues better:
a) How does an out of memory issue look? How about zombie compartments? More over how about when a head pointer gets dereferenced before the rest of the pointers get dereferenced? How do we get there? This is where looking at design helps. Is there any way where you can create an unexpected chain of pointers from looking at the design and then somehow remove the head pointer? How about when you go to a certain website and it does some funky flash stuff, and while it’s creating close the window?
b) initialization of pointers, may cause crashes. Where can you find initialization issues? How about a brand new profile? Either the end user profile on the OS level is new, or the browser’s profile is new. How about starting up after installing application for the first time?
Dereferencing in an untimely fashion can be bad as well. How about starting a video and quitting the application? or closing the window? In some applications, quitting an application might not necessarily go through the same code path of closing a window first. Don’t assume that some similar ways of doing things are going to be the same.
c) Shifting bits in the wrong way may end up corrupting the wrong portion of memory. How can you detect a corruption issue? Run an endurance test; run a feature over and over again and compare it from the first time ( after rebooting the machine, startingup the app) it was run versus the next time. or the following. It takes patience to see this information.
d) Buffer overflow. Can you put more characters than what the limit has for an input? How about fields that do not really specify a length? Does it flow into the next variable?
e) mutex lock : are there any files or processes that can some how interact at the same time? ie, is there one program that can access the same file as another program in the suite? When do those applications access the same files? How about processes? Esp. when it comes to networking, and multithreading, it can become challenging to figure out. Look at the design and try thinking of ways that things might collide.

4) Putting two and two together:
a) If you do have a crash or a hang, take a look at the signature. Then take a look at the code. You have to trace the code in reverse given a crash stack. I recall when I was programming back in school and I had a crash, it would tell me where in the program it crashed, but I didn’t have a crash stack. I had to start with where it crashed, and then go through the application backwards from where I crashed to figure out why. Part of what also helped was that I also knew steps in how I ended up the crash and realize that knowing how to get to the crash and the reproducible steps also helped me go through and figure out which part of the code it was going through to get me to the crashing part of the code faster.
From the examples of what I saw, I would then think is it possible to get a crash and hang based on the above in number 3?

5) Build your repertoire of knowledge. Every company has a bug database. Look to see if there’s patterns within existing and fixed bugs and map to your product. How did crashing occur? Are there steps to repro? How was the crashing fixed?

This is what’s sexy about doing QA. It’s like an unsolved puzzle that you’re trying to solve or being a detective trying to track down evidence of problems. It takes time to solve the puzzle, but in the end it’s satisfying to resolve crashing issues so the end user doesn’t have to hit them when the product is finished.

Filed under: mobifx, mobile, Planet, QA, QMO