I still see many testers talking about the number of bugs found as some sort of barometer of success in terms of effective testing. But lately I’ve seen this framed around the “quality” of bugs found, rather than just their quantity. Still, you have to be a bit careful here. Let’s talk about this.
The common idea here is that testing is defined in relation to the bugs it finds.
I already talked about some of these ideas at length regarding a bug-hunting focus. I don’t want to repeat all that so I’m going to be quite a bit more concise here. I’m going to frame this entirely around how I think about and present these ideas with teams I work with.
What matters for me in testing is enabling people to make better decisions sooner by using a tight feedback loop for the duration between when we make mistakes and when we find them. Those mistakes are areas where various qualities — behavior, performance, security, usability, etc — are either not present or have degraded.
That definitely isn’t measured by the number of bugs found but, arguably, the nature of the bug certainly does come into play. I want to find those things that threaten value the most. That likely relates to the “quality” of bugs, meaning those that are more important; more value-threatening.
Key to this being effective is finding those bugs as quickly as possible, which means, ideally, as quickly as they introduced. My point there being that even the “quality of a bug” itself is not just a single measure necessarily.
Given that focus on quality of bugs, does the quantity of bugs found not matter at all? Well, clearly it can matter, right?
In particular, the number of bugs found does matter if you think about things temporally. For example, let’s say there are ten critical bugs lurking in the system. And I find two of them really quick. Well, that’s great! Two critical bugs found close to the time of introduction. But I don’t find the others before deployment. So that means eight critical bugs remain unfound. In that case, the number of bugs is a key measure — we might just not realize it yet.
So what this gets into is not the number of bugs or the quality of bugs necessarily but the mechanisms I have in place to recognize absence of qualities or the degradation of qualities. And there are a lot of possibilities here.
- Those mechanisms may be really good at finding the bugs — but take too long to do so.
- Or those mechanisms might be really bad at finding the bugs — but, hey, they finish quick at least.
- Or the mechanisms have certain “biases” built in such that we tend to notice certain quality issues at the expense of others.
- Or my mechanisms may be such that others don’t see the value and thus don’t utilize them, which means what we could have found is not found.
- Or my mechanisms may run well, may run quickly, may find lots of issues, but the feedback from them is such that people feel they can’t make decisions based on them.
Focusing on these aspects, as well as others, is important because it helps get around some of the traditional problems that crop up. Such as those situations where I run a bunch of tests and don’t find any bugs. That can be great … if there are no bugs. But it can be bad if there are bugs, but all those tests simply didn’t encounter them. Or those situations where I run a bunch of tests that are very good at finding relatively unimportant bugs but are missing more subtle bugs.
I, and my delivery team, have to make decisions based on the results of our tests. That means we have to trust what those tests are telling us. Thus we have to trust the detection methods that we have put in place around those tests.
Some of those detection methods will be inherent to the tests themselves: what they look for and how they look for it. Other detection methods be inherent to the environment itself and the testability we put in place, which can help determine how likely or possible it is that tests will trigger a bug should one exist.
So in closing this brief post, I’ll say that I actually don’t want to define testing by any one thing. But if you are going to do that, define testing more by its detection abilities and not by the quantity, or even “quality”, of the things it detects.
I’ve found that relatively inexperienced (or just unthoughtful) testers can read that and see very little difference between the idea of the detection abilities and the results of those detection abilities (i.e., what is found by them). Specialist testers, however, should very much understand that distinction.
Being able to understand and articulate that distinction is why specialist testers can go beyond the unhelpful (or so I argue) distinction between “checking” and “testing” and instead frame the discussion around detection methods, both human and automated, that allow testing to help people make better decisions sooner.