I’ve already posted thoughts about “testing vs checking” and I made a very recent update to that post based on some comments by Michael Bolton. As part of that discussion, Michael said the following as a way of framing the distinction and argument:
“Checking is *inside* testing, distinct but not opposite, just as chewing is distinct from eating, but also a part of eating. The parallel is interesting to me, since people who think that chewing is the central issue in eating will tend to be very excited about knives and blenders and food processors (or Soylent!), but may be unconcerned about taste or nutrition or presentation.”
I had responded essentially as such: with that very example I would be curious how many people — whether they do think of “chewing as the central issue” — truly let their desire for knives/blenders overcome any thoughts about taste, nutrition, or presentation. And even then we could argue their focus is really on sustenance and thus really and truly on survival, regardless of all those other things. But, even granting all that, we don’t always eat just to survive or even because we technically require sustenance.
So what we often do is just call it what it is: eating. And we let context decide what we are doing or what particular thing matters more. And during that, I don’t know too many people who wrestle with whether or not “chewing is the central issue.” And those who do wrestle with that issue — well, yes, they very well may focus on knives and blenders while that is their focus. Which doesn’t mean they are focusing on that to the exclusion of everything else.
The contention seems to be that testers, when they focus on tools, are doing so to the exclusion of all or most other elements of testing. Are they? Some do, no doubt. We could tell those latter people: “Hey, you’re doing checking. And that’s valuable up to a point. But what we want you to focus on is testing.” If those people are the ones you’re trying to reach and if you think that’s truly the problem, then perhaps that distinction isn’t so bad.
The Purely Algorithmic
Michael Bolton also said this in response to me:
Our way is straightforward, I believe: algorithmic operation and observation of the product; algorithmic application of a decision rule; algorithmic reporting of the outcome of the application of the decision rule. The key idea here, obviously, is algorithm: a check is the part of a test that can be performed entirely algorithmically (by a human or by a machine).
Okay, I get all that. The check is basically just a part of a test; the part that can be performed entirely without a human but also by a human. But … I still don’t feel the need to introduce that term when, say, I’m running a test via automation or even when I’m running it as a human via a script. Yes, the tool is performing entirely algorithmically and, if following a script, so is the human. I have yet to find too many people who don’t actually know that. Just as people may treat “chewing” as the algorithmic part of eating, they don’t confuse “chewing” with “eating” in a way that has any practical or substantive difference.
Continuing, Bolton said:
The value in this, it seems to me, is to emphasize the role of the human and the skills required for excellent design of both the test and the check within, and excellent interpretation and critical thinking about the outcome.
Yes, but I do — and have — emphasized the role of the human without introducing the word “check.” TDD has been emphasizing the role of the human as well and we still call it Test-Driven Development, even though “tests” and “checks”, to use the terminology, are interleaved in that approach.
So, really, it seems to come down to this: “a check is the part of a test that can be performed entirely algorithmically.” That’s it. If that’s truly all the “testing vs checking” people want to say — well, I guess I’m fine with that. So let’s say I say this:
“Parts of a test can be conducted entirely algorithmically, by which I mean without a human. I call that part a check.”
Great.
How have I done anything there to advance thinking about testing? Or let’s ask a different question. Since this testing / checking distinction has been floating around since 2009, by some of our most vocal, intelligent, and well-known practitioners, what changes have we seen that show this distinction has mattered in ways that are substantive? Career-enhancing? Philosophically edifying? Whatever barometer you want to use.
Is that unfair to ask? Well, I feel confident asking that kind of question because 2009 to now is a long time in our industry. And developers have introduced terms and concepts that have caught fire (or flamed out and died) in much shorter time frames and have empirically and demonstrably changed entire ways of thinking.
What The Problem Really Is (or May Be)
Consider James Bach’s recent article Six Things That Go Wrong With Discussions About Testing, most of which I entirely agree with. He says this:
“A better way to speak about testing is the same way we speak about development: it’s something that people do, not tools.”
Okay, let’s go with that. He also says:
“Tools help, but tools do not do testing. There is no such thing as an automated test. The most a tool can do is operate a product according to a script and check for specific output according to a script. That would not be a test, but rather a fact check about the product. Tools can do fact checking very well. But testing is more than fact checking because testers must use technical judgment and ingenuity to create the checks and evaluate them and maintain and improve them.”
In my experience, it’s not that people (in aggregate) deny that testing, like development, is a human activity. They are simply able to hold in their mind that sometimes we speak about tools “doing” development or tools “doing” testing and not get confused that there is a human behind literally all of these activities, including most importantly those activities that don’t use a tool.
So people, in my experience, aren’t denying that testing is a human activity; rather, they are often denying that they need tester roles to do it. They feel developers can do it. And then, when there is tooling required, those developers can most likely build what’s needed whereas that’s not always guaranteed with testers.
Test pundits keep arguing as if tools were the issue and it’s often not in my experience. It’s not a perception that testing is a purely mechanistic activity in all its aspects; rather, it’s the perception that a defined specialist tester role is not needed. That parallax is keeping our vocal pundits focused, I believe, on the wrong thing.
Make It Practical
The focus seems to be on how we shouldn’t conflate testing as human-level thinking with tool-level action. I get all that. But I rarely see testers having to worry about that when it comes to demonstrating their value.
What I do see — and I see this a lot — is testers who don’t know how to work with developers to put pressure on design. I see a lot of testers who struggle in microservice style architectures. I see testers struggle to become relevant in data science or machine learning contexts. I see testers struggling to fit in with a “DevOps context,” regardless of the use of tools. What I want to do is provide distinctions that help testers along these eminently practical routes. I want to provide distinctions that help foster better discussions with developers or business people about complex business domains. In other words, I’m looking at things in what I see as the most practical terms.
For me, the focus on “practical terms” means recognizing the parallax effect that while the problems people see seem to manifest during the use of tools, that’s not actually where the problem is.
The Testing Perception Problem, if we can call it such, is very real. But this does not mean that it leads to a Testing Can Be Replaced with Tools mindset. It can lead to a Testers Can Be Replaced With Tools mindset, which is partly followed from Developers Can (And Should) Be Doing Testing.
Notice that focus shift that often happens in this discussions: we shift from talking about testing and tester and apply the same argument to each when that’s not always what we should be doing.
Any Better Ideas?
People who know me or read me know that I frame testing as such: testing is a design activity and testing is an execution activity. That execution component can be done by humans or tools. Okay, great. But why is that any better?
Well, one reason is because I’m not avoiding the use of the word “testing” or breaking it up when it doesn’t need to be. What I’m providing is nuance to testing and not asking someone to repurpose the term “checking”, which is a word that is often associated with “testing” as a synonym. “Hey, we should check that out in the lab” is often the same as “Hey, we should run a test of that in the lab.” Just as “Hey, did you check that?” often means “Hey, did you test that out?”
Also: those aren’t my terms. Testing as a design activity is something most, if not every, developer relates to because it’s how they often taught about test driven development and the idea of tests putting pressure on code so that it is architected better. It is also used in microservice or distributed platform contexts to indicate putting pressure on the design of interfaces.
And testing as an execution activity really doesn’t need too much explanation. It is an execution activity. Tests can be executed. They can be executed by a human or by a machine. When that is done, it’s testing. But, of course, testing by a machine is absolutely going to lack elements of testing that could be done by a human. I don’t need “checking” to convey that thought. In fact, most people do get that, I think. They get that tools aren’t as smart as humans. They get that tools can’t think. The proliferation of automation — and the so-called “death” of manual testing — has little to do with people being confused about testing in that sense.
In most if not all of the discussions where I’ve found there to be confusion about what testing is, those discussion were furthered much more when I said testing was a design activity and an execution activity than if I would have said there is testing and there is checking. The confusions about testing go much deeper than the use of tools, as I hope I indicated above.
Example With “Continuous” Testing
As an example of this, the talk about “continuous testing” routinely comes up in Agile and DevOps contexts. Now imagine if I just talked about a “testing vs checking” distinction in those contexts. It doesn’t promote any sort of discussion. So I don’t do that. Here’s what I do.
I tell people that a key part to realize is that if you believe testing is a design activity as well as an execution activity, and if you further believe most mistakes come in from design rather than implementation, then testing is continuous in a much wider sense. In fact, it never stops. Then it becomes about the level of abstraction that you are dealing with.
That’s often not talked about because people tend to frame testing only by the execution activity part and then ask where in the pipeline executing tests can be placed. And those are important. But also important is to understand that part of why Agile and DevOps can work is because testing is used to put pressure on design — before, during, and after implementation.
That, in fact, would make testing truly continuous, as opposed to continual. That’s not a semantic debate I necessarily advocate getting into (even though that is a better semantic debate than “testing vs checking”), but calling out that point can make it clear what we are truly doing.
This is one example that I’ve cherry-picked but I can apply the same logic to every conversation I have around testing, including those that have some focus on tools that support testing. I talk in terms of design activities and execution activities. The former is done only by humans, the latter is done by humans and by tools. But when we relegate that activity to tools, we are settling for a very reduced approximation of testing. If you want to call that “very reduced approximation of testing” by the name “checking” — well, maybe that’s helpful in some contexts.
Example By Comparison
I also like to consider other disciplines that have, as their basis, the concept of testing. Which is basically any discipline that acts in the context of science, by which I mean there is investigation, experimentation, and exploration.
When particle physicists use colliders for their experiments, they call that testing. Another thing they call testing is their very human level analysis of the collision results or the decision of what to collide in the first place. When chemists uses biochemistry analyzers, they call that testing. Another thing they call testing is their very human level breakdowns of the characteristics in biological samples that they want to discern. I could give other examples from literally every other discipline that uses testing as its basis: archaeology, paleontology, botany, geology, ecology, oceanography, meteorology, embryology, comparative anatomy, biogeography, and so forth.
It seems to me that only in the software technology world — and then only among testers — has there been this “need” to clarify testing or get into semantic debates about it. That should tell testers something. Sadly it doesn’t seem to. And that’s surprising because testing in the technology world is really the only discipline that often feels like it’s under assault and having to justify itself. And, at least at a glance, that trend has accelerated the more testers seem to retreat from what seems most relevant to the problems that companies are facing.
What seems to get left out is that maybe some of the ways testers promote their own discipline is a large part of the cause of this overall trend and its acceleration.
Our Language Matters
One thing I do agree with pundits of the “testing vs checking” distinction is that words do matter. A discipline does require distinctions. But, along those lines, Liz Keogh published an article about acceptance criteria vs scenarios and while the article as a whole resonated with me, a comment from Antony Marcano provided some good insight around the distinction:
“For a language to become ubiquitous, growing an understanding of it should be easy. In my experience, having to tear-down someone’s existing definitions (only a limited articulation of their mental model) in order to replace them with new ones adds friction to that process. Using alternative jargon-free terms that people can almost immediately relate to in a (mostly) common way increases the chances that we find our way to a common understanding, relatively quickly.”
I’ve found that “growing the understanding” of checking has not been easy. Thus it has not become ubiquitous. The term “checking” is not really jargon-free because many people, in many disciplines, do equate “check” with “test.”
As part of that same comment, Antony said:
“What I think is important is not how we define or re-define these terms but which terms or questions are most effective at yielding the class of answer we need.”
In my experiences, debates about “testing vs checking” only matter to testers. And even then, testers don’t seem to be listening. They seem to be quoting and referencing the distinction; but, in the aggregate, it doesn’t seem to be changing behavior. Debates about “testing vs checking” matter not at all to most developers (in my experience) nor to most hiring managers who are ultimately making the decisions about whether they want or need testing done by specialist testers as part of their organization’s quest for value.
So … You’re Not Changing Your Mind?
This all may sound like I have zero desire to change my mind or my thinking. On the contrary, this is the second part of my opening salvo in confronting my own thinking.
This post was much longer than intended and for those who do like, adhere and promote the “testing vs checking” distinction, I hope that’s seen as a sign that I do feel this is a problem worth investigating. I’m taking this seriously and not being outright dismissive of the ideas.
Being perfectly honest with my readers, I still feel like I’m being too reactionary to this distinction and I’m not quite sure why. I’m certain of my observations — which is not the same as saying that I’m certain I’m correct — and I’m certain of the experiences I’ve seen when testers have focused on this to the exclusion of dealing with the more deeper entrenched perceptions.
But, again, my goal is always to question my own thinking and my own conclusions. I want to be as adogmatic as possible, skeptical (but practically rather than ruthlessly so), and I don’t want to become fundamentalist in my style of thinking or communication. That said, until I start having very different experiences or someone can show me some empirical evidence contrary to what I observe, I’m sticking with my current stance and doing what I can in the industry to get people past this bit of parallax.
Nice article I do agree with most of your points but I would like add few worlds:
“Yes, the tool is performing entirely algorithmically and, if following a script, so is the human. I have yet to find too many people who don’t actually know that.” I had one experience when it actually made difference: http://thebrokentest.com/test-vs-check/
But Latter I participated in the discussion which showed my that non-technical stakeholders that are quite new to the concept of testing tend to have this view. And in that case, Explaining Testing vs Checking is useful. But as far as I can say this is the only context where it is useful and it is still rare opportunity.
There is one thing I find somehow ironic:
My experience with followers of Context Driven Test School where in theory emphasises on context somehow can derail every discussion to make it about testing vs checking no mether how unimportant it was to the original topic.
– it is worth mentioning it is my personal experience I do believe that majority of Context-Driven Testers don’t do that.
Thank you for the experience point you provide. I do agree that some people will treat testing as being purely algorithmic in many or all cases.
I guess I feel that (1) those people are in the minority and (2) I can better educate them on testing without reference to the word “check” or “checking.”
In your article you state: “There are places where that distinction is useful – but there are few and far between.” I suppose that might be a viewpoint I have to come to if I’m to not seem at total cross-purposes with those who do promote the testing vs checking viewpoint.
But to take your example, the person you met said “What they are hoping is that automated test will find them [bugs].”
That’s where I would have said something like this:
“Okay, let’s step back. First can we agree that there are two main points where bugs are introduced: when we talk about what to implement and then when we implement it?”
Hopefully the person agrees.
“Okay, so now let me put this thought in your head. Testing is an activity that is made of two parts: design and execution. When testing is a design activity, it puts pressure on design to find bugs before we implement them in code. Or, at the very least, quite quickly after we have implemented them in code. That’s a purely human process.”
Hopefully the person still agrees.
“Now, when testing is treated as an execution activity, that’s where the automation you are talking about comes in. Humans can, of course, execute tests. But so can tools. But you are asking about having automation find the issues.”
Here I would have to stop and clarify with the person: do they mean find existing issues or new issues that were not previously known.
So here then we could talk about what automation can and cannot necessarily do. Can it find existing issues? Sure, if the proper test and data conditions are employed by a tool (just as they would be by a human being), then the automation could find those issues. Assuming, of course, there is an understanding of what the issue is. The tool has to be able to be told “Right” from “Wrong” just as a human would.
Could the tool find new issues? Well, that depends. Regression counts as a “new issue” in one sense. If, say, a form was working correctly but now started failing, automation could determine that. Could it find entirely new issues on its own. Well, trickier, right? I mean, you can employ techniques like inference engines, fuzz testing, and so on. But even then the automation is only going to find what it was programmed to find. Even if that programming was within a range of thresholds, it’s still limited to that whereas a human would not necessarily be.
All of this is testing as an execution activity and we can look for quality problems. But even then there can be limits. I’m mainly describing functional issues here. But there are, for example, usability issues that we may not be able to automate for. There may be user experience issues that we can automate for (in terms of action) but cannot determine quality of (in terms of experience of workflow).
I might then finish off with: “Tools and humans can both do testing. But the use of a tool is necessarily an approximation of what a human can do, assuming they have good investigation skills. The reason for this is that the human can do something the tool cannot: the human can intuit, infer, feel emotion, and perhaps most importantly, use curiosity to drive further investigation.”
You’ll notice here I never brought up “checking.” What I did do is frame testing as a wider discipline: design and execution. And then within the latter — since that was the focus — I talked about what tools can do but also why the tool can’t replace a human and what, specifically, it could not replace.
It’s a longer discussion, admittedly, but I find people come away with a better view of testing being a specialized discipline that relies on tools in some cases but cannot be replaced entirely by them.
Hi Jeff, thanks for linking to my post.
In the interests of fair attribution, the quotes you’ve pulled out were actually made by Antony Marcano in the comments. Please could you reattribute?
Thanks,
Liz.
I appreciate you bringing the clarification to my attention. I have notes of excellent articles I’ve learned from (including the comments!) and here I conflated the two. My apologies on that; I have corrected this post accordingly.