Testing vs Checking – A Flawed Argument?

Written by Jeff Nyman 9 October 2017 13 Comments

Lately I’ve been seeing that the whole “testing” vs “checking” debate is now more used as a punchline than it is for any serious discussion around testing as an activity and tests as an artifact. Regardless of my perception, which may not be indicative, I believe that this distinction has not been very helpful. But let’s talk about it. Maybe someone will convince me I’m wrong.

Usually this semantic debate gets framed around whether “manual testers” are still needed or as a polemic against automation. A useless polemic at that, given that automation is a viable, leveragable strategy. Granted, most proponents of this distinction are not arguing against automation as a technique or as a form of test-supporting tooling. Yet that sometimes gets lost in their desire to engage in the semantic debate. And yet it is undeniably true that there is, in the industry, a trend towards some unreflective use of automation. You just have to look at how many companies interview to see this in action.

The Basis of My Stance

Perhaps the problem I have with how this argument is framed is that I come from a scientific background. I worked at FermiLab for awhile during their time of finding the top quark. I have also published scientific papers based on experimentation and I have conducted experiments in artificial intelligence and machine learning contexts. I would argue I have conducted experiments in theological contexts as well, in terms of, for example, analyzing extant ancient manuscripts to better understand the context in which they were written. My only point here is that experimenters in these fields don’t feel the need to break up “testing” (experimenting) into “testing” and “checking.”

For example, when we were analyzing graphs for top quark interactions or running conduit tests on the pipes or figuring out what kind of experiment to run in the first place, we always said we were “testing.” We said that even when a particle accelerator was doing all the work of colliding particles together at very high speeds.

But we did have a framing mechanism for all this and it’s one I’ve certainly found useful in the software testing world.

My Personal Framing Attempt

I’ve found it much better to frame testing as two distinct activities: an execution activity and a design activity.

When that gets into automation discussions, in the latter case, you don’t automate the design activity. You can’t. It involves humans putting pressure on design. They may encode the results of that pressure as automation but it never starts out that way.

Further, when you treat testing as a design activity you can ask people this: “When do we make our most mistakes in software?” Answer: When we’re talking about what to build and then when we’re building it. That’s two levels of design abstraction — neither of which automation applies to at all.

This is how I’ve helped people see that human-based testing and tool-based testing do not have to fall victim to the Tyranny of the Or. There is a Genius of the And to be had there. (Heeding my own advice from many years ago.) It’s also how people can see that testing by humans will never go out of style. Automation is one particular technique we leverage within the context of testing as an activity.

Am I just saying the same thing a different way?

Now, someone could argue: “Well, ‘testing vs checking’ is a lot easier to say than ‘testing as a design activity vs testing as an execution activity.” Perhaps it’s easier to say. But I’ve found it educates people less. Further, I’ve found it’s an argument many people aren’t as receptive to. I’ve found it’s more often a way to shut down discussion than it is to actually dig into the experimentalist aspects of testing.

This is why I question whether the argument, as framed, is a bit flawed. It has been in my experience. With the exception of some overseas companies — I write from the United States, where I primarily work — I often see a lot of eye-rolls or chuckles when a “that’s not testing, it’s checking” statement has come up. I’ve personally seen, but also heard about and from, testers who lost a lot of credibility with this argument.

Yet that credibility hit is a bit unfair. I still think there’s a good point there: there are different aspects to testing. “Checking” could be said to be one of those. It’s the putting it up as a thing distinct from testing that I find often doesn’t sit well with people. And that’s the case mainly because it’s not really illustrative of anything at all.

Consider, as just one example, this write up of testing vs checking. Look partway down the page for the “Testing” / “Checking” table. Does that breakdown really help? I’ve never found it to. Update to this Post (12/31/2017): Is that source I just quoted actually indicative? Well, some of the ideas it throws out there are those I have seen as the idea of “Testing vs Checking” has made its way from its core proponents to the wider community.

I will say that Michael Bolton in particular has specifically told me that this reference has “plagiarized and misrepresented” his work. As such, I want to make sure that’s known. But I also think it sort of reinforces at least part of my point. If what should be a simple distinction is so capable of being misrepresented, I’m not sure that argues against it being flawed. But in case people aren’t aware of the provenance of this idea, check out Bolton’s original Testing vs Checking and do note that it references an updated version as well. Also check out Testing and Checking Refined by James Bach.

Aren’t I painting with a broad brush?

It’s always easy to throw up straw-man arguments when you’re trying to make your own point. So, to be fair, one of the contentions is that by polishing up our terminology when talking about testing, we will make it clearer. That’s true; I agree with that. That, in fact, is why I make the distinction I made above.

But the corollary here apparently is that when many testers speak about testing, they oversimplify it and thus cause confusion about what testing is. Thus the “testing” and “checking” distinction. However, that “oversimplify” part is actually not what I tend to see.

If anything I’ve seen the opposite: testers who listen to people who own, run, or work for consulting companies who want to propogate their business and thus focus on semantics over and above getting things done. Said testers regurgitate what they hear from these companies often to the detriment of their credibility. I feel somewhat comfortable saying this admittedly charged statement because I ran a test consultancy as well as worked at some.

So that’s what I’ve seen hurting the industry. And this does tie in — perhaps peripherally; perhaps centrally — to my point about some testing fundamentalism.

Now, all that being said, it is absolutely possible — and useful — to sharpen our terminology and make distinctions such that we can provide nuance to our discipline. I just feel that “testing / checking” is the wrong one. I don’t think it aligns with the many other disciplines out there that use an empirical and scientific methodology for experimentation.

And that experimentation is always referred to as testing, pure and simple.

13 thoughts on “Testing vs Checking – A Flawed Argument?”

Kathy Stark says:

13 October 2017 at 9:17 am

I really like that you said this. I see a lot of reluctance to challenge some of the views that are spreading out there. I do think the way of saying “testing” compared to “checking” is a very easy way to describe things, which is probably why so many test managers seem to latch onto this. But just because it’s easier doesn’t mean it should be the way we describe it.

Reply
Bob Lane says:

13 October 2017 at 10:38 am

Just out of curiosity, how do you describe automation then? Do you say “automated testing?” Do you consider automation to be testing?

Reply
1. Jeff Nyman says:
  
  13 October 2017 at 11:45 am
  
  I consider automation to be a technique of testing. It is one of many such techniques that support testing such that we can get useful information out of something. A human can exercise an application, so can automation. Both can tell me something thus both support experimentation and investigation and discovery — all of which are core aspects of testing.
  
  Clearly, and hopefully obviously, a human can do much more than automation can. An automated tool cannot do investigation, exploration, and discovery that is adaptive to circumstance. Even if it’s programmed to do so in some way, it’s still only going to do so in the context of that specific programming. That’ something that isn’t true for a human. Humans will wonder and thus they will deviate. Humans will (hopefully!) harness emotion as an aspect of making value judgements, something that automation will not do.
  
  When I worked at FermiLab, which I mentioned in the article, we considered “testing” to be when we conceived of the experiments but we also considered it “testing” when the particle accelerator was doing all the work and carrying out our experiments. The latter was automation, the former was not. Both were testing.
John Wilson says:

16 October 2017 at 3:36 am

Testing provides information. Personally I don’t care about what tags/labels/names are given to what I do, but I do accept that when conveying information about what I’ve found it is sometimes easier to explain in terms of checking and testing to those with no/limited understanding of testing.

Reply
1. Jeff Nyman says:
  
  16 October 2017 at 4:47 am
  
  I agree: it can be easy. But I don’t think because it’s easy that’s the way we should do things. I would rather educate people on what testing is rather than vaguely adapt an English word (“checking”) and frame it as what testing is not.
  
  Since the distinction of “testing vs checking” has been introduced (roughly around 2009 or so), the trend in the industry has been an increased focus on SDETs and SETs, a further conflation of developer with technical tester, and a further and sustained focus on automation above humans. I’m not blaming the introduction of the distinction for that, of course. That said, if the distinction is helpful, its effects haven’t been seen even remotely in the wider industry. And I am concerned about the wider industry.
John Wilson says:

16 October 2017 at 6:08 am

I also would rather educate folk on what testing is, but sometimes a project sponsor has neither the time or motivation to be educated. In those circumstances sometimes the best resort is to “vaguely adapt an English word and frame what testing is not”. There is no ‘one size fits all’ solution.

Within the testing profession though I think we do need to be more careful with our terms, especially when we move outside our close circle of work colleagues, and how we use them. Do we avoid contentious terms or adopt an agreed ‘standard’? The problem then becomes which standard do we agree on? Herding cats might be easier than reaching consensus on use of terms.

Reply
1. Jeff Nyman says:
  
  16 October 2017 at 6:22 am
  
  Rather than even worry about standards — and I agree with you; that’s contentious — maybe we can just stick with describing what things actually are. Therefore our ontology becomes really simple to explain.
  
  Every scientific discipline that has, as its basis some form of testing, understands that testing is two things: a design activity and an execution activity. We design tests and then we execute the tests we designed. Those tests may be executed by a human or by a machine.
  
  Thus education becomes simpler, I would think. I don’t have to say something like “I know the word ‘checking’ and ‘testing’ can look entirely synonymous in regular English usage and I know just about no one else, including your developers, makes the distinction I’m about to tell you. But I want you to reframe the term ‘checking’ for this new usage I’m going to provide. And while I say ‘checking’ is mainly about the use of automation, do understand that humans can do ‘checking’ as well. But automation , which can only do ‘checking’, can’t do ‘testing’.” And so on.
  
  I realize I can sound like I’m caricaturing the position and I don’t intend that. But the above is not entirely inaccurate. Even if I go with the “checking is more about confirming and testing is more about exploring”, I’ve now got to deal with the fact that many understand ‘testing’ as a way of confirming and exploring and that, in their experience, ‘checking’ something can lead to further exploration.
Jeremias Rößler says:

16 October 2017 at 9:54 am

Honestly, I felt and thought the same for quite some time. It took me a while to understand where the desire to make that distinction comes from. I think it is not so much about the actual value of the distinction itself, but about how testers see themselves. I summarized my thoughts in an own blog-post: Testing vs Checking: so what?

Reply
1. Jeff Nyman says:
  
  16 October 2017 at 10:36 am
  
  Right, I understand those thoughts. But if it is a way to sell and promote testing — and I feel it’s a flawed way; but let’s go with it for now — then I go back to what I said to a few others: this “selling” has been going on since 2009. And since that time the industry has, if anything, moved even more into conflating developers and testers. We have had even more of a focus on SDETs and SETs. We have had even more focus on automation-based testing to the exclusion of human-focused testing.
  
  And therein — in my last wording — is what I’ve found the problem to be. “Testing” vs “checking” has allowed people to insulate themselves from what’s really being talked about. Automation-based testing and human-based testing. Humans can be “checkers” just as much as a machine can. So the distinction isn’t just about machines doing testing vs humans doing testing but rather about the types of thinking that occur. And the thinking that occurs — in ANY discipline that has testing at its core — is between design and execution on that design. This is analogous to what developers do as well: come up with a design and them implement the design. It’s what any experimentalist-based discipline does. It’s what a common vocabulary has been built up around, including in computer science.
  
  These are the contexts that testers operate within and yet many of them are choosing to adapt an English word for a purpose that they hope is illustrative. Is it illustrative? Has it helped the wider discipline? Well, consider that over the last decade or so we have seen developers “selling” and promoting concepts like TDD and BDD; not testers. We see developers “selling” and promoting the idea of putting pressure on design at different levels of abstraction; not testers. Many of the test-supporting tools we see out there are from developers; not testers.
  
  And then some testers wonder why an entire industry starts to question if those testers are really needed; after all, they don’t seem to be innovating. They do however seem to be coming up with distinctions like “testing” and “checking.” And so many places in my experience — personal, professional, and consultative — have found that to be a little lacking in terms of substance. And, in my opinion, this has furthered the rise of the test technocrat and furthered the conflation of testing with development. There are other factors that led to these things, of course. But I feel it doesn’t help when testers seem to entrench themselves on the sidelines having a semantic debate that most of the industry seems not to care about.
Andy Kelly says:

16 October 2017 at 11:53 pm

Some very valid points in the article so rather than countering I’ll highlight where I have found it of value.

Going back quite a bit I used to see an emphasis in testing to focus on the mission ‘verify/check against the requirements’ and it often relied on documented specifications, detailed test cases and scripts, high levels of micro management and subsequently automation of those scripts introducing efficiency and rapid feedback loops among other things.

Even then there was still a whole load of testing design involved but it tended to be done up front.

That mission is still very evident today so when many people talk about testing its often with a narrow single test mission in mind.

From a risk aspect that mission tended to overly focus on your known known’s, albeit over time with high levels of early collaboration unknown unknown’s and known unknown’s were naturally covered in deriving the known known’s.

These days though I generally work off a different testing mission with much more emphasis on risk, variations of this ‘discover and investigate risks and learn things of value about the product’

So in partly countering an older view of testing that focused in many was on ‘checking’ known known’s now we have a view of testing that not only encompasses this element but also shifts the focus of testing more towards the discovery on unknown unknowns and the investigation of known unknowns.

So my use likely differs to how many others are making the distinction, however I personally have found it useful when it comes to explaining testing to people, its relationship to risk and differing testing missions.

Reply
1. Jeff Nyman says:
  
  17 October 2017 at 6:42 am
  
  That mission is still very evident today so when many people talk about testing its often with a narrow single test mission in mind.
  
  I very much agree. Testing is narrowed too much. Hence my desire to broaden it (talk about what testing is) rather than narrow it further (talk about what it is not).
  
  These days though I generally work off a different testing mission with much more emphasis on risk,
  
  I don’t know that I place a lot of emphasis on risk, per se. A lot of thoughts from the book Antifragile have stuck with me and one of them is:
  
  It is far easier to figure out if something is fragile than to predict the occurrence of an event that may harm it. Fragility can be measured; risk is not measurable…
  
  While there’s nuance to this thought, of course, it has a nice parallel with how we do development: we try to become just defensive enough in our programming to remove certain sensitivities that lead to a fragile infrastructure. That fragility manifests as bugs on the one hand (external quality) but also aspects like “hard to maintain”, “hard to scale”, etc. And those are internal qualities. That also has parallels as we “shift left” even further and interleave testing with the specification of what the developers are building. So, again, from the book:
  
  Sensitivity to harm from volatility is tractable, more so than forecasting the event that would cause the harm.
  
  And all this dovetails (for me, at least) nicely into testing being a broad-angle, wide-lens activity. We test different aspects at various levels of abstraction, which I suppose you could count as your “differing test missions.” Meaning, each abstraction level is a “test mission.” More specifically, there are two times where we make our most mistakes: when we are talking about what to build and when we are building it. That fits in with the notion of “discovery of unknown unknowns and the investigation of known unknowns.” In each of those aspects of testing, I’m bringing different test techniques to bear that best suit the “mission” at that point (i.e., those that are most likely to discover gaps that will make us sensitive to problems in internal and/or external quality).
  
  None of this, of course, is to argue against your main point, which is that you have found “testing / checking” useful as a distinction. I’m hoping to provide people with other distinctions that I think are more accurate to what testing, as a discipline, is. That, of course, is an opinion statement and certainly not one of fact. And I appreciate you providing your own experiences of where you have seen the “Testing / Checking” distinction add value.
Claus Christophersen says:

18 October 2017 at 9:21 am

I like your angle of observation around the check vs test.

I still believe the “checking” distinction makes sense. At least as a step to undstand what TESTING is. I very often come accross managers AND testers who seriously believe that “testing” (checking!!) the requirements in the requirements document equals that the product is fully tested.

“Testing” is a meta word, where the “checking” part is pulled out. We could further resolve the “testing” into many more parts. I believe that most often, testing will contain a lot of checking..

So “checking” is just an integrated part of testing activitites.

Reply
1. Jeff Nyman says:
  
  18 October 2017 at 9:38 am
  
  I see your point and I appreciate the counterpoint. I would argue in return that the fact that managers and testers believe in a limited view of testing is still not enough of a reason for me to introduce a common English term and then repurpose it to help explain what testing is by some form of contrast. I would rather just expand their definition of testing.
  
  …who seriously believe that “testing” (checking!!) the requirements in the requirements document equals that the product is fully tested.
  
  If that’s the case, a nomenclature issue is the least of their problems! But it is a good time to introduce that testing can be a design activity (looking at requirements for inconsistencies, ambiguities, contradictions, misunderstandings, gaps in knowledge) and an execution activity (carried out against an implementation). The distinction there is that a core value of testing, as an activity, is putting pressure on design. Sometimes that’s done with humans, other times that’s done with code, in the form of, say, TDD.
  
  “Testing” is a meta word…
  
  We could consider it that, sure, if we define ‘meta-word’ as “a word which has significance within a given context beyond its usual meaning.” Which is what a meta-word is when combined with linguistics and computer science. But, of course, in that case ‘checking’ itself is a reduction. So if testing is a meta-word it can’t be with “checking” pulled out.
  
  And if we do go by defining terms, I fall back to the fact that “check” is generally defined as “examine (something) in order to determine its accuracy, quality, or condition, or to detect the presence of something” and “test” is generally defined as “the means by which the presence, quality, or genuineness of anything is determined; a means of trial.” I don’t see enough daylight between those two definitions to appropriate the word “checking” for a specific purpose, when we could just frame testing as different parts or types of activities.
  
  In fact, you sort of said it yourself: “testing activities.” That can be scripted testing, unscripted (exploratory) testing, and so on. Personally, I would rather people think about what “scripted” and “unscripted” means rather than framing things as a dichotomy between “testing” and “checking.”
  
  And, again, I’ll just go with something I’ve been somewhat saying in a few places: if this distinction was helpful to the industry, having been around since 2009, I would expect to see more evidence of that. In fact, I see the exact opposite.

Stories from a Software Tester

Twice upon a time, in another space, no distance in any direction from here …