Getting Lost in Test

I’ve already talked a bit about how testing is a discipline with a wide-angle lens. This means it’s very possible to get “lost in test.” Getting lost in this context means abandoning that wide-angle lens and abdicating responsibility for testing. So let’s talk about getting lost!

Michael Bolton recently asked a question on LinkedIn:

“I would like to survey how people (testers and non-testers alike) think and talk about the subject. So: what is “manual testing”? What is “automated testing”? How do you distinguish between the two?”

It’s a great question and can lead to a lot of interesting discussion. People who read me or who have talked to me know I have thoughts on the “manual testing” angst that testers often place themselves in, sometimes manifesting as manual testing denial.

There are actually two ways we get “lost in test” here, only one of which I’ll jump into a bit, but they are related.

One way we get lost in test is to sit there and fight a reality that does in fact exist (i.e., manual testing). The other is we (might) fight a reality that doesn’t exist. For example, we assume that others seem to have a mental block when it comes to understanding testing because of a concept. In this case it’s about the use of “automated testing” as a term and how that, perhaps, denigrates testing just as much as the term “manual testing” does and thus means non-testers don’t get testing.

Yet there’s a polarity there and readers know I think testers have to shift between polarities.

Distinctions in Skill Set

On the issue of there being a distinction between “automated testing” and “manual testing”, I was led to a corollary in the physics community, which I used to be part of. I guess I still sort of am. This is a distinction that’s been operative pretty much since there has been “physics”, as opposed to “natural philosophy.”

What do you call a physicist? As in, someone who “does physics?”

Well, you call them a physicist, right?

But the discipline (as it evolved) and the industry (that it became) required a distinction: those who primarily work on experiments and/or building the tooling and those who primarily provide inputs into what the tooling should do but who also work on stuff that the tooling can’t do at all.

The former became experimental physicists. The latter became theoretical physicists. And, by and large, people get it. There’s not really the angst, for lack of a better term, that I see in the testing community over this. People understand that there are tool makers and tool users; those who propose experiments and those who conduct experiments. And they don’t spend a whole lot of time worrying about that “manual” and “automated” distinction.

But should they?

After all, theory informs experiment; experiment guides theory. So the boundaries can be fluid depending on the person. But, still, I would the basic distinction is understood and science progresses.

What’s important to note is that both groups test — well, unless they’re into superstrings, I guess — since that’s the basis of experiment. But there’s still room for the distinction. I see the testing industry dealing with a similar situation. So it’s often interesting to me how testers seem to feel this need to wage a battle all the time about how automated testing isn’t really testing and to even say the term “automated testing” is to slightly, if not entirely, denigrate the entire discipline.

And yet …

Yet there is a problem in the testing industry and it’s one that, justifiably to an extent, drives the angst.

We often turn testing into a programming problem, thus, as I argued, potentially forgetting how to test. This test-as-programming problem is certainly something that is reinforced by an interview process focused on a technocracy.

A physicist named Sabine Hossenfelder wrote this great book:

Very brief context: some areas of physics are in crisis mode. In science, theory and experiment have to move in tandem. Theory tells us what kind of experiments to try. Experiments provide data to refine our theories. When the two get too far out of sync, you end up with problems. That’s actually what Sabine’s book is about in the physical sciences.

Science, particularly physics, is getting lost in this concept of “beauty.” We have math — sometimes very beautiful math — (outputs) — but no real outcomes.

By that I mean we have theories right now that are untestable in any foreseeable future; perhaps in any possible future. And even if they are testable, the effort to test them is massively expensive and the tech stack needed to do it is prohibitive at best.

If I wrote my version of this book, it would be this:

Which might seem odd. How does testing lead quality astray? Isn’t testing how we determine quality?

Well, I would argue that no, actually, it’s not. But rather than go down the rabbit hole of what testing is (an entirely different subject!), let’s just consider what leads to the Lost in Test problem.

I already said that we turn testing in a programming problem. But what that means is very specific: more and more our tests are just another form of code. What this leads to is a technocracy. One aspect of this technocracy manifests as where the story we want to tell gets hidden by the battles we have to wage with our technical stacks. Or the fights we have with our tooling just to get our tests to actually run consistently.

And certainly those tests, written as code, aren’t thinking as a human would. After all, they’re not thinking at all.

And therein lies the challenge. There is an entire industry out there with a focus on turning testing into programming, such that thought and nuance are replaced by the algorithm and the check.

Note to loyal readers: You might be thinking: “Wait a minute? A check? Did Jeff just say a check? But hasn’t he said he thinks that the checking distinction is a flawed argument? Didn’t he seem to reaffirm that to himself when he revisited the topic?”

Perhaps. But my thinking is evolving a little bit on this. And, to be fair, I did frame an extended game testing example as an exercise in testing and checking. And, in fact, awhile back I was very much on the bandwagon of the checking distinction.

Now, to be sure, I do still think there is a bit of a false dichotomy that testers reinforce. But there’s a wider point here: I test my own thinking routinely. If I was a machine — say, an AI — I would essentially be an algorithm. I would not be testing my own thinking. I would not be exploring how I view my discipline and how I practice it. I would see the bits and pieces and not the whole picture as AI does (and as Michael Bolton states in Bug of the Day: AI Sees Bits, Not Things).

Science Informs the Lost in Test Problem

In science, we consider how we should go about trying to understand the world. Or, at least, move closer and closer to the truth.

In testing, we do the same thing in the context of our applications and the quality we want to see achieved.

In both, we have to be willing to accept uncertainty and incomplete knowledge. We have to always be ready to update our beliefs as new evidence comes in.

In science, our best approach to describing the universe is not a single, unified story but rather an interconnected series of models that are appropriate at different levels of abstraction. Each model has a domain in which it is applicable.

In testing, the very same idea applies. We have models we create of our business, of the users of our applications and services, of our code, and so on. These all form various abstraction levels that can be served by different models.

Our task, in both, is to assemble an interlocking set of descriptions, a story. This story is based on some fundamental ideas that fit together to form a stable platform of belief. This belief is always provisional but it is a platform that we can establish confidence — not necessarily assurance — upon.

Let’s step in and think about this in terms of most of our day-to-day activities. Consider that we deliver short “sprints” of value over time and, doing so, this allows us to pivot our work to adapt to feedback and thus create better insights. Good storytelling is the bridge between what you learn and what you can tell others. Without that bridge, you won’t get the useful feedback you need to connect your team’s insights to real business value.

And that’s ultimately what the business wants: the release of value-adding features. That’s what provides companies with a competitive advantage and thus a viable enough revenue stream and that keeps people employed. Just like in science, we do have to reduce to some practicalities. And what I just said is about as practical as we get.

Yet notice how as I led us to something that almost seems algorithmic (release value-adding feature, provide competitive advantage, generate viable enough revenue stream), we would actually have to employ a process that was very non-algorithmic in order to determine if we’re succeeding. Why? Because those terms “value-adding”, “competitive” “viable enough” can have both subjective and objective components.

Just like quality. Quality is a shifting perception of value over time and thus it necessarily has both subjective and objective components. Testing leads quality astray when testing is not able to account for that distinction and operate within it.

In short, tooling doesn’t help us tell our story.

Forgetting Our Story Creates a Lost in Test Problem

Yet, tooling is important, right? After all, it can help us scale our human efforts. It can allow us to free up humans from the mechanistic work so that those same humans can more appropriately focus on the exploratory work. So, yes, the tooling can support our efforts. But it cannot be allowed to dictate them. This is as true in testing as it is in the wider science community.

So let’s consider this as part of our collective story: “Revenue is king. Liability is queen.”

The idea there being that we serve both the king and the queen. So every decision we make, we need to say how it impacts both. Along with that, for any bit of tooling, process, or distinction we want to create, we ask this: does it make us more productive and less vulnerable and more understandable? Notice it’s an “and.” We want all of those things. But clearly there will be a balance.

Notice also that when framed as above it’s very clear that automation (of any sort) will be graded on a spectrum of productivity, vulnerability, and understandability. Forgetting that it is a spectrum — rather than an either-or — is why testers sometimes seem to be waging a battle against automation when, in fact, that’s not really the battle. Or at least not the way to frame it.

Aligning that story of revenue/liability with the practical realities we just talked about leads us right to what most of us have heard a thousand times as goals for everything we do as a business:

  • Protect revenue (“Don’t lose money!”)
  • Increase revenue (“Make more money!”)
  • Manage cost (“Keep our profits higher than our expenses!”)
  • Increase brand value (“Make us look better than everyone else!”)

So the story ultimately: “Help us make more money, help us spend less money, and help us protect the money we already have.”

All of that is very much speaking to quality. It’s the quality of what we do and produce. It’s about the things we do to make our lives easier and the things we do that make our lives harder. This is how we know if our platform of belief is truly something worth praising or its a chimera that we use to fool ourselves.

And we often do use tests to fool ourselves, don’t we? “Hey look! The automation just ran and everything’s green. Deploy it!”

Broad Views Avoid Lost in Test Issues

Going to back to what I said before about that platform of belief and confidence, just as in science, we want confidence in our results.

But it’s important to realize that gaining confidence may not be the goal, but it can be (and is) a goal.

To gain confidence, do we really want to turn over our confidence-making decisions to technology? Do we really want our confidence gates to be the operation of an algorithm combined with a check?

Testing is broad and how people use the discipline to reason about what they value is not an either-or. Scientists in all disciplines do use testing as a way to gain confidence. They also use testing as a way to show when confidence is not warranted. Testing can identify problems but it can also identify the lack of certain problems. Even more importantly than those aspects of confirmation or falsification, testing an help us implausify. Meaning, it can help us understand when something we are doing is becoming more and more harmful of quality and suggest to us that our arguments for what are doing, and why it is good, are implausible.

Like any scientific endeavor, of course, we have to take results as provisional to greater and lesser degrees. Quality is a shifting perception of value over time, dependent on many factors. Confidence in value — which can manifest as perception of quality — is just as interesting for us in our industry as it is in other disciplines that use testing as a basis and care about the results of that testing.

But all of this only works when we don’t get lost in test.


This article was written by Jeff Nyman

Anything I put here is an approximation of the truth. You're getting a particular view of myself ... and it's the view I'm choosing to present to you. If you've never met me before in person, please realize I'm not the same in person as I am in writing. That's because I can only put part of myself down into words. If you have met me before in person then I'd ask you to consider that the view you've formed that way and the view you come to by reading what I say here may, in fact, both be true. I'd advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.