The Use of Tradition and Dogma in Testing

It’s become tradition — with a bit of dogma — to point to triangles and quadrants to “explain” things about testing and development. A good case in point is presented in the article Agile Testing Automation. My goal is not to critique the article but rather to use it to highlight what I see as some of the problem. So let’s subject tradition to some rational inquiry and let’s subject dogma to a bit of scrutiny.

I’ll reproduce just one visual from that post here:

This was an arguably clever attempt to combine the two visuals so as to somehow make them more “understandable.” Fair enough.

The reason I harp on visuals like this is because they don’t necessarily guide action, though they are often presented as if they do. They appear to generate options when, in fact, they are really just a static categorization mechanism. Further, when I’ve seen them presented, the surrounding argumentation is usually weak. Maybe these visuals are good conversation openers? I don’t know. I personally avoid them. But having said that, I should probably explain why.

The Test Reality

Let’s talk about the reality of things for a bit and as I go through this I would ask you to see what aspects of this reality you feel are accurately reflected by the visuals. Since that puts an unfair burden on the visuals, I’ll also ask you to think how much of this reality is explained as part of the visuals.

The reality that lots of us run into is that ultimately tests can take a lot of effort to write and update. This is the case whether they are automated or manual but the problem is even worse with automated tests because there’s this inherent notion we have that the automation should be saving us effort, rather than adding to it.

And yet here comes reality, slapping us around. Specifically, we run into various situations where the tests fail even though the application works. Or the tests pass even though the application clearly doesn’t work. What people end up seeing is that a lot of apparent effort was devoted to this thing called “testing,” but without much benefit. And that effort-to-benefit ratio is important, right? Like most human activities, the effort of some action — writing and executing tests, in this case — must be worth the benefits it offers.

We’re always comparing the total amount of work accomplished to the overall effort we expended in doing the work. When the cost of doing that work is perceived (rightly or wrongly) to exceed its value, all that effort feels a bit wasted. This is then taken by some people — usually managers — as a sign that we could, and should, be doing more. Or, as many of us have seen, they say we should be doing something else entirely.

This accretion of untrustworthy tests or tests that provide too little (perceived or actual) benefit for the effort is exactly how testing — as an activity — starts being downplayed and how tests — as an artifact — start being dismissed.

And It Continues…

But wait! Reality isn’t quite done using us as a punching bag. Add on to what I just said that as the number of tests grow, they take longer and longer to run. Even with “fast” tests, it’s still true that the more of them there are, the longer they take. To make our tests run faster, we often devote some effort to “optimizing” their performance. Then we start to develop ways to run them more effectively, such as using multiple cores or various threading mechanisms. Then we find ourselves facing platform fragmentation and so we introduce more clever approaches, like running on a grid.

In the automated context, we can easily get to the point where our test code is vastly larger than our actual application code. That might not be so bad if it wasn’t for the fact that we still have situations where bugs at the behavioral level creep in. And we end up with many important features that have little or even no tests at all. Maybe having no tests in given areas is due to the difficulty of testing some aspects of the actual application or service operation. For example, the tests might be too brittle from a GUI (“system”) perspective but too heavy from a unit perspective. So we don’t quite know where to write them.

And if we do some form of integration testing, we tend to mock quite a bit, which means we end up testing mocks more than we do anything actual. And, along the way, we utilize fixtures that end up not always being representative of the various data complexities that may exist in our applications. Further, these fixtures tend to lock us in to certain data and test conditions, precluding a bit of exploration.

And, by the way, that use of mocks and fixtures gets into a whole area of debate about how to write tests. Beyond the use of mocks, fixtures and whatever else, you get into the entire debate about abstraction levels and the appropriate framing elements for tests and you even start to get into debates, if such they can be called, about “assertions vs expectations.”

Well, that sucks

Okay, so now that I’ve rained on the entire parade, what does this mean?

Well, the problem is that I’ve stated nothing new here to anyone who has been in the industry for awhile. But … and this is maybe, kind of (sorta) part of my point … these details get lost in all this talk of quadrants and triangles. So much so that I don’t even think those visuals are helpful. Further, they can actually be misleading.

The Complex Reality

And here we come to the self-serving and overly-opinionated part of the post. Well, maybe. We’ll see. Let’s consider a few things that I consider to be realities. Specifically, consider that these quadrant and triangle concepts have been around awhile and used to “explain” things but then consider the following points:

It’s only been in the last couple of years that testers are — finally! — coming to the realization that there is a distinction between “integration” and “integrated”, in terms of scopes of testing. Those triangles and quadrants did literally nothing to help clarify that issue and, I would argue, allowed it to be hid for quite some time.

Yet it’s also been in the last couple of years that testers are making a flawed distinction between testing and checking, thereby putting them at odds with literally every other single discipline that has as its basis experimentation and exploration, and thus a systematized view of testing.

Testers still routinely talk about “negative tests” and “positive tests” and use this as some sort of indicator of how to express testing. I personally find that a harmful distinction but, regardless of where you come down on that, the triangles and quadrants do nothing to inform people of that quagmire.

Testers still routinely use the terminology “non-functional” or, at the very least, do not discourage its use in wider contexts. Again, whichever way you fall on that debate, our triangles and quadrants do nothing to help frame this issue.

Many testers are still blindly following an industry dogma that is leading to more “technocrat testers”. Test consulting services are a huge promoter of this. Triangles and quadrants have nothing to say on this and, in fact, are often used to bolster the technocrat focus.

Testers, particularly as they become technocrat, don’t practice building a lot of intuitions, nor do they focus on when intuitions fail. Triangles and quadrants have nothing to say about this.

Many testers still don’t practice the notion of “bumping the lamp”. Triangles and quadrants don’t indicate this necessity at all because it’s not something you can easily categorize.

Many in the testing field feel that if you just do enough work, quality becomes inherently objective. Yet while there are components of quality that are objective, there is an element of the subjective at all times. Triangles and quadrants don’t provide any insight into this.

Testers still fail to realize that sometimes you want to align your automation language with your production language but there are many times you want to do the opposite. Triangles and quadrants have little to say on this.

Testers, particularly (again) those technocrat ones, often still don’t think in terms of some form of credibility strategy. Triangles and quadrants won’t help with that.

Testers still don’t think like generalists with specialist tendencies. You can see the pattern right? How does a triangle or quadrant help with that?

Yeah, but that’s all, like, just your opinion …

You would be forgiven at this point if you said, “Okay, Jeff is clearly a guy that has a vendetta against triangles and quadrants.” You could further argue — and you would be right — that these visuals are not meant to convey all these ideas. So why sit here and complain about it, like I’m doing?

The main reason is because I see these kinds of ideas and visuals passed around as some sort of tradition and dogma that often stops discussion rather than furthers it. People simply nod their head at the visuals, apparently understanding — or at least accepting — the supposed message they convey. If you’ll bear with me a bit, I promise I’ll speak to exactly what I mean in a few moments.

A Framework of Analysis

There’s often an assumption that we human beings tend to cluster around the same opinions. But, in reality, people more often tend to cluster around the same framework of analyses.

Those frameworks of thought are what provide a lot of the expectations, motivations, and then ultimately desires. So, in the United States, for example, this is one of the key reasons why we have such partisan style politics. It’s not that people are clustering around the opinions of, say, Republicans or Democrats. Rather, it’s that they buy into the same framework of analysis that their respective party of choice tends to cluster around. This then drives their expectations which in turn drives their motivations.

All of that is a long way of saying that people tend to assign the same importance to the same sets of circumstances and cut reality into the same categories. Certainly the act of categorizing is necessary for humans; it’s a large part of how we survived as a species. But it’s also — in our modern societies — what fragments us. Usually along lines of race, gender, disability, sexual orientation, etc. And therein lines the problem: categorizing becomes pathological when the category is seen as definitive. This prevents people from seeing the fuzziness of various boundaries. And it certainly acts to prevent people from revising their categories.

Even if you don’t agree with me entirely, do you see how that relates to the categorization mechanisms of the triangle and the quadrant? But why does it matter? It matters because the act of categorizing always produces reduction in true complexity.

An interesting reinforcement of this idea came to me when I was reading the book RSpec Essentials where the author says:

Testability is not a binary quality. When looking at a given software system, we should ask, “How testable is this?”, rather than trying to categorize it as testable or not testable. This requires judgment and common sense. As our features and priorities evolve, so must our criteria for testability.

Bravo! I totally agree. And I maintain that quadrants and triangles do not hone the instincts for judgment and common sense, nor give ideas about how to to evolve priorities.

I’m fine with the visuals being used as long as the instincts and judgments are in place. The visuals then simply provide a condensed means by which to remind ourselves of what we already know. That, however, is not the case in the industry, at least from what I’ve seen.

Operationalize the Triangle

Earlier I asked you to bear with me and here hopefully I’ll repay some of the effort of reading this post. So let’s take just one visual example: the triangle. Many of us are more than well aware that in the current industry, there needs to be more of a focus on the middle level of the pyramid, usually referred as the “service level.” Yet you wouldn’t know that looking at it, right? The triangle seems to place a huge amount of emphasis on the bottom level, which is usually the “unit level.” We even enshrine phrases around this idea (like “push testing down the stack”).

The issue there is that all the unit tests in the world don’t necessarily tell you anything about the business value of the feature or the application. So showing a triangle with a huge unit component as the base is actually potentially very misleading. Because while, yes, the brunt of the tests can be at that level since they are quick feedback mechanisms, they are only putting pressure on one very limited area of design.

And that’s a huge oversimplification — and thus a reduction in complexity — because What matters is putting pressure on design at the appropriate levels of abstraction. So, to get a little operationally specific here, at the service level the idea of contract tests and collaboration tests are actually crucial. The triangle doesn’t indicate that and notice how the quadrant doesn’t show that either.

What this means is that rather than focusing on triangles and quadrants — effective as they may be for visualization, once understanding is in place — we need to be focusing on powerful operational questions. For example, here’s one that I think needs to be asked a lot more:

Is there an alternative for testing the interaction between components without resorting to larger-scale integrated testing?

Questions that like can then lead to experiments that are capable of being proven out in a given context. As an example of what I mean by that, consider this hypothesis:

Lower-level integration tests — consisting of collaboration tests (clients using test doubles in place of collaborating services) and contract tests (showing that service implementations correctly behave the way clients expect) — can provide the same level of confidence as higher-level integrated tests. Further, they can do so at a lower total cost of maintenance.

This is a hypothesis that can be tested in fail-fast, safe-to-fail experiments during iterations. What it can lead to is a distinct conclusion, assuming that the hypothesis proves itself out. And what is that conclusion? One possibility:

We should put most of our effort into consumer-driven contracts to provide focus points for conversations between teams.

Understanding First, Then Tools, Then (Maybe) Visuals

In the post I referenced at the start of this thread, the author says:

Let’s do some research into the problem we’re trying to solve and learn from others who have done work in this field. Allow yourself time to learn and experiment with tools to check their suitability.

I would agree with that. The trick is knowing the kind of strategy you want to employ well before you start thinking about tools. My concern with the quadrant / triangle approach is that it wouldn’t tell me much about my strategy. What I said above — regarding distinctions that lead to choices that generate options — are what I feel needs to be more of a focus.

If people use the triangles and quadrants in service to that focus, I’m all for it. The problem is I rarely see that being the case. What I more often see is people simply trotting out these diagrams as if they were in any way indicative. And they’re not. I’m not against the triangles and quadrants; I am, as per the title of this post, against how they are used.

Where to from here?

It’s easy to sit here and critique one post from one person or visuals that are in use in a variety of contexts. So let me close by saying I have no idea if I’ve done any better combatting the dogma and the traditions of our industry when it comes to testing.

I’ve been talking about modern testing a lot recently, but I know I have a lot of work to do in order to distill these ideas in a way that is more palatable. I don’t think the pure visual approaches are it and I don’t think my overly verbose ways are it either. So the trick is finding some sort of balance that helps us all talk about these ideas usefully and with enough operational specificity that we can deal with the challenges we are currently facing as an industry.

Stories from a Software Tester

Twice upon a time, in another space, no distance in any direction from here …