Select Mode

Test Shapes

In my career I’ve found that the “shape” of testing tends to guide the level of abstraction that we put our emphasis at. But what does that mean? What “shape” am I talking about? Well, let’s dig in.

For the longest time, the so-called “test triangle” or “test pyramid” was considered gospel.

The test triangle has, among test specialists, long been recognized as an outdated concept. But why has the pyramid gone out of favor? Useful context for all of this stems from the discussion around “Integrated Tests Are A Scam”, which is a well-known discussion point among the intersections of testers and developers. Without going into all the theory and discussion, consider this:

  • Integration is about the basic correctness of the system (feedback about code design).
  • Integrated is about the value of the system (feedback about business design).

This leads to an interesting dynamic.

  • Test simplicity is best focused on in unit tests.
  • Production faithfulness is best sought with integrated tests.

That interesting dynamic suggests the following:

  • You should try to increase production faithfulness in unit tests.
  • You should try to increase test simplicity in integrated tests.

This leads us to yet another interesting dynamic that gets us beyond the traditional testing pyramid and gets us more into a diamond-style shape:

You have to imagine that diamond being able to morph and flex a bit, where the middle layer can expand or contract based on need. What the testing industry is slowing coming to realize is that we can utilize micro-test (unit) principles but scale them for macro-test (integrated) concerns.

The middle of the diamond — along with its ability to flex — is a way we can calibrate our scale. Notice too how this also broadens a bit the nature of end-to-end testing. That testing is crucial because it’s essentially the full realization of all parts integrated from a user’s point of view. The end-to-end is ultimately the only type of testing that is going to empirically and demonstrably show you what the user is going to experience.

Incidentally the test diamond is really just the test trophy, talked about most concisely in “Write tests. Not too many. Mostly integration.”

The “static” part of the has to do with making sure you use static elements of your language of choice in order to leverage type checking, as just one example. With a language like Java or C# this is easy and built-in. With a language like JavaScript — and assuming you’re not using TypeScript — you’re probably going to be relying more on linting tools. In the context of an ecosystem like Python, you might use something like mypy along with function signatures.

The static part often gets relegated solely to the developers … and that can be fine, as long as testers realize that there is an aspect of quality here as well, particularly in terms of testability.

Types (especially static) can be thought of as a tool for producing a partial proof that a given program is correct. At the very least they can help show us that the program isn’t broken in some extremely obvious (and easily detectable) way. As such, type systems express meaning and provide evidence of correctness. This can allow you to make the code easier to understand, putting on emphasis on the meaning of the code, rather than its mechanisms. Scaled up this means you can focus on business intent rather than technology implementation.

But what is the shape telling us?

It’s telling us where we spend most of our time with testing.

Each run of your test suite can be thought of as an experiment that you’ve designed in order to validate (or refute) a hypothesis about how the code you are dealing with behaves. So where should most of your experiments be situated? And why? The common answer in the industry is at the integration level, often for very good reasons.

One goal is to come up with tests that guide design, surface regressions and build confidence, all while minimizing the time lost to writing, running, and fixing those tests in the first place. What tests are often the best at guiding design? And at what level of design are we talking about? The design that a unit test will help us think about is very different than the design an end-to-end test will be helping us think about.

So all of what I talked about here regarding the “shape” can be thought of as a population of experiments that are designed to execute against code. As in any science, you want to have the broadest possible range of experiments that will tend to give you answers you trust in the quickest amount of time possible. This is all about dealing with a cost of mistake curve.

So code is verified by tests. But tests are not verified in the same sense; code-based tests certainly are not. So you have to make sure that tests continue to be valuable over time. This means you keep them honest, which means they provide value and minimize cost. As such, the construction of tests (as artifact) is not all about prevention of regressions or the verification of logic. Tests — again, as artifact — are about providing the best set of experiments possible. But the “best set of experiments” will differ from time to time; hence you need the portion of the “shape” where you spend the most time testing to be the most flexible.

Moving on from tests (as artifact), this takes us to testing (as process). Testing adds value in many ways but one of the key ways is if that testing encourages a focus on design that makes code change easier. (This somewhat goes to what I talked about in Metrics That Matter if you think around the idea of metrics whether than get trapped by the idea of them.)

So let’s break that down again: testing adds value when it encourages a focus on design. A focus on design encourages us to make code change easier. Making code change easier means dealing with code at the appropriate level of abstraction.

That’s where it’s important for us to have a “shape” to our testing that allows us to be the most flexible. This is another way of framing what I’ve often said in a variety of contexts: testing is not just an execution activity; it is also a design activity, which I believe is the basis of how a modern (as opposed to traditional) approach to testing focuses on design pressure.

Share

This article was written by Jeff Nyman

Anything I put here is an approximation of the truth. You're getting a particular view of myself ... and it's the view I'm choosing to present to you. If you've never met me before in person, please realize I'm not the same in person as I am in writing. That's because I can only put part of myself down into words. If you have met me before in person then I'd ask you to consider that the view you've formed that way and the view you come to by reading what I say here may, in fact, both be true. I'd advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.