Among the many debates testers have, one of those is whether it makes sense to write tests down. Sometimes this is framed, simplistically, as just writing down “test cases” and, even more simplistically, as a bit of orthodoxy around how you don’t write tests, you perform tests. So let’s dig into this idea a little bit because I think this seemingly simple starting point for discussion leads into some interesting ideas about what the title of this post indicates.
I should note at the outset that much of what I talk about here is a work in progress in my head. The way I handle work in progress items like this is to put them out there, often incomplete and half-baked, and see what happens.
I’m sure many of you have seen something that shows this bit of contrived dialogue:
- We don’t write tests.
- Because we don’t have time for it.
- Because there is too much work and pressure.
- Because we don’t move fast enough.
- Because changing software has become difficult and risky.
- Because we don’t write tests.
This is usually framed as the idea that we don’t correctly consider testing enough of an investment. But that calls into question whether testing should be considered an investment in the first place.
First, on the topic of writing down tests, it’s certainly worth noting how the idea of written tests aligns well with written experiment steps in scientific disciplines. After all, scientists very much do write down their experimental ideas, including the procedures for how to conduct certain experiments. If we treat “tests” as “experiments” then any blanket dismissal around writing them is, a priori, at the very least a little suspect. But let’s dig a little deeper here because, on the face of it, the idea of not writing tests might make sense from some economic viewpoints.
The Economics of Writing Tests
In most contexts, written tests are never delivered. In manufacturing and economic terms, that means they could be considered eternal inventory. (The book Hands-On Inventory Management by Ed Mercado will tell you all you want to know about inventory, should you be curious.) In this view, written tests could be considered an investment but one that only has indirect returns, such as by surfacing risks. And, of course, that could be done without a written test.
On that point, some would argue that an investment with no direct return like this is not really an investment at all. It would be better characterized solely as an expense.
So what happens in this context?
Well, again, in manufacturing and/or economic terms, it could be said that we incur overprocessing waste. This term refers to anything that requires some amount of labor that doesn’t provide a direct value to any putative customer or user. I hasten to add that this isn’t to say testing doesn’t provide value. But rather that the testing can be done without having written tests.
But why specifically can this argument be made? Generally, it’s because the extra resources required to write the tests has to be considered in relation to doing something else. After all, time spent on those written tests was time not spent doing something else. Ah, but what? What else would have been done?
Well, as one answer, that would be writing the code, right? Yes, I know the testers don’t generally write the code but bear with me here.
Consider that the only thing we actually deliver is the code. Whatever working application we deliver, that application is essentially the code we wrote. Thus, purely from a code perspective, we incur inventory waste when we invest capital — potential coding time — in a product that has not yet derived value. “Inventory waste” is basically inventory that is waiting to be used.
That makes sense for the code since if we deliver it, it will be used. Yet since the tests will never derive value — since we don’t deliver them — they are inventory waste and possibly eternal inventory. And we incur overprocessing waste by spending the extra attention required to write the tests (which we never deliver) as compared to the actual production code (which we do deliver).
So you see what we’re circling around here, right? The idea of performing tests versus the idea of writing them. That’s the orthodoxy I spoke of earlier and here, it seems, that orthodoxy makes a bit of sense.
But let’s keep digging.
The Problem of Being Correct Initially
Clearly writing tests is additional to getting the code correct from the start, right? If we could somehow guarantee getting the code correct from the start, you could argue we wouldn’t need the written tests. But, then again, if we somehow had that kind of guarantee, performing tests wouldn’t matter much either.
The trick is that all testing is in response to imperfection.
We can’t get the code “correct” the first time. And even if we could, we wouldn’t actually know we did until that code was delivered to users for feedback. And even then we have to allow for the idea that notions of quality can be not just objective, but subjective.
Here’s a bit from the book Quality Code:
Lean manufacturing operates under a principle of building the quality into a product. Rework is a form of waste to be eliminated from the system. Writing software so you can rewrite it or patch it once the bugs are found is rework. Testing software that should be correct can be seen as wasteful. Given that we have not figured out how to create software without defects, some degree of post-coding testing is required.
So that makes sense, right? Until we’ve achieved some means of getting it right the first time — and I’m not holding my breath waiting for that — testing, as activity, is likely necessary. Also from the same book:
However, testing it more than once is clearly inefficient, which is what happens when a defect is found, fixed, and retested. Another form of waste is inventory, which can be seen as opportunity cost. Time spent fixing bugs is time during which the correctly written parts of the software have not been delivered and are therefore not delivering customer or business value. That time is also time in which a developer could have been working on other valuable activities, including professional development.
So this really amounts to an argument about people not being able to get things right in the first place and thus there is always some amount of testing we have to plan on doing. Whether or not that includes written tests is still potentially open for the economic part of the debate.
But let’s not lose sight of the perception issue.
Specifically, that the opportunity costs of working on tests (such as writing them, storing them, and so forth) is higher than working on new features or bug fixes, even with the understanding that “bug fixes” is rework. Otherwise, per the argument above, we incur costs of eternal inventory and overprocessing waste in the context of tests (as artifact). Yet the same is not, arguably, to be said of testing (as activity). So there’s the crucial distinction here of artifact (tests) and activity (testing).
Hence we’re back to the idea of investment and expense.
Investment or Expense?
So is it the case that writing tests is an expense and testing is an investment? Or are both simply expenses? And is an expense, in either case, necessarily a bad thing?
Certainly an expense can be worth it, right? If I purchase insurance, that’s an expense. But (usually) it’s one I consider being worthwhile. Why is that? Well, if a given expense provides utility, it can be worth it. In the context of software and application testing, here that utility might be about enabling something that you can do (find bugs earlier), reducing the overall cost of what you already do (introducing bugs into software), or mitigating the impact of some problem that might hit you (users finding your bugs for you).
Again, it’s a question of economics. Consider the following from Stephen Vance’s aside “Tests are Wasteful” in the book Developer Testing: Building Quality Into Software:
If we could achieve the results without software altogether at the same levels of speed and convenience, our entire discipline would be irrelevant.
Sure … but this doesn’t necessarily say much. You can also say that if we could generate the software we wanted for customers without having to do programming, then the entire discipline of developers would be academic, at best. And thus this style of argument can be used for anything. “Hey, if we just developed in production, we would have no need of deployment pipelines.” And, as we’ve been talking about here, “Hey, if we could just write all this code correctly in the first place, we would have no need of tests.”
Arguments from Economics
My main point is that these are all arguments around economics. Economics provides a framework for understanding the trade-offs underlying any situation where choices are available to us. Economists tend to see everything through a lens of competing elements and that’s often focused around prices or costs.
When people make arguments against written tests, they are making (in part) an economical argument. But so are those who are making a case for written tests.
When framed this way, people can have fruitful discussions about what is and isn’t economical but backed up by judgment. Judgment, in this context, is all about assigning value. Economists see judgment as abilities related to determining some payoff or reward or profit. Or, to use the term I used earlier, a utlity. So, as an example, someone could say: “Yes, written tests offer no direct value to customers BUT they do allow us to record those tests that we want to remember to execute every time. They provide a historical record of how we conceptualized testing that anyone can reference and look at. Those tests can also help with some onboarding and training.”
What this shows is that perhaps “written tests” is too broad of a term to discuss unless we contextualize it. What needs to be focused on are the tradeoffs. One way to frame that in a “fun” way, I guess, is something I talked about in the Test Snap. But in the context of what I just said above, the utility of writing the tests, offsetting the expense of creating them, is that they shorten the immediate familiarization process for people being onboarded or for customers being trained. Or you could frame this as artifacts that exist in a case where auditing is operative.
The Idea of Value
Part of what the above intersects with is the idea of the “value” of testing. That sometimes gets conflated with the “value” of writing down tests or creating test cases. The issue here is that such a conflation risks ignoring that testing is an activity; test cases are an artifact.
By themselves, test cases provide no value to the product that we actually deliver. We don’t deliver the test cases to users. Internally, you could argue that the test cases provide value in that they encode how we perform assessments of risk. But even then, the assessments of risk are static if we just go by the test cases as they are written. There is more to the activity of testing than what we can ever hope to write down in tests.
Longer term, it could be argued, tests can add value if they encourage a design style that makes code change easier and makes it easier to determine unwanted changes in behavior quicker. This can be framed around an operational question: “Has a change in code has triggered a change in behavior?” One of the quickest ways to determine that is to run some tests. But which tests? And to what extent? All of them? Some of them? And, if some, do we know which “some” to include and which we can leave out? And does everyone have that same understanding? How do we know?
Testing as a Service
Admittedly a bit obliquely, this gets into the idea that testing needs to be a service within an organization. And that service can certainly provide value. But “testing,” by itself, does not necessarily provide value to the product. And written tests likewise don’t necessarily provide value to the product. What does provide value is the way that testing, acting as part of a service, allows us to surface risks and provide information so people can make decisions with a series of options that are workable within the constraints they’re dealing with.
Testing, as an activity, allows us to communicate and collaborate around demonstrable and empirical observations that can help people explore options, recognize various dilemmas, understand various tradeoffs, and consider people whose values we may be neglecting. Sometimes those people are ourselves! This gets into the idea of treating various types of qualities, both internal and external.
This is distinct from just relying on written tests or artifacts. Those are one thing that a testing service can provide, for sure, but exploration has a huge part to play in this, as it does in many discoveries or investigations in just about any discipline you care to name. That said, being in the right place at the right time — just having people who “perform testing” — is not enough. Having the preparation to be able to act on serendipity is also important and thus part of the goal of testing as a service is allowing our delivery teams to see how to get the right people in the right places to leverage opportunities.
Testing as a service, and thus as a social discipline, puts a large amount of importance of getting timely information about genuine opportunities to people soon enough for that information to cause said people to be able to consider it in time to do something about it, if they so choose.
Thus we don’t have to worry about the perfect knowledge to get it all right the first time. Consider, perhaps, the ancient Greeks. They didn’t strive for certain knowledge. They knew that there would always be tension between what we knew, what we hoped to know, and even what we could know. Their goal, at least broadly speaking, was to take into account various amounts of information, some of it conflicting, as well as various types of evidence, all of which could be open to interpretation, and be willing to alter the course of action to take advantage of opportunities and avoid problems that the information and evidence were indicating to us.
Testing, as the service I envision it as, is all about the following:
- Discover vital information quickly.
- Deliver that information effectively.
- Characterize the extent and impact of testing compellingly.
- Do all of that economically.
These are the discussions I would much rather see testers having than the type of dialogue I started off this post with. In a technocratic industry that certainly can devalue testing, I would like to see test specialists start to frame more discussions around the “economics of testing” but with a healthy understanding that economics itself is a complicated discipline, often focused on framing uncertainty and enabling various types of decision-making. Facilitating variable decision-making under conditions of uncertainty is something that I personally believe test specialists should be very good at.
Yet, to be good at it, we have to be able to discuss it first and frame a narrative around it. This post is the start of my own attempt at that.