The Theseus of Testing

I was going to frame this post as “The Ontology of Testing” but, while writing it, the Ship ofTheseus, a thought experiment around the metaphysics of identity, seemed apropos. This is particularly the case in an industry where testing, as a discipline, can struggle to find or retain its identity. I was also going to call this post “The Identity of Testing” but the subject was a little more broad than just that. So let’s dig in!

Before looking at Theseus and his ship, let’s actually consider some gaming history. As some of my readers will know, I do believe in the idea of gaming like a tester.

The Initial Identity of Games

As the microcomputer revolution started up and progressed from, say, 1977 to 1984, there were certain categories of games that served to frame identities. Those games were ultimately growing out of what, at the time, was a tradition of text-based games that were finding their own identity. Those earlier text-based games started to be called things like:

simulation game
participation novel
electronic novel
compunovels
interactive fiction

The name that stuck the most, and still does, is “text adventures.” The idea of the textual adventure was interesting because it allowed for a distinct identity when technology improved and “graphical adventure” started to come on the scene. What we ended up with as the early microcomputer years wore on were certain categories.

There was the textual adventure.
There was the graphical adventure.
There was the computer role-playing game (CRPG) adventure.
There was the action adventure.

Notice how everything is an “adventure.” I suppose you could frame that as a core identity, but then I’m getting a bit ahead of myself.

So instead of focusing on “adventure”, let’s focus on that “graphical” part for a bit.

So we had something like Zork 1:

And you had a CRPG like Ultima 1:

And you had an action game like Karateka:

Notice something there? The CRPG and the action games also used graphics. But the CRPG tended to focus on strategy and tactics, usually around a team of characters where there is a heavy reliance of statistics-driven combat. The action adventure tended to focus more on reflex-oriented activities, such as jumping around platforms or using combinations of “moves” to fight enemies.

So the point is: why is “graphical adventure” slotted out on its own? That’s not really a distinct identity, right? In fact, certain textual adventures could show graphical pictures, such as Mystery House:

Or Oo-Topos:

Yet that didn’t make them graphical adventures in their identity. But why not? Well, the “true” graphical adventures replaced descriptive text with pictures and a textual interface (usually a parser) with a joystick or mouse. With the graphical adventures relying more on the mouse, they also came to be called “point-and-click adventures.” Something like Maniac Mansion is a good example:

Yet, then again, you had something like King’s Quest 1 which combined text and graphics and didn’t really rely on the “point-and-click” interface so much:

So what was the common element? What was the identity?

The Evolving Identity of Games

There was a common identity that could be framed around gameplay — not so much the game mechanics, but rather with the ways the games played out. Here this meant (usually) dealing with puzzle-solving and exploration. But more than that, the focus was on the emphasis on story. There was a coherent storyline or narrative arc that went beyond just a set of quests and certainly well beyond doing nothing more than learning a set of mechanics.

Okay, how is this related to testing?

This gets into definitions and definitions can be tricky. As a lot of testers should know since there are debates around what testing is or is not. Or whether we should define anything via the term “automated testing.” Or if the term “manual testing” should ever be used.

Jimmy Maher, also known as “The Digital Antiquarian”, tackled this subject a bit in his post Ludic Narrative née Storygame. That post started me thinking on this and I made a few comments there to that effect. So my post here certainly owes a debt of gratitude to Jimmy’s. More specifically, Jimmy brings up four framing points to make up the identity of what he considers a “ludic narrative”:

The work must be directly and obviously interactive.
A computational simulation (a “storyworld”) must enable that narrative.
The player must play the role of an individual (a distinct persona) in the storyworld.
There must be a coherent story arc and it must be possible to complete that story.

To tell you what games, or kinds of games, the above may or may not disqualify would turn this post into a history of gaming.

Consider that some of the above games had you taking on the role of a nameless person involved in some story, even if the story was a bit threadbare.

Zork had you essentially exploring the ruins of a kingdom with the sole aim being to find a certain number of treasures.

Mystery House had you exploring a more restricted domain — literally just a house — and rather than treasures, you found clues to determine the identity of a killer.

Oo-Topos ups the stakes a bit: you have to escape imprisonment on a planet so that you can deliver something that will literally save the human race from extinction.

Karateka has the anonymous hero ascending a mountain into the fortress of a villain named Akuma to rescue Princess Mariko.

Ultima 1 has you trying to stop the evil wizard Mondain after being hired by Lord Britsh to find some way to circumvent the wizard’s immortality before he destroys and/or enslaves a continent.

All nameless adventurers undertaking some quest within the context of an overarching story. What you explore may be different — a house, a prison, a mountain fortress, an underground dungeon — but the basic idea is the same.

Some of the games focused on very specific characters or even multiple characters.

King’s Quest 1 has you play as Sir Graham who is sent on a quest by King Edward to save the Kingdom of Daventry by finding three legendary treasures.

Maniac Mansion has you initially play as Dave Miller who, along with friends that you can also play as, must rescue his girlfriend, Santy Pantz, from a mad scientist named Fred Edison and, along the way thwart an effort to take over the world.

Notice how those games all share a few themes and motifs (collecting certain treasures), usually saving something (world, continent) or someone (a captive). So, is that the identity? Is that what makes these “adventure games”?

Well, notice that the “plot” if Karateka was basically replicated much earlier by Donkey Kong, with the hero (at least given a name, “Jumpman”, only later to become Mario) trying to rescue the “damsel in distress”, who was known as Pauline. So does this mean Donkey Kong is just as much an “adventure game”?

Still wondering … how does this relate to testing?

Consider our “adventure game” concept as a spectrum (something that can be “more or less an adventure game”) as opposed to framing it as a dichotomy (something that can “either be an adventure game or not”). This is important thinking for a tester. Way back when dinosaurs roamed the Earth, I wrote about how testers are not either-or.

What this leads us to is that we can try to come up with a “scientific” definition of what “adventure games” are; this can serve as a checklist that includes or excludes. Or we can have a heuristic definition of what “adventure games” are, where we essentially have to allow the boundaries to blur a bit. That second one is a much harder realm to occupy because it leads to people having angst when they feel boundaries are being violated.

That happens with testing in our industry all the time. We we have a spectrum or a gradient of activities that form the identity of testing. Rarely do we have a simple checklist. But a lot of testers these days act as if we do have such a checklist.

The Context of Identity

I started off the gaming discussion around a very specific context: that of the early microcomputer industry and specifically within the context of roughly 1977 to 1984. To be sure, the modern world of gaming is very different from that of that timeframe; it has evolved. And that helps us see that identity can be retained even over wide swathes of changes. As Jimmy Maher has said in his The Digital Antiquarian series:

“If some of the traditionally story-oriented forms of game have retreated from the mainstream, their absence is more than made up for by the piles of first-person shooters, real-time strategy games, and casual tycoon games that now also want to be narrative experiences to one degree or another.”

Testing has also evolved. There was a time when “computers” referred to human operators of machines. They computed. The original testing was basically: debugging. A cyclic activity involving code execution, observation and code correction. We can (and should) still consider debugging a form of testing. It is one aspect of the identity of testing.

It’s certainly the case that the “pure textual adventure” and the “pure graphical adventure” have somewhat retreated. The former is no longer commercially marketed to any great extent and the latter is in a similar situation. What has happened is that all of those above aspects of gaming identity have somewhat merged together to form a cohesive experience. All are still adventures; they simply combine the aspects that previously differentiated them.

I see “testing” as having done that same thing. Testing has retained its identity and continues to do so even if the word “automated” or “manual” is put in front of it.

Again: there can be a scientific definition and a heuristic definition. This gets into “feelings.” Does a certain game feel like a story-based “adventure game” even when it’s perhaps not billed that way? And if it provides the experience required, does it matter?

I would argue the same for testing: a context, like automation, can feel like testing and that means — contrary to what I said here about how automation is not testing — it is a form of testing. It isn’t all of testing; it isn’t close to being most of it. But, along the gradient or spectrum, automation does have an identity aligned with testing.

Theseus Emerges!

Now finally I get get to the point of the title of this post.

There’s a thought experiment that is referred to by the name “Ship of Theseus” and it raises some of the questions about identity that I explored above.

The basic story goes like this: Theseus, the legendary founder of Athens, had an impressive ship in which he had fought numerous battles. The Athenians wanted to honor his contributions to the state. So they preserved his ship in their port. Now, the problem is that occasionally a plank or part of the mast would decay beyond repair. And at some point that piece would have to be replaced to keep the ship in good order. And this continued over the course of much time and, thus, many repairs.

Notice how we’re already at a question of identity: is it the same ship after we’ve replaced one of the planks? What about after we’ve replaced 25% of the planks? 50%? What about after we’ve replaced all of the planks? Thomas Hobbes took this to another level and asked: What if we then took all the old planks and built a ship out of them? Would that new ship, built entirely out of the old ship’s material, then suddenly become the Ship of Theseus?

Narrowly speaking, these are all questions about identity. This is effectively a bit of the ontogeny that I didn’t explore as much in my post on the the basis of testing; it’s looking at the changes something undergoes over time while still retaining its core identity. Even more broadly, these are questions about ontology, our basic view of what exists in the world. What kinds of things are there at all? And how do we know it’s that thing (its identity) versus some other thing?

Managing Identity and Threats to It

What this thought experiment shows us is that when we ask about the identity of something, a whole bundle of stated and unstated assumptions tend to come along for the ride. We are assuming that there are things called “ships” or “tests” and things called, say, “automation” and that these things have some persistence over time. And everything goes fine with that view of the world … until we come up against situations that put a strain on how we define these kinds of objects or these concepts. The question then becomes: what are we allowing to strain the definitions? And are we correct in perceiving that strain as a problem that’s worth having and thus trying to solve?

All of this frames a lot of my thinking over the career of my blog.

This is how navigate the gradient of semantics matter but not all semantics matter equally, and thus avoid too much of what I described as “semantics dismissal.”

This is how we navigate the dangers of suitcase words that may, or may not, preserve the identity of something.

This is where I think the manual testing deniers are very much fighting a battle that need not be fought.

This is where I feel the people who think “DevOps killed testing” are feeling a strain where none exists.

This is where I feel those who feel the future of AI puts testing at risk are simply wrong and seeing AI as a straining element to testing when it’s not.

This is where I think those who continue to put an emphasis on checking as a viable term are doing so because of a strain that they perceive to the identity of testing; a strain that simply isn’t there to the extent that the identity of testing is truly threatened.

All this matters because our attempts to make sense of the big picture inevitably involve different kinds of overlapping ways of talking about the world. That, in fact, is part of how we frame the identity of something. We realize that aspects of it can change but that this does not compromise the thing itself. This too allows us to broaden our ontologies.

Sean Carroll, in his excellent The Big Picture: On the Origins of Life, Meaning, and the Universe Itself, says this:

“As knowledge generally, and science in particular, have progressed over the centuries, our corresponding ontologies have evolved from quite rich to relatively sparse.”

What I think is interesting in that in the testing, the reverse has been true. Yet testing is very much based on science. However the testing we are talking about is not about the actions of the universe, but rather the actions of human beings as they interface with technology and build complex things.

What the Ship of Theseus tells us is that the notion of a “ship” is a derived category in our ontology, not a fundamental one. To be sure, it’s a useful way of talking about certain subsets of the basic stuff of our world, such as when those subsets come together to form an outcome: the ability to move about on bodies of water. We invent the concept of a ship because it’s useful to us.

Testing is something we have invented because it, too, is useful to us. It’s useful to us when it incorporates more into its identity rather than less.

This is a challenge that I find modern testers have to, first, recognize and then, secondly, evangelize.

There is a wider topic here. Beyond the “Theseus of Testing”, I think we could get into the “Theseus of Quality.” When is the identity of quality compromised such that what we have no longer counts as “quality”? That’s a discussion we, in testing, often deal with all the time.

But that’s an adventure for another post.

One thought on “The Theseus of Testing”

nilanjan says:

11 November 2020 at 5:20 pm

“By running your Rails tests you can ensure your code adheres to the desired functionality even after some major code refactoring.”

https://guides.rubyonrails.org/v5.2/testing.html#a-brief-note-about-test-cases

That simple statement explains all of developer testing, including agile and DevOps.

Now compare that with the ideas in this blog (testerstories). The identity of testing hasn’t really evolved. For most developers (and testers) they don’t even need to make a statement like the one in the rails tutorial. You just write confirmatory tests.

More on this later…

Stories from a Software Tester

Twice upon a time, in another space, no distance in any direction from here …