Recovering Context by Test Thinking

Here I’m going to write one of my posts that I think are the most fun but are probably the ones that many testers struggle with in terms of seeing how (or even whether) I’m being relevant. I want to talk a little about an aspect of testing that I think is consistently underused and consistently undersold in the industry: recovering a context that has been buried under years of major and minor decisions.

I’m going to use The Oregon Trail and Dungeons & Dragons to make my point. I’ve used game examples in the past for context, such as talking about bumping the lamp and how exploratory attention to detail matters.

The Testing Relevance

I feel that this aspect of recovering context — which is also a way of recovering history — is very much a mandate of testing. I say that because part of testing is exploring, which means not just exploring what’s here now but how it got that way and how it changed along the way. That’s how we combine ontology (what things are) to ontogeny (the history of changes that preserve or degrade the integrity of something). It’s also how we bring in epistemology (the way we know things).

Testing is also about investigation, which often means looking at the current state of something and being able to tell how it differs from a previous state. That may be in the context of code, a working application, or in a series of requirements documents or user stories. Sometimes you are working to recover information based on work done by people who don’t remember or who are no longer around.

I’ve mentioned before that I think testing is like cartography and that testers often have to act as historians and archaeologists.

This involves a lot of test thinking which is predicated upon experiments that lead to discoveries. This provides for one of the ways in which testing acts as a framing activity. Regular readers have probably heard me use that term once or twice, generally while talking about testing as a design activity and testing as an execution activity.

The “framing activity” I refer to here is, in part, the “recovering context” of the title of this post. It’s also about providing context. All of which is another way of talking about providing a shared understanding of something, where that understanding is evidence-based but with the recognition that sometimes we have to make inferences from evidence that is of varying probative value.

The History of the Trail

So let’s talk about The Oregon Trail. This was actually a series of games but here I’ll be sticking with the first incarnation.

The game was first written in 1971 by three teachers at Carleton College, which is a small liberal arts college in Northfield, Minnesota. The three guys — Don Rawitsch, Bill Heinemann, and Paul Dillenberger — are shown here from the Carleton College 1971 yearbook:

They wrote the game in BASIC on an HP-2100 series minicomputer. The 2100 series was HP’s first computer line and it could be equipped with a number of possible operating systems. One of the most common was called HP Time-Shared BASIC. It’s this version that The Oregon Trail was originally written for.

It was Rawitsch who first conceived of and designed The Oregon Trail as a board game. It was Heinemann and Dillenberger, who had experience in programming, who suggested that it could be computerized. Thus they set about and wrote up a program based on Rawitsch’s idea.

The first people outside of this trio to play the game were the students in Rawitsch’s history class on 3 December 1971. This was back in the early seventies, keep in mind, so this meant they played it on a teletype machine which Rawitsch wheeled into his classroom. When Rawitsch left the district in 1972, however, he deleted the game from the system. The only part of it that still existed at that point was a printed out listing of the source code.

In 1973 the Minnesota state legislature founded an organization called the Minnesota Educational Computing Consortium (MECC). Their role was basically to increase the use computers in education. A UNIVAC 1100 mainframe was installed at MECC’s Minneapolis headquarters and over one thousand terminals were connected to it from schools throughout the state. In 1974, MECC hired Rawitsch who decided that one such use of “computers in education” could be his game.

The UNIVAC was a mainframe rather than a minicomputer like the HP-2100 but it did have a version of BASIC available. So Rawitsch set about creating The Oregon Trail on this machine. In fact, it was long thought that Rawitsch simply ported the game to the new system. Interviews with him, however, indicate that this wasn’t the case. Instead he typed the program out once again on another HP-2100 system, using that listing of source code that he had. In order for the game to eventually run on the UNIVAC implementation of BASIC, this new version that Rawitsch re-wrote had to be modified.

For historians looking into this, what wasn’t entirely clear is when all that happened or how much had to be modified. What is known is that The Oregon Trail was enhanced by Rawitsch to be more historically accurate, to provide a bit more consistency in terms of some game mechanics, and to be a little more entertaining. That was simply modifying the game concept itself. Then there the modifications that someone else seems to have made to get the game to play on the UNIVAC.

That version — this modified version or perhaps even a few persons — was played by thousands of schoolchildren all over the state during the next several years.

Now let’s jump ahead to 1977. MECC replaced its aging UNIVAC with a top-of-the-line CDC Cyber-73 system. And, sure enough, the The Oregon Trail was modified once again to run on that system. Now, importantly, this version was the version that appeared in the July-August 1978 issue of Creative Computing. You can read a little about that in the article Oregon Trail Ver. 3 (BASIC 3.1, 1978).

So, for the longest time, this was the only version that any one actually had. The prior history of implementation of The Oregon Trail, from 1971 to 1977 was, it was thought, entirely lost.

But then, in 2001, an old tape image from a school district of York County, Pennsylvania turned up and on it was a program that had the name “Oregon.” Sure enough, when loaded up, there was a version of The Oregon Trail that was dated 27 March 1975.

What’s evident is that Rawitsch, or MECC — or someone — continued to improve and refine the game for years before it made its way to the Apple II, which would have been sometime in 1980. By 1980 MECC had purchased around five hundred Apple II machines and installed them in classrooms all over Minnesota, where children used them to play the freshly ported Apple II version of The Oregon Trail.

The 1978 version found in Creative Computing has some features not present in the 1975 version. In both versions, the player has to enter a word quickly into the terminal at certain points, such as when hunting or being attacked. The 1978 version, however, has a difficulty setting. When the program starts, the player is asked how good a shot they think they are (“ace marksman” to “shaky knees”) This determines how much time the player is given to type the word. And unlike in the 1975 version (which only allowed players to type “BANG”), in 1978 that word is chosen randomly from four possibilities (“BANG,” “BLAM,” “POW,” and “WHAM”).

A peculiarity that The Oregon Trail shares with many other BASIC games of this era is that it seems to expect — even to depend upon — the player having a look at the code in order to fully understand what’s going on in the game. In fact, much early gaming was exactly like this. Code spelunking was considered part of the experience of exploration.

As one example, looking at the source code tells you that stopping at a fort for supplies dramatically reduces the miles you can cover in a single turn. That is in no way clear from the game itself.

Testing Point: Look At The Source!

Is there a corollary there with modern testing and looking at source code? Of course there is!

Looking at the source code is sometimes how we can find certain sensitivities or boundaries or kludges that would not otherwise be obvious to us just from using the actual program. It’s often how we learn to craft better boundary conditions or to question whether equivalence classes are appropriately established.

Making The History Tangible

I have the basic versions of both Oregon 1975 and Oregon 1978 if you’re curious to take a look at them.

So let’s use that article I referenced earlier and treat these as versions 2.0 (1975) and 3.0 (1978). What we’re missing then is version 1.0, the one written back in 1971 and the one that started everything off.

The challenge there is that this version apparently never made it beyond the system on which it was written. And we know Rawitsch deleted it off of that system in 1972. But we have a bit of history that says it survived. Remember that listing of source code I mentioned? Consider this bit of evidence:

That picture is from an Oregon Trail anniversary event at the Mall of America in 1995. Rawitsch is holding something that looks like a computer printout and it would be interesting if it was the one he took with him all those years ago. That said, the printout does appear to be lost given that Rawitsch has no idea what ultimately happened to it.

What this means is that it’s likely that the 1975 version is the best we’ll be able to do. Which is better than the 1978 that was our previous best.

My main point here is to show you a little bit of the investigation context that ultimately led to those artifacts. This is how we uncover history. This is how we, using testing as a framing activity, can help developers and business determine what is the case now, what was the case then, and perhaaps a little understanding of how we got from the one to the other.

Okay … But Who Cares?

Looking at the code of the 1975 version does tell some very interesting stories in itself.

For example, we can tell that — accounting for the time frame and the language used — the program was written by a careful programmer. But we can also recover the knowledge that it was maintained by someone who was less experienced. Or perhaps simply lacked the time to continue the discipline of the original programmer. This makes sense with what we know of the history. Heinemann and Dillenberger, remember, wrote the first code. They were programmers. Rawitsch was not a programmer but he was ultimately the one who maintained the code.

Just looking at the structure of the code tells us something. While most of the program is numbered in steps of 5, this pattern is occasionally broken. This is somewhat odd because HP-BASIC had a fairly decent renumbering facility. Rawitsch probably didn’t know about it or care one way or the eother. But what’s interesting is that we can assemble a list of lines which break the numbering pattern.

And why does that matter? Because these probably indicate places where Rawitsch made changes or additions as the program evolved. Here are some examples:

Line 8- 11 Added MECC name, maintainer, and version

Line 262-263 Caution user against using a dollar sign

Line xx99 added section names in remarks

Line 1332 require user answer to be an integer

Line 1537 added caution about spending

Line 1752 discovered that ‘7 was a bell, added note to that effect

Line 1902 made question two lines

Line 2392 fixed bug when riders don’t attack

Line 2672 added ammunition losses to heavy rains losses

Line 2792 added ammunition losses to fire losses

Line 2891 may have changed Indians to wolves and cause death

Line 3147 added ammunition losses to wagon damage

Line 3317 added ammunition losses to blizzard damage

Line 3650-3658 added next of kin and aunt Nellie

Line 4012 added another note about ‘7 bells

Line 4279 changed congratulatory message

And why does that matter? Because — and this is key to thinking like a historian — working from these clues and the historical record, it might be possible to reconstruct the original 1971 version of The Oregon Trail.

Obviously the end result would inevitably be a bunch of conjecture and speculation. But it would possibly be informed conjecture and speculation. To give you an idea, I have a small diff that shows some of this process in action.

Yet there’s a cautionary tale here. While we can learn a lot from the line numbers, we can’t know what modifications Rawitsch might have made within certain lines.

Yeah, Okay … But Who Cares?

Consider this from the book The Half-Life of Facts:

“Knowing how facts change, how knowledge spreads, and how we adapt to new ideas is important because it helps us make sense of our world. It can also allow us to anticipate the shortcomings in what we each might know and help us plan for these flaws in our knowledge.”

Make sense of the world. Anticipate shortcomings. Plan for flaws. Doesn’t that sound a bit like the ambit of testing overall?

But key to this framing aspect of testing is a bit of what I showed you above. Ideally we don’t ask people to make one big leap over a very large pond of information. Instead we give them several stepping stones. We try to place those stones a reasonable distance apart from one another. Thus is crossing the pond made just that much easier for people. People don’t fall in and metaphorically drown in the information.

But, and this is a key point, we need to have no fewer, but no more, stepping stones than are useful, necessary, and helpful to get from the beginning to the end; from the “then” to the “now.” From the quality people think is in place to the quality that is probably actually in place.

One final point of interest before we take our leave of The Oregon Trail. At some point MECC started to realize it had a valuable property on its hands and it started to more actively control distributions and claim copyright protections. This actually stopped them from distributing the game on the TRS-80 systems, which were somewhat popular at the time.

So it’s of interest that in an October 1979 issue of SoftSide magazine, a game was published called Westward 1847, apparently written by Jon C. Sherman. This was made available for the TRS-80. No big deal, right? It was just someone cashing in on the success of The Oregon Trail. Yet given the historical context we recovered above and given that SoftSide published the code for Westward 1847 and if we look at that code, we find something interesting.

What we find is that the game is pretty much The Oregon Trail with modifications to let it run on the TRS-80. (If you’re curious, I have a PDF version of the game’s intro and source code.)

Heading to the Dungeon

Here I’ll give a slightly shorter example about historical recovery. I’m not going to belabor this one with a lot of test corollaries. I’ll trust you can spot those on your own. First let’s set up some context for you.

In 1977 the holder of the non-literary rights to the works of J.R.R. Tolkien lodged a complaint against the publisher of Dungeons & Dragons which led to a number of hasty changes in the rulebook. But it also led to some questions about the originality of the game.

Even prior to that, late in 1976, the two co-creators of Dungeons & Dragons — Dave Arneson and Gary Gygax — parted ways quite acrimoniously. This eventually ended up in a 1979 lawsuit by Arneson against Tactical Studies Rules (TSR), which was Gary Gygax’s company as well as against Gary Gygax personally.

The relevance here is that both of these legal actions have obscured the influences on Dungeons & Dragons. By which I mean how it came to be and who contributed to what.

Arneson’s lawsuit was effectively centered around the basis of inventorship and, from that basis, the royalties owed for derivative works.

What this led to is both Arneson and Gygax — understandably, perhaps — casting the history of Dungeons & Dragons in a light favorable to their respective claims. What this meant is that a bit of historical revisionism was taking place. But how much and to what extent? What had to be determined was who had more evidence on their side for their claims.

The practical upshot of this is that a great deal of actual history got buried. Further, after the legal aspects were over, a settlement by both parties largely made sure that everyone had to be silent on the particulars of that history, such as the specifics of their collaboration and what led to what.

This all led to some other interesting consequences. For example, many of those who were eyewitnesses, as it were, to the early formation of the game ended up engaging in a long-standing bout of factionalism, essentially “picking sides.” Whether you were on “Gygax’s side” or “Arneson’s side”, you probably had strong views about the whole situation and some of those were no doubt formed by individual and collective memory.

These battles were often waged in so-called “fanzines” (fan magazines) of the time. So you would think we could perhaps turn back to these sources, that were close to the events in question, thus recovering some of the history as long as we accounted for the various biases that were likely prevalent.

The problem there is that this was the late 1970s. Duplicating technologies at the time were less than stellar and so these fanzines, even when circulated, did not do so broadly. And those that did were few in number. And those few have, by and large, been themselves lost to history. Even when we can find those fanzines, there are challenges. These were materials put out by fans in an “at the moment” kind of way. This means dating, attribution, and sequencing was rarely a priority. In other words, it’s hard to tell what happened when and by whom.

As historians — and as people testing the history to recover some of it — the intellectually honest approach leads us to realize that although many of the nuanced points within Dungeons & Dragons clearly originated with one or the other co-creator, some of the most lasting innovations emerged from another source entirely: the gaming communities that those co-creators frequented. (As the saying goes, “While the authors create the work, the fans create the phenomenon.”)

There’s a lot more to say about this. I’ve done a ton of exploration into the history of Dungeons & Dragons. Why, you might ask? Largely to flex the mental muscles required to make sense of a complicated subject that has evolved over time as the result of human thinking and action and subject to the whims of human memory.

And that latter point is exactly what a lot of testing is about! Speaking of that …

Was This Really About Testing?

One of the things testing focuses on is to have discipline about inferences made from evidence. There’s a book called Fooled By Randomness by Nasim Taleb and in that, I read this:

“Memory in humans is a large machine to make inductive inferences. Think of memories: what’s easier to remember, a collection of random facts glued together, or a story, something that offers a series of logical links? Causality is easier to commit to memory. Our brain would have less work to do in order to retain the information. The size is smaller.”

That resonated with me quite a bit.

I talked about modeling testing awhile back, which focused on this idea of a story or a narrative. I also talked about how we have to, sometimes and perhaps paradoxically, not be such testers. In that I said:

“I think testers need to embrace the idea of narrative more. To understand the fundamental aspects of storytelling and why this mode of communication works so well.”

Testers help people see a complete picture; a map of causality, as it were. Yes, granted, we’re not always having to reach back decades in order to do so. But we are often dealing with source material that people have different perceptions and memories about, including sometimes very strong feelings about. Seeing a complete picture — or as complete of one as we can recover — matters for decision making in many cases.

Paraphrasing from another of Taleb’s books, Antifragile, I would argue that testing helps with variably-predictive decision making under conditions of uncertainty. When we have trouble reasoning about the artifacts that we are building (and thus testing), this leads to decision making that is often uncertain. It’s why we still have trouble getting estimation down correctly. Or why delivery can still be an issue for companies, regardless of the talent they have on board. Or why we still have software with numerous bugs, no matter the advances in our craft or our tooling.

Taleb talks about “anywhere the unknown preponderates, any situation in which there is randomness, unpredictability, opacity, or incomplete understanding of things.” That pretty much describes just about any project where humans are building complex things. It’s these places where we often have to make our decisions and understand what it is we’re talking about, building, deploying, etc.

Playing With History

I’ll throw one more thing at you here to show that I’ve practiced this a bit.

I have a Dialect series on my blog. This was a breakdown of an open source test tool I wrote. This ended up being a redesign of a previous tool of mine, Symbiont. But, perhaps, confusingly, I also had a different version of Symbiont (that was basically the same thing; just kind of different) that came about after Dialect. That Symbiont tool eventually evolved into Tapestry. And even that has now evolved into my tool Testable, which I have written nothing about.

Of those tools, one of them — Dialect — is, on purpose, entirely unrecoverable at this point. (Or, actually, is it?). One of the Symbionts became the other and tracing back the original would require some of the digging that I showed you with The Oregon Trail. Recovering the few bits of where Testable diverges from Tapestry would be much easier.

Recovering the thought process of how the tools were created, and what other tools I used as my basis, would require looking at the written material (blog posts), somewhat similar to digging into that Dungeons & Dragons history.

What’s my point? Is it that I’m a confused and confusing test tool developer?

No.

(Well, maybe.)

The point is that all of my open source work on test supporting automation has also had a meta purpose: the evolution of something through time. They are all just similar enough that you can probably make out the broad strokes of their lineage. What’s less clear is what the actual difference is in all cases or what led to those differences. Yet … this history is all (for the most part) recoverable through my blog posts, through the operations of the code itself (i.e., using the tools), and via looking at the code.

It’s not often when you claim a method to your madness was a long-running demonstration of what you believe is a fundamental principle of your discipline. I’ll close this post off being strangely (and probably unjustifiably) proud of myself.

Stories from a Software Tester

Twice upon a time, in another space, no distance in any direction from here …