I’ve said before that testing games is hard and I’ve also shown that I created a game for interviewing test candidates. You wouldn’t be far off the mark to figure that I put a large focus on game-related thinking. This is because, in my experience, such thinking makes some of the best testers. But let’s talk about that a bit.
First, I’ll note that I’m not just talking about people who play games. I’m talking about people who play games but who also think about the playing of the game beyond just it’s mechanics.
Events, Mechanics, and Emotions
Let’s just start off with a little game design. Instead of authoring events directly, game authors design mechanics. It’s those mechanics that then generate events during play. That’s a fairly crucial point. The events emerge from the interaction between a player taking some action and the game mechanics, which respond to that action in a particular context. So: game designers don’t design events; they design mechanics that generate events. This layer of indirection is the fundamental difference between games and most other media.
It’s that layer of indirection that makes all the difference to testers who are testing games because you are not just looking at a set of functionality, but rather a set of behavioral responses that are cross-functional in nature and generate an experience of some sort. It might be hard to see how this translates to non-game application testing but, I would argue, for the really good testers, this is largely intuitive.
Events in a game produce many small emotions in players. A minor setback creates a little pulse of frustration. A moment of indecision can easily worry a player, with both excitement and trepidation vying for supremacy. One player might acknowledge the achievements of another, thus players can feel a bit of acceptance. People sometimes describe gaming experiences as “happy” or “sad” or even “bored.” Those words describe giant shifts in the most obvious feelings. The micro emotions — the ones that make up the tapestry of play — change constantly, every second.
Think that only applies to games? I assure you, major site applications like Facebook take this very aspect into account. The entire science and practice of gamification is predicated upon game-based thinking being applied to applications of all sorts. Knowing whether this works or not and knowing what kind of experience it will provide is a large part of what testers should be looking for.
Detecting and understanding subtle emotions is a designer skill and it’s also a tester skill. As was presented by Michael Bolton, emotions are important in software testing. Having to consider the emotional experience of a user is often quite important. If you’re frustrated by some aspect, it’s likely a user will be as well. If you find yourself appreciating a particularly nice feature that lets you get something done more efficiently, it’s likely your users will appreciate it as well.
I hope it’s clear that what I said here applies to any applications, but I do believe that games allow you to practice these skills quite a bit more.
A Game By Example
A couple of years ago, I commented on how I felt Star Wars: The Old Republic (SWTOR, for short) was a good experiment for testers. In reality, just about any RPG-style game, including MMOs, would make my point. But here I’ll use The Old Republic as a specific example mainly because I’m a huge Star Wars geek.
For those who haven’t played this particular game, or any RPG-style MMO, the basics are that you have classes you can choose to play. Those classes tend to be broken up into three categories: tank, damage, or heal. The play mechanics differ drastically based upon what category you choose to play. Further, every class of character has its own abilities. And, as you might imagine, those abilities complement the category your class is in. Beyond even that, each class (and category) gets its own set of utilities. These are essentially extra abilities or complements to existing abilities.
So, as just one example, let’s say you want to play the class Jedi Knight. You want to play that class in the category of a tank, which means you want to specialize into an advanced class called Jedi Guardian. By the way, “being a tank” just means that you wear very heavy armor and can take a whole lot of damage. Your don’t (necessarily) deal out a lot of damage. Your goal, as a tank, is to generate sufficient threat so that enemies attack you rather than your friends. Meanwhile, your friends should be attacking the enemies, whittling them down. They take less damage because you, as the tank, are taking it for them. (Sounds like a very Jedi thing to do, right?)
I bet you can already see that you have to make a lot of decisions here about your strategy. Being a tank is taking a particular approach to solving problems. Taking that approach means you have to get very good in the techniques related to that approach. Sound just a little bit like testing? Wait! It gets even better.
Apply Situational Techniques
One particular challenge you’ll face in games like this is the “boss”. This is some enemy on a given level that is often really, really hard to beat. They have lots of ways to counter most of your abilities and attempt to disrupt your strategy. In the context of testing, I would call the “boss” the imposed deadline for getting something “out the door.” In the context of the game, there are variants of your utilities that you can apply to assist against particular boss mechanics.
For the utilities, there are three categories of utility: Skillful, Masterful and Heroic. You can’t use all of these. You have to pick and choose. Also, in the Shadows of Revan expansion (call it a “feature release”), there are different boss types: Underlurkers, Revanite Commanders, and then Revan himself. So taking those things together, here is how you probably want to set up your utilities for those specific bosses (click the images to get a slightly better view):
The differences, as you can see, can be very slight. That’s a problem with some people’s perspective of testing: it all looks pretty much the same from a certain distance. But the small differences in how techniques (abilities) are being applied can make all the difference.
Can you see the correlation to testing? You can’t just rely on a single strategy to get you through any encounter. You have to pick and choose techniques (in this case, utilities) and then figure out how to apply them effectively given a specific scenario that you are facing. This kind of thinking is critical to being a tester. Playing games like this entirely reinforces those mental muscles.
Rotate Your Abilities
In these styles of games you will end up with lots of abilities. Some of these abilities are more powerful than others. Some have what’s called a “cooldown”, meaning once you use them, they are unavailable to you for a duration of time. Others can do what’s called “stacking”, which means you can use the ability a certain number of times and have its affects applied that number of times. Other abilities do what’s called “ticking” which basically means that while the activity is active, it will “tick” and cause some effect, such as damage or healing. Some activities are best when used together, others are best when used separately.
Lots to think about, right?
Games like this tend to have a “rotation”, which basically means that you use your abilities in a particular order and then rotate through them until the fight is won or lost. There is also a system that some games use called “circumstantial priority” and what this means is that you don’t necessarily follow a strict rotation, but rather prioritize your abilities and then use them when and if they are available. The “when and if” part reflects the fact that events will happen in the game that effectively render some abilities unavailable or useless. Thus applying an inflexible rotation will often not do you a lot of good.
Certainly you can see the correlation with testing there, right? Certain test techniques stand a better chance of ferreting out certain kinds of bugs but using too many techniques at once or without recognition of the context, can make testing ineffective, inefficient, or both.
To put this in context and give you a visual, let’s consider the so-called “opener”. These are your opening moves in an attack. (You can always click for a more clear view of the list.)
Wow! That’s a lot for just the “opening” moves, no? But, just like testing, often you are applying a series of thinking and techniques at once, often not even aware you are doing so.
To give at least a little context, the opener shown above is used to make the most threat in the shortest amount of time you have before the DPS threat catches up and over takes you. In other words, you want your threat (as a tank) to be more than the damage-per-second (DPS) that other players are doing. If they do more damage than you do threat, the enemies will concentrate on them and not you.
Notice the first ability used: Force Leap. This will close the gap to your target. That’s important because this means you are essentially rushing to the enemy to begin the battle. However, what about when you want to open from range and let the enemy come to you? That’s covered by this opening:
Uh, wait. Isn’t that the exact same opener? Nope. Notice that first ability there. It’s the one that’s different; you substitute Saber Throw for Force Leap. In this case, you actually do a small bit more threat than the previous opener. That can be good, based on the situation. It can also be really bad. Force Leap is pretty much immediate in terms of when you generate threat whereas using Saber Throw means your threat generation is held off by 1.5 seconds. That 1.5 seconds may not seem like much but it’s plenty of time for a DPS player to generate enough threat.
My point isn’t to get into the mechanics of the game but rather to note here that you must pick a path: there’s only room for either your gap closer (Force Leap) or a Saber Throw, not both. So you have to make a call as to which technique best fits the situation.
This is like in testing where you apply your “hardest hitting” techniques first to find the most critical risks as quickly as possible (“generate threat”). Then you apply a relatively standard set of techniques after that.
It’s that intuition and/or skill to pick the right technique at the right time (often as your “opener”) that separates the good testers from the really good testers. And here “really good” simply means the ability to get the most important information in the most reasonable amount of time with the fewest number of tests.
The opener is one thing, but there’s also the “mid fight” techniques to consider. This is when you are in the thick of it.
The above rotation is an approximation of what you’ll do. Why approximation? Mainly because the mid-fight depends on how the opener is handled. Depending on the abilities that you use, you will notice abilities coming off cooldown in a certain order.
Incidentally, the “filler” above just indicates that you do something, anything, that may be worthwhile based on whatever the circumstances are. This is often where you use your imagination and, in a testing context, this would be a great place for some aspects of exploratory testing.
Finally, there’s also something called “area of effect” aspects to battles, and this is where you’re dealing with enemies that don’t just cause damage to an individual player but to an entire area.
When in an AoE situation, the rotation will all depend on your current standing of Warding Strike. I don’t want to bore you with too many details but essentially Warding Strike increases your own damage reduction by 3% for a certain number of seconds and deals a certain amount of damage to the target with a series of quick melee attacks.
So, with Warding Strike active going into an AoE situation, your threat generation is not only quite high but vastly easier to sustain due to the fact that the Guardian Slash ability deals a massive amount of threat and has a damage cap of eight enemy targets. That means you’ll likely be hitting all possible targets in your vicinity and generating threat against all of them. Without using Warding Strike, notice where Guardian Slash is. What this means is that you have to do some extra work to generate threat first, before you can build up enough to use your Guardian Slash.
I’m vastly oversimplifying the complex job of being a good tank but I hope the visuals are at least somewhat indicative of what I’m talking about. The idea at all points is that you are making choices. Some of those choices mean you have to consider how best to use your abilities to achieve whatever your desired effect is. That same sort of idea applies in testing as well, such as deciding what level of abstraction to test at, whether to mock or stub, whether to push testing down the stack (to units) or to push testing up the stack (to acceptance).
Think Like a Gamer .. and a Scientist
In these games, when confronted with a new enemy or a set of enemies, you have to look closely at their arrangement and their particular attributes. Are they a “strong”? An “elite”? A mob that is immune to knockbacks or crowd-control mechanisms? You have to take stock of what your abilities are and then determine exactly what you need to accomplish. If I have one elite, one strong, and one weak, do I try to burn down the elite since it’s the strongest? Or do I take out the weakest first, but allow the strong and the elite to get some good hits in on me while I’m doing so? Do I put up my shield right away or, given that it has a long cooldown, do I save that for later in the fight when I might desperately need it?
What you’re doing there is developing a mental model of what will happen when you start the attacks. Then you test your model directly and see how well your prediction matches (video game) reality. If you modeled correctly, you devastate the enemies and move on to the next group. If your model was incorrect or incomplete, your fight probably goes fairly poorly. You refine your strategy based on what you learned and try again.
What you’re basically doing here is using the same mental tricks as your average scientist.
Testing, much like any science, is an observational and experimental science. The future evolves based on experiment and observation. This, as it turns out, is exactly what game players do regularly.
Science, Gaming, and Testing
So let’s put our Dr. Science hat on for a minute here and think this through. I think most everyone remembers the idea that a hypothesis is a reasonable guess for a possible fact, based on some evidence. In testing, this is what our (usually test-driven) specifications or acceptance criteria start out as. Our evidence is the business telling us what they in fact want. Our hypotheses are the examples or tests that provide an expectation for the “possible fact”, i.e., the feature, as described, actually existing and working.
There’s a test design component in here that has to build on these facts, however. Consider that in most scientific disciplines, “facts” must be based on accurate, carefully checked observations, usually verified by different people. Scientists use imagination as well as logical reasoning to build on the facts they have. They make hypotheses: reasonable guesses as to possible explanations for what they observe or, perhaps most importantly for what they expect to observe.
Well, testers are doing the same thing when they take the facts they are given — the initial specifications — and build up further expectations for possible facts, such as variations in how the software will work, possible error handling scenarios, and so on. These must be capable of being verified by different people, not just the testers who did the writing. Going back to our game example, the game mechanics must be consistent and reliable for all players. All players must be able to draw the same conclusions about openers and rotations but, crucially, that information is also available to be questioned and amended. This is particularly the case as games like this are changed (often called “balanced”).
Experiments and Tests
In science, experiments are often designed to get new observations to test hypotheses. Well, that’s what testers are doing as well. Our experiments are the tests we are specifying (before implementation) and the execution of those tests (after implementation). In both phases, we get new observations and thus new ideas of what to test or how to test it. Business analysts likewise learn new or better ways to express what they want and developers learn new or better ways to build it. In the context of the game, the “balancing” that I mentioned is often tested by seeing if any classes end up “over-powered” due to the changes, meaning they have an unfair advantage over other classes. This causes much player angst and goes back to the emotions I mentioned earlier.
In science, hypotheses are not considered to be true until supported by very strong evidence. Well, that’s the same for testing. Not only must we see working functionality in an application, but we must see functionality that matches the other evidence we have: the specification and/or acceptance criteria. Further, we must keep seeing that evidence supported over time, just as scientists (and gamers) do.
Obviously you can do all of this thinking with any application but games tend to make all of this more interesting and certainly serve to intensify much of the experience. This is why I’m convinced that some of the best testers are those who focus not necessarily on just playing games, but in studying how games provide the experience they do and thinking about how they would test that experience.