There is a distinction I want to make in this post regarding what you change in a test specification and how a test specification itself my change, in terms of the role it provides. That leads into a nice segue about how team roles also change. Here by “test specification” I mean the traditional “feature file” of BDD tools like Cucumber, Lettuce, Spinach, SpecFlow, and so on.
Playing around with the Steam game service during the holidays had me thinking of how I would write features for some of the functionality I was observing. So here’s an example:
Feature: Steam Badges Scenario: Choice voting When participating in 5 community choice votes Then the badge earned task count goes up by 1 Scenario: Add to wishlist When 10 games are added to a wishlist Then the badge earned task count goes up by 1 Scenario: Purchase a game When 1 game is purchased Then the badge earned task count goes up by 1 Scenario: Purchase a gift When 1 gift is purchased Then the badge earned task count goes up by 1
By itself, you might think this isn’t too bad. However, ask yourself this: what if a lot of those specific values change? For example, what if it now takes 10 community votes, but only 5 games added to the wishlist? What if purchasing a game now gave you two earned task counts rather than just one? Obviously this change would be reflected in the source code of your application but all of those above scenarios also have to change. That’s the case even though the business intent has not changed. Some people consider that simply the cost of doing business with tests like this.
That’s one way to look at it. Another way to look at it is that you can write your feature more at the intent level rather than at the specific data implementation level. Here’s an example of what that might look like:
Feature: Steam Badges Scenario: Choice voting When participating in enough community choice votes Then one of the three tasks that count toward the badge total is fulfilled Scenario: Add to wishlist When enough games are added to a wishlist Then one of the three tasks that count toward the badge total is fulfilled Scenario: Purchase a game When an eligible game is purchased Then one of the three tasks that count toward the badge total is fulfilled Scenario: Purchase a gift When an eligible gift is purchased Then one of the three tasks that count toward the badge total is fulfilled
Notice here that some of the wording may have even allowed us to explore things a bit more. For example, we were forced to discuss “enough” community choice votes meant. Maybe it ended up meaning 5. We had to discuss what “enough games” meant. Perhaps it was just 1. Likewise, the wording “eligible gift” and “eligible game” forced us to discuss if some games or gifts would be eligible. The notion of a certain number of tasks may have made us consider whether one action could count for more than one task in terms of the total.
I would argue that not only does this modified form make it easier to add extra conditions — whose values we may not know yet since the business is still deciding them — but we can also shorten up our wording even more:
Feature: Steam Badges Scenario: Choice voting * participating in enough community choice votes counts towards the badge total Scenario: Add to wishlist * adding enough games to a wishlist counts towards the badge total Scenario: Purchase a game * purchasing eligible games counts towards the badge total Scenario: Purchase a gift * purchasing eligible gifts counts towards the badge total
What this shorter form showcases even better is that if our specific data decisions change, we would simply change the value behind the scenes if these tests were automated. You are testing the actual values, you are just not stating what they are in the test so that the test has relevance (from an intent perspective) even if some details change. Now someone could ask: okay, that’s great for automation. But what about a tester who has to read these tests? What about a business user who has to read them? Aren’t they going to need to know what “enough” means? Or what “eligible” means? The answer is, yes, they are. The question is should the test specification be where they turn? I keep battling with this question. One thing I have learned is that it’s important to keep in mind what test specifications based on features are for: they are for expressing a business need. They are not used as unit tests. Testing specific values in various combinations is something that can (and arguably should) be done in unit tests. That’s what unit tests are good for. And unit tests can be applied at various levels, such as database, business logic, and web logic. When the test specifications are automated as functional system tests, the specific values will be tested at the UI level. But still: what about the manual tester or the business user who wants to confirm specifically how something works? Well, before getting to that, let’s consider another variation on the above example. Let’s say I want those business needs to be expressed as a workflow. For example, it’s mentioned that there is a badge count or badge total. So what about something like this:
Scenario: Getting a holiday badge When participating in enough community choice votes And adding enough games to the wishlist And buying an eligible game Then a holiday badge is granted
Again, there are no specific values for how many choice votes have to be participated in, how many games have to be added, and what an eligible game is.
Should there be?
The conclusion I’m starting to play around with is that the only reason for putting specific example values in scenarios is to facilitate communication when the test specifications (features) are written. Sometimes various stakeholders, like business analysts or testers, need an example to help uncover various aspects of behavior. For example, being forced to asked if there is an “ineligible game” might lead to the idea that only games that are not already discounted count towards the badge, and thus are eligible. But I don’t need a list of specific games to understand how to test that. I could write:
Scenario: Eligible Game * purchasing a non-discounted game counts towards the badge total Scenario: Ineligible Game * purchasing a discounted game does not count towards the badge total
That works for the eligible/ineligible distinction because it’s essentially binary. But note that it allows me to talk about workflows a little easier:
Scenario: Badge Totals Only Count for Eligible Portions of an Order When a non-discounted game is purchased And a discounted game is purchased Then only the non-discounted game counts towards the badge total
In all these cases, as long as a tester knows how to spot a discounted versus a non-discounted game, they can test these cases without having to know specific data, such as a list of games. But what about the case of numbers, such as “five community votes.” If I just say “enough community votes”, a tester will not necessarily know that this is five or more, right?
Well, ask yourself this: how would a user know that they have to put five community votes in place? Presumably this would be available to them as part of the operation of the site, as it certainly was on Steam. So what the application does would be a good indicator.
Wait … what? Use the application as a source of truth? That’s ridiculous. Well — perhaps — unless:
- The test specifications were the result of discussion between developer, tester, and business analyst.
- The decisions of those discussions were encoded as part of the application.
The above are some of the key elements people often fail to get when they have trouble understanding how all this works. They are not used to the following ideas:
- The idea of testing as a design activity.
- The idea of test specifications that are drivers to development.
- The idea of effective unit tests that catch bugs at one of the most responsible times: when the code is being written.
So can test specifications have specific data? Should they? Certainly they can and I think it’s up to a team to decide whether they need to or not. That’s one question. Another question is whether it is necessary or practical for specific examples to remain in test specifications (features) when and after they are implemented.
The answer to this question seems to hinge, in large part, on how practical it seems for business stakeholders to retain ownership of test specifications. (Assuming they ever had such ownership.) This is an important question and it’s yet another one that many people don’t consider. The key idea here is that once a feature is implemented based on a test specification, that feature effectively becomes source code. That business need becomes encoded as an artifact and that artifact, by definition, is maintained and refactored in a way that is different from how business needs are discussed.
That source code is ultimately what we deliver. That’s what customers pay for. Well, what they pay for is an application or service that provides value by meeting their needs in the most effective way possible. But what that ultimately means in most cases is the source code that drives the application. They are not paying for our test specifications or our discussions in round table spec workshops. They are paying for what we ultimately deliver to them. There is a feedback mechanism here: source code that results from discussions of shared quality can feed back into further discussions about how that code must be modified to support new or enhanced features.
What all of this is really speaking to is cross-team disciplines in that a developer and a tester really have to be able to function as a business analyst. A business analyst and a developer have to be able to function as a tester. A tester has to be able to function as a developer. Because our artifacts — which are a shared communication mechanism — can morph, so must our roles. Just as our specifications become our tests which becomes our code, our roles have to accommodate this shifting dynamic and morph as well. When that happens the knowledge that we have encoded as specifications and as various forms of tests can exist in different forms where everyone can get the amount of detail they need — but from the appropriate place.