Communicating In a Test Description Language

A TDL (Test Description Language) is a constructed language that we use to describe, and thus specify, our requirements as tests. Or our tests as requirements, if you prefer. This is what allows testing to be a design activity. What makes a style of writing a TDL is adherence to a structuring element and a set of principles and patterns that are used to guide expression.

Current forms of TDL swirl around various BDD concepts, such as Given-When-Then. But it’s clear that just having that focus in place does nothing for you by itself because there is a lot of thought that goes into how you want to express yourself. I’ve found many testers really struggle with this but, equally, I’ve found I struggle in being able to adequately teach at what level you work at with a TDL.

In an article on Introducing BDD, Dan North states that he and Chris Matts were trying to develop a template that, “had to be loose enough that it wouldn’t feel artificial or constraining to analysts but structured enough that we could break the story into its constituent fragments and automate them.” This gave birth to the Given, When, Then syntax. One of the examples they provide is this:

Scenario 1: Account is in credit
  Given the account is in credit
  And   the card is valid
  And   the dispenser contains cash
  When  the customer requests cash
  Then  ensure the account is debited
  And   ensure cash is dispensed
  And   ensure the card is returned

While reading that I came across an article about not using Given/When/Then. The author says that the GWT format could instead be this:

The dispenser when the account is in credit and the dispenser has cash should debit the account.
The dispenser when the account is in credit and the dispenser has cash should dispense the requested cash.
The dispenser when the account is in credit and the dispenser has cash should return the card.

A commenter to that post says that, in fact, it could be this:

The dispenser should dispense cash when the account has credit.
The dispenser should refuse to dispense cash when the account does not have credit.
The dispenser should return the card when all transactions are complete.

That does sound a little more like useful English and while I would agree that the two later sets read better than the original from North’s original article, I think the problem with these latter two approaches is that they have sentences that are not necessarily connected. For example, does “when all transactions are complete” mean just those transactions that were in the sentences above? In the first statement, the account having credit may be one condition but what about if the account has a hold on it?

I don’t think the problem here is with Given/When/Then so much as it is with the specification of the details. The key thing is a domain phrase. Let’s take just part of the original specification from North’s article. Here’s the first part:

  Given the account is in credit
  And   the card is valid
  And   the dispenser contains cash

This is set up as a context. Could we call that a “viable transaction?” Meaning, a “viable transaction” is defined as one where a valid card is being used against an account with sufficient credit and where the dispenser has cash on hand. Well, let’s replace the Given with that and see how the scenario reads as a whole:

  Given a viable transaction
  When  the customer requests cash
  Then  ensure the account is debited
  And   ensure cash is dispensed
  And   ensure the card is returned

Not bad, I guess. But are we hiding too much detail there? Is it better to stick with saying “in credit” to indicate specifically what is meant? Let’s leave it as it is for now and consider the outputs from the original scenario in North’s article:

  Then ensure the account is debited
  And  ensure cash is dispensed
  And  ensure the card is returned

Could we call that “appropriate transaction response”? In other words an “appropriate transaction response” is defined as one where the account is debited appropriately (based on the money being taken out), cash is actually given to the user, and the card is returned to the user. So let’s replace the Then:

  Given a viable transaction
  When  the customer requests cash
  Then  an appropriate transaction response occurs

Hmmm. Well … that’s kind of useless, isn’t it? It’s close to saying “Given that everything is as it should be, then everything will work as it should.” As a thinking tester, I suppose this is nice in that it suggests a lot of things I could test. But that also depends on my skill as a tester. If I gave that to a developer to implement a solution, there are no doubt many ways that the developer could implement the idea that a “viable transaction” leads to an “appropriate transaction response.” If I gave the above to a business analyst, they certainly could agree that we have the high level idea down but they could not say whether we actually understand what it means to use the system in a way that was intended.

As a related side note, you can take this kind of thing way too far. In the realm of reductio ad absurdum, I could remove so much of the detail from a scenario that it loses its ability to tell a story:

Scenario: The Entire App
  Given the app is called up in a browser
  When  the app is used
  Then  the app works perfectly, every time, for every user

Obviously this scenario is silly — and we’d never write something like that, right? But this is illustrative of what can happen when you keep moving up the abstraction chain, climb so high that your scenario fails to effectively communicate intent and describe behavior. If you use scenarios like this, you would need a extremely high level of trust in the developers (to code something useful) and the testers (to actually test it usefully). Having telepathy wouldn’t hurt either. Clearly there’s no story being told here. (Or if there is it’s like saying: “Once upon a time, a lot of stuff happened. The end.”) Again, this example may seem silly but what I was showing above with the transaction example is how you might quite easily approach that sort of situation where your scenario basically says the equivalent of “as long as nothing goes wrong, everything will work fine.”

In another context, consider that I could say something like this:

Customers should be prevented from entering invalid credit card details.

You could argue we have a statement of intent here. What we have is in fact a business rule. Here is a statement of intent regarding the details of what we mean by the business rule:

If a customer enters a credit card number that is not sixteen digits, when they try to submit the shopping cart, the shopping cart should be redisplayed with an error message telling the customer that they entered too few digits.

Certainly we would argue the latter is more testable in that it more specifically calls out what needs to be tested. It describes what is meant by “invalid credit card details” and describes what is meant by “prevented”.

Some people would say this is getting too much into implementation details. I would argue it’s not, though. Implementation details would be describing how we parse the number, or what button is used to submit the form, or the specific color and placement of the error message. Here we are fleshing out the business rule or business feature by describing the responsibilities of the system as it fulfills this feature.

How would the above get translated into a test specification? Here’s one possible example:

Feature: Feedback provided for invalid credit card details
  
  In order to avoid having customers unclear about invalid transactions
  the system must provide feedback about what specifically went wrong.
  
  Background:
    Given a user buying an item
    And   the user enters a credit card number (<-- needed?)

  Scenario: Credit card number too short
    When  the card number entered is less than sixteen digits
    And   all the other details are correct (<-- needed?)
    And   the form is submitted (<-- needed?)
    Then  the form should be redisplayed
    And   a message should appear indicating the correct number of digits required

Notice a few places there that I have "(<-- needed?)" in the text. Here I'm calling out whether those statements are even necessary. Are they incidentals that convey nothing useful about the purpose of the scenario? Put another way, could I change this to read:

  Background:
    Given a user buying an item

  Scenario: Credit card number too short
    When  the user tries to buy an item with a card number of less than sixteen digits
    Then  a message appears telling them that sixteen digits are required

What do I lose or gain by going with either scenario? It's worth thinking about. For example, in the second case I don't specify that the cart form is redisplayed so where the error message appears is not stated. Is that important? Or an incidental? In both cases, note, however, that my intent is clear ("credit card number too short") and the way I test it is clear ("card number of less than sixteen digits").

You might also notice that since this feature is general enough -- being about invalid credit cards -- other scenarios can clearly be put in place that would test out other variations of invalid credit card usage. Some examples, just giving the scenario title:

Scenario: Credit card that is not accepted by the vendor
Scenario: Credit card that has a hold on it
Scenario: Credit card that has been flagged as stolen

So now, rather than taking Dan North's original example, let's take one of those variations from above. Specifically, let's look at this:

The dispenser should dispense cash when the account has credit.
The dispenser should refuse to dispense cash when the account does not have credit.
The dispenser should return the card when all transactions are complete.

Looked at a certain way, isn't that really just saying the same thing as what I just came up with when I said this:

  Given a viable transaction
  When  the customer requests cash
  Then  an appropriate transaction response occurs

In a way, yes, but of course the example spells it all out a bit more than my very abstract scenario. So that's where the main decision point of a TDL starts to come in: how abstract are you? How imperative versus how declarative? To start looking at this, how would the become a test spec? One example:

Feature: Dispensing Cash
  Background:
    Given a user with a valid account

  Scenario: Account has credit
    When  the user requests cash from an account with credit
    Then  the user is given cash
    And   the card is returned

  Scenario: Account does not have credit
    When  the user requests cash from an account with no credit
    Then  the user is not given cash
    And   the card is returned.

But ... there's still potentially an issue there, right? Or no? Well, the phrase "the user is given cash" and "the user is not given cash" indicate what will happen in each scenario, but obviously a test can't test if the user was actually given cash in their hand. The "card is returned" makes sense because it's calling out that the system will return the card in those cases, as opposed to, say, this:

  Scenario: Account is flagged as a security risk
    When the user requests cash from an account flagged as a security risk
    Then the user is not given cash
    And  the card is not returned.

We seem to have captured business rules at a high level here. Is that enough? In the post I referenced earlier, the author says "language and fluency are important." I agree, but specification is also important. Behavior is also important. And thus I think the first example (from the Introducing BDD article) is a little more in line with what I find useful, as it turns out. Why? Because it's providing a bit more specifics about what exactly happens or at least starts on that path. Consider that Dan North's original scenario says this:

  Then  ensure the account is debited
  And   ensure cash is dispensed

The "ensure the account is debited" gets closer to what we actually want to make sure happens. The "ensure cash is dispensed" gets back into the business rule a bit. Someone could read those two lines above as synonymous. In reality, of course, they are not. The system could dispense cash and yet have a bug that does not debit the account. But of those statements, the "ensure the account is debited" is the behavior I am looking for or, at least, that I can test for.

Yet with this, sometimes when I write scenarios out that way I feel like I'm crossing that line from intent to implementation. What about making sure that statements like "ensure the account is debited" are spelled out with examples? George Dinwiddie, in Contemplating Given-When-Then gives an example similar to this:

  Given an account with $500
  When  $50 is withdrawn
  Then  the account has $450 remaining

Here there's no mention of "credit" or "dispense cash". Rather, it's just an example. What this gets us closer to is focusing on the output because that's what we are checking. You might even say:

  When $50 is withdrawn from an account with a $500 balance
  Then the account balance will be $450

The context and the action get wrapped up and what you really focus on is the output or the assertion.

When you start with the output, the example (business test) specifies the minimum necessary to get to the value or output in question. I can see this leading to smaller tests. I can see that leading to less cost per test. Yet, this might also challenge our assumptions about "BDD scenarios" and what level they should be written at. For example, with the above you could argue that we don't need to write a series of tests that showcase different numbers because the business rule is pretty clear:

* The remaining balance after a valid transaction is the amount of the transaction subtracted from the original balance

In fact, we could have said that as a test, right? The example just clarifies it. So it really gets into how much information should be included. When you consider how much or little information to include, I came across refactoring given/when/then and that lead me to ask: what's to stop my examples from being totally minimalist?

$500 balance
$50 withdrawal
$450 remains

Nothing stops me from doing that, except perhaps readability. But more importantly this all leads me to ask: could the output ever NOT be what I think it is? Is there ever a reason why taking out $50 from an account with $500 would NOT have $450 remaining? Like what? Well, what about a fee on the account for some reason. Or perhaps there's some penalty applied to certain transactions. Now everything can become a variation on the key scenario. Let's say this:

  Given an account with $500
  And   a fee of 6% applied to transactions
  When  $50 is withdrawn
  Then  the account has $447 remaining

Or I could write it as:

  When $50 is withdrawn from an account with a $500 balance and a 6% fee
  Then the account balance will be $447

Then I came across the idea that concrete examples like this may be too little. Okay, going with that idea, we still need our big picture. What if we state the required behavior in a sentence or two before giving the examples? Each behavior is described by a specification like this:

The remaining balance after a valid transaction is the amount of the transaction
subtracted from the original balance and taking into account any fees.

  Scenario:
    When $50 is withdrawn from an account with a $500 balance
    Then the account balance will be $450

  Scenario:
    When $50 is withdrawn from an account with a $500 balance and a 6% fee
    Then the account balance will be $447

In the Given/When/Then approach the business rule describing the behavior generally isn't made explicit. The reader is often expected to guess -- or at least determine -- the rule from the examples, so naturally those examples have to have more context. That's why you often end up with complicated Given's.

If we don't make the rule explicit then we only have the concrete examples, so they have to be made much more verbose -- and potentially implementation-specific -- so that readers can correctly interpret them.

I would argue that context-free examples with explicitly stated rules are better, both in terms of avoiding lock-in (very implementation specific) and in terms of readability (how long it takes the reader to understand the behavior expected). And this isn't actually all that new of an idea. For awhile it has been argued that in order to identify features in your system, you can use what has been called a "feature injection template." Something like this:

In order to (meet some goal) as a (type of user) I want (a feature)
As a (type of user) I need (a feature) so that (I can meet some goal)

The requirements of an application are determined by asking these kinds of question:

What goals are the users trying to achieve when using this feature?
What tasks must the users perform to achieve those goals?
How does my application support those tasks?

My struggle is that I prefer having the rules specified as executable tests. I think what's important is to keep the test scenarios free of what Dale Emery calls "incidental details" and make those scenarios as declarative as possible. To conceptualize this, I often suggest that tester describe the behavior in general terms (acceptance criteria), then provide some examples around that behavior to help ensure a common understanding (acceptance tests).

On these ideas consider Imperative vs Declarative Scenarios in User Stories or Why Bother with Cucumber Testing?.

I do think it's important to push toward the more abstract, declarative form of describing behavior. The trick is finding the right level to frame your scenarios at. What you want to aim for is a style is that is not coupled to any specific implementation of the user interface, to the extent that this makes sense. What I have observed is this:

The more imperative you get, the more you get into how to do something rather than talking about what you are doing.
The more imperative you get, the more you are failing to create a domain language. Instead you will end up speaking in the language of the user interface elements.
The more imperative you get, the closer you get to creating fragile scenarios.
The more imperative you get, the easier it is to fall into using incidental details.

Yet, all this said, I'm finding it is difficult to get people to find the right balance. What's the appropriate amount of information that conveys intent, describes behavior, and reveals task (as opposed to app) implementation? The answer is that it seems to vary based on what you are doing and the people you have available to work on it with you. My thoughts are in flux on this and I will be posting more on the concept of a TDL as I figure out better ways to express myself.

One thought on “Communicating In a Test Description Language”

Jeff Lucas says:

12 March 2013 at 9:27 am

Thanks for posting this! Going through each of your examples, I understand the issue in abstraction vs specification that you present. But in each case, I started looking at each as a tester:

* How does each format spur me to ask questions about the context, risks, and alternate or overlooked scenarios?

* How can I document those questions and additional information in a format that will help test this test case later? (additional scenarios?, an expanded scenario?, associated notes?)

* How will each format ultimately provide a higher quality product from the team as a whole? (from the standpoint of business analysts, developers, testers, and customers)

I am not suggesting which is the better format or where the ultimate level of abstraction exists, but the blog definitely got me thinking about how to address that within the context of my team. I always enjoy posts that make it difficult to finish reading because they send my imagination in multiple directions. Very good job!

Stories from a Software Tester

Twice upon a time, in another space, no distance in any direction from here …

One thought on “Communicating In a Test Description Language”

Leave a Reply Cancel reply