I’ve found myself in a position lately of having to explain a lot of concepts that are “obvious” to me. I found myself getting frustrated but then I considered my own words regarding the “obvious” nature of Quality Assurance and I realized that maybe I wasn’t establishing the context of what I was talking about. So I took step back and I started to look at whether many of the testers I work with and meet these days are aware of, much less practice, the idea of requirements being tests; of acceptance test specifications that drive development; of specification workshops. As it turned out, no, most testers were not practicing these concepts and many were not even aware of them as a shift in the dynamic of how testing can be done.
So let’s first consider some specific points that I like to call out. These points are relevant to testers and product analysts who want to consider themselves working as part of a quality assurance function. (Regardless of whether they are part of a “QA Team”.)
- We need to build a shared understanding of what needs to be done.
- We want to produce specifications.
- We want those specifications to be easy to understand an easy to maintain.
- Those specifications should encode acceptance criteria.
- Acceptance criteria can be written as acceptance tests.
- When specifications are written as tests, they become executable specifications.
- Those specifications can be created or reviewed at specification workshops.
I want to come back to the sixth point above but I’ll first focus on the last point: the use of specification workshops as a technique. Spec workshops are business domain and feature scope exploration exercises that ensure that the implementation team, business stakeholders, and domain experts build a consistent, shared understanding of what the application should do in order to provide value to customers.
- The input to and output from a spec workshop is a test specification (or feature file, if you prefer).
- Test specifications will tend to use a structured format.
- A structured format example would be the Given-When-Then format.
- The structured format should allow you to write in a business domain language that expresses the intent of the tests.
The immediate goal of a spec workshop is to produce a set of examples that illustrate a feature. Illustrating a feature means using examples.
- Examples are added to clarify meaning and codify a shared notion of quality.
- Examples avoid ambiguities and communicate with precision.
- Examples must be consistent and they must be complete.
When examples are the prime drivers, you are using the technique of specification by example and thus example-driven testing. When those examples cover business workflows, you are using the technique of scenario-based testing.
Now let’s go back to that sixth point above, regarding “specifications are written as tests.” Treat the word specification as if it meant requirement and then consider this quote:
“As formality increases, tests and requirements become indistinguishable. At the limit, tests and requirements are equivalent.”
That came from the article “Tests and Requirements, Requirements and Tests: a Möbius Strip”, published in IEEE Software back in 2008 by Robert Martin and Grigori Melnik. Reading that article was a game changer for me as it got me thinking along different paths. So here’s how I try to explain that:
- Requirements, tests, examples — they all talk about the same thing.
- They talk about how a system will behave once it’s in the hands of users.
Now let’s bring in a few terms that people like to throw around:
- A story represents an executable increment of business functionality (i.e., a feature).
- A scenario represents a concrete example of the behavior of the feature.
- A scenario is a sequence of steps through the feature that exercises one path.
- A scenario can exercise related paths if those related paths are part of the business intent.
Note here that a “scenario” does not have to mean an “end-to-end scenario” necessarily. Or, put another way, the “end-to-end” part can be very focused and very small, as opposed to some workflow that takes you through all parts of the system. A few more points:
- Scenarios can be broken into examples.
- A representative set of examples will be the acceptance criteria for the feature.
- Examples should demonstrate the intent only and not include incidentals.
- Examples should generally avoid describing business rules in terms of the user interface.
All of the bullet points you see above were my attempt to distill down to the basics of what I want testers to understand, particularly as they work within the domain of the product/business analyst. Further, I want testers to see that we ultimately want requirements in the form of examples. Writing those examples is basically writing tests.
That’s why, to me, these elements all combine to make up test specifications. As with all test writing, it’s nice to have some type of structured format. One of the more popular — but often debated — formats is the “Gherkin” or BDD format known as Given/When/Then. The Given/When/Then format, like all such formats, is just a tool for thinking about context, actions, and observables. The real power in using such a format comes from being able to ask whether you’ve missed a particular aspect of the context, whether you are focusing on just one action, and whether there are different aspects to the observables that you’re not checking for.
A Case Study
So I want to walk you through a specific example here. I’ll take this from an actual environment I work within and an actual situation that occurred. We have a feature around billing rate cards. In this context, I’m working with clinical trials that have a Sponsor (the group who wants to do the trial) and a CRO (Clinical Research Organization) who does work for the Sponsor. In order for the CRO to bill the sponsor for activities they undertake as part of carrying out the trial, a billing rate card can be associated with any plan. The plan is the agreement between the CRO and the Sponsor for how the trial will be carried out. One day a bug report came in through our customer support. We decided to pilot the idea of a spec workshop on something very small, like this bug report. We framed it as: “How would we have written this problem as a requirement? How would we have made that requirement into a test?” The assumption being that had we done so, we would have found this internally rather than hearing about it via a customer support ticket.
In discussing the issue during a spec workshop, we wanted to make sure we understood what area of the application — or what basic feature — was being tested. Rather than have a long description about the feature, we’re trying to categorize test specifications based on common feature names within our application. So we started with this:
Feature: Billing Rate Cards
Now, keep in mind that we might have other feature files sitting around in our repository and some of them may have that exact same feature title. That’s okay. In fact, that’s how we stop worrying about finding each file so we can add tests to it. Technically, it doesn’t matter. As long as we use the same title for the feature, we can query our repository for all features that have the title Billing Rate Cards. You might ask: Isn’t that a bit ridiculous? Why have so many different files when there could just be one? That’s true. We could have just one. And, in fact, since we can query all feature files for a given title, we can later refine our test specifications by combining them if need be.
At this point, with the feature title written, people tend to jump into writing scenarios because that’s what BDD practices tend to encourage. However, we tried a slightly different approach. While the scenarios are supposed to capture the “titles” of requirements, I’ve found this doesn’t always work very well. So instead we just talked with the product analysts about what the requirements were for the particular aspect of billing rate cards we were dealing with this iteration. I’ll spare you the discussion, of course, but what we came up with was this:
Feature: Billing Rate Cards
Requirement: A plan should always use the "active" billing rate card.
Requirement: A plan should never use rates from a deleted billing rate card.
Essentially the bug report was that a plan was using rates from a deleted billing rate card. So what we did is make sure that the actual requirements were understood. Then we worked out at least one scenario that we could get our arms around. The bug report was quite specific in that it involved a CRO customer and their resources. We also realized, however, that the bug could occur for a Sponsor customer as well. So we added the scenario titles that we were going to want to write tests for:
Feature: Billing Rate Cards
Requirement: A plan should always use the "active" billing rate card.
Requirement: A plan should never use rates from a deleted billing rate card.
Scenario: CRO Customer
Scenario: Sponsor Customer
That was enough for right now. We decided to end the spec workshop with the promise that we would break down those scenarios into tests and pass those tests to product for review. Could we have done the breaking down right in that spec workshop? Yes, and, in fact, that might have been ideal. But the product analysts had a time constraint and spec workshops should be flexible and adaptive. You get as much as you need to continue work. So the test team went off to work. It actually took us a half hour or so to come up with our first test. That might seem like a long time but the issue was we had a tester that had the knowledge, but not necessarily the ability to communicate it effectively. I want to take you through this a bit because the back-and-forth is what you often don’t see documented for this kind of thing.
The Dialogue
So, for this issue, here was how the tester first described it:
- Tester: For a CRO customer, replacing a provider or adding a new provider always uses sponsor specific rate card even when sponsor specific rate card is deleted before replacing provider and there is one any sponsor rate card.
- Me: When you say “replacing the provider or adding new provider” you mean when this is done on a given plan, correct?
- Tester: Yes.
So I created the start of a test and asked the tester if this was good:
GIVEN a CRO customer
WHEN a plan has a service provider with a billing rate card
AND that service provider has been deleted
The response:
- Tester: No you are not deleting a service provider. You are deleting a rate card.
So I tried again:
GIVEN a CRO customer
WHEN a plan has a service provider with a billing rate card
AND that billing rate card has been deleted
AND a new service provider with a billing rate card is added to the plan
THEN ....
- Tester: No still not correct. I don’t know how would I write this.
- Me: Well, in my example: which part was wrong?
- Tester: The first ‘when’ statement is wrong. The ‘and’ statement is also wrong.
- Me: Okay, the first ‘And’ was based on the fact that you said “you are deleting a rate card.”
At this point, the tester started getting into it a bit and came up with the following:
WHEN there is a sponsor specific rate cad and Any sponsor rate card are created
AND sponsor specifc rate card is deleted
AND replacing provider on plan
THEN ....
- Tester: I don’t know if the above makes sense but somewhat like that.
- Me: Okay, so the “sponsor specific rate card” and the “any sponsor rate card” are preconditions. They have to exist for this test to work.
- Tester: Yes, I said you need to delete rate card but you need to delete for a provider whom you want to replace.
Again, notice the somewhat oblique way the tester communicated in that sentence compared to when they tried to write Given/When/Then. So I tried another version:
GIVEN a "sponsor specific rate card" and an "any sponsor rate card" exist
WHEN the sponsor specifc rate card is deleted
AND the service provider is replaced on a plan
THEN ....
- Me: The problem here is the second ‘And’ clause. What service provider? What plan? That’s why I would start with my context as saying something like “a plan exists with a service provider that has a sponsor specific rate card.”
- Tester: Well the problem is the plan does not need to have a provider that has a service provider that has a specific sponsor rate card. What is needed is the provider that you are replacing should have a deleted sponsor specific rate card and an any sponsor rate card.
- Me: But where are you replacing the provider? On a plan, right?
- Tester: Yes.
- Me: So that implies the plan must exist already in a given state. So “Plan exists that has ….” what?
- Tester: Plan can exist with any provider — it does not matter.
I wanted to focus on this action, so I just provided a When:
WHEN you are replacing a service provider on a plan with a service provider that has a deleted sponsor-specific billing rate card....
- Tester: And an any sponsor rate card. Don’t forget that.
- Me: Are those two different conditions, though? Because I would then have another when:
WHEN you are replacing a service provider on a plan with a service provider that has a deleted any sponsor billing rate card
- Tester: No, not separate. I guess the correct version is this: “When you are replacing a service provider on a plan with a service provider that has a deleted sponsor specific rate card and Any sponsor rate card.”
- Me: Okay, so both types of rate card must have been there and been deleted, is that correct?
- Tester: No, only sponsor specific has been deleted. Any sponsor is not deleted.
- Me: Ah! Key thing there. How about:
WHEN you are replacing a service provider on a plan with a service provider that has an any spondor billing rate card and a deleted sponsor-specific billing rate card
THEN ...
- Tester: Yes, that probably makes sense.
At this point the tester was concerned that I wasn’t writing out actual steps. This all seemed a bit too high-level for her.
- Tester: But if we had written steps than won’t that have made this easier?
- Me: It can, but the meaning can potentially get lost or the implementation may change. The above is a requirement: the When and the Then taken together are a statement of what should happen in a given context. The steps part will be done — just not at that level.
I still wanted to make sure that I had the context correct, now that we seemed to have settled on the action.
- Me: This service provider — they have to be CRO provider, right?
- Tester: Yes. Also this is only possible for CRO customer.
- Me: So if I use a Given, I might start like this:
GIVEN a CRO customer with a standard plan with a service provider
- Me: In fact, I may not even need to say with a service provider since a plan has to have one.
- Tester: Exactly.
- Me: So I could do something like this:
GIVEN a CRO customer with a standard plan with a service provider
AND a CRO service provider that has the following:
an any spondor billing rate card
a deleted sponsor-specific billing rate card
WHEN you replace the plan's service provider with the CRO service provider
THEN ...
- Tester: When is still confusing. Because both the providers are type of CRO.
- Me: Okay, so the plan will have an existing CRO provider and that’s the one we’re replacing, right?
- Tester: Yes.
- Me: Okay. So here’s where “called” might help us. I say that because, imagine I had this:
GIVEN a CRO customer with a standard plan with a CRO service provider
AND a CRO service provider that has the following:
an any spondor billing rate card
a deleted sponsor-specific billing rate card
WHEN you replace the plan's original CRO service provider with the new CRO service provider
THEN ...
- Me: The trick here is still the when. The “original” might be okay, but the “new CRO” might be confusing.
- Tester: May be you should use “another”? Instead of saying ‘new CRO’.
- Me: Right, but the trick is the “another” could be read as the one that was specified already. So what if we did this:
GIVEN a CRO customer with a standard plan with a CRO service provider
AND a CRO service provider called "TestCRO" that has the following:
an any spondor billing rate card
a deleted sponsor-specific billing rate card
WHEN you replace the plan's original CRO service provider with TestCRO
THEN ...
- Me: Here the difference is I’ve used a “called” clause to name the provider that we are going to use in the When.
- Tester: Hmmm. Not sure.
- Me: The concept here is the When should be the thing you are testing and ONLY the thing you are testing. In this case, it is the action that will drive the Then. Meaning, whatever we expect to see in Then should be a direct result of the action we took in When. Everything else — when you strip away the action and the observable result — is the context.
- Me: The above, put another way, is a usage scenario or use case. This is something that a user story would tell us. (Ideally, anyway.) Also note that this visual appearance easily lets us ask: “Okay, but what if they DO NOT have the deleted card? What then? And what if they ONLY have a deleted card? Is the observable the same?”
- Tester: I guess I can see that.
- Me: One other thing I should add: notice how our discussion forced us to narrow this down to only those aspects that matter for the test. We ruled out incidentals and only included the information that mattered. Thus we refactored (refined) as we reviewed. We also did pair test design.
As the tester continued to think on this, she came up with something else:
- Tester: In the above given when and then, you also need to mention that rate card is published.
- Me: So my and could become:
AND a CRO service provider called "TestCRO" that has the following:
a published any spondor billing rate card
a published deleted sponsor-specific billing rate card
- Tester: Yup. Still worried about steps, though.
- Me: Steps would be like this:
step %{I login to a CRO customer}
step %{I create a plan called "TestPlan" with a CRO service provider}
step %{I create a CRO service provider called "TestProvider"}
step %{I associate a rate card for a specific sponsor to "TestProvider" and publish it}
step %{I associate a rate card for any sponsor to "TestProvider" and publish it}
step %{I delete the specific sponsor rate card for "TestProvider"}
step %{I replace the CRO provider on "TestPlan" with "TestProvider".}
At this point, a developer chimed in.
- Developer: This is all great, but the bug report just says that you should “check rates”. But check them for what?
- Tester: Check rates so that they match the rate card of Any sponsor.
- Me: Perfect. That’s our observable. So how about:
THEN the rates match the Any Sponsor billing rate card
- Me: The trick here is “the rates match” — is that enough? For example, I can check this for a location on the assignments tab, right? (As per the bug.) So I might do this:
THEN the rates for a location match the any sponsor billing rate card
- Me: Which then leads me to: do I have to specify a specific location? Or is whatever location is on the plan good enough?
- Tester: Whatever location on plan is enough since rates are uploaded for all locations. Also I guess I may write the ‘Then’ as the rates for each of the resources for a location should MATCH the any sponsor billing rate card.
- Me: Okay. That’s a great clarification, as it implies the scope of what has to be observed. So what that would mean is that if this requirement was not met, the Then was NOT being observed. And that would thus imply that the rates for at least some resources DO NOT match the any sponsor billing rate card. So our full requirement+test at this point looks like:
GIVEN a CRO customer with a standard plan that has a CRO service provider
AND a CRO service provider called "TestCRO" that has the following:
a published any sponsor billing rate card
a published deleted sponsor-specific billing rate card
WHEN you replace the plan's original CRO service provider with TestCRO
THEN the rates for each of the resources for a location match the any sponsor billing rate card
Back to Product …
Whew! That was a lot of dialogue to get to one test, but it was a test that was thoroughly reviewed as we created it. Notice how this is meant to serve as a high-level, example-driven requirement that is in essence a user scenario. This is just one way to word this. The wording can be very flexible. Of note is that we specified only the necessary elements; no incidentals. We have a data context (GIVEN along with its AND), a specific test and only the specific test we are running (WHEN), and an observable (THEN) that should be a direct result of the action taken. If we come up with the Then we effectively have a test and a requirement bottled into one deliverable which could (hopefully) drive development. We can break out the above to cover related scenarios as well. Everyone involved liked what we came up with, felt they understood the example, felt it clarified the requirement by making it more specific. We passed this to our product team. As follows:
Feature: Billing Rate Cards
Requirement: A plan should always use the "active" billing rate card.
Requirement: A plan should never use rates from a deleted billing rate card.
Scenario: CRO Customer
GIVEN a CRO customer with a standard plan that has a CRO service provider
AND a CRO service provider called "TestCRO" that has the following:
a published any sponsor billing rate card
a published deleted sponsor-specific billing rate card
WHEN you replace the plan's original CRO service provider with TestCRO
THEN the rates for each of the resources for a location match the any sponsor billing rate card
Scenario: Sponsor Customer
Product agreed but noted that “resources” come in two types: built-in and custom (or user-defined). No one was sure if this mattered or not. There was still some debate about whether the sponsor mattered, but we broke out the scenarios like this:
Feature: Billing Rate Cards
Requirement: A plan should always use the "active" billing rate card.
Requirement: A plan should never use rates from a deleted billing rate card.
Scenario: CRO Customer, Built-In Resources
GIVEN a CRO customer with a standard plan that has a CRO service provider
AND a CRO service provider called "TestCRO" that has the following:
a published any sponsor billing rate card
a published deleted sponsor-specific billing rate card
WHEN you replace the plan's original CRO service provider with TestCRO
THEN the rates for each of the built-in resources for a location match the any sponsor billing rate card
Scenario: CRO Customer, Custom Resources
Scenario: Sponsor Customer, Built-In Resources
Scenario: Sponsor Customer, Custom Resources
Notice I changed the scenario titles. Also note the addition of “built-in” to the THEN statement. With this we could then go forth and write Given-When-Then examples that covered these other scenarios. Notice how the spec workshop was quite constrained and only built what was needed to further understand the aspect of the feature we were testing.
Guidelines
After all of this was said and done I wanted to point out some guidelines that I believed this manner of writing tests promoted. I used the specific Given-When-Then test we provided and stated the following as guidelines for test writing and test reviewing:
- The test focuses on the business intent rather than details of implementation.
- Incidentals are removed.
- The only information included is that which is necessary for the test to be executed.
- Anything extraneous should be edited out.
- The GIVEN establishes a full context that is required for a specific action to take place.
- The WHEN indicates the specific action that will lead to a specific observable.
- The focus should be on the specific action that directly leads to the observables.
- An action can be stated in workflow terms, as opposed to activity terms.
- The THEN should be the specific observable.
- An observable can be stated as exact values.
- An observable can be stated as a range of values.
- An observable can be stated as a relative difference in value.
- If the GIVEN and the WHEN can be stated as an actionable context, then it’s fine to combine them.
- The output conditions must be related to the same context and the same action.
If you read even remotely close to this far, I hope this was helpful to see how a real example was constructed and a little bit about the day-in-the-life discussion that took place. As a closing note, I would add that whether or not there is a “QA Team” the above activities involved a product analyst, a developer, and a tester. They were acting with a goal in mind, which was to produce a shared notion of quality about a particular feature. As such, they were most certainly engaging in a quality assurance function.
Thanks for providing such a detailed real life example, much more useful to read than the contrived ones that I’ve read elsewhere
Notice how when you are “backporting” specifications for feature due to a bug or because you wish to use start writing specs for “new” features, you stumble accross hidden and implicit requirements.
In your examples above, no one was sure whether “built-in” or custom resoruces mattered. So there may in fact be a hiddin requirement floating around out there, but since it was never documented, no one knows for sure.
This is always an added expense to writing specifications after a product has released and become mature. When you write new features, or are fixing one representation of a bug, you have to re-do the work to understand any hidding/implicit specifications that have never been captured.
I agree. The “backporting” aspect is refining the specification in this case, which of course is something you will probably do even if this was a new feature specification you were writing.
As far as the hidden requirement, I agree. So when our discussion took place we had to decide whether or not the nature of a built-in or a custom resource would have an impact on the particular scenario we had. It was decided that the observables should be the same: meaning, for the stated requirements, the same observable should apply whether or not the resources were custom or built-in. We chose to break that out as two scenarios but, arguably, we really could have just said “for any resources” or something along those lines.
I agree with the danger of trying to do this stuff after the fact. To at least try to mitigate that somewhat, what we did with this one is write the specification as if the feature was newly introduced. Because in both cases we were asking the same thing: “How should it behave?” Going with the guidelines, we also kept asking the questions about the full context and the removal of incidentals. As an example, someone did bring up “Oh, what about resources that are pinned vs. unpinned?” (Pinning a resource, in our context, means that the settings of the resource will not change at all, even if actions are taken that should change it.)
So this required we had people on hand who understood the issue we were working with, meaning (at least) a tester, a product analyst, and a developer. Even that, however, does not guarantee that we fail to miss something that we should be considering. However, that issue with “backporting” can actually occur even when starting fresh because people can still not take into account the variables that affect the context or the action. With my example, even if this was a brand new feature we were writing “acceptance test first”, it would still require someone bringing up the idea of built-in resources vs. custom resources and then someone determining if those should matter.
That’s why I’m trying to come up with guidelines that serve as heuristics regardless of which “direction” you are going, so to speak. Time will tell if I’m barking up the wrong tree here.