BDD Specs and Parameterizing Phrases

If you plan on using a BDD tool (like Cucumber, SpecFlow, Behat, etc) you are going to want to have some guidelines for how and to what extent you allow parameterized and conditionalized phrases. This is an area that I’ve found can become a rat’s nest of bad habits unless you establish early on how much and to what extent to use these features.

Here I’ll take you through some representative examples that are based on work I’ve seen. However, I’ve sufficiently obfuscated the examples so that I don’t violate any confidentiality rules. I’ll be focusing on Ruby examples here simply because they are the cleanest looking. What I’m talking about, however, applies to any programming language you happen to be using.

Let’s consider a simple scenario:

Scenario: With Steps, Match Terms Without Delimited Parameters
  Given a car that has a flux capacitor
  Then  the car will travel through time

So generating matchers for those would end up with literal step definitions like these:

At the simplest level, you can parameterize as such:

In the second matcher notice how the regular expression allows you to enforce wording in a simplistic way: will or should. This means the following phrase would be unrecognized:

Then the car must travel through time

Looking at the first matcher, however, it’s still pretty literal. It would do nothing for a phrase like:

Given a starship that has a warp manifold

You could conditionalize the first matcher a bit further like this:

However, now that phrase stands for anything and everything as long as it fits the pattern of “a SOMETHING that has a SOMETHING”. Is that good? Clearly this depends on your local domain and what kind of code you are putting in the matcher. Clearly with very generalized phrases like this, you have to provide a data helper method or function somewhere that will determine what specific data was passed in and then act accordingly.

As I was presented with this example, I told the team that we should probably have the phrases specifically deal with cars and flux capacitors. Other phrases (like starships with warp manifolds) should be handled with different matchers. So we move our matcher back to this:

Notice here that I don’t have to worry about parameters at all. Easy, right? But then the team adding another scenario:

Scenario: With Steps, Match Terms without Parameters, Different Data
  Given a car that has 3 flux capacitors
  Then  those cars will travel very far through time

Taken by itself, this would lead to the following new matchers:

Let’s just take the first one and consider the two statements we have matchers for:

Given a car that has a flux capacitor
Given a car that has 3 flux capacitors

Okay, so given that we’re not parameterizing by vehicle or component (i.e., we agree our phrase should hard code those values), it might make sense to combine a matcher to handle the two phrases above since all we’re really parameterizing by is the count of flux capacitors in the car. I could combine the two into a single matcher like this:

Notice here that I have to account for “a flux capacitor” or “3 flux capacitors”. This works well from a matching standpoint and I can even capture the case of multiple flux capacitors versus just one. For example, the following And clause could be added:

Given a car that has 3 flux capacitors
And   a car that has 1 flux capacitor

Since I conditionalize the word “capacitor” by indicating the “s?”, I allow for singular or plural. Notice, however, that this matcher leads to “a” being captured in the initial case. My output would look like this:

Scenario: With Steps, Match Terms Without Delimited Parameters
  Given a car that has a flux capacitor
    Car with flux capacitor count of: a.
  Then the car will travel through time

Scenario: With Steps, Match Terms Without Delimited Parameters, Different Data
  Given a car that has 3 flux capacitors
    Car with flux capacitor count of: 3.
  Then those cars will travel very far through time

Notice the output of the first scenario: “Car with flux capacitor count of: a.” Not ideal. In this case, I can’t make a non-capture group like this:

That would work for the first phrase, but it would not work for the second since, in that case, I do want to capture the count.

I could, however, do this:

That would work in that the first phrase would not capture on the “a” while the second would capture on the count value.

However, as I showed this to the team, I asked: “Are we losing clarity here?” That’s a question to always be asking with these matchers: am I losing more than I’m gaining? Specifically, yes, I’m cutting down on the number of matchers I have but, on the other hand, my existing matcher is starting to lose some expressiveness. The team agreed to think about it and went away to continue working.

When I caught up with them later, they had added another scenario:

Scenario: With Steps, Match Terms Without Delimited Parameters, Different Data
  Given 10 cars that have 1 flux capacitor
  Then  those cars should sell like hotcakes

This led two two more matchers:

The team showed me that they could bring that first matcher in line with our existing matchers, like this:

Wow, huh?

Now, to be sure, that does accept all of the phrases used so far but you have to really parse that matcher text to understand what it means and what it will accept. You also run into a danger of sometimes accepting more than you want. So I told the team: “Well, yeah … you could do that. Or you can have three matchers that will cover all the phrases so far.” I then just gave them this:

What’s better? What’s worse? It really is up to the team but it does come down to the code clarity that you value as well as the maintainability of your matchers. Regular expressions do have their place and can be powerful. But you do have to consider how to use them wisely. Incidentally, some tools promote placeholders. You can do that with Cucumber like this:

That said, you’ll find these aren’t as flexible (currently) as regular expressions. With the first phrase, for example, it’s not possible via placeholders to indicate “capacitor” or “capacitors”. A tool called Turnip allows you to use actual placeholders that provide conditionalizing phrases. Those would look like this:

I believe Cucumber 2.0 is going to allow something similar. I’ve long been experimenting on whether to put something like this in my own Lucid tool. Since I tend to maintain compatibility with Cucumber, it’s likely I’ll do so. Spinach, by contrast, allows no regular expressions and no placeholders. Their view is that every phrase in test specs must have a one-to-one mapping with a matcher. To me this makes Spinach a complete non-starter in that it’s simply way too restrictive.

Here’s another example I came across, which, again, is obfuscated to protect internal interests:

Scenario: With Steps, Match Terms with Quote Parameters
  When looking up the definition of "CHUD"
  Then the result is "Contaminated Hazardous Urban Disposal"
  But  if looking up the true definition of "CHUD"
  Then the result is "Cannibalistic Humanoid Underground Dweller"

Looking at the When and But phrases, you’d end up with these matchers:

Here is an example where it helps to look at the wording of the scenario itself. I see why they worded it the way they did with the “But” clause but notice how it made them use interesting phrasing — “but if looking up the…” It’s basically stating another action (thus a When) yet is being used in the context of a Then. In any event, how the team combined the above two matchers was with:

Okay, that works. But my argument to them was simply this: why not just “reuse” English and make the scenario look like the following.

Scenario: With Steps, Match Terms with Quote Parameters
  When looking up the definition of "CHUD"
  Then the result is "Contaminated Hazardous Urban Disposal"
  When looking up the true definition of "CHUD"
  Then the result is "Cannibalistic Humanoid Underground Dweller"

Now you just need one matcher:

But notice how the spacing matters for the “true” part? Since it’s an optional word and not being captured you have to account for the fact that a phrase that doesn’t use the word will have different spacing to match than a phrase that does use the word. So, speaking to the team, I asked if it simply wouldn’t be better to just do this:

In fact, this gets very close to what Spinach advocates: don’t use regular expressions but rather just use the exact phrases. My problem is sometimes you do want to use a placeholder for terms like above. I would not want a matcher for every possible phrase where someone could look up a specific word. This is where Cucumber and Turnip could align. I could have this in Cucumber:

In Turnip that would be:

You can see here how interests of test spec (feature file) clarity do align with code-level clarity, in terms of matchers and step definitions. I maintain that it is useful to parameterize phrases but it is necessary to do so in a way that does not obfuscate how the phrase ties into specific English statements about the domain you are testing.

Share

About Jeff Nyman

Anything I put here is an approximation of the truth. You're getting a particular view of myself ... and it's the view I'm choosing to present to you. If you've never met me before in person, please realize I'm not the same in person as I am in writing. That's because I can only put part of myself down into words. If you have met me before in person then I'd ask you to consider that the view you've formed that way and the view you come to by reading what I say here may, in fact, both be true. I'd advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.
This entry was posted in BDD, Cucumber, TDL. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *