If you plan on using a BDD tool (like Cucumber, SpecFlow, Behat, etc) you are going to want to have some guidelines for how and to what extent you allow parameterized and conditionalized phrases. This is an area that I’ve found can become a rat’s nest of bad habits unless you establish early on how much and to what extent to use these features.
Here I’ll take you through some representative examples that are based on work I’ve seen. However, I’ve sufficiently obfuscated the examples so that I don’t violate any confidentiality rules. I’ll be focusing on Ruby examples here simply because they are the cleanest looking. What I’m talking about, however, applies to any programming language you happen to be using.
Let’s consider a simple scenario:
Scenario: With Steps, Match Terms Without Delimited Parameters Given a car that has a flux capacitor Then the car will travel through time
So generating matchers for those would end up with literal step definitions like these:
1 2 3 4 5 |
Given (/^a car that has a flux capacitor$/) do end Then (/^the car will travel through time$/) do end |
At the simplest level, you can parameterize as such:
1 2 3 4 5 6 |
Given (/^a (car) that has a (flux capacitor)$/) do |vehicle, component| puts "Vehicle: #{vehicle} with Component: #{component}" end Then (/^the car (?:will|should) travel through time$/) do end |
In the second matcher notice how the regular expression allows you to enforce wording in a simplistic way: will or should. This means the following phrase would be unrecognized:
Then the car must travel through time
Looking at the first matcher, however, it’s still pretty literal. It would do nothing for a phrase like:
Given a starship that has a warp manifold
You could conditionalize the first matcher a bit further like this:
1 2 3 |
Given (/^a (.*) that has a (.*)$/) do |vehicle, component| puts "Vehicle: #{vehicle} with Component: #{component}" end |
However, now that phrase stands for anything and everything as long as it fits the pattern of “a SOMETHING that has a SOMETHING”. Is that good? Clearly this depends on your local domain and what kind of code you are putting in the matcher. Clearly with very generalized phrases like this, you have to provide a data helper method or function somewhere that will determine what specific data was passed in and then act accordingly.
As I was presented with this example, I told the team that we should probably have the phrases specifically deal with cars and flux capacitors. Other phrases (like starships with warp manifolds) should be handled with different matchers. So we move our matcher back to this:
1 2 3 |
Given (/^a car that has a flux capacitor$/) do puts "Vehicle: 'Car' with Component: 'flux capacitor'" end |
Notice here that I don’t have to worry about parameters at all. Easy, right? But then the team adding another scenario:
Scenario: With Steps, Match Terms without Parameters, Different Data Given a car that has 3 flux capacitors Then those cars will travel very far through time
Taken by itself, this would lead to the following new matchers:
1 2 3 4 5 |
Given(/^a car that has (\d+) flux capacitors$/) do |count| end Then(/^those cars will travel very far through time$/) do end |
Let’s just take the first one and consider the two statements we have matchers for:
Given a car that has a flux capacitor Given a car that has 3 flux capacitors
Okay, so given that we’re not parameterizing by vehicle or component (i.e., we agree our phrase should hard code those values), it might make sense to combine a matcher to handle the two phrases above since all we’re really parameterizing by is the count of flux capacitors in the car. I could combine the two into a single matcher like this:
1 2 3 |
Given (/^a car that has (a|\d+) flux capacitors?$/) do |count| puts "Car with flux capacitor count of: #{count}." end |
Notice here that I have to account for “a flux capacitor” or “3 flux capacitors”. This works well from a matching standpoint and I can even capture the case of multiple flux capacitors versus just one. For example, the following And clause could be added:
Given a car that has 3 flux capacitors And a car that has 1 flux capacitor
Since I conditionalize the word “capacitor” by indicating the “s?”, I allow for singular or plural. Notice, however, that this matcher leads to “a” being captured in the initial case. My output would look like this:
Scenario: With Steps, Match Terms Without Delimited Parameters Given a car that has a flux capacitor Car with flux capacitor count of: a. Then the car will travel through time Scenario: With Steps, Match Terms Without Delimited Parameters, Different Data Given a car that has 3 flux capacitors Car with flux capacitor count of: 3. Then those cars will travel very far through time
Notice the output of the first scenario: “Car with flux capacitor count of: a.” Not ideal. In this case, I can’t make a non-capture group like this:
1 2 3 |
Given (/^a car that has (?:a|\d+) flux capacitors?$/) do |count| puts "Car with flux capacitor count of: #{count}." end |
That would work for the first phrase, but it would not work for the second since, in that case, I do want to capture the count.
I could, however, do this:
1 2 3 |
Given (/^a car that has (?:a)?(\d+)? flux capacitors?$/) do |count| puts "Car with flux capacitor count of: #{count}." end |
That would work in that the first phrase would not capture on the “a” while the second would capture on the count value.
However, as I showed this to the team, I asked: “Are we losing clarity here?” That’s a question to always be asking with these matchers: am I losing more than I’m gaining? Specifically, yes, I’m cutting down on the number of matchers I have but, on the other hand, my existing matcher is starting to lose some expressiveness. The team agreed to think about it and went away to continue working.
When I caught up with them later, they had added another scenario:
Scenario: With Steps, Match Terms Without Delimited Parameters, Different Data Given 10 cars that have 1 flux capacitor Then those cars should sell like hotcakes
This led two two more matchers:
1 2 3 4 5 |
Given(/^(\d+) cars that have (\d+) flux capacitor$/) do |car_count, capacitor_count| end Then(/^those cars should sell like hotcakes$/) do end |
The team showed me that they could bring that first matcher in line with our existing matchers, like this:
1 2 |
Given(/^(?:a)?(\d+)? cars? that ha(?:s|ve) (?:a)?(\d+)? flux capacitors?$/) do |car_count, capacitor_count| end |
Wow, huh?
Now, to be sure, that does accept all of the phrases used so far but you have to really parse that matcher text to understand what it means and what it will accept. You also run into a danger of sometimes accepting more than you want. So I told the team: “Well, yeah … you could do that. Or you can have three matchers that will cover all the phrases so far.” I then just gave them this:
1 2 3 4 5 6 7 8 |
Given (/^a car that has a flux capacitor$/) do end Given(/^a car that has (\d+) flux capacitors?$/) do |count| end Given (/^(\d+) cars that have (\d+) flux capacitors?$/) do |car_count, capacitor_count| end |
What’s better? What’s worse? It really is up to the team but it does come down to the code clarity that you value as well as the maintainability of your matchers. Regular expressions do have their place and can be powerful. But you do have to consider how to use them wisely. Incidentally, some tools promote placeholders. You can do that with Cucumber like this:
1 2 3 4 5 |
Given ("a car that has $count flux capacitors") do |count| end Given ("$count cars that have $count flux capacitor") do |car_count, capacitor_count| end |
That said, you’ll find these aren’t as flexible (currently) as regular expressions. With the first phrase, for example, it’s not possible via placeholders to indicate “capacitor” or “capacitors”. A tool called Turnip allows you to use actual placeholders that provide conditionalizing phrases. Those would look like this:
1 2 3 4 5 |
step "a car that has :count flux capacitor(s)" do |count| end step ":count cars that have :count flux capacitor" do |car_count, capacitor_count| end |
I believe Cucumber 2.0 is going to allow something similar. I’ve long been experimenting on whether to put something like this in my own Lucid tool. Since I tend to maintain compatibility with Cucumber, it’s likely I’ll do so. Spinach, by contrast, allows no regular expressions and no placeholders. Their view is that every phrase in test specs must have a one-to-one mapping with a matcher. To me this makes Spinach a complete non-starter in that it’s simply way too restrictive.
Here’s another example I came across, which, again, is obfuscated to protect internal interests:
Scenario: With Steps, Match Terms with Quote Parameters When looking up the definition of "CHUD" Then the result is "Contaminated Hazardous Urban Disposal" But if looking up the true definition of "CHUD" Then the result is "Cannibalistic Humanoid Underground Dweller"
Looking at the When and But phrases, you’d end up with these matchers:
1 2 3 4 5 |
When (/^looking up the definition of "(.*?)"$/) do |term| end But (/^if looking up the true definition of "(.*?)"$/) do |term| end |
Here is an example where it helps to look at the wording of the scenario itself. I see why they worded it the way they did with the “But” clause but notice how it made them use interesting phrasing — “but if looking up the…” It’s basically stating another action (thus a When) yet is being used in the context of a Then. In any event, how the team combined the above two matchers was with:
1 2 |
When (/^(?:if )?looking up the(?: true)? definition of "(.*?)"$/) do |term| end |
Okay, that works. But my argument to them was simply this: why not just “reuse” English and make the scenario look like the following.
Scenario: With Steps, Match Terms with Quote Parameters When looking up the definition of "CHUD" Then the result is "Contaminated Hazardous Urban Disposal" When looking up the true definition of "CHUD" Then the result is "Cannibalistic Humanoid Underground Dweller"
Now you just need one matcher:
1 2 |
When (/^looking up the(?: true)? definition of "(.*?)"$/) do |term| end |
But notice how the spacing matters for the “true” part? Since it’s an optional word and not being captured you have to account for the fact that a phrase that doesn’t use the word will have different spacing to match than a phrase that does use the word. So, speaking to the team, I asked if it simply wouldn’t be better to just do this:
1 2 3 4 5 |
When (/^looking up the definition of "(.*?)"$/) do |term| end When (/^looking up the true definition of "(.*?)"$/) do |term| end |
In fact, this gets very close to what Spinach advocates: don’t use regular expressions but rather just use the exact phrases. My problem is sometimes you do want to use a placeholder for terms like above. I would not want a matcher for every possible phrase where someone could look up a specific word. This is where Cucumber and Turnip could align. I could have this in Cucumber:
1 2 3 4 5 6 7 |
When ("looking up the definition of $term") do |term| puts "Term: #{term}" end When ("looking up the true definition of $term") do |term| puts "Term: #{term}" end |
In Turnip that would be:
1 2 3 4 5 6 7 |
step "looking up the definition of :term" do |term| puts "Term: #{term}" end step "looking up the true definition of :term" do |term| puts "Term: #{term}" end |
You can see here how interests of test spec (feature file) clarity do align with code-level clarity, in terms of matchers and step definitions. I maintain that it is useful to parameterize phrases but it is necessary to do so in a way that does not obfuscate how the phrase ties into specific English statements about the domain you are testing.