Awhile back I wrote up some context around the question of Is Cucumber Truly Misunderstood?. There is a wider concept here that makes these tools quite applicable in the modern testing context, so I want to cover that here.
I focused on Cucumber in that last post but really what I’m referring to are the slate of tools that tend to be lumped as “BDD solutions”, whether that be Cucumber (either Ruby or JVM version), jBehave, Concordion, SpecFlow, Behat, Gauge, and so forth. Cucumber just happens to be one I pick on because one of its creators, and some of his most vocal followers, routinely tout that “Cucumber is not a testing tool” and the implication is that none of those other tools are either.
In fact, some of these people argue that as if that is some sort of selling point. “We’re not a test tool!” This even gets widely distributed as a talking point, such as with this “news” article called BDD Tool Cucumber is Not a Testing Tool. Such tools are often touted as collaboration tools. And I agree that they can be that. But, quite frankly, so can Microsoft Excel, Atlassian Confluence, or the Slack messaging application. It’s my view that if tools like Cucumber are misunderstood, it’s often only by some of the misguided statements of their creators or the limited viewpoints of their most ardent supporters.
These thoughts have been in the back on my mind lately and this happened just as I read A Context-Driven Approach to Automation in Testing (PDF link, by the way) by James Bach and Michael Bolton. That paper has nothing to do with BDD tools, per se, but it is well worth the read because it has everything to do with the idea of testing tools and the context of automation, which tools like Cucumber are often used as front-ends for.
One part of this paper says that a “tool” is “any human contrivance that aids in fulfilling a human purpose.” Any tool that thus aids in fulfilling the human purpose of testing could rightfully be called a test tool. That is a large part of why I look at tools like Cucumber in this context because, to me, they are specification tools — or at least can be used as such — and specification is one of the techniques that testing can be applied to.
A few points really stuck out for me in the paper and I’ll quote them in full with a brief bit about why those points resonated with me.
Tools That Affirm Multiple Purposes
Tools that support many purposes are preferable to those optimized for one purpose. Some tools are designed with specific process assumptions in mind. If you work in a context where those assumptions apply, you are fine. But what if you decide to change the process? Will your tools let you change? In changing contexts, tools that are simple, modular, or adaptable tend to be better investments. Tools that operate through widely used interfaces and support widely used file formats are more easily adapted to new uses. Note that a tool may have only one major function, such as searching for patterns in text, and yet be a good fit for many purposes and many processes.
I couldn’t agree more and that’s why I see BDD-style tools as being those that can support many purposes. For example, I’ve used such tools solely as specification tools and never had them delegate down to some underlying automation tool. I’ve used the tools solely in that context to help teams build up a shared language of phrases. A ubiquitous language, in the terms of Domain-Driven Design. But I could do that without these tools, right? I sure could … so why use those tools then? Various reasons but one is that because many of them have a “dry-run” feature, which is kind of nice in that it has the tool parse over all test specifications and see if the phrases, as used, are recognized. This helps teams figure out if they are saying consistent in how the talk about the domain. I could build my own such solution but these tools handle that fairly well already.
I might note that I just said “test specifications” above and that’s another point: these tools can be used to write tests at varying abstraction levels, not just at the business rule level. Most proponents of the tools say this means you are not “doing it right.” But, again, the tools can serve multiple purposes. Some of those specifications may, in fact, delegate down to automated checkers whereas others are solely used for communication.
As just one example of this, I used Cucumber in a context where some test specifications were pure business rule statements but others were literal test steps for interacting with a user interface. I used the Macros4Cuke to provide sequences in my own test specifications such that more imperative steps for tests could be provided. I attempted to implement this in my own Lucid tool (see the post Sequences in Lucid if curious, although that tool is pretty much deprecated right now). Using imperative style statements like that would be a major no-no according to most BDD practitioners but I found I could use those as immediate “steps to reproduce” in bugs as well as have those be training materials for new testers.
Tools that Reinforce Human Engagement
Tools that require more human engagement and control are preferable to those that require less. This is due to a syndrome called “automation complacency,” which is the tendency of human operators to lose their skills over time when using a tool that renders skill unnecessary under normal circumstances. In order to retain our wits, we humans must exercise them. Tools should be designed with that in mind, or else when the tool fails, the human operator will not be prepared to react.
One thing I like about many of the BDD tools is that they don’t just have to be tools used to delegate down to automation. They can be used with other supporting tools to provide more insight into communication by looking at what kinds of test specifications have been written. Just sticking with the Ruby context again, let me provide two examples:
- I can use Cuke Sniffer, with my own rules applied, to check if our teams are using some “communication smells”. I could look for examples were the tables were too large or where certain phrases should be used. For example, I worked at an ad serving firm where the much abused phrase “beacon ad” was considered verboten by the business. With the sniffer, I could automatically check if we were using that word anywhere and have it flagged.
- I also used the Cucumber Query Language to query on some statistics regarding what’s been written. So I could run queries, for example, on how many tags were being used or how many of a particular tag was being used. This allowed me to see if we were using different tag names for what amounted to the same thing.
Consonant with Bach’s article, note here that I’m using various tools in the support my human purpose of testing but none of those necessarily has to be in the context of automated checking. When I combine tools like this, and frame BDD tools as communication and collaboration test tools, I have a way to engage the humans in the process of testing, which is a multi-faceted discipline.
Tools That Account for the Non-Specialist
Tools that can be useful to non-specialists are preferable to those that are not. We’re talking about tools that lower the cost of getting started; that afford ease of use; that don’t depend on proprietary languages; that have lower transfer and training cost. Microsoft Excel and spreadsheets in general provide a good example. It is possible to use Excel in a very specialized and sophisticated way, but there is a lot Excel can do for you, even if you have only basic skills with it.
BDD style tools can used by non-specialists in a variety of contexts which I’ve barely touched upon here. One thing that’s interesting is mention of “proprietary languages” and while that’s likely referring more to programming languages, consider many BDD tools are predicated upon the use of Gherkin, which is a standard language for expressing business rules in the context of “Given-When-Then” statements. This can be a bit limiting even though I do think it can be helpful to limit some aspects of structuring. That said, various tools, like Yadda or Gauge, are attempting to provide similar solutions but without necessarily restricting you to Gherkin. One area of development I’d like to see is the allowance for more expressive test specifications outside of a Gherkin context.
Was there a point to all this?
I think a lot of the tools out there have provided us with a direction. I think they have all been helpful in allowing us to think about what kind of tools we need to support testing, particularly when testing intersects with other areas such as specification writing. But I also think the future has a lot of room for continued growth in this area of thought. We just have to make sure that people are spending less time categorizing these tools and spending more time on thinking about how the tools engage the human process of testing.