This post follows on from my code is a specification. I highly recommend reading that post to get the context because here I’m going to add a bit to the sample code from that post. This is being done to illustrate the idea of test code and production code working together to act as an executable specification. Here I’m going to focus a bit on how this has relevance to the business as well.
Let’s consider the bit of test code we put in place to put pressure on the design:
1 2 3 4 5 6 |
RSpec.describe 'a project' do it 'that has no tasks is done' do project = Project.new expect(project).to be_done end end |
I’m not showing you the production code that makes this code pass but, again, keep in mind that it’s the two acting in concert that is the basis of practices like TDD. As I argued in the previous post, you could equally well argue this is BDD. By discussing the construction of this code along with developers, testers, and business, pressure was put not just on the design of the code but on the overall design of the behavior.
Incrementally Refine the Design
So as part of our nascent design, we’ve managed to encapsulate the idea that a new project is done. This would then lead to a question of what a “non-done project” is. Let’s say we end up with another test that drives our code, such as this:
1 2 3 4 5 6 7 |
it 'that has an incomplete task is not done' do project = Project.new task = Task.new project.tasks << task expect(project).not_to be_done end |
This test is similar to the first one, but now we have a second domain concept introduced: the Task. We now strictly model the fact that a Project can contain Tasks. From a code perspective we indicate a tasks
related attribute of the Project, but we’re still keeping this at the behavior level without worrying too much about implementation. The important thing to note is how this is forcing us to consider our level of abstraction and our domain terminology.
Also notice an assumption here. The assumption in this code is that a new task is incomplete. Therefore that also tells us another assumption we are making: a project with an incomplete task is not done. It’s very important when putting pressure on design to rise assumptions to the surface.
All of this, of course, brings up the distinction between complete and incomplete tasks. So now let’s consider this next bit:
1 2 3 4 5 6 7 8 9 |
it 'recognizes tasks that have been completed' do project = Project.new task = Task.new expect(task).not_to be_complete task.mark_complete expect(task).to be_complete end |
Notice here how we are defining, in broad strokes, how the application behaves but we are doing so outside of too many details of implementation. This is important because we can imagine a situation where there is a GUI interface (say a web site) as well as a service interface (an API). The notion of “mark complete” can have meaning in both, but certainly via different levels of interaction. Here we are just talking about the design.
As a tester or developer, at this point I know an empty project is done. (Perhaps it’s an open question with business as to whether a project with no tasks should be considered “empty.”) We know that tasks can be part of projects and they can be completed or incomplete. This then leads to the next point of discussion and design which we wrap in a test:
1 2 3 4 5 6 7 8 9 |
it 'that has all tasks completed is done' do project = Project.new task = Task.new project.tasks << task task.mark_complete expect(project).to be_done end |
Let’s again reflect. Am I doing BDD here? TDD? Well, kind of both, I would argue. The above code separates out into what BDD would look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# SCENARIO: a project with all tasks complete is done # GIVEN a new project and a new task project = Project.new task = Task.new # WHEN the task is marked as complete project.tasks << task task.mark_complete # THEN the project is considered done expect(project).to be_done |
This is all pretty important. We now have a Project that can be populated with Tasks. Those tasks can be marked as complete. Further, the project recognizes when it has incomplete tasks, thus recognizing that it is not done.
Encode Understanding
We have created tests that have encoded our understanding. Those tests have natural language aspects that can be used to communicate at different levels. Further, these tests operate at a behavioral level — without getting too much into implementation details — that can serve as a good regression test suite that can be updated as business rules change, but stay relatively static if the implementation changes.
If I wanted to push English up at this point, here’s a possible output based entirely on what I’ve shown you in the logic above:
a project that has no tasks is done that has an incomplete task is not done recognizes tasks that have been completed that has all tasks completed is done
Since the test logic is wrapped with English statements, those statements can be pushed back up to provide insight.
Use Tests to Pressure Design
As part of this project, as I mentioned in the first post, we need to be able to calculate how much of a project is remaining and the rate of completion, and then put them together to determine a projected end date. So the project ultimately needs to be able to calculate how much work is remaining. Let’s say the following is created:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
RSpec.describe 'a project' do describe 'providing estimates' do let(:project) { Project.new } let(:done) { Task.new(size: 2, complete: true) } let(:small_not_done) { Task.new(size: 1) } let(:large_not_done) { Task.new(size: 4) } before(:example) do project.tasks = [done, small_not_done, large_not_done] end it 'accurately calculates the total size' do expect(project.total_size).to eq(7) end it 'accurately calculates the remaining size' do expect(project.remaining_size).to eq(5) end end end |
The book Rails 4 Test Prescriptions (from which I am borrowing the salient aspects of this example) has this to say about constructs like the above:
A couple of minor style choices make the test easier to manage. All the task objects have meaningful names so that at a glance I can tell each object’s reason for being in the test. If the tasks had descriptions or names I’d also give them meaningful data so that if the object gets printed to the terminal it’s easy to tell which object it is. The specific score numbers that I’m using for each are deliberate. Each task has a different score, and neither of the two adds up to the third, which is a very small thing that makes it harder to get a false positive test.
Use Code to Discuss
The important thing here is that, yes, this is code. But it is code that is understandable. There seems to be this fear of introducing code like this as part of a discussion with business teams. But why is that? After all, business teams certainly (and rightly) expect developers and testers to understand their language and their business domain. And, guess what, that business team is operating in the context of a technical discipline. So there’s no reason they should be unexposed to what makes their business ideas realizable in a technical form.
This is probably one of the most important ideas I ultimately want test teams — and teams they work with — to start embracing. BDD, speaking generally, has worked to try to insulate the business from code and I think that’s a terrible mistake. I’ll also be the first to admit this is an idea I’m coming to after firmly drinking the BDD Kool-aid for quite some time.
Refactor to the Business Domain
There’s one last bit I want to show you here and this time it jumps into some (admittedly simplified) code. Here is what the Project class looks like that satisfies the above tests:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
class Project attr_accessor :tasks def initialize @tasks = [] end def done? # tasks.reject(&:complete?).empty? incomplete_tasks.empty? end def incomplete_tasks tasks.reject(&:complete?) end def total_size tasks.sum(&:size) end def remaining_size # tasks.reject(&:complete?).sum(&:size) incomplete_tasks.sum(&:size) end end |
The commented lines in the done?
and remaining_size
methods are there to show how a particular bit of code was refactored as part of the test design. Specifically, an incomplete_tasks
method was created. As the book says:
[we wrapped] a slightly opaque functional call containing a negative condition in a method with a semantically meaningful name. And if the definition of completeness changes, we only have to change one location.
This type of refactoring does make the code more clean but it also makes the code easier to discuss with business. And that gets interesting for another reason entirely.
To explain, looking at that code, you might notice another bit of duplication there. At two points there is a summation of the tasks by calling sum(&:size)
. If you have a developer mindset, you might wonder if you could refactor that down to a method as well, such as sum_tasks
or something like that.
But … does that method make sense on a Project? It certainly doesn’t make sense for the Task class. But it’s not clear that a Project should have this responsibility either. Yet, as the book indicates, this leads us to question whether the Project is even the correct abstraction. Perhaps what we really want is a TaskList for most of these activities. Then a Project would hold very little except references to TaskList objects.
And the reason this is so important is because deciding those abstraction levels, which reflect the business domain, is very important to being able to align business, developers, and testers along the same axes of discussion. And please note that this is putting pressure on design at both levels — intent and implementation — but still using code — production and test — as the ultimate specifications.
Expressive and Intent-Revealing Code
One final thing I’d like to point out. Here I’m showing code, both production and test, written in Ruby. Clearly having the test language aligned with the production language makes sense when you are taking about testing at the unit and/or integration level. But when you start to get into integrated (and thus “system” and “acceptance”), you don’t necessarily have to use the same test language as your production language.
As I’ve talked about before, sometimes it makes sense to align your test language with your development language. Other times, however, your test language is not necessarily your development language. And, of course, you can have a resilient strategy wherein you are a bit polyglot in your approach.
I mentioned the “push English up” idea earlier to showcase just the natural language part of the specs. This is exactly what I was talking about regarding “pushing English up” versus “pulling English down” when I showcased the use of a tool like Serenity in the context of Cucumber.
This idea does have some impact on BDD style tools. Adding Cucumber or whatever else often means adding the pure English abstraction layer (“feature files”), then a secondary layer (“step definitions”) that is used to match the English to regular expression annotated methods, and then finally that secondary layer delegates down to some code that performs the actions. Using an approach like what I’ms showing here can short circuit a lot of that. Further, you can output the natural language information from the code-based spec file. I’ve been showing Ruby here but an example of doing this in a Java context might be using a tool like Spock.
Even with all that being said, the nature of the test code needs to be written in such a way that it leverages a DSL if you want to have it act as a communication mechanism for different people. It may seem that I’ve stacked the deck here with Ruby, given that it is a clean language with very little boilerplate. But as I showed in those posts on Serenity, you can do something like this with Java as well:
1 2 3 4 5 6 |
Actor jeff = Actor.named("Jeff"); jeff.can(BrowseTheWeb.with(theBrowser)); givenThat(jeff).wasAbleTo(StartWith.anEmptyTodoList()); when(jeff).attemptsTo(AddATodoItem.called("Digitize JLA vol 1 collection")); then(jeff).should(seeThat(TodoItemsList.displayed(), hasItem("Digitize JLA vol 1 collection"))); |
The same thing could be done in C#:
1 2 3 4 5 6 7 |
IActor jeff = new Actor("Jeff"); jeff.Can(BrowseTheWeb.With(webDriver)) .WasAbleTo(StartWith.AnEmptyTodoList()) .AttemptsTo(ToDoItem.AddAToDoItem("Digitize JLA vol 1 collection")); jeff.AsksFor(TheItems.Displayed()).Should().Contain("Digitize JLA vol 1 collection"); |
Using code as a specification does mean you want that code to be as boilerplate-free as possible, at least in terms of the test code that you use to communicate with different teams. Ideally your code should help you be expressive and intent-revealing. When you can accommodate such an approach, you will often feel much more comfortable about the idea of code being the ultimate specification.