The Art of Testing at Different Levels

Many testers entering the field get nervous when they consider the need to test at different levels of an application, particularly if that involves working with developers. Many testers will enter into environments that claim they are practicing Test-Driven Development (TDD) and/or Behavior-Driven Development (BDD). What I want to do here is show that this stuff isn’t really all that scary and it’s really not that arcane. Further, if you want to be competitive as a tester and continue to add value, it’s in your best interests to learn what it means to test at different levels.

I’m going to pretend that you’re an experienced tester (from a manual testing perspective) but you are new on the job in an environment that uses Ruby, a language you haven’t really done anything with at all. I pick Ruby because this is generally one of the easier languages to get new people up to speed on and to learn concepts with.

Ruby Stuff If you want to try out the examples, you’ll need Ruby installed. I’d recommend Ruby 1.9.2. The directions at the Ruby site are generally more than adequate. You can also follow my Open Source Automation Setup directions if those help.

I’m going to cover three testing tools in the Ruby world: Test::Unit, RSpec, and Cucumber. I’ll also include a few side comparisons with similar tools in other languages just so you have an idea of what they look like. For example, the equivalent of Test::Unit in the Java world is JUnit and in the .NET world it would be NUnit. In Python you’d probably be using PyUnit (sometimes called unittest). The equivalent of RSpec or Cucumber in the Java world is probably easyb and JBehave, respectively. In the .NET world this would be SpecFlow. In Python you’d by using something like Freshen or Lettuce.

TDD is a methodology consisting of writing a failing test first and then writing the code that will make the test pass. Once this test is in place developers can theoretically refactor (basically, change) their code with much more confidence. The confidence isn’t because all of this testing will make them better coders, but rather because tests have been built in that will tell them if they screw something up. If the developers write code that cause the tests to break, then that should be immediately obvious. (This presumes people are actually running the tests, of course!) The nice thing about this approach is that it can help people — not just developers — consider how the code should be running and what kind of outputs it should give as the result of certain inputs.

BDD is a methodology based on TDD and, some would argue, an extension of it. Whereas with TDD you tend to write tests that check an isolated part of the code base (often just a single method), with BDD you write a test to check the interaction between different parts of the code base. You can sort of think of TDD as testing bits of code in isolation and BDD as testing various bits of code working together. Like TDD, BDD also places an emphasis on writing tests before anyone writes code but the focus is different in that BDD works at the level of features and scenarios that utilize those features rather than at the level of methods within objects.

TDD and BDD, as approaches, are meant to give developers (and others) time to think through decisions before too much code is written. By first writing the test for the implementation you eventually want, you are forced to think through the implementation so that it will be easy (or at least easier) to test. This is an important point. Many developers would argue — and I would agree — that if you find the test very difficult to write, then perhaps the implementation needs to be improved. If the test is relatively easy to write, but it’s quite cumbersome in terms of the number of steps, maybe that’s again pointing out ways that the implementation should be simplified.

Note how this kind of testing having an input into the development of maintainable and extendable code is different from the tests that most testers write at the system level.

So let’s get started on a working example. I’d recommend creating a working directory where you can put the files we’ll create. Let’s say your new job involves testing a new game engine for role-playing games that your company is developing. The developers want to get a handle on how the various parts of the engine should work and, as a tester, you should be seeking the same thing. So let’s create a file called attack_test.rb. Put the following in the file:

To make this a Test::Unit test, you had to require test/unit, which is part of Ruby’s standard library. This provides the Test::Unit::TestCase class. The AttackTest class inherits the TestCase class. Inheriting from this class provides the functionality to run any method defined in this class whose name begins with test. This is very similar to how you would do this in just about any unit testing tool in just about any language. Here’s an example in Java with JUnit:

Here’s an example in C# with NUnit:

Here’s an example using PyUnit in Python:

As you can see, while each language will obviously have it’s only particular constructs, the overall approach is the same. To run our test file, you run the following command at your command line:

ruby attack_test.rb

When this command completes, you see some output. You’ll probably something like this:

Loaded suite /Projects/attack_test
Finished in 0.000000 seconds.

1 tests, 0 assertions, 0 failures, 0 errors, 0 skips

It’s easy to miss the fact that a test ran. The first line under the “Started” text is nothing more than a period. This is Test::Unit’s way of indicating that it ran a test and the test passed. If the test had failed, that period would have been replaced with an F. If the test had encountered an error, the period would have been replaced with an E. You can also see some statistics on what happened. In the above output we see that there was one test run, and there were zero assertions, zero failures, zero errors, and no tests were skipped. This would all be great if my test actually did anything. But it doesn’t. That’s what the zero assertions means: nothing was asserted and so nothing was tested. Let’s fix that.

It’s all pretty much the same … The above description of unit test output is what you will find in just about any unit testing tool. There are general conventions that are followed by just about all of these tools. So the good news is that if you learn one, you’ve pretty much learned them all, at least at a conceptual level.

Let’s change the test method so it looks like this:

The assert method in the test makes an assertion that the argument passed to it evaluates to true. This test passes given anything that’s not false. When this method fails, it fails the test and raises an exception. In case you’re not sure what that line is saying, it’s asserting that an Opponent class responds to a method called still_fighting? and that the method does not return a false or a nil value.

Were you to run this test right now, you’d see something like this:

Loaded suite /Projects/attack_test
Finished in 0.000000 seconds.

  1) Error:
NameError: uninitialized constant AttackTest::Opponent

Notice the E under “Started”? This is not a test failure. This is an error. Essentially the developer is being told that the code can’t even run. This is how some developers practice TDD. The developer who wrote this test would presumably have known that there was no Opponent class yet. And if there wasn’t an Opponent class, they’re certainly could not have been a still_fighting method on that class. What the developer may have been doing here is writing how they would like the code to respond. They would like to be able to ask an Opponent if it was still fighting.

So now create a file called opponent.rb and put the following in it:

In order to allow your unit test to find this file, you have to make the following changes at the top of your attack_test.rb file:

Here I’ve required the opponent.rb file and I’ve used that strange looking construct on the top to make sure that Ruby looks in the current directory for any files I reference.

Finding Source Code The above change showcases what is probably one of the harder areas for testers who want to practice with these concepts. Each language has its own way to recognize where bits of source code are. C#, for example, uses assemblies that you have to reference. Java uses packages. You do have to know how to set up source code in the respective language in order to get everything to work.

Now run your test again. Everything should pass. In this case, however, everything passes because the method still_fighting always returns true. It’s impossible for this test to ever fail, which means it’s not really a good test. A developer would also tell you that this test doesn’t make a lot of sense because it’s calling the method on the class rather than on an object of that class. If practicing TDD, the developer would not start refactoring the code in opponent.rb to change this but would rather refine the expectations of the test. Change the test_fight_result method in attack_test.rb to look like this:

Running this test will cause an error. Not a test failure, but yet another error. Sheesh, now what? Well, the problem is that the still_fighting method is now being called on a particular object of the Opponent class (goblin) rather than on the class (Opponent) itself. This inability of the test to run is a good thing! It tells the developer they made a change in how they expected the logic to work. This lets them verify that kind of decision. In this case, the still_fighting method was defined with a “self.” in front of it. That makes it a class method, meaning a method that can be called on the class but not on objects of that class. The developer would probably refactor the opponent.rb class as follows:

The only change here is removing “self.” from the front of the still_fighting method.

I should probably note that a developer who was not practicing “true” TDD may have just refactored the opponent.rb file first and then updated the test. While that’s certainly one way to go about it and while it would have led to the same result in this case, practitioners of TDD say this sort of violates the whole idea of test first.

There is still the problem of this being a useless test since it always returns true. However, note what all of this futzing around did: it allowed the developer to start thinking about how opponent objects must be communicated with. One such way is asking the object if it’s still fighting. So let’s take this further. The game will have hit points. If an opponent is attacked, they will lose a certain number of hit points. Rather than code up the logic, let’s go all test-first on this thing and add another test:

The developer has made a few decisions here. Apparently an opponent can be created and given a class or race designation. Just as we could ask if an object was still_fighting, now we can set the hit_points for an object. We can also tell the opponent that it was attacked and pass it a value that would seem to indicate how many hit points the attack was worth. We know this test would error because none of that code logic is defined. While the test would error, it wouldn’t fail and, thus, this actually isn’t a test yet because we haven’t said what we expect to happen. So let’s add one more line:

That assert_equal statement is saying that I expect the opponent’s hit points (which were 10) to be lowered (to 5) after the attack. Even further, you might notice that this also demands that the hit points can be read as well as set. Now I have a test! Now code can be written that makes that test pass. Let’s just say the developer does a lot of work and the opponent.rb file ends up looking like this:

If you run the test — it fails! What?!? Here’s the output you’ll probably see:

Finished in 0.000000 seconds.

  1) Error:
ArgumentError: wrong number of arguments (0 for 1)
    C:/Terminus/Projects/opponent.rb:2:in `initialize'

Are you kidding me? Another error? Reading the output, this problem occurred in our first test — test_fight_result — and not the one we were just working on. A developer would now have to realize that the latest change made it so that an opponent must be created with a type or race specified. Our original test didn’t do that. The developer could update the test:

Now everything works. Okay, great for the developer. You’re a tester. So what does this have to do with you? Well, the important part here was not the implementation (in opponent.rb) but the intent (in attack_test.rb). The developer was essentially indicating how the application should work. In this case, subtracting a number of hit points probably seems like a pretty safe thing for the developer to do. But imagine more complex applications where the decisions may not be quite so simple. Conversations with developers about how they are planning to unit test can be helpful. If nothing else, this can help you understand what areas of the application are tested behind the scenes and what are not.

Still, though, this code stuff is not necessarily where many testers spend their time. (Although I would argue the industry is changing on that score, but that’s a discussion for another time.) TDD, while ostensibly being a generic term that refers to letting tests drive development, has tended to be relegated to the code-based unit testing arena, where many testers simply don’t live. At the very least they don’t live there as much as the developers do.

Enter an approach like BDD. BDD is similar to TDD, but the tests for BDD are written in an easier-to-understand language so that developers and pretty much anyone else can clearly understand what is being tested. I’ll cover that in a bit but let’s talk about what you’ve seen so far.

The main thing I wanted you to see here is that, as a tester, you can easily learn the practice of unit testing. If you followed along, you were able to write your own unit test. I realize that just copying a bunch of code that I wrote is not knowing how to code necessarily. However, that really wasn’t the intent here. The intent was to show that good testing practice does have to be applied to unit testing as well. This is testing that takes place very near the time of coding and, in fact, lives side by side with the code. In fact, these are tests expressed as code! A tester and a developer could certainly work together to say what the tests should do and consider other situations.

For example, wearing my tester hat: what if I set hit points on an opponent to 0? Or to a negative number? Should there be a highest number that hit points can be set to? What if I have an attack value of 5 but the opponent only has 3 hit points? Do different races have different “tolerances” for attacks? What’s with that still_fighting thing? It sounds like it’s meant to indicate whether an opponent can still fight after an attack, but clearly there’s nothing being done to enforce that. Should there be? What else can cause hit points to go down? Can anything cause them to go up? Do opponents who lose points slowly “regenerate” them in some fashion?

Code-based unit testing is just one of many techniques in a tester’s arsenal. It’s one that requires close collaboration with developers and it’s also one that can allow you a more nuanced view of test coverage. Code-based unit tests can even serve as a form of communication mechanism once the code is explained to those who don’t know how to read it. So now let’s talk about tests that don’t have to be explained as much because while they are still close to the code, they are written at a different level of abstraction.

When you start getting into BDD tools, you start along the path of writing tests in a domain-specific language. So far I just talked about the code-based unit tests and how those tests live so close to the code that they are, in effect, just another representation of the code itself. BDD tools can operate at various levels that are more or less removed from the code.

Get the Gems If you want to follow along with these next parts, beyond having Ruby installed you also need to install the RSpec and Cucumber gems. Assuming you have Ruby and Rubygems installed, the commands gem install rspec and gem install cucumber should do the trick.

The first tool I’ll look at is RSpec. RSpec is an extension of the concepts already provided by Test::Unit. In fact, you can use Test::Unit methods inside of RSpec tests if you want to. However, RSpec does try to provide an easier-to-understand syntax and I’ll use that here. You’ve already seen how Test::Unit essentially just asks you to write test methods on test classes. Those test methods call assertion methods. With RSpec, the code you write is referred to as a spec as opposed to a test class. The spec contains examples instead of tests. (They are tests but they are just referred to as examples.)

So go ahead and create a file called opponent_spec.rb. Put the following in it:

Here you describe the Opponent class and write an example for it, declaring that an Opponent is still_fighting. The describe block contains tests (examples) that describe the behavior of Opponent. In this example, whenever you call still_fighting?, the result should be true. Here the “should” serves a similar purpose to assert, which is to assert that its object matches the arguments passed to it. If the outcome is not what you say it should be, then RSpec raises an error and goes no further with that spec.

Let’s add the hit points test:

Okay so let’s compare the two versions here. Here was the Test::Unit version:

Here’s the RSpec version:

Hmmm. They sort of look pretty much alike, don’t they? In fact, they’re almost identical except for a few structural differences. Those few structural differences are, however, enough for some people. What they like is how the intent is made a bit clearer by the “describe” and “it” blocks. Let’s break down just the salient parts of each test and see how they would look for discussion purpose. Abstracting away a lot of stuff, here’s the Test::Unit breakdown:


And here’s the RSpec breakdown:

  "is still fighting"
  "can lose hit points"

Incidentally, other tools would be very similar to the RSpec approach. A tool like Yeti (for Python) would have tests like this:

Similar frameworks for Java and .NET are often not to be found but what you will find is that developers may utilize RSpec for .NET with IronRuby or for Java with JRuby. Again, as before, I just want you to see that once you learn one of these tools, you tend to learn most of them because they follow similar patterns and structures.

Okay, so what you can see is that a tool like RSpec, much like Test::Unit, is still sitting fairly close to the code. When you read the tests, you’re still reading code. So now let’s consider a further evolution. For a time — back around 2008 — there was a tool called RSpec Stories and what I would have with that is write one on my tests like this:

Let’s pull out just the bits we want for discussion:

Scenario "opponent can lose hit points"
Given "An Orc"
And "the hit points are", 10
When "an attack causes damage of", 5
Then "hit points are reduced to", 5

That actually reads pretty nice, huh? Well, RSpec Stories eventually became Cucumber and what you have with that is this:

Scenario: Opponents can lose hit points.
  Given There is an Orc
  And its hit points are 10
  When an attack causes 5 points of damage
  Then the hit points are reduced to 5

Look at that, huh? No code-like stuff to be found. What you see here is known as a scenario in Cucumber. Under the scenario’s title, the remainder of the lines are called steps. Each step is prefaced with a clause (Given, When, etc). These steps are read by Cucumber’s parser and matched to what Cucumber refers to as step definitions. This is where the logic is kept that turns those English statements into actual working code. In Cucumber scenarios are found inside a feature, which is a collection of common scenarios. So the above scenario may fit into a feature that is all about the hit point system in the game.

While it may not be obvious from the Cucumber example, you can see from the RSpec Stories example that Given, And, When, and Then are methods. These are the methods that Cucumber looks for to indicate that a line is a step as opposed to just some text that is in the file but is not executable. In many ways these clauses are really no different than how RSpec used “describe” and “it” as methods.

So let’s say you had a file called attack.feature with the following in the file:

Feature: The game engine has a hit point system.

Scenario: Opponents can lose hit points.
  Given there is an Orc
  And its hit points are 10
  When an attack causes 5 points of damage
  Then the hit points are reduced to 5

You set the stage by using the Given steps, play out the scenario using When steps, and ensure that the outcome is as you expected by using Then steps. The And word is used when you want a step to be defined in the same way as the previous step. In this case, the And is acting like another Given.

This natural language test can now be turned into an executable test. I don’t want to necessarily go through all that here because to effectively use Cucumber takes setting up a bit of a structure. If you are curious, I do a working example in my post Testing with Cucumber, Capybara and Selenium. But for now just know that behind the scenes, those statements would be translated by the following code:

Steps are defined by using regular expressions, which are used when you wish to match strings. You match a step in the feature with a step definition by putting the text after one of the clauses into a regular expression. With some steps you’ll see capture groups inside the regular expression. Anything in parentheses — like the (\d+) in the above steps — is a capture group, so called because it captures whatever it finds in the string at that point. Whatever is matched is stored in a variable which can then be used as part of the steps operations.

What I wanted you to see from this example is that Cucumber allows you to write tests for your code in syntax that can be understood by developers, customers, business users, and testers. People can write the test scenarios even if there is no underlying implementation yet. This allows you to start testing at an system and integration level, largely at the same time.

What you should probably notice is that, ultimately, what’s happening behind the scenes in all these cases — Test::Unit, RSpec, and Cucumber — looks remarkably similar from a code perspective. What you hopefully also saw was that many of these tools ultimately look remarkably similar and, if you play around, you’ll find they operate quite similarly as well. As one example, here’s the same “behind the scenes” logic in JBehave (which is a Cucumber-like tool for Java):

Here’s that same thing but using the Groovy scripting language with Java:

As a final note I just want to make it clear that all of my examples here were essentially dealing with a form of code-based unit or code-based integration testing. The code logic of these tests was more acting to test out the application behind the scenes rather than drive the application itself. Again, see my post on Testing with Cucumber, Capybara and Selenium, which shows how you can use a tool like Cucumber as a front-end for tests that drive a browser. You can see Using RSpec and Capybara Without Rails for a similar example with RSpec. With those examples you are dealing more with system testing. In all cases — both in this post and those others — testing was a form of design activity done in tandem with or even before development of the actual application code begins.

Hopefully this somewhat long post was helpful to testers who may find themselves having to broaden their definition of testing a bit and learn some tools in order to support that broadened definition.


This article was written by Jeff Nyman

Anything I put here is an approximation of the truth. You're getting a particular view of myself ... and it's the view I'm choosing to present to you. If you've never met me before in person, please realize I'm not the same in person as I am in writing. That's because I can only put part of myself down into words. If you have met me before in person then I'd ask you to consider that the view you've formed that way and the view you come to by reading what I say here may, in fact, both be true. I'd advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.