Learning Capybara, Part 1

Capybara is rapidly becoming the go-to test tool of choice among Rubyists. I will NOT be covering this tool in the context of the Rails platform. A lot of people see the Ruby test ecosystem as existing largely to support Rails and that’s simply not true. It is true that Capybara, in particular, was forged in and around a set of tools that exist largely to support testing Rails applications. What I’ll show you here is any web application can be tested using Capybara.

Capyara is a Ruby gem and requires a Ruby version of at least 1.9.3. For purposes of this post, I’ll assume you have no trouble getting a working Ruby installation on your system of choice. Feel free to check out my page on setting up Ruby-based automation.

The Application to Practice With

It’s helpful to have something to test against. I’m going to be using my Planetary Weight Calculator page of my Dialogic application. You can feel free to grab a copy from GitHub if you want and run it locally. All of the URLs I post here will show a localhost, assuming you are running it locally. Simply replace that with the remote URL if such is your choice.

The Capybara Session

Capybara provides a domain-specific language (DSL) for test automation. This DSL is designed to extend the mostly human-readable BDD style of frameworks, such as Cucumber and RSpec, into the automation code itself. Generally what you’ll do is mix this DSL into your own code, whether that be modules or classes. However, before getting into that, I’ll show you that you can use a Capybara session “directly.” What that means is that you can utilize a Capybara session object as a way avoid mixing Capybara into your own code. The idea here is that you instantiate a new instance of the session object and then call the Capybara DSL methods on that instance.

To follow along, create a file called capy-session.rb and let’s get started.

The Driver

We need to tell Capybara how it is going to “drive” our application.

This is a fairly simple start. Obviously I must require Capybara in order to actually use it. The RSpec::Matchers module has also been included, which allows you to utilize standard RSpec Matchers. Do note that you don’t have to use RSpec; you could choose a different way to assert behavior. They key thing to note is that I create a new session instance and I set the driver of that instance to Selenium.

But what does this mean?

In a very real sense, Capybara is simply acting as a translator. This translator is used translate an expressive command, provided by the Capybara API, into the API of a given driver. Capybara is thus acting as an abstraction layer over the driver. This allows you to talk to any compatible driver in a much more friendly way than the driver often allows.

A driver will be associated with an automation library. So, following the above train of thought, Capybara is simply an API that provides a layer of abstraction on top of your actual automation library. What’s interesting is that Capybara assumes by default that you are testing a Rack app. So the automation library that Capybara assumes it will use is one known as Rack::Test. This particular library involves no HTTP and instead directly accesses controller classes in MVC-style Rack applications, such as those created via Rails, Sinatra or Padrino.

I don’t want to get too sidetracked, but just in case the context of a “Rack app” is not clear, Rack is a library within the Ruby web application stack and serves as the basis for many, if not all, Ruby-based web frameworks and web servers. Rack is an abstraction layer that sits between any web application that wants to talk over HTTP and the web server that implements that communication. This means Rack falls into that somewhat nebulous realm of “middleware.”

Anyway, let’s get back on track. What I’ve done with the above source is say that the driver — and thus the automation library — to be used is Selenium. This refers to Selenium WebDriver, which involves a full-stack browser test communication mechanism.

So just kepe in mind that Capybara bundles two drivers: Rack::Test and Selenium WebDriver. However, Capybara is designed in such a way that it’s particularly easy for developers to implement other drivers. These drivers sort of “plug in” to Capybara, thus allowing for an ecosystem of sorts.


Capybara brings two key ingredients to test automation: expressive and concise code via an elegant domain-specific language and the ability to write a script once and have it run on multiple drivers such as Selenium WebDriver or Rack::Test. I’m going to show you that DSL as we move through the example here. One particular component of that DSL is a simple way to navigate to pages:

The visit() method does just what it says: it navigates to the provided URL and thus “visits the page.” One thing I should note is that Selenium WebDriver has a built-in mechanism to wait for page loads in the browser. What this means is that (generally) you don’t have to put logic in your script regarding a check as to whether the page actually loaded. Another thing I should note, however, is that this does not include waiting for asynchronous JavaScript that may be running on the page. Capybara does have mechanisms in place to cover these situations as well, but those are not relevant for us right now.

Now let’s do some checking. Specifically, let’s do something really simple and focus on the title of the page.

Here I’m using the Capybara “Query” API directly. Capybara provides a whole set of methods for querying the page under test and returning a specific value or boolean values. Consider the specific things I’m doing:


Capybara provides a title property that returns the title of the current page. Capybara also provides a predicate method to check if the page has a particular title.

What I do above is wrap those checks in expectations blocks provided by RSpec. If you don’t know much about RSpec, there is plenty online (see http://rspec.info/) but essentially the matchers provide semantically friendly ways of asserting that the state of something is what you expect it to be. The key difference to traditional assertions being that RSpec matchers raise exceptions when conditions are not met as opposed to just returning false. RSpec also has a way of turning predicate methods (like has_title?) into predicate matchers (like have_title). The predicate matcher can be used to directly check an expectation, rather than getting a true/false value and then wrapping that within an expectation. You can see examples of both approaches in the above source.

By the way, to keep this example concise at each point I’m going to be replacing some code rather than simply adding code. You can do whatever you prefer, of course, but because I’m doing this I’ll tend to show the full code example each time with the previous elements removed and new elements added.

You can use other query elements, like this:

As a note on coding style, you could just wrap the checks directly in an expectation block:

In that case, however, you’ll probably want to use those predicate matchers I just talked about:

The have_content and have_selector matchers have a default wait mechanism built into them. This is useful because if the content you are waiting for happens to be loaded via asynchronous JavaScript — and thus not necessarily part of the initial page load — Capybara will retry the check for a configurable amount of time to see if the element eventually exists. Again, I’ll cover asynchronous JavaScript later; most likely in a different post.

That have_selector bit of code should stand out to a bit because a “selector” is a CSS term that refers to a specific DOM element. The Document Object Model (DOM) is a tree-like structure that browsers construct, and store in memory, when parsing an HTML page. CSS selectors and XPath queries allow you to search that structure to find content. Capybara relies entirely on such selectors to be able to locate content within web pages. Capybara uses CSS as the default selector. This means that when you use the Capybara API and the Capybara DSL, you will not need to specify what selector type to use.

So the argument I passed to the have_selector method is ‘#wt’, which means “find the selector that has an id of ‘wt'”. It’s important to include the ‘#’ symbol because, in CSS, that stands for “id”. Without that ‘#’ symbol, Capybara would be looking for a selector — HTML element — called <wt> and that isn’t going to exist.

Finding Elements

Let’s refine our example a bit, using another aspect of the Capybara API:

Here I’m using the find() method to have Capybara look for a selector that has an id of ‘calculate’. Once that element is found I’m returning the value of it. In this case, the element that corresponds to this selector is the “Calculate” button on the page. Once again, as per Capybara’s default, I’m using a CSS selector. If you wanted to use an XPath selector, you could simply pass :xpath as the first argument to the method and then the XPath expression as the second argument. So you could replace the find() method like this:

What you see there is that if you wish to use XPath selectors, you can explicitly state this when calling methods. However, let’s say you always want to XPath over CSS. That can get annoying having to type :xpath for each method. So, as an alternative, you can set the selector globally:

Now you don’t have to specify the :xpath part each time. As you might imagine, even if you are using XPath as your default selector, you can now reverse the logic and add :css to your method call, like this:

Manipulating Elements

Now that I have a reference to an element, I can call actions up on that element. This works because a finder method, like find(), returns an actual element instance. Were you to inspect the above element variable, you would find it is an instance of Capybara::Element. So for now just understand that find() is a method that takes an XPath expression or a CSS selector and returns a Capybara::Element on which we can invoke an action.

So let’s say I want to enter a weight value into the form and click the calculate button. First, I’m going to remove the default_selector of xpath. I tend to use CSS. Now let’s deal with some elements on this page. Specifically, we’re going to enter a weight value of 200 and click the button.

If you run the script, you should see that working. This is demonstrating a very common task that you are likely to want to automate: form entry and submission. Capybara provides a lot of user-friendly API to do just this. Here I use the fill_in() method. Now, something you may notice: I pass in the ‘wt’ information to the fill_in() method but I don’t include the ‘#’ to indicate this an id. A lot of Capybara’s specific element interaction methods use a “best guess” strategy when you tell them to find something on the page. Those methods look at various attributes on DOM elements to try to find the one you asked for. So, for example, when locating fields that can accept text input, Capybara will use one of the following to find those fields in the DOM:

  • The id attribute of the input element.
  • The name attribute of the input element.
  • A related label element for a given input element.

After filling in my form field value, I then use the click_on() method. This is a generic method for clicking on any object that is clickable. As with fill_in(), Capybara uses a “best guess” strategy for much of the API when attempting to locate elements. In the case of links and buttons, Capybara looks at the following element properties when attempting to locate the element to click on:

  • The id attribute of the anchor, button, or input tag.
  • The title attribute of the anchor, button, or input tag.
  • The text within the anchor, button, or input tag.
  • The value attribute of the input element, when its type is one of ‘button’, ‘reset’, ‘submit’, or ‘image’.
  • The alt attribute where an image is used as an anchor or input.

Now let’s check the result of our action:

Notice here, on line 12, I’m using a find_field() method. This is yet another finder method, built on the find() method, but with the syntax sugared a bit to make it more expressive. This find_field() finder searches for form fields by the related label element, or by the name/id attribute.

If you run this, you’ll find the expectations are being met.

Run on Other Browsers

If you’ve been running these examples, you’ll certainly see that Selenium uses Firefox by default. You can make your script run against Chrome by registering a driver. This is how Capybara provides different drivers for different libraries. Let’s add one for Chrome:

You can do something similar with Internet Explorer:

If you want headless execution, you can use PhantomJS:

Do note that in all cases you must have the appropriate browser drivers installed. Specifically you need to get the latest version of the Chrome Driver, IE Server Driver and PhantomJS.

Further posts will explore further the API and DSL of Capybara.


About Jeff Nyman

Anything I put here is an approximation of the truth. You're getting a particular view of myself ... and it's the view I'm choosing to present to you. If you've never met me before in person, please realize I'm not the same in person as I am in writing. That's because I can only put part of myself down into words. If you have met me before in person then I'd ask you to consider that the view you've formed that way and the view you come to by reading what I say here may, in fact, both be true. I'd advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.
This entry was posted in Automation, Capybara. Bookmark the permalink.

One Response to Learning Capybara, Part 1

  1. Kobus Beets says:

    This is truly one of the best sites I’ve come across explaining how to write tests using rspec, capybara and the simple sample of using phantomjs. I’ll definitely visit this site more often to learn more about testing. Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *