In previous posts about the integration / integrated distinction (see part 1 and part 2 of that series), I talked about how there is in fact a distinction and provided a little rationale behind why this distinction currently matters. So now let’s talk a little “around” the concept of integration — not integrated — and see where this takes us.
Why does it even matter if we have different terms for different testing? And why does it even matter if we have a few terms conflated? So what if “integration” and “integrated” are often treated the same?
It’s a valid question. One of the core issues has to do with the benefits and costs of testing. It’s always necessary to establish a positive balance between the benefits of testing and the cost of creating and maintaining tests.
It’s a truism that we need to figure out which features to test and not spend too much effort on tests that don’t offer much value. But it’s also a truism that we have to decide the right level of abstraction to focus on for tests. In all cases, your effort devoted to testing should provide meaningful benefit without excessive cost.
What this means, practically speaking, is that we need to focus on testing at those places where we are most likely to make mistakes. We want testing to find those mistakes as quickly as possible. There was a good line from the book Rails 4 Test Prescriptions. Author Noel Rappin said:
One reason why it is sometimes hard to pin down the benefit of testing is that good testing often just feels like you are doing a really good job programming.
The implication there, of course, is that you are testing very close to the actual code. And, further, you are using code to do the brunt of the testing. As was probably clear from my modern testing posts, I do believe in the idea of production and test code being the ultimate specifications.
But is that truly workable? Can code act as your primary specification? Beyond the fact that it already does, let’s consider another dimension to this.
Clearly the more testable something is, the more likely we are able to write tests for and against it.
The first step to improving testability in an application is to establish a natural feedback loop between application code and test code, using signals from testing to improve the application code. The energy devoted to writing complex tests for untestable code should be channeled into making the code itself more testable, allowing simpler tests to be written.
Note that the above thought, however, is different in some ways when you consider a lot of UI-based automation. With such tests, such as when you are driving a browser or controlling a mobile app, you are not dealing with tests at the code level directly, but rather with the implementation that a code ultimately exposes via a platform. But testability is still a concern in this context as well.
We can define testability as the degree to which a system can be verified to work as expected. At the smallest level, closest to the individual lines of code that make up our software, we are concerned with whether methods return the values we expect. At higher levels of abstraction, we are concerned with behaviors such as error handling, performance, and the correctness of entire end-to-end features.
Developers and testers often struggle to automate manual tests for a system with low testability. This common pitfall leads to high-cost, low-value tests and a system whose architecture and organization is not improved by the testing efforts. This is, in large part, what the integrated test scam is all about in my opinion.
This also brings in another dimension.
Simplicity and Faithfulness
So going with the discussion points so far, an important thing is to understand the costs and benefits of testing. The second part of this effort is focused on adjusting where you are on the continuum between production faithfulness and test simplicity as your system evolves, along with the goals it serves.
The production faithfulness part is important. Most of us know that you want test environments that are as close as possible to production environments. But as another point that most of us know, there are always tradeoffs in this and one hundred percent fidelity is usually impossible. Yet this distinction between being faithful to production and being as simple as possible in testing has some ramifications if we consider the distinction between “unit” and “integrated.”
Again, keep in mind that “integration” would, by what I’ve talked about previously, sit between “unit” and “integrated”. Further, “integration” is much more aligned with the unit testing, thus code-based testing, than it would be with higher-level UI testing.
Let’s consider this breakdown of points:
- Test simplicity is best focused on in unit tests.
- Production faithfulness is best sought with integrated tests.
Would you agree with that? If so, let’s consider these further points:
- You should to increase production faithfulness in unit tests.
- You should try increase test simplicity in integrated tests.
Assuming you follow this kind of approach, then you have a formula for what you should do in between unit and integrated tests — where “integration testing” sits. Specifically, you need to be very clear about what is and what is not faithful to the production environment. This can be interesting when you consider that you are right between the direct code and the user interfaces that the code ultimately exposes once a higher level of design is put over it.
Along with being clear about faithfulness of the environment, you also have to be clear about whether the test simplicity is leading to tests that are not telling you anything. Or when the test simplicity is being compromised such that you are no longer looking at integration of components, but rather an integrated system.
The idea I’m going for there is that we should utilize unit testing principles but scale them for integrated testing. “Integration” seems to be a way we can calibrate our scale. Meaning, “integration” is a sort of a net that captures the idea of tests that aren’t quite ‘unit’ but that are expensive enough that we might not want to go full ‘integrated.’
Before we get ahead of ourselves, let’s ask what might seem a silly question depending on your experience: how much does good unit testing practice really matter?
Automated testing has evolved to include many categories of tests — for example, functional, integration, request, acceptance, and end-to-end. Testers will often consider the term “acceptance test” to be a synonym for integration test or end-to-end test. Granted, certain distinctions can — and are — made. But the point here is simply that automated testing encompasses a lot.
Along with this, sophisticated development methodologies have also emerged that are premised on automated verification, the most popular of which are TDD and BDD. Yet I think it’s critical that people not lose sight that the foundation for all of this is still the simple unit test.
Why do I say that? Consider that code with good unit tests is good code that works. You can build on such a foundation with more complex tests. You can base your entire development workflow on such a foundation. You are unlikely to get much benefit from complex tests or sophisticated development methodologies if you don’t build on a foundation of good unit tests. Further, the same factors that contribute to good unit tests also contribute, at a higher level of abstraction, to good complex tests.
Here when I say “complex” I mean tests that shade away from “integration” and into “integrated.”
Whether you are testing a single function or a complex system composed of separate services, the fundamental questions are the same. What are those questions? Some examples:
- Are the assertions/expectations clear and verifiable?
- Are the inputs and outputs precisely specified?
- Are error cases, both input and output, considered?
- Is the test logically coherent?
- Is the test readable and maintainable?
The questions can often ultimately be boiled down to considering whether a test makes it clear how its expected output relates to its input. The idea being that tests are designed so as to call attention to all input values that have a direct bearing on computing the expected result. This has a corollary which is minimizing the chances of tests providing false positives, where the test passes even though the system does not behave correctly, or false negatives, where the test fails even though the system works correctly.
Actually, I’m somewhat convinced that this notion of considering false positives/negatives is what has led to the bad thinking habit of testers referring to “positive tests” and “negative tests.” (Which I try to combat by telling testers not to be so negative … or positive.)
Ultimately what all of these questions and ideas are focusing on is simply this: Is the test providing value? Which is another way of asking: is this test more trouble than it’s worth?
So with this context established, let’s go back to those “sophisticated development methodologies” I mentioned that are, ideally, encapsulating all of the principles we’re talking about here.
TDD / BDD Intersection
Whereas TDD is concerned with tests and code, BDD is concerned with behaviors and benefits. Wait. Is that true? Well, BDD — as it is most often practiced — attempts to express the behavior of a system in plain, human language along with some justification for the benefits that the behavior provides. TDD, by contrast, is expressed in code and does not attempt to justify the value of any part of the system.
This is important. When you consider external quality, you are asking: How would a user know they are getting value from something? All the unit tests in the world won’t necessarily answer that question from a business or user standpoint. However, when you’re considering internal quality, you are asking: What are we doing to make the system maintainable, scalable, extendable, and discoverable?
BDD — ostensibly — concerns itself with ensuring that the right software is developed. Contrast this with our typical concern when writing tests, which is to ensure that the software works in the right way. Yet BDD is often treated as just some form of extension of TDD.
I have noticed what I consider some damaging behavior in teams that practice TDD: they duplicate a sizable amount of their effort by designing their objects with thorough unit or integration tests, but then go on to add a suite of integrated tests that verify a substantial amount of the same behavior.
Over time your code path coverage decreases because — and this is an important point — the complexity of your code base grows more quickly than your capacity to write enough integrated tests to cover it. This is particularly problematic when you realize that lots of bugs happen in the wiring between components.
But that’s interesting, right? Given what I’ve talked about here, is guaranteeing compatibility among services’ interfaces different than integrated testing? Whichever you way you come down on answering that, I think many of us would agree that high-level test frameworks do need a way to guarantee compatibility among services’ interfaces. What I would maintain, however, is that these frameworks still need to maintain many properties where unit and integration mix.
So This Leads Us Where?
I think if we can frame testing around that, then and only then does the distinction of integration and integrated begin to disappear. Or, rather, because everything must ultimately be integrated to provide business value, maybe this is where we simplify and just refer to end-to-end testing and edge-to-edge testing. Or are there other concepts out there that we can use?
What this is really doing is framing what I think is one of the key questions for testers and developers: Are there viable alternatives for testing the interaction between components without resorting to “heavier” integrated testing?
That’s a topic I plan to take up in a future post. In fact, the title of this post was meant literally in the sense of a “formal agreement between individuals or parties.” In this case, the formal agreement is the interface with unit testing on one side and integrated testing on the other.