Combination testing involves testing several variables together. As you can imagine, however, this leads into an explosion of tests. Combination testing can be used to make your tests manageable. But you also have to make your strategy for determining combinations manageable. It’s the latter aspect that I’m going to talk about here.

The first critical problem of combination testing is the number of test cases. Imagine testing three variables together. That doesn’t sound so hard, right? But what about when each variable has one hundred possible values? In that case, the number of possible tests is 100 × 100 × 100. One million tests. Yikes! So reducing the number of tests is a critical priority. But how do you do that? That’s what I’ll be talking about here.

## Partitioning Is One Way…

The first step is to reduce the number of values that will be tested in each variable. The most common approach involves what’s known as **domain testing**. In other words, you partition the values of Variable 1 into subdomains and choose best representatives of the subdomains. Perhaps you can bring the number of tests of Variable 1 down to five in this way. If you can do the same for Variable 2 and Variable 3, you now only have 5 × 5 × 5 tests. That’s 125 tests in total and while that’s still a lot — keep in mind this *is* just for one set of variables — it’s still a lot less than a million.

## Combinations Are Another Way…

There are a couple of different techniques within the broad technique of combination testing.

### All Singles

The simplest set of combination tests would ensure that you cover every value of interest of every variable. This is sometimes called **all singles**, usually contrasted with “all pairs” and “all triples”. The reason for the name is that you’re making sure that you hit every single value of every variable a single time. There is a procedure you can follow to make sure you achieve this.

Let’s consider an example based on a hypothetical trading application. One of the challenges with this kind of platform is the need to consider different kinds of activities you apply with different types of accounts that apply trades. There are certain bits of functionality that you want to test on those combinations.

So let’s set some ground rules and nomenclature here.

First we’ll let V1, V2, and V3 stand for the three variables under consideration in a suite of tests. So those are the key variables in our system that we want to test for. Treat those as our test conditions. They are aspects of functionality that can take certain values.

### The Values of V1

V1 will represent the product and investment type to be tested. Let’s say that A, B, C, D, and E are the five values of interest in variable V1. Here’s the breakdown of what those values mean:

- A = Onshore EOM Redemptions/Account (Liquid)
- B = Onshore Sidepocket/Account (Liquid)
- C = Onshore FOM Redemptions/Account (Liquid)
- D = Offshort/Account (Liquid)
- E = Private Equity Fund/Account (Illiquid)

### The Values of V2

V2 is the business event being generated when a given product and investment type (V1) is being used for a trade. Let’s say that I, J, K, L, and M are the five values of interest in variable V2. Here’s the breakdown of what those values mean:

- I = Investor Subscription
- J = Investor Redemption (Redeem All)
- K = Investor Redemption (Redeem Across Transactions)
- L = Investor Redemption (Liquidating Redemption from Tranche)
- M = Investor Transfer

### The Values of V3

V3 refers to different actions that can happen with a business event. So let’s say that V, W, X, Y, and Z are the five values of interest in variable V3. Here’s the breakdown of what they mean:

- V = deleted
- W = canceled
- X = amount is changed
- Y = date is revised
- Z = status of event changes

### The Test Conditions

Just to be sure the test conditions are clear, with the above breakdown I could have a test condition like this:

V1,A --> V2,K --> V3,X

It probably helps if you table that out a bit:

Account (V1) | Business Event (V2) | Action (V3) |
---|---|---|

[A] Onshore EOM Redemptions/Account (Liquid) | [K] Investor Redemption (Redeem Across Transactions) | [X] amount is changed |

So I’ve tested the combination of V1,A with V2,K with V3,X. Great. Now I just have to do the others. Let’s talk about what that looks like, though.

### Test the Combinations

To test all combinations of these variables’ values, we would have 5 × 5 × 5 = 125 tests, as I already established earlier. The table below is a combination test table that achieves “complete testing” — when the criterion of completeness is that every value of every variable must appear in at least one test.

Variable 1 | Variable 2 | Variable 3 | |
---|---|---|---|

Test Case 1 | A (Onshore EOM) | I (Subscription) | V (deleted) |

Test Case 2 | B (Onshore Side) | J (Redeem All) | W (canceled) |

Test Case 3 | C (Onshore FOM) | K (Redeem Across Trans) | X (amount change) |

Test Case 4 | D (Offshore) | L (Liquidating Redemption) | Y (revised date) |

Test Case 5 | E (Private Equity) | M (Transfer) | Z (status change) |

This is pretty simple, right? If you read down the variable columns, you can see that I simply used each of the possible values for the variable at least once.

A serious problem with the approach, however, is that it misses predictably important configurations. For example, this one misses: Private Equity / Redeem Across Transactions. Why is that “predictably important”? Well, that’s part of the domain knowledge you have about what you are testing.

So how do I get my “predictably important” values in there? Well, a common solution to this problem is to simply specify additional test cases that include key pairs of variables — such as Private Equity with Redeem Across Transactions — or more key combinations of more than two variables. The problem there is that this can be a bit hit or miss. You might forget some, for example. Further, it starts to muddy up your test table. I say that because unless you apply good discipline to adding values, it can become harder to find rhyme or reason to what data conditions are tested with what test conditions.

So let’s consider a refinement of the above technique.

### All Pairs

In the **all pairs** approach, the set of test cases includes all of the pairs of values of every variable. So, E (Private Equity) isn’t just paired with M (Transfer) as in the table above. It’s also paired with I (Subscription), J (Redeem All), K (Redeem Across Transactions), and L (Liquidating Redemption). Similarly, E (Private Equity) is paired with every value of V3.

The table below illustrates a set of combinations that will meet the all-pairs criterion.

Variable 1 | Variable 2 | Variable 3 | |
---|---|---|---|

Test Case 1 | A (Onshore EOM) | I (Subscription) | V (deleted) |

Test Case 2 | A (Onshore EOM) | J (Redeem All) | W (canceled) |

Test Case 3 | A (Onshore EOM) | K (Redeem Across Trans) | X (amount change) |

Test Case 4 | A (Onshore EOM) | L (Liquidating Redemption) | Y (revised date) |

Test Case 5 | A (Onshore EOM) | M (Transfer) | Z (status change) |

Test Case 6 | B (Onshore Side) | I (Subscription) | W (canceled) |

Test Case 7 | B (Onshore Side) | J (Redeem All) | Z (status change) |

Test Case 8 | B (Onshore Side) | K (Redeem Across Trans) | Y (revised date) |

Test Case 9 | B (Onshore Side) | L (Liquidating Redemption) | V (deleted) |

Test Case 10 | B (Onshore Side) | M (Transfer) | X (amount change) |

Test Case 11 | C (Onshore FOM) | I (Subscription) | X (amount change) |

Test Case 12 | C (Onshore FOM) | J (Redeem All) | Y (revised date) |

Test Case 13 | C (Onshore FOM) | K (Redeem Across Trans) | Z (status change) |

Test Case 14 | C (Onshore FOM) | L (Liquidating Redemption) | W (canceled) |

Test Case 15 | C (Onshore FOM) | M (Transfer) | V (deleted) |

Test Case 16 | D (Offshore) | I (Subscription) | Y (revised date) |

Test Case 17 | D (Offshore) | J (Redeem All) | X (amount change) |

Test Case 18 | D (Offshore) | K (Redeem Across Trans) | V (deleted) |

Test Case 19 | D (Offshore) | L (Liquidating Redemption) | Z (status change) |

Test Case 20 | D (Offshore) | M (Transfer) | W (canceled) |

Test Case 21 | E (Private Equity) | I (Subscription) | Z (status change) |

Test Case 22 | E (Private Equity) | J (Redeem All) | V (deleted) |

Test Case 23 | E (Private Equity) | K (Redeem Across Trans) | W (canceled) |

Test Case 24 | E (Private Equity) | L (Liquidating Redemption) | X (amount change) |

Test Case 25 | E (Private Equity) | M (Transfer) | Y (revised date) |

Every value of every variable is paired with every value of every other variable in at least one test case. This is a much more thorough standard than all singles, but it still reduces the number of test cases from 125 (all combinations) to 25. *That* is a fairly big savings.

The first column (V1) shows the variables put in order: A, B, C, D. Each is run five times to match the five possible variables of V2. The second column (V2) shows those variables in order: I, J, K, L, M.

Each of the variables in the third column (V3) need to be paired up with a variable from V2 at least once. So consider the case of “I (Subscription)”. If you look at the table, you’ll see that variable is used in test cases 1, 6, 11, 16, and 21. Further, you’ll see that these test cases each test a different value for V3.

## Example Process for All-Pairing

Reading the above, you may see why many testers don’t necessarily follow this approach. It’s easy enough when it’s presented to you as a working and finished example. But how easy is this to do from scratch? To answer that, at least in part, I’ll step through the process of creating an all-pairs test set. In order to concentrate on the test technique itself, rather than on application specifics, I’m going to thoroughly generalize this example.

Once again let’s imagine a program with three variables and thus three test conditions. In this case ,V1 has three possible values; V2 has two possible values; and V3 has two possible values. Clearly if V1, V2, and V3 are entirely independent, the number of possible combinations is 12 (3 × 2 × 2). Now let’s build an all-pairs table.

First, label the columns with the variable names, listing variables in descending order based on the number of possible values).

Variable 1 | Variable 2 | Variable 3 |
---|

Keep in mind that that right off the bat you can start to get a feel for the scope of this table. If the variable in Column 1 has V1 possible values and the variable in Column 2 has V2 possible values, there will be at least V1 × V2 rows.

Second, you can now fill in the table, one column at a time. The general strategy is this:

- The first column repeats each of its elements V2 times, skips a line, and then starts the repetition of the next element.
- The second column lists all the values of the variable, skips a line, list the values, and so forth.

So if V1 is three (and the possible values are A, B, and C) and V2 is two (and the possible values are X and Y) then here’s what you would have:

Variable 1 | Variable 2 | Variable 3 |
---|---|---|

A | X | |

A | Y | |

B | X | |

B | Y | |

C | X | |

C | Y |

The reason for interposing the blank row is because otherwise it’s hard to know how many tests — and thus how many rows — will be needed. Leave room for extras.

So now for the third step. When I talk about a “section” next, think of the two AA rows as defining a section, BB as defining another, and so on. With that in mind, each section of the third column will have to contain every value of variable 3. You should order the values so that the variables also make all pairs with variable 2. Suppose that V3 can have two values and those are 0 or 1. Then you might have this:

Variable 1 | Variable 2 | Variable 3 |
---|---|---|

A | X | 1 |

A | Y | 0 |

B | X | 0 |

B | Y | 1 |

C | X | 1 |

C | Y | 0 |

It’s important to understand here that my decision of 1,0 was arbitrary in terms of starting it off. I could have done 0,1. I’ll come back to this later because you may find that you make a bad choice. In any event, you can see that I have coverage for X1, X0, Y1, and Y0.

Now that we’ve solved the three-column exercise, try adding more variables. To keep it simple, each additional variable will have two values.

Incidentally, to add a variable with more than two values, you would have to start over, because the order of variables in the table must be from the one with the largest number of values to the next largest number and on down so that the last column has the variable with the fewest values.

The fourth column will go in easily. Start by making sure you hit all pairs of values of Column 4 and Column 2 — which can be done in the AA and BB blocks — and then make sure you get all pairs of Column 4 and Column 3.

Variable 1 | Variable 2 | Variable 3 | Variable 4 |
---|---|---|---|

A | X | 1 | E |

A | Y | 0 | F |

B | X | 0 | F |

B | Y | 1 | E |

C | X | 1 | F |

C | Y | 0 | E |

You can see that I cover XE, XF and YF, YE. That’s pairing up column 2 and column 4. I also cover 1E, 0E, 1F, 0F. That’s pairing up column 3 and column 4.

Now let’s make a first attempt on Column 5:

Variable 1 | Variable 2 | Variable 3 | Variable 4 | Variable 5 |
---|---|---|---|---|

A | X | 1 | E | G |

A | Y | 0 | F | H |

B | X | 0 | F | H |

B | Y | 1 | E | G |

C | X | 1 | F | H |

C | Y | 0 | E | G |

This is adding a fifth variable to the matrix but note that this one doesn’t work. Yet it illustrates how to make a guess and then recover if you guess incorrectly. Why doesn’t it work, though? Well, it achieves all pairs of GH with Columns 1, 2, and 3 but misses pairing for Column 4. Specifically, you can see that I have EG and FH covered but I have no representative coverage for EH and FG.

The most recent arbitrary choice was HG in the BB section. That was purely a choice I made yet notice that after the order of H then G was determined for the BB section, HG is then the necessary order for the third in order to pair H with a 1 in the third column.

So to recover from guessing incorrectly that HG was a good order for the second section, I can try again. How do I do that? Here are the basic steps:

- Flip the most recent arbitrary choice (Column 5, Section BB, from HG to GH).
- Get rid of section CC. The choice of HG there was based on the preceding section being HG, and I just get rid of that.
- Refill section CC by checking for missing pairs.

So let’s think about that third step there. Using GH,GH would give me two XG,XG pairs, so what I should do is flip to HG for the third section. This yields a Column 2X with a Column 5H and a Column 2Y with a Column 5G. That is exactly what is needed to obtain all pairs.

Here’s what you end up with:

Variable 1 | Variable 2 | Variable 3 | Variable 4 | Variable 5 |
---|---|---|---|---|

A | X | 1 | E | G |

A | Y | 0 | F | H |

B | X | 0 | F | G |

B | Y | 1 | E | H |

C | X | 1 | F | H |

C | Y | 0 | E | G |

This now successfully adds that fifth variable.

However, if you try to add yet another variable, it won’t fit in the six pairs. Try it with the IJs — the values of Variable 6 — in any order, and it just won’t work. Check out the following two tables.

Variable 1 | Variable 2 | Variable 3 | Variable 4 | Variable 5 | Variable 6 |
---|---|---|---|---|---|

A | X | 1 | E | G | I |

A | Y | 0 | F | H | J |

B | X | 0 | F | G | J |

B | Y | 1 | E | H | I |

C | X | 1 | F | H | J |

C | Y | 0 | E | G | I |

Variable 1 | Variable 2 | Variable 3 | Variable 4 | Variable 5 | Variable 6 |
---|---|---|---|---|---|

A | X | 1 | E | G | I |

A | Y | 0 | F | H | J |

B | X | 0 | F | G | I |

B | Y | 1 | E | H | J |

C | X | 1 | F | H | J |

C | Y | 0 | E | G | I |

These six variables do not fit into the six tests in the all-pairs matrix.

But that’s a clue! It’s telling us that we need two more test cases. Check out the next table:

Variable 1 | Variable 2 | Variable 3 | Variable 4 | Variable 5 | Variable 6 |
---|---|---|---|---|---|

A | X | 1 | E | G | I |

A | Y | 0 | F | H | J |

G | J | ||||

B | X | 0 | F | G | I |

B | Y | 1 | E | H | J |

H | I | ||||

C | X | 1 | F | H | J |

C | Y | 0 | E | G | I |

What’s needed is a test that pairs a G with a J and another test that pairs an H with an I. You can see those additions added in on rows 3 and 6, respectively. The values of any of the other variables are irrelevant, at least as far as achieving all pairs, so you can fill them with anything you want.

If you’re going to keep adding variables, you might leave them blank, and decide later — as you try, say, to accommodate Variable 7 and Variable 8 into the same eight test cases — what values would be convenient in those rows.

If you tried to test all of the combinations of these variables, there would be 3 × 2 × 2 × 2 × 2 × 2 = 96 tests. As it is, we’ve reduced our set of tests, using all-pairs, from 96 to 8. That’s a fairly substantial savings to say the least.

## But Be Careful…

There *are* risks if you only use the all-pairs cases. As with all-singles, you might know of a specific combination that is widely used or likely to be troublesome and that specific combination may not be falling into your pairing test matrix. The best thing to do is add the particular case you are considering to the table. After all, you’ve already cut back from 96 to 8 tests. It’s certainly reasonable to expand the set out to 10 or 15 tests, to cover the important special cases.

As you can see, this kind of technique can help you reduce your testing tasks. But it doesn’t do your thinking for you. As you noticed above, I had to make some choices. I showed you that it’s possible to make a wrong choice and you have to back it out to get a reasonable combination. Finally, while this approach helps you narrow down possible test conditions, you do have to apply thought towards test conditions that may fall off the matrix but still should be tested.

Great article! I never knew how to actually construct the list of combinations until now.

I’ve gotten some excellent mileage out of a tool called Hexawise (www.hexawise.com), which creates combinations of test data.

It’s got some other helpful features, like invalid pairs, and value expansion for when you need to do domain testing.

This tool helps a ton, especially when writing tests in Cucumber.