Testing for Quality, Betting on Value

There’s an irony worth noting with my previous posts on Hollywood quality and gaming quality: testing exists, in part, to mitigate risk but only by helping people understand the risks that exist. Yet, quality itself often requires reasoned and reasonable risk-taking! Let’s dig in to this.

The studios and publishers in my examples from those previous posts weren’t (generally) reckless. They conducted exhaustive market research, analyzed comparable titles, and made data-driven projections. They mitigated technical and operational risk admirably. What they failed to do was take creative risk in service of genuine distinctiveness. They bet enormous sums on safe, derivative concepts (superhero fatigue, live-service clones, franchise extensions) precisely because those felt like lower-risk bets. But risk-aversion in creative decision-making often produces the blandness that guarantees failure.

The lesson for testing (just as with any experimentation) isn’t to abandon risk mitigation by surfacing risks to people who can do something about them. No, instead, it’s to recognize which risks matter. Testing can ensure your code works, your performance scales, and your features function as specified. But no amount of testing can save a product built on a fundamentally risk-averse creative foundation: one designed by committee to offend no one and appeal to everyone.

Sometimes, the highest-quality outcome requires taking a reasoned gamble: building something distinctive that deeply serves a specific audience rather than something generic that vaguely serves everyone. Testing can help teams execute that vision reliably. What testing can’t do is substitute for having a vision worth executing in the first place.

The Two Types of Risk

To understand this tension, we need to distinguish between two fundamentally different kinds of risk.

One is technical/operational risk. This is the risk that your product won’t work as intended: code defects and bugs, performance failures under load, security vulnerabilities, integration failures, usability issues that prevent valuable outcomes, requirement mismatches, and so on. This is the traditional domain of modern testers and we’re (generally) good at it. We have frameworks, methodologies, automation, and clear pass/fail criteria. We can measure whether the login flow works, whether the API handles ten thousand requests per second, whether the checkout process completes successfully. These risks are largely deterministic. If we test thoroughly, we can find and fix problems before users encounter them.

The other is strategic/creative risk. This is risk that your product won’t matter. This shows up as building features users don’t actually want, solving problems users don’t actually have, creating experiences that feel generic or uninspired, misjudging what users will value in their actual context, missing cultural, competitive, or temporal shifts, and pursuing “safe” choices that alienate your core audience or simply not keeping pace with your competition. This is the domain of product vision, market strategy, and creative differentiation.

And here’s the crucial point: you cannot test your way out of strategic risk. No amount of unit tests, integration tests, or even user acceptance testing can tell you whether users will care about what you’ve built. You can validate that the feature works perfectly; you cannot validate that it matters.

Hollywood and gaming taught us what happens when organizations excel at mitigating technical risk (production quality, visual effects, performance optimization) while failing catastrophically at strategic risk (making something people actually want to experience).

What We Can And Can’t Test

This distinction reveals the inherent limits of testing when we talk about “quality assurance.” We can test:

  • Functionality: Does this button do what it’s supposed to do?
  • Performance: Does it handle the expected load?
  • Reliability: Does it work consistently over time?
  • Security: Are vulnerabilities present?
  • Usability: Can users complete tasks without confusion?
  • Compliance: Does it meet specified requirements?

We cannot (with traditional methods, anyway) test:

  • Desirability: Will users want this feature in six months?
  • Value perception: Will users consider this worth their time/money/attention?
  • Cultural resonance: Will this feel relevant or tone-deaf when it ships?
  • Competitive positioning: Will this matter when alternatives exist?
  • Strategic fit: Does this align with what users actually need?

The second list isn’t a QA (or testing) failure: it’s a category error. These aren’t testable properties; they’re judgments made by users in context. You can gather feedback through beta testing, user interviews, and analytics, but you’re still fundamentally predicting future perception, not verifying present fact.

This is why the game Skull & Bones could spend $800 million and ten years in development, pass every technical milestone, and still fail utterly. The game worked. It just wasn’t what players wanted by the time it launched. No QA process could have caught that, because the problem wasn’t in the code.

Reasoned Risk in Practice

So what does “reasoned and reasonable risk” look like in software development?

One argument is that it means building something distinctive for a specific audience rather than something generic for everyone.

  • Slack didn’t try to be “enterprise communication for all possible use cases.” It bet on dev teams and startups who valued speed and searchability over enterprise audit trails. That focus was a risk (smaller addressable market) but also what made it valuable.
  • Figma didn’t try to be “design software for everyone.” It bet on web-based collaborative design for product teams, sacrificing desktop power users. That was a risk, but it created genuine differentiation.
  • Notion didn’t try to be “productivity software for all workflows.” It bet on flexible, block-based documents for knowledge workers comfortable with structure. That specificity was a risk, but it’s what made people passionate about it.

In each case, the company made a strategic bet about what a specific audience would value, even though that bet narrowed their market. Testing, in the context of quality assurance, could ensure these products worked reliably; it couldn’t validate whether the strategic bet would pay off. That required creative risk.

So, what does this tell us? Well, what it tells me that reasoned risk means:

  • Informed by user research and data, but not paralyzed by it
  • Bounded in scope (test hypotheses, don’t bet the company)
  • Specific in audience (serve someone deeply, not everyone vaguely)
  • Adaptive to feedback (tight loops to course-correct)

This also tells me that reasonable risk means:

  • Building for clarity of vision, not committee consensus
  • Accepting that some users won’t be your users
  • Prioritizing distinctiveness over universal appeal
  • Measuring success by depth of value for your audience, not breadth of reach

Feedback Loops as the Answer

Since we can’t perfectly predict quality in advance, the answer isn’t better prediction. It’s tighter feedback loops.

  • Hollywood’s problem: two to four year production cycles with no user feedback until opening weekend. By then, $200 million is sunk.
  • Gaming’s problem: four to ten year development cycles with limited beta testing. By launch, $400 million is spent and the cultural moment may have passed.

Where software has an advantage over Hollywood is that we can build tighter loops, if we choose to.

  • Ship incrementally: Release smaller features more frequently
  • Measure actual usage: Don’t just test functionality; observe behavior
  • Iterate based on feedback: Treat launches as hypotheses, not conclusions
  • Limit sunk costs: Keep investments bounded until validation occurs
  • Stay adaptive: Be willing to pivot when perception shifts

This doesn’t eliminate risk. It makes risk manageable. You’re still betting on what users will value, but you’re making smaller bets and learning faster.

The role of testing in this model shifts:

  • From “gatekeeper preventing broken releases” to “enabler of fast, reliable iteration”
  • From “validating complete products” to “ensuring safe experimentation”
  • From “testing against specifications” to “helping teams learn from reality”

The faster you can ship, measure, and adapt, the less you’re locked into bets made some time ago based on outdated assumptions. Quality becomes something you discover through iteration, not something you engineer upfront through exhaustive planning.

The Lesson?

Quality is, and will always be, a shifting perception of value over time. We cannot test our way to certainty about whether users will care. What we can do is recognize the limits of our craft: testing can help us determine reliable execution (under certain conditions); it cannot validate strategic vision.

The highest-quality outcomes come from pairing both: bold, reasoned creative bets with rigorous technical execution. Take the risks that matter, building something distinctive for someone specific. Then use testing as experimentation to be as certain as possible that you execute that vision reliably, ship it quickly, and stay adaptive when reality doesn’t match your projections.

In the end, quality isn’t what you project in your planning documents. It’s what users perceive when they finally experience what you’ve built. And, by then, it’s too late to change your bet. The only winning move is to bet smaller, learn faster, and stay humble about how little we can truly predict.

Share

This article was written by Jeff Nyman

Anything I put here is an approximation of the truth. You're getting a particular view of myself ... and it's the view I'm choosing to present to you. If you've never met me before in person, please realize I'm not the same in person as I am in writing. That's because I can only put part of myself down into words. If you have met me before in person then I'd ask you to consider that the view you've formed that way and the view you come to by reading what I say here may, in fact, both be true. I'd advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.