METRICS & ITERATION · LESSON 06.03intermediate

A/B testing for PMs.

Hypothesis, sample size, primary metric, guardrails.

↳ tl;dr

A/B testing is how mature product orgs decide. The PM's job isn't to run the test — it's to design a test that can actually answer the question. Strong tests have a clear hypothesis, a single primary metric, explicit guardrails, and a sample size that detects the smallest meaningful effect.

Anatomy of a good test

  1. Hypothesis. "If we change X, we expect Y because Z." Specific, falsifiable.
  2. Primary metric. The one number that decides win/lose. NOT a list.
  3. Guardrail metrics. What you don't want to break. (e.g. activation can't lift if signup completion drops 10%.)
  4. Sample size. Power calc: n = f(MDE, α, power, baseline rate). Don't eyeball.
  5. Stop conditions. Pre-decide when to call it. Stopping early when results look good is the most common abuse.

MDE — Minimum Detectable Effect

The smallest lift you'd care about. A test powered for MDE = 5% won't reliably detect a 2% lift, even if it's real. Picking MDE is a business decision: what lift would justify the build? Picking too small means impossibly large sample sizes; too large means you miss real wins.

the most common abuse

Peeking. Watching the test mid-run and stopping when it looks favorable. This inflates false-positive rates enormously — Kohavi shows that "significant" results from peeked tests are often noise. Pre-commit to a sample size and a stop condition.

When NOT to A/B test

  • The change is too large to stage as a treatment (rebuilding the whole app).
  • You don't have enough traffic to power a test (sample size > total weekly users).
  • The decision is qualitative (brand redesign).
  • It's a regulatory / safety change — you don't test those, you ship them.

// sources

Sources cited

  1. [01]
    Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing

    Kohavi, R. & Tang, D. & Xu, Y. · Cambridge University Press · 2020 · retrieved 2026-05

    Modern canonical text on industrial A/B testing.

// sources

Further reading

  1. [01]
    Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing

    Kohavi, R. & Tang, D. & Xu, Y. · Cambridge University Press · 2020 · retrieved 2026-05

    Modern canonical text on industrial A/B testing.