A/B Tests Rarely Make Sense: How and When to Use Them

You can't A/B test an invention.

A/B Tests Rarely Make Sense: How and When to Use Them
Dall·E

Apple didn't A/B test the iPod or the iPhone. Tony Fadell, creator of the iPod and iPhone, didn't A/B test the Nest thermostat when he launched a revolutionary home thermostat after leaving Apple. OpenAI could not A/B test ChatGPT before launching it. 

An A/B test—testing one variation of a product against another to determine which resonates more with customers—cannot be used to invent new things. How can you test something that doesn't exist against a variation of itself? And yet, the A/B test has become a ubiquitous crutch at corporations of all sizes.

When teams don't have a decisive leader, people say things like, "let's test it to see what users want," or "we're a data driven organization, so let's rely on a test to decide." This thinking is a thinly veiled disguise for decision by committee. It leads to marginal decisions and small changes. Committees distribute responsibility, and the buck stops with no one. Mediocrity follows.

X (Twitter)

Can you test laying off 80% of a company?

When Elon Musk purchased X (formerly Twitter), he had a problem: the company was quickly burning cash—his cash.

Musk solved this by drastically reducing costs, firing 80% of the workforce. Prior management, the media, and even X's own users at the time were sure that the company needed most of its 7,500 employees to function. Musk could not A/B test laying off a large portion of the workforce. Instead, he relied on judgement gained from running multiple companies with thousands of employees to estimate that X needed a lot fewer workers.

He did what Jeff Bezos made famous at Amazon: disagree and commit. He made an unpopular decision to slash headcount, which would be impossible to undo. While big, untestable decisions certainly lead to mistakes, they avoid stagnation. Musks's alternative was preserving what was, which would have led to bankruptcy for a company that had not made a profit in years. Today, X has approximately 1,500 employees and is functioning well.

Template: A/B testing framework. Download the template below.

When should you run an A/B test?

An A/B test only works if you start with an opinionated hypothesis, have enough data, and can test an isolated feature.

Below is a link to the A/B testing template I use to run every A/B test. But before running a test, you must make sure that three things are true: (i) you have an opinionated hypothesis, (ii) you can collect enough data, and (iii) you're testing an isolated feature.