Until recently, quantitative product experimentation was seen as the gold standard for any product organization. If you weren’t A/B testing, you weren’t doing product properly.
Controlled experiments were popularized by world-leading tech companies like Amazon, Google, Meta and Booking.com. Jeff Bezos has attributed much of Amazon’s success to “the number of experiments we run per year, per month, per day.”
However more recently there’s been a backlash against controlled experiments as the only product development lever, as shown by Brian Chesky’s vocal championing of a more vision-led approach founded in design.
These positions are closer than the soundbites would initially suggest.
Most companies should be running A/B tests when they have the traffic and the stakes are high enough, but shouldn’t see this as the only way to validate features or as a substitute for a strong product strategy.
This is obviously a judgement call that product teams need to make on a case-by-case basis.
To make sure you understand the principles involved and are equipped to make this call, in this article we’ll look at:
- The advantages of A/B testing
- The limitations of A/B testing
- When to use A/B tests and when not to
Get the Hustle Badger Guide to How to run effective A/B tests
Advantages of A/B testing
“When is the best time to A/B test? The short answer is always. The more realistic answer is whenever you can.”
– Jon Simpson, Forbes
An effective A/B testing program has a number of attractive benefits:
- Data-driven decision-making - By measuring key metrics (e.g. conversion rates or revenue) teams focus on what actually works rather than gut feel.
- Faster learning - Numerical insights reduce trial-and-error cycles, enabling faster, more effective iterations.
- Confident roll outs - You can be sure of the impact a feature will have, before exposing it to the full user base.
Let's look at each of these reasons in detail.
Data-driven decision-making
A/B testing helps product teams become more data-informed, and better able to resist pressure from stakeholders to build features without a solid business plan.
Lots of product work is innovation - launching new features. Because these features are new, the impact they will have is uncertain. Combine this fact with the cost of running an engineering team, and product work is an expensive activity that offers an uncertain rate of return.
A/B testing lets you understand with much more accuracy the true impact that features have had, and as a result estimate the impact that future features might have.
This is the case even when the impact of new features is very small, and might be impossible to detect without controlled quantitative testing. Instead of having to take a gut call and hope for the best, you can be sure that you’re moving in the right direction and accumulate small changes that over time contribute to big improvements in the user experience and business results.
This allows product work to be significantly derisked, and increases the average rate of return that product teams might deliver.
Once product teams have hard data to support their hypotheses, it’s much easier to steer stakeholders away from ideas that they feel are great, and towards options where there’s a solid hypothesis.
Faster learning
A/B testing can speed up product iteration by detecting smaller changes in metrics unambiguously.
Rather than waiting to collect qualitative data from customers, or investing heavily in an area to see if you can drive a significant impact, A/B testing gives you certainty over whether small tests are driving change.
If they are, you can double down on them, confident that you’ll be able to drive more impact. If not, you can avoid further investment and experiment elsewhere to find pockets of value.
Get the Hustle Badger Case Study on Booking.com’s experimentation culture
Confident roll–outs
‘There are essentially three main reasons why we use the experimentation platform as part of our development process.
- It allows us to deploy new code faster and more safely.
- It allows us to turn off individual features quickly when needed.
- It helps us validate that our product changes have the expected impact on the user experience.’
– Lukas Vermeer, Booking.com experimentation
Another use of your experimentation platform is to help deploy code. By putting new features behind a feature flag and monitoring the results between the control and new variant you can see the impact new features are having and turn them off instantly if you see that something is going wrong.
In this case, the A/B test platform acts as an extra safety net. If all shipped code is wrapped in an experiment, then it is easy to turn features off / on.
That said, let’s be clear that a staged roll out is not the same as a true A/B test, and you should be clear about your objectives here.