Running an effective A/B testing program requires product managers to be able to develop, design and run tests well.
They can only do this if they have an informed understanding of what A/B testing is, how it works, and how to analyze data effectively.
There’s lots of ways to create a bad A/B test, from picking the wrong success metrics, setting a low confidence interval or jumping to conclusions based on user behavior or metrics you’re not actually testing.
In this article we’ll walk you through designing, running and analyzing an A/B test step by step, sharing good and bad examples of A/B tests.
We’ll deep dive how you should interpret and act on the results you get from the process.
By the end of this article you should be able to confidently design and analyze A/B tests.
If you’re not clear on when and how to use A/B testing, the advantages and disadvantages of A/B testing, or your organization does not yet have a testing program set up, we cover this in an earlier article.
Get the Hustle Badger Guide on When to Use A/B Testing in Product Management
Choosing your parameters. What is your desired risk?
Before running any A/B tests, you need to be clear on your overall parameters for success.
The main one is the confidence interval.
Most companies set a confidence interval of 95%.
This means that there's only a 5% (1 in 20) chance that the result happened by chance. If you ran the same experiment 20 times, then you’d expect to see the same result at least 19/20 times.
You can choose to set a higher, or lower, confidence interval.
If you need more certainty as a business, set the interval higher.
If you are happy to take on more risk, you can set it lower and use a smaller sample size.
The higher the confidence interval, the more data is needed to conclude the experiment. This generally means the experiment needs to run for longer.
Designing the A/B test
Step 1: Be clear on the key success metric
If you don’t have clarity on the metric that you want to measure when A/B testing, you will run into issues:
- No vision of what success looks like: Not understanding what result you want to achieve impacts the experimentation process end to end, from design to what actions you want to take, and weakens outcomes
- No idea how long it will take: You cannot predict how long you need to run the experiment for to see a change
- Sharpshooter problem: You’ll likely find any positive result to attribute to the experiment
A great primary success metric is:
- Leading: Predictive metrics over lagging metrics. For example revenue per visit, rather than customer lifetime value
- Meaningful: Full aligned with the broader goals you want to achieve
- Measurable: Easily updated and visible
- A big metric not a small metric: Able to demonstrate impact in a short-time frame
- Result focused: Has real impact upon the funnel rather than being a vanity metric. For example, people moving into the checkout funnel, rather than clicks on the product page.
Common primary metrics are:
- Conversion rates: movement from one stage of the Pirate / AARRR funnel to another. Often these focus on activation, purchase or retention metrics.
- Leads captured: demonstrated intent, such as requesting a demo
- CTA clicks: these might be on ads, or core site landing such as the homepage. The aim here is to show that marketing changes can drive meaningful acquisition uplifts
Get the Hustle Badger Guide to Success Metrics