A/B testing is a very powerful technique for analysing different user experiences, and seeing at a granular, statistical level, which performs better.
When you can run A/B tests quickly and competently, you can rapidly improve the user experience and drive significant business impact.
However, A/B testing is prone to errors and misinterpretation if not done correctly, so you want to make sure you understand the principles behind it before you get going.
In this article we’ll walk you through designing, running and analyzing A/B tests step-by-step, so you can feel confident designing, running and interpreting your own A/B tests.
Get the Hustle Badger Guide on When to Use A/B Testing in Product Management
A/B Testing Prep
Before you start A/B testing, there are a number of steps you need to take to prepare. These lay the foundations to run high quality A/B tests quickly and efficiently:
- Have the right people involved
- Make sure your data is ready
- Set up a good A/B testing tool
- Get started with an easy test
Let's run through these in more detail.
Have the right people involved
For A/B testing, you’ll need three skill sets in particular:
- Analytics - To check data accuracy, help set up experiments and run analysis. Most likely means having a data analyst involved.
- Engineering - To write the code, create feature flags, and turn these off and on. Most likely means having a developer involved.
- Commercial - To prioritise tests, balance the use of other research techniques and make sure the testing program delivers business value. Most likely means having a product manager involved.
While it's possible that all these skills are present in a team of 1 or 2 people, a setup that involves a PM, engineer and data analyst is the most common.
Ensure your data is ready
Before you start A/B testing, you need to make sure that the data underpinning your tests is in good shape.
In practice that means:
- Defining the metrics used
- Captures the metrics reliably
- QA the data
- Agree how you will segment users
Defining the metrics used
Start by clarifying what metrics you might use to measure success. This could be different things for different experiments, but for any potential success metrics you’ll want to make sure:
- You have a clear definition of how to calculate this metric
- The definition is captured and widely shared in a data dictionary or similar
Rather than picking a single metric (e.g. revenue), make sure you have a few options to pick from, that cover leading and lagging metrics throughout your funnel. This could include:
- Revenue / transactions
- Completion of funnel steps
- Engagement with core features
Capture the metrics reliably
Once you’ve decided which metrics you might use, you need to check that you are recording the events used to calculate these metrics accurately and reliably:
- Map out the events - List out the specific user actions you’ll track (e.g. “Clicked sign-up button,” “Visited pricing page,” “Completed purchase”), and make sure each is accurately captured. Don’t forget events that happen through edge cases or off the happy path.
- Stick to naming conventions - Define a standard naming scheme for your events, document it, and stick to it. Consistent labeling will help you avoid duplication, omission and other errors, as well as make your data queries more readable.
- Set up data schemas - If you store data in a data warehouse such as BigQuery, Redshift, or Snowflake, create tables that align with your tracking plan, so each event has a well-defined schema.
Check out our 70+ product metrics cheat sheet for guidance on selecting and using leading metrics
QA the data
With everything set up correctly in theory, make sure you’ve actually tested the data collection, and it’s coming through as expected:
- Test events - Make sure that events are being captured as expected when test users complete the actions you’re tracking.
- Include tracking in your automated tests - Make this a standard part of your unit tests, so that you are confident future changes don’t disrupt the tracking.
- Cross-reference - Sense check the data you are capturing against other data points you have - both quantitative and qualitative.
- Check historical trends - Look back over time, to make sure that you don’t see any anomalies that would suggest tracking errors.
Agree how you will segment users
Finally you’ll want to make sure that you have a robust process for assigning users to different test segments (i.e. control and test variants):
- Agree user identifiers - Decide how you’ll assign users to different experiences. Will this be user ID, device ID, etc.? Whatever you choose, you’ll need to make sure that you have consistent usage across the user experience, tracking and test platform to make your data meaningful.
- Check the experience - Check that users from different segments are being shown the user experience you expect them to, and they stay in the right test groups throughout their journey and across different sessions.
You’ll also need a decent tool to manage your A/B tests. Experimentation tools have two main functions:
- Assign traffic - Allow you to adjust the proportion of traffic between variant and control - including turning A/B tests off or rolling them out to 100%
- Analytics - Provide real time charts and analysis on which variants are performing best.
There are many A/B testing tools, but some of the most popular are:
Each of these tools will require both a contract and integration with your systems, so they make most sense when you want to run experiments on a regular basis, and not just as a one-off.
Get started with an easy test
If you don't have a lot of experience running A/B tests then start simple and build up. This will let you:
- Get used to the tools
- Test your internal process for running experiments
- Check the data is accurate
Starting with a low stakes test can help you build confidence and credibility before running a high profile test that senior stakeholders will be watching closely.
Some great places to start are in running:
- A/A tests to test your tooling
- Paid ad experiments
- Minor UI changes to landing pages
Make sure that everyone understands the purpose of this first test is not to maximise impact, but to iron out any wrinkles in your ways of working. You don’t want to damage people’s confidence in your plans for wider experimentation as you’re getting going.
Get the Hustle Badger Guide to Success Metrics
