Quantitative testing measures the impact of product changes through numerical data like conversion rates and revenue.
While A/B testing dominates this field, other methods include multivariate testing, non-inferiority testing, and market comparisons.
Each approach has specific use cases and limitations.
This guide will help you understand when to use different testing methods—and when testing might not be the right choice at all.
Why is quantitative testing important?
Quantitative testing is important for a number of reasons:
- Data-driven decision-making - By measuring key metrics (e.g. conversion rates or revenue) teams focus on what actually works rather than gut feel.
- Faster learning - Numerical insights reduce trial-and-error cycles, enabling faster, more effective iterations.
- Confident roll outs - You can be sure of the impact a feature will have, before exposing it to the full user base.
Categories of quantitative testing
Quantitative tests fall into two categories:
- Controlled experiments - These split traffic randomly between variants to isolate the impact of changes. Examples include A/B and multivariate testing, typically used by high-traffic B2C sites.
- Uncontrolled experiments - These compare existing groups, like different markets or time periods, rather than creating random splits. Common when controlled experiments aren't feasible, especially in B2B settings.
Each method has distinct strengths and ideal use cases that shape when to apply it.
Get the Hustle Badger Guide to When to A/B test
Controlled Experiments
Controlled experiments work best when testing small changes that could have a big impact, and when you need to be very sure about your results. They're typically used on larger websites that get lots of visitors, since smaller sites won't generate enough data to draw reliable conclusions.
Benefits:
- These tests are "controlled" because you only change one or two things while keeping everything else the same
- This makes it much easier to understand exactly what impact your changes had
- The results are very reliable and trustworthy
Limitations:
- You can only test a few small changes at a time, which makes this better for fine-tuning than big changes
- These tests don't work in many situations, like when you have few visitors, when testing SEO changes, or when there are legal requirements
- They also won't work when you need to make big changes all at once, like launching a new brand across your whole site
Uncontrolled Experiments
These are often used when controlled tests aren't possible, or when you just need a rough idea of whether a change helped or hurt.
Benefits:
- You can move faster and launch more changes
- If a change has a big impact, you can spot it by comparing before and after
- You can still catch any major problems
- This approach is often your only option, so it's a valuable tool to have
Limitations:
- The results aren't definitive - you can't be completely sure what caused what
- You can't tell if changes in your results came from your updates or from outside factors (like seasonal changes or market shifts)
- It's harder to understand why something worked or didn't work
- While you can look at different numbers in detail, without clear test results it's tough to see the full picture
Picking the right type of quantitative testing
Normally the deciding factor for which type of test you run stems from:
- Traffic constraints
- Technical constraints
- Legal constraints
- Commercial constraints
Traffic constraints
When running a controlled experiment, you want to answer your hypothesis in a reasonable time frame - we’d suggest within 4-6 weeks.
The run time of a controlled experiment is influenced by:
- Overall traffic - How many users will your experiment reach?
- Size of impact - How much does your success metric actually move?
- Sample size - What proportion of your audience will be exposed to the experiment?
In many cases these constraints mean that you can’t reach statistical significance within that 4-6 week timeframe, so you’ll need to adapt your approach.
Technical constraints
In some situations you cannot split traffic between the variants you want to test for technical reasons.
A common example is testing the effectiveness of changes aiming to improve SEO. Search engines only see one version of each page on your site, and serve up one set of results to users (barring their own A/B tests!). That means you can’t A/B test changes you make to the content or technical architecture of your pages to improve SEO.
Legal constraints
In heavily regulated industries, or when new legal requirements come in, sometimes you have no choice but to ship to all users.
For example, when GDPR was introduced, then all websites needed to introduce cookie banners, regardless of the impact this had on the user experience.
Commercial constraints
There may be times where you do not want to run a controlled test for commercial reasons.
For example, if you want to change your pricing, you might want to do this for everyone at the same time, rather than show different customers different prices.
Let’s now go through both categories in detail.