Are you tired of choosing deals at random and wasting your hard-earned cash on duds?

3 March 2023

8 min read

Does your method of selecting offers resemble betting on football matches at the bookies? Or maybe you rely on a hunch that this time you will succeed? Instead, why not try conducting a reliable A/B test in a few simple steps and choose a guaranteed winner without having the foggiest idea about statistics?

This short blueprint will enable you to:

  • Prepare the test A/B in 3 minutes
  • Carry it out as cheaply as possible and even make money on it
  • Before scaling, make sure that the new offer beats the control offer, giving you decent earnings

PART 1: A/B test calculator

how to find correct test parameters in 3 minutes without having a clue about statistics (regardless of whether you use Voluum, Bemob, Binom or Keitaro)

A/B testing is founded on statistical principles. The more data we have, the greater the chance of making the correct decision. Conducting an A/B test with limited data is a waste of both time and money.

The results won’t be any better than a coin toss – scaling on this basis usually leads to bigger losses.

No matter what tool you use to conduct A/B testing, the principles are always the same. Below you’ll find a link to a simple calculator you can use to see if the data you’ve collected is sufficient to decide which offer to scale. You can use the calculator independently of the selected tracker.

https://marketing.dynamicyield.com/bayesian-calculator/

  • Samples: the number of samples, e.g. clicks on a banner or the number of Landing Page visits
  • Conversions: the number of conversions

By entering different values of Samples and Conversions, we can check what the probability is that with given values, one variant is better than the other and if the difference is statistically significant.

Estimating the probability of winning “at random” with little data to go on can be likened to flipping a coin – you have a 50% chance of losing money by promoting an offer selected this way.

Part 2: How to conduct the test correctly

While minimising the risk of losses

  1. Example of an incorrectly executed test – ending too early

Let’s look at a sample A/B test where both variants had about 1% CR. Let’s see the mistake we would make by stopping the test too early when not enough data has been collected yet.

The chart below shows the number of conversions collected by two variants in the A/B test. Each variant received about 20,000 hits on the site. It should be noted that for the first 4,000–5,000 clicks, the red variant seemed to be inferior or no better than the blue.

It was only after about 100 conversions per variant that we could see that the red variant was actually better (it needed fewer clicks to reach 100 conversions). This means that finishing the test earlier would have led to rejecting the better variant and therefore losing $$$!

AND ALL IT TOOK WAS…

Waiting patiently for more data to be collected, for example, 100 conversions per variant. It should be remembered that A/B testing is not just about pure statistics – it is important to eliminate other factors that may distort test results, such as the day of the week when the test was conducted. The best A/B tests are run for full weeks (one or even several weeks), so you don’t run the risk of making a losing decision due to seasonality.

  1. Example of a correctly executed test

Below we see an example of an optimally executed test, whose resolution was postponed until approx. 300 conversions per variant were obtained. Already with 200 conversions it was possible to conclude that the red variant is better (it cost fewer clicks), and a further prolongation of the test only strengthened the conviction of the affiliate that the decision is correct.

  1. Conclusions

TEST PERFORMED INCORRECTLY = GAMBLE

In the first test, a variant that initially bode well was mistakenly selected, as it proved to be weaker after more data was acquired. Ending the test too early resulted in the selection of an offer that converted worse by about 20%!

TEST PERFORMED CORRECTLY = YOU ACT WITH CONFIDENCE AS IF YOU KNEW THE LOTTERY NUMBERS

In the second test, a decision was only made after 200-300 conversions were achieved, ensuring that the winning test variant was selected with confidence.

Part 3: Statistical terms used in A/B tests

on the example of basketball

If we want to check which player is a better 3-point shooter, we won’t be able to do this after a few shots, but only after a few dozen. The more confident we want to be in discovering which variant is better, the larger the sample (number of throws) we need. In statistics, this is called the power of a test.

The less different the variants tested are, the larger the sample needed. In statistics, the expected difference between test subjects is referred to as the effect size. Basketball analogy – if both players shoot with similar effectiveness, you need to let them perform more shots in order to pick the better player. Because if one of them, for example, didn’t hit the basket at all and the other always did, you wouldn’t need many shots before you made your decision.

If we get two players with the same effectiveness, and we don’t want to accidentally conclude that one of them is better than the other, we assume the so-called significance level. The lower the significance level (e.g. 5%), the larger the sample we need to avoid this type of error.

Conclusion: Statistics can’t be cheated because this science is based on rigorous mathematical rules. If we don’t allow enough data to be collected, then we take the risk of choosing an inferior player for the team, making the team weaker and geared towards losing.

Related posts