Summary: A/B I’m sure everyone has done it to some extent, but how much do you know about A/B testing? Is A/B just triage? What is A scientific A/B experiment? The following ali front-end technology experts will combine some recent learning, systematic and popular to say A/B Testing, hoping to help you.

What is A/B Testing?

There are many layers of definition of A/B. In general terms, A/B is A tool that separates the A and B versions of the data to see which version of the data is better and more helpful to the product goals.

But I want to define it more in the sense of A/B itself.

In our business iteration, for example, we define business data metrics for our products (which often directly and indirectly reflect our business goals), and then we constantly make assumptions during the business iteration, expecting to improve the corresponding business metrics by making changes to those assumptions. Here, A/B is A method to measure the validity of the business improvement hypothesis proposed by us, which is A kind of hypothesis verification method in A statistical sense.

I think the advantage of this definition is that A/B is not only A tool, but more of an iterative idea integrated with business development. In addition, there is actually scientific statistical basis behind A/B, and you will pay more attention to whether each business hypothesis is really valid.

In user growth, the most taboo is to blindly apply the growth means of other business lines, while ignoring the process of analysis and derivation of their own business. Whether everything is correct, we need to test before we know.

At what stage is the product suitable for A/B Testing?

For A start-up project, it is not suitable for A/B testing when the product has just been incubated, because our goal at this time is relatively clear, which is to quickly form A “prototype” product and A large framework, so that the “product is born”, so there are basically not too many details.

When the product reaches A certain stage, the model is relatively stable and relatively in the stage of rapid iteration, it is more suitable to use A/B Testing to help business development.

Step of A/B Testing

Before talking about the steps of A/B Testing, I would like to say that A/B Testing does not mean that you do one experiment and get the result and then you do not need to do A/B any more. It is more A process of constantly optimizing and understanding products and users.

Therefore, the steps of A/B Testing mentioned here are not how we configure an A/B experiment on the platform, but more broadly, how we optimize the product with A/B Testing.

In general, the industry generally divides A/B Testing into 8 steps.

This is the 8-stage A/B division that I have seen in my study. It can be seen that the creation of A/B experiment that we technical students pay most attention to is actually only the fourth and fifth steps. Before that, we still have A lot of work to do, so what should we do in each step to scientifically do A/B? Let’s see.

1. Build a product funnel

This step is often neglected in our work. In my opinion, no matter business or technical students, it is necessary for us to understand our product link and the funnel of users. Only when we know where users come from and where we want them to go, can we be prepared for growth. For example, when a user pulls a new process, its funnel might look something like:

2. Determine the product link core indicators

Having identified the product funnel, we need to identify which core metrics to look at in the product link.

If your focus is only on one page, you might want to look at the metrics of the current page. If you are looking at a long product link, you should look at the metrics between nodes on the entire link.

In the “user pull” example above, we might look at the number of users per node (PV/UV), and the conversion rate per layer (e.g., click/exposure), etc.

Having identified the indicators, we need to incorporate them into long-term observations.

3. Observe indicators and propose optimization hypotheses

Then our product students can analyze the current business situation according to the indicators, and then put forward corresponding business assumptions based on the data indicators that need to be optimized. So from here, statistics comes in.

Here we say that the hypothesis actually contains two kinds:

The Null Hypothesis, also known as the Null Hypothesis, represents a Hypothesis that we hope to disprove through experimental results.

Alternative Hypothesis represents the Hypothesis we hope to verify by experimental results.

You can see that the null hypothesis is pessimistic. Why do I have to do this? I was really confused at the beginning. We’ll start with these two concepts (the original hypothesis and the alternative hypothesis), whose effects we’ll see in a few steps.

Let’s say our scenario is to optimize the click-through rate of the buttons on the page, and we expect to increase the size of the buttons.

So the null hypothesis is stated as follows: increase the size of the button and the click-through rate of the button will not change at all.

The alternative hypothesis is stated as: increase the size of the button, and the click-through rate of the button will have an impact (I think the impact includes both increase and decrease, but in most presentations this hypothesis will only be stated as increase, I understand that we normally do not assume a decrease in data, which can be discussed).

It is also important to note that in hypothesis testing, one and only one of the original hypothesis and alternative hypothesis is true.

Having identified the hypothesis, we proceed to the design of the experiment.

4. Design A/B experiment plan

In terms of experimental design, we need to clarify some information:

We have to specify what the objective of the experiment is, including the hypothesis mentioned above.

In terms of experimental groups, we need to consider how to divide the groups, whether there should be A/A control, how much flow should be cut to do the experiment?

In addition, in the delivery, who should we do the experiment for? Do you want to put it in a specific area? Or on a particular end?

In addition, it is better to make only one “variable” change at A time in A/B experiment (although you can also make multiple variables at the same time due to time limitation, such as the classic A/B version of Obama’s election poster), which will be more beneficial for subsequent data analysis and clear conclusions.

5. Develop A/B experiments

This step is the most familiar stage for us. General project requirement review starts from here. Developers will write UI logic and bucket logic with the help of Runtime SDK.

6. Run the experiment

After the development is complete, we are ready to go online, at this time to set up the experiment run time configuration, such as:

We mainly need to set:

The sample size of the index (which in turn determines the running time of the experiment).

The experimental significance level (α) and statistical power (1-β) are generally set as 5% α and 10%~20% β in the industry.

Why set significance level (α), statistical power (1-β)?

This is because all experiments are probabilistic and statistical errors, and errors lead us to make wrong judgments.

Common misjudgments here include:

Class I error (truth-nullifying error) : Reject the null hypothesis when it is true; The probability of type I error is denoted as alpha, which corresponds to the significance level.

Class II error (false error) : Failure to reject null hypothesis when null hypothesis is false. The probability of class II error is denoted as β(Beta) and the inverse (1-β) is the statistical efficacy value.

To put it more simply, take the example above:

The first type of error is when you increase the size of the button, the click-through rate of the button actually doesn’t change, but because of the error, we think it does.

The second type of error is that by increasing the size of the button, the click-through rate of the button actually changes, but because of the error, we think there is no change.

If it’s a little tricky here, you can feel it a few more times. Once we’ve set this up and released the code, we can release the experiment.

7. Analysis of experimental data

As we said earlier, the statistical essence of A/B Testing is hypothesis Testing.

Of course, before we start hypothesis testing, we need to verify that our data itself is correct.

Then we have to look at the experimental data:

Does the experimental significance meet the requirements?

Do the results of the experiment confirm the hypothesis to improve the data?

Did the experiment cause other data in the funnel to deteriorate?

As for the significance of the experiment, we will also use a z-test to calculate the p value to verify it.

The p value means that we observe the probability that the sample is generated by a random process. The smaller the p value is, the more confident we are that the null hypothesis is invalid. If the p value is less than the significance level (α), we can assume that the null hypothesis is invalid.

8. Experimental conclusions

Finally, we summarize the experimental conclusions according to the analysis results of this experiment.

For example, in this experiment, we specifically improved the INDEX of XX by doing XX, without affecting other indicators. Based on the conclusion of this experiment, we deduced that it is suitable to use XX to improve the index of XX in the scenario of XX.

Of course, if the desired goal is not achieved, we need to adjust the strategy and put forward further optimization assumptions.

These eight steps are sometimes reduced to a five-step cycle:

In general, the things you do are pretty much the same.

What challenges do we face in doing A/B Testing in e-commerce business?

With that said, let’s take A look at the current challenges of doing A/B testing in e-commerce.

Personally, I think the main challenges are:

A/B testing intuitively feels high cost and business acceptance threshold.

E-commerce businesses are all about running fast, and I have talked with many students about this. In fact, people don’t feel so buy-in about receiving A/B test, because they intuitively feel that the cost is high and they have to develop two (n) versions, which delays the launch time. But it’s not just about running fast, it’s about running in the right direction.

I believe that the above mentioned, we can see that combining A/B Testing to do business is A relatively scientific process. With A/B Testing, we will pay more attention to hypothesis verification, data derivation and verification in the business process. At the same time, compared with “functions on A shuttle”, the launch of A/B can also reduce the business risks brought by iteration. Even combining A/B, you can discover existing problems in the business and better understand the behavior of your users. In addition, the business growth experience gained through A/B can be deposited and generalized.

In addition, A/B is not A one-time thing, but A long-term iterative process, everyone should do A/B with the mentality of “continuous optimization”, rather than “one time to arrive”.

If you are interested in JAVA development, you are welcome to join QQ group: 322708204 for technical discussion, where senior architects will share some BATJ interview questions. Spring, MyBatis, Netty source code analysis, high concurrency, high performance, distributed, microservice architecture principle, JVM performance optimization, distributed architecture and so on to become the architect of the necessary knowledge system.

From A/B “platform” perspective, we have A lot of questions to answer to help our business solve these challenges:

To solve the problem of high cost of A/B (here we solve it from several angles) :

1. The operation efficiency of the platform (whether it is easy to use) and whether the platform tools are easy to understand (whether the cost of understanding so many statistical concepts as A/B can be smoothed by our platform).

2. The development is more standardized. We need to standardize the customized A/B development of the business and provide development from the development SDK.

3. Improvement of development efficiency:

From the engineering side, we can use code scaffolding, code generation and other ways to improve efficiency.

In terms of platform functionality, we can provide tools such as UI Editor to put some “static”

The “configuration” section of the class is open to operations and products, allowing them to make changes to do A/B experiments and reduce the cost to the developers themselves.

4.A/B’s capabilities need to be integrated into other processes, platforms and systems.

When operating on other platforms in the future, we will not feel that A/B configuration is A separate part. Of course, we also need to think about the scheme here. At present, the cost of integrating A/B capabilities into other platforms is still very high.

I think these are the questions we need to solve step by step.

— — — — — — — —

Copyright: This article is an original article by CSDN blogger “Ali Technology official number”

The original link: blog.csdn.net/alitech2017…