Author: Cola

Source: Coke’s path to data analysis

Please contact authorization for reprinting (wechat ID: data_COLA)

This is the second article in the intensive reading of The Data Analytic Mind: Analytical Methods and Business Knowledge

Hello everyone, I am Cola, last Friday’s intensive reading we summarized the commonly used indicators, and how to choose indicators, how to build an indicator system, if you have not read, you can have a look: Intensive reading 1: always say business, in the end what business commonly used indicators

Today, we continue the book to look back, the second chapter describes the methods of data analysis, from the well-known 5W2H to group analysis, basically covering all kinds of methods that may be used in work and life, let’s talk about one by one.

5W2H analysis method

This is to use 5 W’s and 2 H’s to think about problems. It is easy to understand and suitable for solving simple problems, while other methods are needed for complex business problems.

Logical tree analysis method

Put forward by Fermi, it is mainly used to turn complex problems into simple problems, gradually unfolding like branches, problem dismantling, and turning a complex problem into a simple sub-problem. Common interview question: if the estimate how much a product manager in shenzhen, how many piano tuner in Chicago, etc., this kind of estimate problem we known as a Fermi problem, in solving the problem of Fermi, KaoChaDian usually is not really to work out how much a product manager in shenzhen, the focus is on your analysis method, also is your ability to use logical tree analysis problem.

Industry analysis method

When it is necessary to analyze industry problems and formulate development plans, industry analysis should be carried out, and PEST analysis is the first choice.

Multidimensional disassembly analysis method

Multidimensional disassembly, that’s dimension plus disassembly, thinking from multiple perspectives.

So what dimensions can we break down the problem?

  • Disassemble from index composition
  • Disassemble from business processes

It’s a common interview question: “What about 5% drop in retention next day?” For this kind of problem, you can use the method of dimension disassembly. For details, please refer to the following article:

Index again double 叒 dropped, how should I give boss analysis after all?

By deconstructing the data in multiple dimensions, we found a phenomenon called Simpson’s Paradox, which leads to opposite conclusions when we examine the whole data and different parts of the data.

When the two variables were grouped together, the one that prevailed in both groups was the one that lost in the overall score.

More notably the university of California, Berkeley, 1973 examples of gender discrimination, boys acceptance rate was 44%, the girl’s acceptance rate of 35%, according to the data that someone feel the school there is A tendency of gender discrimination, but if each departments separately to see acceptance rate, can be found that A B D F four girls acceptance rates are higher than boys. This paradox tells us that a simple statistic cannot fully describe the complex meaning behind it, so it is wrong to look at the data as a whole and ignore the differences within its individual parts.

Comparison and analysis

When conducting comparative analysis, two main questions are considered, with whom and how to compare.

And who is

  • And their comparison: year-on-year, sequential, fixed ratio, and the target value of the contrast, vertical ratio, horizontal ratio, specific period of comparison
  • Compared to industry: Compared to the industry average

How to than

  • Overall size of data: mean, median
  • Aggregate fluctuation of data: coefficient of variation
  • Trend change: line chart, year-on-year, sequential

Recommend a comparative analysis of the article: data comparative analysis, see this is enough!

Note: Comparative analysis is used in A/B testing

Hypothesis testing analysis

Analysis of the cause of the problem, also known as attribution analysis, the “why” question, the question of declining indicators

Correlation analysis

The method of studying the relationship between two or more kinds of data. If one indicator changes with another indicator, it indicates that they are related, while if one indicator changes first and leads to the change of the other indicator, it indicates that they are causal.Also refer to the following article for relevance:Again, correlation analysis

It is important to note that correlation is not causation, and 100% causation is hard to find in real life. How do you distinguish between correlation and causation? The answer is: univariate control, where you change just one factor and see how it affects the result.

Cohort analysis

Also known as cohort analysis, which is grouping and comparing data.

For example, by analyzing retention over time, the goal is to find groups with low retention and then analyze those groups further.

There are lost users analysis, financial overdue analysis, etc

RFM analysis

RFM analysis is used to classify the value of users, from important value users to general retention users, identify valuable users, carry out refined operations, and constantly transform users into important value users.

Here R, F and M correspond to:

  • R- Last consumption interval
  • F- Frequency of consumption
  • M- Amount of consumption

Credit card membership service, for example, is an example of operation based on RFM analysis. Users should not adopt the same operation strategy, otherwise it may lead to loss.

For more information on how to implement RFM analysis in Excel, please refer to this article: RFM Analysis – Precise operation method of User Value Segmentation

Note:

  • The RFM value should be used flexibly according to different businesses.

AARRR model

AARRR model is used to analyze user behavior, make decisions for product operation and realize user growth. Corresponding 5 important links of product operation:

  • Acquisition: How do users find us
  • Activation- Activating users: How was the user’s first experience
  • Retention- Improving Retention: Will Users Come Back
  • How to make more money
  • Refer-recommendation: Does the user recommend it to others

Questions about THE AARRR model can also be referred to the following article: What indicators of the AARRR model should be focused on for data analysis

In the user acquisition phase, we are concerned with the following indicators:

  • Channel exposure
  • Channel conversion rate
  • New daily users
  • Daily App Downloads
  • A guest cost

In the user activation phase, you need to find the “aha moment,” which is the moment when users can’t help but love and admire the product.

In the retention phase, the core goal is to build user habits and focus on retention metrics

The income increase phase focuses on:

  • Total volume related indicators, such as total volume, volume of transactions
  • Per capita performance indicators, such as ARPU/ARPPU, per capita visit duration
  • Indicators of payment status, such as payment rate, repurchase rate

The recommendation phase, also known as viral marketing, or self-transmission, focuses on:

  • Forward rate
  • Conversion rate
  • K factor

Funnel analysis

Funnel analysis is an analysis method to measure the conversion rate of each step of the business process. It has been applied in all walks of life, such as user conversion analysis, user churn analysis, traffic monitoring and so on. The goal is to locate the problem nodes and find out where the problem links are.

Official account: The road of data analysis of Cola

Reply: “Documentation”, access to the original 130,000 word data analysis bible and 57 pages OF SQL quick reference manual