There are five commonly used data analysis methods.

 

1. Comparative analysis

 

Comparative analysis method refers to the comparison of indicators to reflect the changes in the number of things, which is a common method in statistical analysis. Common comparisons are horizontal and vertical.

Horizontal comparison refers to the comparison of different things at a fixed time, for example, the comparison of prices of products purchased by users of different levels at the same time, and the comparison of sales volume and profit margin of different products at the same time.

Longitudinal contrast refers to the same thing on the time dimension change, for example, sequential, and compared to the same calm base than, namely sales compared with last month’s sales this month, this year’s sales in January and the comparison of the sales in January from the previous year, the monthly sales this year, respectively, compared with last year the average sales, etc.

Comparative analysis can be used to judge and evaluate the data size, level and speed effectively.

2. Grouping analysis

Group analysis method is to divide the data into different parts according to the nature and characteristics of the data according to certain indicators, analyze their internal structure and mutual relations, so as to understand the development law of things. According to the nature of indexes, grouping analysis method can be divided into attribute index grouping and quantity index grouping. The so-called attribute indexes represent the nature and characteristics of things, such as name, gender, education level, etc., which cannot be calculated. Data indicators represent data that can be calculated, such as people’s age, wage income, etc. Group analysis is usually used in conjunction with comparative analysis.

3. Predictive analysis

The predictive analysis method is mainly based on the current data to judge and forecast the trend of future data change. Prediction analysis is generally divided into two kinds: one is based on time series prediction, for example, based on previous sales performance, forecast the sales of the next three months; The other is regression-type prediction, that is, prediction is made according to the causality of the interaction between indicators, for example, according to the user’s web browsing behavior, predict the possible purchase of goods.

Funnel analysis

Funnel analysis is also called process analysis. Its main purpose is to focus on the conversion rate of an event in important links, which is widely used in the Internet industry. For example, in the process of credit card application, users browse card information, fill in credit card information, submit application, bank review and card approval, and finally activate and use credit card. There are many important links in the process, and the number of users in each link is decreasing, thus forming a funnel. Funnel analysis enables the business side to pay attention to the conversion rate of each link, and monitor and manage it. When the conversion rate of a link is abnormal, the process can be optimized and appropriate measures can be taken to improve the business indicators.

5.AB test analysis method

AB test analysis method is actually A comparative analysis method, but it focuses on comparing A and B samples with similar structures, and analyzes their differences based on sample index values. For example, different styles and page layouts are designed for the same function of an App, and pages of two styles are randomly assigned to users. Finally, the advantages and disadvantages of different styles are evaluated according to the browsing conversion rate of users on the page, so as to understand the preferences of users and further optimize the product.

In addition, if you want to do the data analysis, readers need to master certain mathematical foundation, for example, the concept of basic statistics (mean, median, mode, variance, etc.), a measure of dispersion and variability (poor, quartile, interquartile range, percentile, etc.), data distribution (geometric distribution, binomial distribution, etc.), As well as fundamentals of probability theory, statistical sampling, confidence intervals and hypothesis testing, the application of relevant indicators and concepts makes data analysis results more professional.

This article is excerpted from Machine Learning Testing Introduction and Practice

 

This book comprehensively and systematically introduces machine learning testing technology and quality system construction, which is divided into 5 parts and 15 chapters. The first part (chapters 1-4) covers the basics of machine learning, Python programming, and data analysis. The second part (Chapter 5 ~ 7) introduces the basis of big data, big data testing guidelines and relevant tool practices; The third part (Chapter 8 ~ 10) explains the machine learning test foundation, characteristic special test and model algorithm evaluation test; The fourth part (Chapters 11 ~ 13) introduces the model evaluation platform practice, machine learning engineering technology and continuous delivery process of machine learning. The fifth part (Chapters 14 and 15) discusses the practice of AI (Artificial Intelligence) in the field of testing and the future of test engineers in the ERA of AI.

This book helps readers understand how machine learning works and how quality assurance works in machine learning. By reading this book, engineering developers and test engineers can systematically understand the knowledge of big data testing, feature testing and model evaluation. Through reading this book, algorithm engineers can learn the method of model evaluation and broaden the ideas of model engineering practice; By reading this book, technical experts and technical managers can understand the construction of machine learning quality assurance and engineering performance.