An overview,

1.1 What is global data user behavior

What is global data user behavior? Let’s first talk about the background of this project: the global user behavior data pull through project, which aims to pull through the user behavior data in the good future group. Cover the front desk business unit as follows: Lighting, pew, little monkey (original), ZhiKang, excitation step 5 business user behavior data, from different dimensions, such as students, curriculum, class, subject, school year term, tell time, to teach children, teachers, classroom integration, assigned to the upper application project the future picture of CDP, kanban, RFM model, industry market this product dictionary, such as whole domain feature pool scene. Early model are mainly composed of demand, the overall lack of underlying precipitation, but also found some obvious problems (need not easy extensibility, data quality is not high, data quality is difficult to locate, behavior data is not comprehensive), based on these problems, is expected to support the upper demand at the same time, also can with demand as the guidance, starting from the underlying business and data sources, It covers more front desk business units, systematically combs and builds more key nodes of user behavior life cycle, and makes longitudinal precipitation of user behavior data pull through in the whole domain.

From the background of the project, it is not difficult to conclude that global data: integrate the data of all business divisions

, the integration object is the existing user behavior of each business unit, such as: class purchase, which may be called recall/wake up/renewal/branch expansion in online school, order payment in Zhikang, and purchase in Libu. It is necessary to pull through the user behavior with different names (the same name also has the same meaning).

1.2 Why to pull through the Global User Behavior

  • Reduce the cost

  • Technology can assign

  • Data integration

  • Assigned to the business

Second, architecture design



The overall architecture is divided into three layers: data source layer, data latching and modeling layer, data application layer

  1. Data source layer: this layer is the bottom data source of the whole domain, all from the existing user behavior of each business unit;
  2. Data pull through and modeling layer: this layer is divided into two parts, pull through and modeling layer, actually modeling, but the existing carding behavior is divided into private and common, on this basis for modeling;
  3. Data application layer: This layer is the upper application part of the whole domain, such as: future portrait, battle map, the whole domain features and other different projects using the whole domain has been pulled through the data, to provide more editing use;

Iii. Construction process

After knowing what global data is and why global data should be realized, the next step is to sort out the existing user behaviors of each business unit. The behaviors of each business unit are summarized as 268 in a graphical form, and each behavior is marked in the graph, forming the life cycle of user behaviors of the business unit

3.1 User behavior lifecycle

A: The behavioral life cycle of little monkeys



Number of behavioral life cycles available in young monkeys: 31

B: Walking is the life cycle



Number of current behavior life cycles of Listep: 34

C: Life cycle of online school behavior



Number of existing behavior life cycles: 85

D: Life cycle of excellent behavior



Number of existing behavior lifecycles: 99

E: Life cycle of wisdom health behavior

Number of current behavior life cycles: 57

3.2 Private and co-ownership behaviors

Above the main work is to integrate to the behavior of the comb, make the development of students’ cognition on all domain user behavior have overall, understand the background and value of this project, but also to collect these behaviors are classified, mainly divides into the private behavior and public behavior, each teacher is responsible for the different data fields, need to know the current behavior of five divisions, From business familiarity and the back-and-forth dependencies of each model, the main function is to extend the characteristics of the current behavior more fully. Here are some screenshots of the private and public behavior:

1. Private behavior

2. Public behavior

3.3 Specific fighting methods

All the work above is to sort out the global user behavior of each business unit. If there is behavior, how should we do it?

  1. From bottom to top, collect the demand of the next month regularly (at the end of the month) according to the standardized demand collection template provided by the data warehouse to support the horizontal data project.

  2. From top to bottom, we sorted out the user behaviors of each business unit in the future, marked the core user behaviors, and understood the data in combination with the business.

  3. Through 1 and 2, vertical precipitation can be obtained: <1>: precipitation model: when there is a demand, business process is designed according to the demand, consistent dimension and facts are combined to support the project demand, and precipitation model is summarized from it; When there is no need, according to the core behavior of combing, normal development; <2>: Precipitation business: when making requirements, I have a certain understanding of the business needs of a certain piece. After developing a model, I need to summarize and precipitation the current business to avoid stepping on the pit in the future. <3>: Precipitation method: through 1 and 2, precipitation of the overall approach of the whole field of development (technical architecture and process), sort out TTC articles and share them with everyone; There are two other important points in the whole process of playing: 1. For an action,

    It must start from the DWD of each business unit, which is the detailed layer after cleaning

    It is also helpful for the later model extension. 2. When integrating an action,

    The oneData theory must be followed to annotate the model fields, which are modifiers, which are atomic indicators, which are time periods, etc. Only when the annotation is clear, can different features be derived automatically

    ;

    The specific model is as follows:

    Time periods, modifiers, and atomic metrics are clearly marked in the model design

PS: In the next two articles, WE will focus on oneData, which is also the core of our data warehouse

3.4 Bus Matrix

  • Understand the corresponding model and data domain of the business process;
  • Understand the divisions covered by the business process;
  • Understand the dimensions that the business process supports; Bus matrix see floor document: bosom friend yach-doc-shimo.zhiyinlou.com/sheets/dPkp… <04 Bus Matrix >

3.5 Difficulties in Regional Construction

Iv. Enabling business

  1. User graphs: Derive high-value features using developed automated features to quickly provide underlying data sources for feature platforms
  2. CDP: Provides latched user behavior data for rapid calculation, reducing communication and calculation costs
  3. Future Kanban: Provide the model required by integral data demand query to improve development efficiency