@[toc]

The introduction

Hello everyone, I am ChinaManor, which literally translates to Chinese code farmer. I hope I can become a pathfinder on the road of national rejuvenation, a ploughman in the field of big data, an ordinary but not mediocre person. I have been studying big data for almost a year. Recently, I have been organizing my notes on big data learning. This series is about some must-know and must-know knowledge of big data.

Typical data warehouse hierarchies:3 layer structure [ODS layer, DW layer and DA layer]

  • 1) ODS layer data: the original data often comes from the data generated by the business system, such as RDBMS table data, log file data, crawler data and data purchased by a third party
  • 2) DW layer: data warehouse layer, data source ODS into data, integration, stretching and analysis of data
  • 3) DA layer: data application layer, data source DW layer data analysis and processing, business analysis as required

Sometimes dimension data from business data is placed in a separate layer: the DIM layer (dimension layer), which stores all dimension table data.

Subject index development, according to the hierarchical structure of data warehouse to store data, divided into three layers of typical data warehouse architecture:ODS, DW and APP layersMore effective data organization and management makes the data system more orderly.

Benefits of data layering:

1. Clear data structure, each data layer has its scope and responsibility, in the use of tables can be more convenient positioning and understanding 2. Reducing duplication, standardizing data layering, and developing some common middle tier data can reduce significant duplication. 3. Unified data caliber: Provides unified data outlet and unified output data caliber through data stratification. 4. Simplify complex problems by breaking a complex task into multiple steps and solving specific problems at each level.

General data layering design:

  • ODS: Stores raw data
  • DW: Stores data at the middle layer of data warehouse
  • APP: Application data for business customization

E-commerce website data system design, only focus onUser Access LogsThis data:

Computing engines and storage systems used by each layer:

Jingdong’sData warehouse layered patternsIs based on the standard model.

Data warehouse layering:

BDM: buffered data, direct image of source data, Buffer: Buffer FDM: basic data layer, data zipper processing, partition processing, Foundation: GDM: Generic Aggregation, Generic ADM: high Aggregation, Aggregation: Aggregation, application layer: ApplicationCopy the code

conclusion

These are the basic concepts of data warehouse, I hope you have read your own harvest, if there is a harvest, you can pay attention to the recommended book: Data warehouse Toolkit (3rd edition).