Design tables around goods

Before also wrote similar database design scheme, this article is for you to explain in detail reference jingdong Mall around how to design the database, need to pay attention to the details of a lot of fields, in detail, combined with the function to achieve the analysis of the meaning of each field design

After reading this article, you can have a look at the previous four articles about the database design of e-commerce projects. It is a little different from this one. Most of the previous articles were designed by referring to Suning Tesco

When we design the database, most of them are designed in accordance with our own actual needs. We provide these articles only for your reference and learning, so that you can learn from them and participate in the actual combat of the project. At the same time, if there are any mistakes in the content of the article, I hope you can correct them and learn together.

So let’s get down to business and start designing tables

Classification table

CREATE TABLE 'tb_category' (' id' bigint(20) NOT NULL AUTO_INCREMENT COMMENT '类 iD ', 'name' varchar(32) NOT NULL COMMENT 'iD ',' parent_id 'bigint(20) NOT NULL COMMENT' id ', 'is_parent' tinyint(1) NOT NULL COMMENT 'whether parent, 0 = no, 1 = yes ',' sort 'tinyint(2) NOT NULL COMMENT ', ', 'create_time' TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'CURRENT_TIMESTAMP ', 'update_time' TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'UPDATE time ', PRIMARY KEY (`id`), KEY 'key_parent_id' (' parent_id ') USING BTREE) ENGINE=InnoDB AUTO_INCREMENT=1424 DEFAULT CHARSET=utf8 COMMENT=' Category and commodity (SPU) is one-to-many relationship, category and brand is many-to-many relationship.Copy the code

The home page of JINGdong is captured as follows:

On the left menu bar, that is, the commodity classification displayed, which fields should we set in the classification table? We can refer to JINGdong Mall to think about:

MySQL > select id, create_time, update_time; Is_delete Boolean type is_delete Boolean type is_delete Boolean type is_delete Boolean type is_delete Boolean

See above, I, in turn, in menu box out of the three, we in the classification of how to represent the menu, and how to set up a field name, is very simple, set level1, level2 stores, level3 three fields, respectively, said level1 level menu, level2 stores said secondary menu, Level3 represents the three-level menu, but think about it, in our actual business, if we develop a mall with hundreds of millions of users, and want to add other categories, do we need to add Level4, Level5…

So we can design a parent_id to represent the parent category ID for association, the top category is 0, to meet the scalability, and the field is not redundant, in line with the three paradigms design, what is the database design three paradigms, blog.csdn.net/weixin_4243… For those of you who don’t know, check out this article.

I’m designing an is_parentid to indicate whether it’s the parent ID or not. If it’s not the parent ID, it’s definitely the last menu. I’m not going to say three-level menu here, maybe we’ll expand it later

Assign weights to a sort field, sort the index, and the smaller the index, the more advanced the classification table is designed.


Brand table

CREATE TABLE 'tb_brand' (' id' bigint(20) NOT NULL AUTO_INCREMENT COMMENT '表 名 ', 'name' varchar(64) NOT NULL COMMENT 'varchar ',' image 'varchar(256) DEFAULT '' COMMENT ', 'char(1) DEFAULT '' COMMENT ',' create_time 'TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP, `update_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (' id ') ENGINE=InnoDB AUTO_INCREMENT=325403 DEFAULT CHARSET= UTf8 COMMENT=' 1 'Copy the code

To design the table, we still list the common fields id, create_time, and update_time

Intercept jingdong page

From the above figure we can see that brand design is not so complicated. Let’s look at the top, all brands, initials, so we design a letter field, which is used to represent the initials of the brand, can be used for searching

Moving on, what can we see? It is nothing more than the logo and name of the brand. Then design the name field to represent the name, and the image field to represent the logo of the brand. Is there anything else?

Logical deletion can also be set, this is put in the association table for!


Brand Classification Table

Previously we have analyzed the commodity classification table and brand table, what is the relationship between them, we come to this stage repeatedly

Our home page or return to jingdong first, that is, when we in the classification of the design of the interception of jingdong the image, click a three-stage classification on edge, I phone here is to see is brand watch the pictures here, that is to say, under a classification have multiple brands, such as: mobile: huawei mobile phone, apple mobile phone, millet mobile phone…

Here’s a picture:

Huawei also has many categories under the brand, so there are also many categories under a brand

That is, classification table and brand table are many-to-many relationship, many-to-many design intermediate table

CREATE TABLE 'tb_category_brand' (' category_id 'BIGint (20) NOT NULL COMMENT' CCID ', 'brand_id' bigint(20) NOT NULL COMMENT 'brand_id ', PRIMARY KEY (' category_id ', 'brand_id') ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=' category_id '(' category_id', 'brand_id')Copy the code

However, you may notice that there are no foreign key constraints in this table, which seems to be inconsistent with the database design paradigm. Why do you do that?

  • Foreign keys seriously affect database read and write efficiency
  • Data deletion can be troublesome

In the e-commerce industry, performance is very important. We prefer to maintain table relationships logically in our code rather than setting foreign keys.

If logical deletion can solve this problem, you can think about it.


List of commodity parameters

Commodity parameter table is also an essential table design around the commodity design, but how to change the parameter table design? According to our normal thinking, a product has many parameters, such as a mobile phone, brand, product name, body length, body weight, CPU, memory, and so on a series of, we designed field respectively, and a table has dozens of fields, but think, what is wrong with doing so will, first of all, we design a database is to design a whole category of electric business platform, There are many kinds of goods. Let’s open jingdong and have a look.

Specifications and packaging of the mobile phone

Specification and packing of air conditioner

There are many, many, many, if all the design of a parameter list, it can be imagined that it is not appropriate

But we found that although different products, specifications are different. However, the goods under the same category, such as mobile phones, have the same specifications parameter name, but different values.

That is, the specification parameters of the goods should be bound to the classification. There is a uniform specification parameter template for each category, but the parameter values may vary from commodity to commodity.

Therefore:

  • The name and value of a specification parameter should be kept separately
  • A category corresponds to a set of specification parameter templates. Only the specification parameter key has no value
  • A category corresponds to multiple commodities, and each commodity has a different specification value, and each commodity corresponds to a set of specifications value

So we introduce parameter groups and parameter lists

We ignore the value for the moment. When new products are added, we can fill in the specification parameter value. We first think about how to design the specification parameter template (key).

  • The specification data should be grouped first, and then there are different specification parameters in the group
  • A category specification template has multiple specification groups
  • Each specification group contains multiple specification parameters

From the perspective of object orientation, specification parameters and specification groups are two types of transactions respectively, and the group and the parameters within the group form a one-to-many relationship, so there can be two classes to describe them respectively, so from the perspective of database design, there are two different tables:

  • Specification group: tb_spec_group
    • There are multiple specification groups under a product category
  • Specifications: TB_spec_param
    • A specifications group has multiple specifications parameters

As shown in figure:

The next thing you need to think about is:

  • What attributes are required to describe a specification group?
    • Because the commodity classification table and specification group table is a one-to-many relationship, a commodity classification there are multiple groups, so in the commodity specification group table there should be commodity classification ID associated
    • So the fields can be set to the following:
      • Id Id of the specification group
      • Category ID of category_id product
      • The name group name
      • Create_time Creation time
      • Update_time Update time
  • What attributes are required to describe the specification parameters? The specification group table also has a one-to-many relationship with the specification parameter table. That is, a group has multiple parameters. Therefore, the product category ID and specification group ID must be associated

Think clear above the problem, know how to design the table.

Parameter set table

CREATE TABLE 'tb_spec_group' (' id 'BIGint (20) NOT NULL AUTO_INCREMENT COMMENT '主键', 'category_id' BIGint (20) NOT NULL COMMENT 'C002', ', 'name' varchar(32) NOT NULL COMMENT 'iD ',' create_time 'timestamp NULL DEFAULT CURRENT_TIMESTAMP, `update_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY 'key_category' (' cid ') ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8 COMMENT=' InnoDB AUTO_INCREMENT=15 ';Copy the code

Id, create_time, update_time

Group name: NAME, because there are multiple specification parameter groups under a product category, so it is a one-to-many relationship and the design category_id correlation implements a one-to-many relationship

Specification parameter list

First look at the design of the watch

CREATE TABLE 'tb_spec_param' (' id 'bigint(20) NOT NULL AUTO_INCREMENT COMMENT '主键', 'category_id' bigint(20) NOT NULL COMMENT 'category_id' BIGint (20) COMMENT 'category_id' bigint(20) NOT NULL, 'category_id' bigint(20) NOT NULL 'name' varchar(128) NOT NULL COMMENT 'numeric' tinyint(1) NOT NULL COMMENT 'is NOT a numeric parameter, True or false', 'unit' varchar(128) DEFAULT 'COMMENT' Non-numeric types can be NULL ', 'generic' tinyint(1) NOT NULL COMMENT ' True or false', 'searching' tinyint(1) NOT NULL COMMENT ' True or false', 'segments' varchar(1024) DEFAULT' COMMENT 'Specifies the value type parameter. ', 'timestamp NULL DEFAULT CURRENT_TIMESTAMP, `update_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `key_group` (`group_id`), KEY 'key_category' (' cid ') ENGINE=InnoDB AUTO_INCREMENT=24 DEFAULT CHARSET=utf8 COMMENT=' unsigned ';Copy the code

We don’t need too much explanation to fix three fields: id, create_time, update_time.

A product category has multiple specification parameter groups, and a specification parameter group has multiple specification parameters

The name parameter name is easy to explain

Numeric, unit, generic, searching, segments

  • Numeric type: numeric

We have two fields to describe it: some parameter values are numeric with units and some are not, so we design this field as an identifier and unit as the unit of the parameter values.

  • Searching, segments is used to do search, we randomly open a page

You’ll find screen size, operating memory, network, body memory, battery capacity, number of CPU cores, and so on in the filter criteria in the specifications:

In other words, some of the data in the specification parameters will be used as search criteria in the future. We can mark these attributes at design time and use them as filters for future searches.

There are two fields associated with the search:

  • Searching: Indicates whether the tag is used for filtering
    • True: Used to filter search
    • False: not used for filtering
  • Segments: specifies parameters of certain numeric types. Segments are determined in advance
    • For example, battery capacity, 0-2000mah, 2000MAh-3000mah, 3000Mah-4000mah

A full category of e-commerce website, so a wide variety of goods, each commodity, its attributes and differences. To more accurately describe the commodity and segmentation differences, two concepts are abstracted: SPU and SKU.

There’s a generic property, a generic property, and we’ll talk about that when we talk about commodity data structures.


Learn what spUs and SKUs are

SPU: Standard Product Unit, a set of goods with common attributes

SKU: Stock Keeping Unit, each item in a SPU that is subdivided for specific characteristics

Take the picture for example:

  • Huawei Mate10 on this page is a commodity set (SPU).
  • Due to different colors, memory, etc., different Mate10 are subdivided, such as bright black 128GB version. (SKU)

As can be seen:

  • SPU is an abstract concept of commodity set, for the convenience of back-end management.
  • Skus are the specific goods to be sold. The price and inventory of each SKU may be different. Users buy SKUs instead of SPUs

Now that we know the difference between spUS and SKUs, we can start designing the spUS and SKUs.

Let’s take a look at how jingdong page is designed, we just open a product

Looking at this picture, let’s simulate the design of SPU table ourselves

  • Id primary key
  • C_id category ID
  • Brand_id brand ID
  • Name Commodity name
  • The description described
  • The spec specifications
  • After_service Afterservice service
  • Evaluation of the comment
  • .

It doesn’t seem complicated.

What fields do you think should be in SKU? Id: primary key spu_id: associated SPU Price: price images: image Stock: Stock color? Memory? The hard disk?

The unique attributes of skUs also change. The unique attributes of different goods may not be the same, so our table fields are not uncertain.

How should this particular property of the SKU be designed?

First, what are the unique skU attributes?

As the name implies, unique attributes are unique, all goods are held separately, such as the memory of goods, different mobile phone memory is different, there are 4G, 6G, 8G, 16G….. The colors are red green black and so on

So does that mean that our SKU is related to the classification of goods? In my view, skU unique attributes are related to the specifications of goods. Let’s take a look at the page of JINGdong as shown below

  • Color –> Correspond to five
  • Memory –> Three types

This can be seen in the parameter specification, click which one, the following parameters will change. Sku-specific properties are part of the commodity parameter specification

That is, there is no need to design the SKU’s specific properties separately, which can be considered part of the specification parameters. Thus the attributes in the specification parameter can be marked in two parts:

  • Specification attributes shared by all SKUs under the SPU (called common attributes)
  • Different specification attributes for each SKU under the SPU (called unique attributes)

Go back to the spec sheet we designed earlier and see if there is a field called Generic that marks generic and unique properties. This is for use here.

In this way, the goods SKU table only needs fields other than the design specification properties, which are held by the previous specification table.

However, the value of the specification attribute still needs to be associated with the item. So let’s look at how do we design

The SPU table

CREATE TABLE `tb_spu` ( `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'spu id', 'name' varchar(256) NOT NULL DEFAULT 'COMMENT ',' sub_title 'varchar(256) DEFAULT' COMMENT ', ', 'cid1' bigint(20) NOT NULL COMMENT '1 类 iD ', 'cid2' bigint(20) NOT NULL COMMENT '2 类 iD ', 'cid3' bigint(20) NOT NULL COMMENT 'id',' brand_id 'bigint(20) NOT NULL COMMENT' id', 'saleable' tinyint(1) NOT NULL DEFAULT '1' COMMENT '; 1, 'create_time' TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'CURRENT_TIMESTAMP ', 'update_time' TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'update_time ', PRIMARY KEY (' id ') ENGINE=InnoDB AUTO_INCREMENT=187 DEFAULT CHARSET= UTf8 COMMENT=' SPu 'Copy the code

It is basically similar to our previous analysis, but seems to be missing some fields, such as the item description.

We split the table vertically, putting the SPU details into another table: tb_spu_detail

CREATE TABLE 'tb_spu_detail' (' spu_id 'bigint(20) NOT NULL,' description 'text COMMENT' Product description ', 'generic_spec' vARCHAR (2048) NOT NULL DEFAULT COMMENT ', 'special_spec' vARCHar (1024) NOT NULL COMMENT ' Json format ', 'packing_list' varchar(1024) DEFAULT 'COMMENT' Package list ', 'after_service' varchar(1024) DEFAULT 'COMMENT ',' create_time 'TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP, `update_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`spu_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8Copy the code

The data in this table are relatively large, in order not to affect the query efficiency of the main table we split this table.

Notice these two fields: generic_spec and special_spec.

SKU table

Create Table CREATE TABLE `tb_sku` ( `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'sku id', Spu_id 'bigint(20) NOT NULL COMMENT 'spu ID ',' title 'varchar(256) NOT NULL COMMENT 'spu id', 'images' varchar(1024) DEFAULT' COMMENT ' , 'stock' int(8) unsigned DEFAULT '9999' COMMENT 'stock ',' price 'bigint(16) NOT NULL DEFAULT '0' COMMENT' sales price, ', 'indexes' vARCHAR (32) DEFAULT' COMMENT 'SELECT * from' spU ', 'own_spec' varchar(1024) DEFAULT 'COMMENT' SKu ', 'enable' tinyInt (1) NOT NULL DEFAULT '1' COMMENT ' 1 valid ', 'timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ', 'update_time' TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'update_time ', PRIMARY KEY (`id`), KEY `key_spu_id` (`spu_id`) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=27359021564 DEFAULT CHARSET=utf8 COMMENT=' SKu table, which represents specific commodity entities such as black 64GB iPhone 8'Copy the code

Next, the SPU and SKU table fields are described in detail

SQL > alter TABLE SPU (id, id, id); SQL > alter table SPU (ID, id, id, id, id); Direct one category ID can find other category ID, so there is no problem, we are mainly for better page display design, you can also optimize. Then the SPU table controls the loading and unloading of goods, and the SPU table controls whether the goods are effective. For example, if there is no inventory of goods under one SKU, the goods will be invalid, because one SPU contains multiple SKUs, so it is designed in this way

The reason why we split the SPU table into one detail table is to consider the performance problem, separate some large fields, namely vertical split, so as to improve the efficiency of data retrieval

We will focus on two fields in the commodity Detail table

  • Generic_spec: General specification parameter data
  • Special_spec: special specification parameters and optional values in JSON format

As mentioned above, specification parameters are bound to commodity classification. Commodities in the same classification will have a set of the same specification parameter key (specification parameter template), but the specification parameter values of each commodity in this classification are different, so the following points should be met:

  • We have a specification table, associated with a category, which holds the specification template for a category.

  • We also need tables that are associated with items, that hold the values of the specifications associated with a particular item.

  • Specification parameters are divided into general specification parameters and specific specification parameters, so specification parameter values need to be associated with SPU and SKU respectively:

    • Generic specification parameter values are associated with the SPU.
    • Unique specification parameter values are associated with the SKU.

But instead of adding a new table, let’s see how our table stores this information.

Generic_spec field

If you were to design a table representing the values of the generic specification attributes in the SPU, you would need at least the following fields:

  • Spu_id: indicates the product associated with
  • Param_id: specifies the parameter value of the product
  • Value: indicates a specific value

We didn’t design it that way. The value of the spec attribute associated with an item is stored directly in the spU table, so that the item associated with the spec attribute is visible at a glance. The spu_id of the three attributes is not required to be saved, and the rest is param_id and the spec parameter value. It’s a one-to-one correspondence, a key-value pair. We can just use a JSON structure to indicate it.

Yes generic_spec in the spuDetail table, which holds the values of the general specification parameters:

Overall:

Generic_spec (spu_id = 2) corresponds to the ID in param. {"1": "Huawei ", "2": "G9 Youth ", "3": 2016, "5": 143, "6", "porcelain", "7" : "Android", "eight" : "Xiao dragon (Snapdragon)", "9" : "Xiao dragon 617 (msm8952)", "10" : "eight nuclear", "11" : 1.5, "14" : 5.2, "15" : "1920*1080(FHD)", "16": 800, "17": 1300, "18": 3000 }Copy the code

Json structure with key-value pairs:

  • Key: indicates the ID of the corresponding spec_param
  • Value: indicates the value of the specification parameter

Special_spec field

We said that only general specification parameters are kept in the SPU, so why is there an extra special_spec field?

Take mobile phones for example, brand, operating system and so on must be general specification attributes, memory, color and so on must be unique attributes.

When you identify an SPU, such as Xiaomi’s Redmi 4X, there will be multiple SKUs due to differences in color memory and so on. If the color, memory, and other information of each SKU were sorted out, the following results would appear:

  • Color: [White, gold, rose gold]
  • Memory: [2G, 3G]
  • Storage: [16GB, 32GB]

This means aggregating the specific specification attribute values for each SKU under an SPU! So this is the special_spec field.

Look at the data format:

# tb_spu_detail special attributes in the table: special_spec (spu_id = 2) with the corresponding {id tb_sku table "4" : [" white ", "golden", "rose gold"], "12" : [" 3 gb "], "13" : ["16GB"] }Copy the code

Also json structure:

  • Key: indicates the ID of a specification parameter
  • Value: array of SPU attributes

So the question is: why store all the SKU specification attributes together in the SPU?

Because we sometimes need to query all specifications instead of just one SKU’s properties. For example, when the product details page displays optional specifications:

Just notation our structure, so that the page rendering is very convenient.

To sum up, the relationship between SPU and commodity specification parameter template is shown in the figure below:

SPU table clear, in the EXPLANATION of SKU table fields will be easy to understand

  • Indexes: indicates the combination of subscripts corresponding to specific specification attributes in the SPU attribute template
  • Own_spec: key value pair of skU specifications. It is in JSON format. Use linkedHashMap for deserialization to ensure order
  • Enable: indicates whether the command is valid

SKU is mainly these three fields, the other fields you should understand at a glance

Let’s start with the OWn_spec field, which describes the SKU’s specific specification parameter key-value pairs in JSON array format. The own_spec field is also a JSON data, which describes the specific specification parameters of a commodity

See the data

{
	"4": "White"."12": "3GB"."13": "16GB"
}
Copy the code

It holds key-value pairs for specific attributes.

The SPU stores optional but unspecified values, whereas the SKU stores specific values.

Let’s talk about the INDEXES field, the subscript combination of specific specification attributes in the SPU attribute template

Special specifications and options have been saved in the SPU table. The structure is as follows:

# tb_spu_detail special attributes in the table: special_spec (spu_id = 2) with the corresponding {id tb_sku table "4" : [" white ", "golden", "rose gold"], "12" : [" 3 gb "], "13" : ["16GB"] }Copy the code

Such as:

  • Nova 5, White, 3GB, 16GB: 0_0_0
  • Nova 5, Gold, 3GB, 16GB: 1_0_0
  • Nova 5, Rose Gold, 3GB, 16GB: 2_0_0

There are three permutations of these unique attributes, and if we add other permutations to the options, there are more permutations, so we can record the table below for each set of options, so that when the user clicks on an option, we can quickly locate the SKU.

And that part is our own_spec

The Enable field controls whether a specific item specification is valid or not


Conclusion:

What does it matter which tables we designed

  • Category table: TB_category
  • Brand name: TB_brand
  • Category Brand table: TB_CATEGORY_brand
  • Specification group table: TB_spec_group
  • Specifications: TB_spec_param
  • The SPU table: tb_spu
  • SPU details table: tb_spu_detail
  • SKU table: tb_sku

Relationship:

  • A category has multiple brands, a brand belongs to multiple categories, so it’s many-to-many
  • A classification has multiple specification groups, and a specification group has multiple specification parameters, so it is one-to-many
  • There are multiple SPUs under one category, so it’s one-to-many
  • There are multiple SPUs under one brand, so it’s one-to-many
  • There are multiple SKUs under one SPU, so it’s one-to-many