How do I store tree structures in a relational database

In daily work, we often use tree data structure, such as product category tree, comment tree, department tree, permission tree, etc. How to store tree structure in relational database? Today we are going to introduce a couple of options.

The business scenario

This article uses the company department structure tree as chestnut, to store the company department structure tree in mysql

The first approach: The Adjacency List

The key to using adjacency lists is to store the ID of each node’s parent node.

CREATE TABLE 'company_adjacency_list' (' id 'bigint(20) NOT NULL AUTO_INCREMENT COMMENT' primary key ', 'name' varchar(200) NULL DEFAULT NULL COMMENT 'parent ',' parent_id 'int(11) NULL DEFAULT NULL COMMENT' parent ', PRIMARY KEY (`id`) USING BTREE )Copy the code

The parent id and parent_id fields are stored in each department information

Import the data process without saying, let’s look at the data directly:

Data query mode

Here is a look at the query of this scheme using several commonly used query methods

Query the subordinate departments of a department

select * from company_adjacency_list where parent_id = ?
Copy the code

Parent_id allows you to quickly query the subordinate departments of a department

Query the immediate superior department of a department

Select * from company_adjacency_list where id= parent_id of the department to be queriedCopy the code

You can quickly query the parent node information by referring to the parent_id in the department information

Querying all subordinate departments of a department (querying child nodes including child nodes)

The query table structure, is a process of circulation, queries to the first child node, and then in the child nodes of the query child nodes, such loop check, although many articles online for the cyclic query of SQL and stored procedures to check, but the efficiency of these two approaches are relatively low, especially when using online service, this method of query efficiency is low, You are not advised to directly query data using SQL. In this way, all data can be queried, assembled into a tree structure in the application, and stored in the cache. Periodic updates and preheating can improve the query efficiency

Data Update mode

In this data storage structure, it is convenient and quick to update data. When adding data, you can directly find the ID of the parent node, and when the organization department changes, you can also directly change the parent ID. When deleting data, you can determine whether your business needs to delete child nodes.

Second option: Path Enumeration

The point of the path label is that each node stores the path from the root node to the node. In fact, I think it can be shared with several other schemes

CREATE TABLE 'company_path_enumeration' (' id 'bigint(20) NOT NULL AUTO_INCREMENT COMMENT '主键', 'name' varchar(200) NULL DEFAULT NULL COMMENT 'Department name ',' path 'varchar(200) NULL DEFAULT NULL COMMENT' path ', PRIMARY KEY (`id`) USING BTREE )Copy the code

In each department information stored its complete path, path field

Import the data process without saying, let’s look at the data directly:

Data query mode

Using the path table, it is difficult to query through the path field. Generally, it needs to use the like, CONCAT function, REPLACE function to do the string processing logic, which is complicated to query, and it is not shown here. Online services are not recommended to use this method, because the low query efficiency will affect the service performance. You are advised to add the parent_id and path fields in the same way as the adjacency list. Parent_id is used for query, and path is used to view the complete node path

Data Update mode

In this data storage structure, it is convenient to update data. When adding data, you can directly find the correct path. When changing an organization, you can also directly find the correct path.

Third option: Closure table? Close watch? (Closure Table)

Closure Table stores the main Table of department information of the company. Instead of storing the data of the relation between nodes, a separate relation Table is used to store the relation between nodes, which contains the relation information of any two related (superior and subordinate) nodes

Company department information main table

The main department information table stores only the department information

Company department structure relationship table

It consists of three fields

field	Literal translation field	Fields that
ancestor	The literal translation is ancestor node	The higher the node
descendant	The literal translation is descendant node	The lower nodes
distance	Literally called distance	The distance between a parent node and a child node

The point is that a record of a relational table is a parent node, a child node, and the path distance between them. Take the department chart for example

The relationship data between head office and planning department is as follows:

Ancestor: headquarters ID 1 Descendant: Planning ID 2 Distance: Headquarters - Planning department, so their route is 1Copy the code

The relationship data of head office-Region A is as follows:

Ancestor: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendant: descendantCopy the code

All node path information is stored in the relational table, and the distance is also used to represent the distance of the path. The path information between every two nodes in the tree structure needs to be maintained in the table.

CREATE TABLE `company_relation`  (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `ancestor` bigint(20) NULL DEFAULT NULL,
  `descendant` bigint(20) NULL DEFAULT NULL,
  `distance` int(11) NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE,
  INDEX `index_ancestor`(`ancestor`) USING BTREE,
  INDEX `descendant`(`descendant`) USING BTREE,
  INDEX `distance`(`distance`) USING BTREE
) ENGINE = InnoDB 
Copy the code

Data Store Example

The process of data storage is an example of the process of import into head office-store A. In the relationship, the path information of the department structure is stored. There are altogether the following paths between the head office and store A:

Head office – Region

Ancestor: id of the ancestor in 1 descendant: Id of big area in main table 4 distance: the route from head office to big area is: INSERT INTO 'company_relation' (' ancestor ', 'descendant', 'distance') VALUES (1, 4, 1);Copy the code

Head office – Region A

Ancestor: id of the ancestor in 1 descendant: Id of region A in the main table 6 distance: the route from the parent company to region A is: INSERT INTO 'company_relation' (' ancestor ', 'descendant', 'distance') VALUES (1, 6, 2); INSERT INTO 'company_relation' (' ancestor ', 'descendant', 'distance') VALUES (1, 6, 2);Copy the code

Head office – Store A

Ancestor: id of the ancestor in 1 descendant: Id of store A in the main table 8 distance: the path from the parent company to store A is as follows: INSERT INTO 'company_relation' (' ancestor ', 'descendant', 'distance') VALUES (1, 8, 3);Copy the code

Big sector. – Big sector A

Ancestor: large id in the main table 4 descendant: Id of region A in the main table 6 distance: the path from region A to region A is: INSERT INTO 'company_relation' (' ancestor ', 'descendant', 'distance') VALUES (4, 6, 1);Copy the code

Region – Store A

Ancestor: large id in the main table 4 descendant: Id of store A in the main table 8 distance: the path from large region to store A is as follows: INSERT INTO 'company_relation' (' ancestor ', 'descendant', 'distance') VALUES (4, 8, 2);Copy the code

Region A- Store A

Ancestor: id of area A in the main table 6 descendant: Id of store A in the main table 8 distance: the path from large area A to store A is: INSERT INTO 'company_relation' (' ancestor ', 'descendant', 'distance') VALUES (6, 8, 1); INSERT INTO 'company_relation' (' ancestor ', 'descendant', 'distance') VALUES (6, 8, 1);Copy the code

See? It stores all the paths between head office and store A

Data query mode

Here is a look at the query of this scheme using several commonly used query methods

To query the immediate subordinate department of a department, query the node ID of the ancestor whose distance is 1

Select * from company_relation where ancestor = id and distance=1Copy the code

You can query the directly superior departments of a department in which the node ID is 1

Select * from company_relation where distance=1 and distance=1Copy the code

To query all sub-departments of a department (including sub-nodes of the sub-node), use the id of the node as the ancestor node to query all sub-departments

Select * from company_relation where ancestor = id of this nodeCopy the code

This design can also be very convenient to query to the specified level of nodes

Data Update mode

Under this data storage structure, updating data is more difficult, because it stores all path information of two nodes directly (including intermediate nodes).

Scheme comparison:

Scheme 1: Adjacency List

Advantages: Only the parent ID field is added to the main table, which occupies a low space. It is easy to add and delete nodes when querying the relationship between adjacent nodes

Disadvantages: The multi-node query will be more troublesome, need to use some means to optimize

Application scenario: Nodes are frequently added or deleted and the node hierarchy is not very complex

Option 2: Path Enumeration

I understand the second scheme, which uses the path field to store the path between nodes and top-level nodes. In fact, it is quite troublesome to query, so this scheme is not used alone. Generally, it is used together with the Adjacency List scheme. It is used to quickly support the query of the entire node path

Option three: Closure table? Close watch? (Closure Table)

Advantages: High efficiency in querying relationships between any level of a tree structure

Disadvantages: it is a space for time scheme, relational table storage data amount, increase, delete and change are more troublesome

Application scenario: Large query requests and small change frequency of the tree structure

How to store tree structure in a relational database for you to talk about this, welcome you to exchange, point out some of the wrong places in the article, let me deepen my understanding, I wish you no bug, thank you!

How do I store tree structures in a relational database

The business scenario

The first approach: The Adjacency List

Data query mode

Data Update mode

Second option: Path Enumeration

Data query mode

Data Update mode

Third option: Closure table? Close watch? (Closure Table)

Company department information main table

Company department structure relationship table

Data Store Example

Data query mode

Data Update mode

Scheme comparison:

Related Posts

This is the famous rainwater problem

How can Multi Task Learning excel in industry

What happened to Spring cycle?