This article will introduce some basic concepts, common problems and corresponding solutions in database architecture design. In order to facilitate readers to understand, it will take “user-centric” database as an example to explain the common gameplay of database architecture design.

 

I. User center

As a common service, the user center mainly provides user registration, login, information query and modification services. Its core metadata is as follows:

User(uid, uname, passwd, sex, age,nickname,…)

Among them:

  • Uid indicates the user ID and primary key

  • Uname, passwd, sex, age, nickname… Etc are attributes of the user

In terms of database design, generally speaking, at the beginning of the business, a single database and a single table can solve this requirement.

 

2. Illustration

In order to facilitate everyone’s understanding, there are more pictures in the following text, among which:

  • The gray box indicates service

  • The “purple” round box identifies master, the master library

  • “Pink” round box, slave, slave library

 

3. Single library architecture

The most common architectural design is as follows:

  • User-service: user-centric service that provides a friendly RPC interface for callers

  • User-db: a library for data storage

 

4. Grouping structure

What is grouping?

A: The grouping architecture is the most common one master, multiple slaves, master/slave synchronization, read-write separation database architecture:

  • User-service: indicates the user center service

  • User-db-m (master) : indicates the master database, providing database write services

  • User-db-s (slave) : provides the database read service

A database cluster consisting of master and slave databases is called a group.

 

What are the characteristics of the grouping?

A: Database clusters in the same group:

  • Data is synchronized between the master and slave using the binlog

  • Multiple instance databases have identical structures

  • Multiple instances store exactly the same data, essentially duplicating the data

 

What problem does the grouping architecture solve?

A: Most Internet services read too much and write too little. Database reads tend to be the first performance bottleneck if you want to:

  • Linearly improves database read performance

  • Improves database write performance by eliminating read/write lock conflicts

  • “Read High Availability” of Data through Redundant Slave Libraries

The grouping schema can be used at this point, but it is important to note that in the grouping schema, the main library of the database is still a write single point.

 

In a word, grouping to solve the “database read and write high concurrency high” problem, the implementation of the architecture design.

 

5. Sharding architecture

What is sharding?

A: Sharding is what everyone calls a horizontal sharding database architecture:

  • User-service: indicates the user center service

  • User-db1: Split horizontally into the first of two halves

  • User-db2: Split horizontally into the second of two portions

After sharding, multiple database instances also form a database cluster.

 

Horizontal shard, is it separate library or table?

A: Separate libraries rather than tables are strongly recommended because:

  • Partitioned tables still share a single database file, and there is still disk IO competition

  • Branch repositories can easily migrate data to different database instances, or even database machines, for better scalability

 

Horizontal segmentation, what’s the algorithm?

A: Common horizontal segmentation algorithms include “range method” and “hash method” :

The scope method is shown in the figure above: split the data horizontally into two database instances based on the service primary key UID of the user center:

  • User-db1: stores 0 to 10 million UID data

  • User-db2: Stores 0 to 20 million UID data

The hashing method is shown in the figure above: also based on the uid of the service primary key in the user center, the data is horizontally split into two database instances:

  • User-db1: stores UID data modelled to 1

  • User-db2: stores UID data modelled to 0

Both of these methods are used on the Internet, among which hashing is more widely used.

 

What are the characteristics of sharding?

A: Database cluster in the same shard:

  • Multiple instances are not directly related to each other, unlike binlog synchronization between master and slave

  • The multiple instance database structure is exactly the same

  • There is no intersection between data stored by multiple instances, and data between all instances is combined to form global data

 

What problem does sharding solve?

A: Most Internet services have a large amount of data, and the single database capacity is easy to become a bottleneck. In this case, sharding can:

  • Linearly improves database write performance. Note that grouping architectures do not linearly improve database write performance

  • Reduce the data capacity of a single database

 

In a word, sharding solves the problem of “large amount of database data” and implements the architecture design.

 

Group + Sharding architecture

If the concurrency of read and write services is high and the amount of data is also large, the grouping and sharding database architecture is usually required:

  • Sharding reduces the amount of data in a single database and linearly improves the write performance of the database

  • Groups are used to linearly improve the database read performance and ensure the high availability of the read library

 

Vertical segmentation

In addition to horizontal sharding, vertical sharding is also a common type of database architecture design. Vertical sharding is usually closely related to business.

Using user center as an example, we can do vertical shard like this:

User(uid, uname, passwd, sex, age,…)

User_EX (uid, intro, signs, etc.)

  • For vertically split tables, the primary key is uid

  • Login name, password, gender, age and other attributes are placed in a vertical table (library)

  • Attributes such as self-introduction, personal signature, etc. are placed in a separate vertical table

 

How do you do vertical segmentation?

Answer: When vertically shards data based on business, two factors are generally considered: the “length” and “access frequency” of attributes:

  • Shorter in length and more frequently visited together

  • The ones that are longer and less frequently accessed are put together

This is because the database loads data into the buffer in the unit of row. In the case of limited memory capacity, the memory can load more data with short length and high access frequency, resulting in higher hit ratio, reduced DISK I/O, and improved database performance.

 

What are the characteristics of vertical segmentation?

A: Vertical and horizontal cutting are similar, but not quite the same:

  • There is also no direct connection between instances, that is, no binlog synchronization

  • Multiple instance database structures, all different

  • Data stored in multiple instances must have at least one intersection column, which is generally a service primary key. Data between all instances is combined to form global data

 

What problem does vertical segmentation solve?

A: Vertical sharding can reduce the amount of data in a single library and reduce disk I/OS to improve throughput. However, vertical sharding is closely related to services. Not all services can be vertically sharded.

 

Eight, summary

I hope to keep these points in mind for a longer article:

  • At the beginning of business, single library is used

  • Read pressure, read high availability, use groups

  • Large amount of data, write linear expansion, with sharding

  • Attributes with short attributes and high access frequency are vertically split together

 

I hope you’ve had some fun.

= =

Related articles:

Design of Database Architecture with 10 billion data volume, 10,000 attributes, 100,000 Concurrent data

Architecture Solution for Smooth Second-level Database Expansion

30 Catch-22 rules for 58 Home Database

“Industry Challenges – Four Solutions to” Kuaku Paging”

10 billion Data Smooth Migration without Affecting Service


Did this article help you? Welcome to join the wechat group of back-end learning Group: