Typical database architecture design and practice

This article will introduce some basic concepts, common problems and corresponding solutions in database architecture design. In order to facilitate readers to understand, it will take “user-centric” database as an example to explain the common gameplay of database architecture design.

I. User center

As a common service, the user center mainly provides user registration, login, information query and modification services. Its core metadata is as follows:

User(uid, uname, passwd, sex, age,nickname,…)

Among them:

Uid indicates the user ID and primary key
Uname, passwd, sex, age, nickname… Etc are attributes of the user

In terms of database design, generally speaking, at the beginning of the business, a single database and a single table can solve this requirement.

2. Illustration

In order to facilitate everyone’s understanding, there are more pictures in the following text, among which:

The gray box indicates service
The “purple” round box identifies master, the master library
“Pink” round box, slave, slave library

3. Single library architecture

The most common architectural design is as follows:

User-service: user-centric service that provides a friendly RPC interface for callers
User-db: a library for data storage

4. Grouping structure

What is grouping?

A: The grouping architecture is the most common one master, multiple slaves, master/slave synchronization, read-write separation database architecture:

User-service: indicates the user center service
User-db-m (master) : indicates the master database, providing database write services
User-db-s (slave) : provides the database read service

A database cluster consisting of master and slave databases is called a group.

What are the characteristics of the grouping?

A: Database clusters in the same group:

Data is synchronized between the master and slave using the binlog
Multiple instance databases have identical structures
Multiple instances store exactly the same data, essentially duplicating the data

What problem does the grouping architecture solve?

A: Most Internet services read too much and write too little. Database reads tend to be the first performance bottleneck if you want to:

Linearly improves database read performance
Improves database write performance by eliminating read/write lock conflicts
“Read High Availability” of Data through Redundant Slave Libraries

The grouping schema can be used at this point, but it is important to note that in the grouping schema, the main library of the database is still a write single point.

In a word, grouping to solve the “database read and write high concurrency high” problem, the implementation of the architecture design.

5. Sharding architecture

What is sharding?

A: Sharding is what everyone calls a horizontal sharding database architecture:

User-service: indicates the user center service
User-db1: Split horizontally into the first of two halves
User-db2: Split horizontally into the second of two portions

After sharding, multiple database instances also form a database cluster.

Horizontal shard, is it separate library or table?

A: Separate libraries rather than tables are strongly recommended because:

Partitioned tables still share a single database file, and there is still disk IO competition
Branch repositories can easily migrate data to different database instances, or even database machines, for better scalability

Horizontal segmentation, what’s the algorithm?

A: Common horizontal segmentation algorithms include “range method” and “hash method” :

The scope method is shown in the figure above: split the data horizontally into two database instances based on the service primary key UID of the user center:

User-db1: stores 0 to 10 million UID data
User-db2: Stores 0 to 20 million UID data

The hashing method is shown in the figure above: also based on the uid of the service primary key in the user center, the data is horizontally split into two database instances:

User-db1: stores UID data modelled to 1
User-db2: stores UID data modelled to 0

Both of these methods are used on the Internet, among which hashing is more widely used.

What are the characteristics of sharding?

A: Database cluster in the same shard:

Multiple instances are not directly related to each other, unlike binlog synchronization between master and slave
The multiple instance database structure is exactly the same
There is no intersection between data stored by multiple instances, and data between all instances is combined to form global data

What problem does sharding solve?

A: Most Internet services have a large amount of data, and the single database capacity is easy to become a bottleneck. In this case, sharding can:

Linearly improves database write performance. Note that grouping architectures do not linearly improve database write performance
Reduce the data capacity of a single database

In a word, sharding solves the problem of “large amount of database data” and implements the architecture design.

Group + Sharding architecture

If the concurrency of read and write services is high and the amount of data is also large, the grouping and sharding database architecture is usually required:

Sharding reduces the amount of data in a single database and linearly improves the write performance of the database
Groups are used to linearly improve the database read performance and ensure the high availability of the read library

Vertical segmentation

In addition to horizontal sharding, vertical sharding is also a common type of database architecture design. Vertical sharding is usually closely related to business.

Using user center as an example, we can do vertical shard like this:

User(uid, uname, passwd, sex, age,…)

User_EX (uid, intro, signs, etc.)

For vertically split tables, the primary key is uid
Login name, password, gender, age and other attributes are placed in a vertical table (library)
Attributes such as self-introduction, personal signature, etc. are placed in a separate vertical table

How do you do vertical segmentation?

Answer: When vertically shards data based on business, two factors are generally considered: the “length” and “access frequency” of attributes:

Shorter in length and more frequently visited together
The ones that are longer and less frequently accessed are put together

This is because the database loads data into the buffer in the unit of row. In the case of limited memory capacity, the memory can load more data with short length and high access frequency, resulting in higher hit ratio, reduced DISK I/O, and improved database performance.

What are the characteristics of vertical segmentation?

A: Vertical and horizontal cutting are similar, but not quite the same:

There is also no direct connection between instances, that is, no binlog synchronization
Multiple instance database structures, all different
Data stored in multiple instances must have at least one intersection column, which is generally a service primary key. Data between all instances is combined to form global data

What problem does vertical segmentation solve?

A: Vertical sharding can reduce the amount of data in a single library and reduce disk I/OS to improve throughput. However, vertical sharding is closely related to services. Not all services can be vertically sharded.

Eight, summary

I hope to keep these points in mind for a longer article:

At the beginning of business, single library is used
Read pressure, read high availability, use groups
Large amount of data, write linear expansion, with sharding
Attributes with short attributes and high access frequency are vertically split together

I hope you’ve had some fun.

= =

Design of Database Architecture with 10 billion data volume, 10,000 attributes, 100,000 Concurrent data

Architecture Solution for Smooth Second-level Database Expansion

30 Catch-22 rules for 58 Home Database

“Industry Challenges – Four Solutions to” Kuaku Paging”

10 billion Data Smooth Migration without Affecting Service

Did this article help you? Welcome to join the wechat group of back-end learning Group:

Typical database architecture design and practice

Related Posts

Part 13: DRF Framework API versioning

Win64 version of the JDK web disk resources

Java Web implementation folder upload (preserving directory structure) instance parsing