Review technical Architecture for Large Web Sites and take notes.

The technical challenges of large web sites are mainly due to the large number of users, high concurrent access and large amount of data. Any simple business that needs to deal with tens of millions of data and hundreds of millions of users becomes very difficult. Large web architectures are designed to solve these problems.

1. Initial site architecture

Large sites, are from the development of small sites, gradually evolved from.

The initial traffic is small, with applications, databases, files, and so on all on the same server.

2. Separate application services from data services

As more and more users access the server, the memory, disk space, and CPU usage of a single server may be insufficient.

Therefore, we separate applications and data:

  • Application servers: Use faster and more powerful cpus to process a lot of business logic;
  • Database server: faster hard disk to speed up disk retrieval, more memory for data caching;
  • File server: Use more disk space to accommodate the large number of files uploaded by users.

3. Use caching to improve site performance

As the number of users increases, the database is under too much pressure, resulting in access delays and affecting the overall website experience.

Because most business access is concentrated on a small portion of data, this portion of data can be cached. Improves query performance and reduces database access pressure.

4. Improved concurrent processing capability of application server clusters

The processing capacity of a single application server is limited. Therefore, additional servers are used to share the access load and distribute requests using load balancing.

5. Read and write data of the database are separated

When the number of users on the website reaches a certain size, all write operations and some read operations (such as cache mismatches and expiration) cause high database load.

Through the hot backup of the primary and secondary databases, the primary and secondary data are synchronized to achieve the separation of read and write data from the database. To make reads and writes transparent to applications, specialized data access modules are often used.

6. Use reverse proxy and CDN to speed up site response

Due to China’s complex network environment, users in different regions can access the network at different speeds.

Research shows that website access delay is positively correlated with user churn rate.

  • CDN cache website data, so that users from their nearest network provider machine room to obtain data;
  • Add a web site reverse proxy. When a user visits a web site, the request first arrives at the reverse proxy and returns directly if cached data exists.

7. Use distributed file systems and distributed database systems

No single powerful server can meet the growing business needs of large web sites.

When database and file system bottlenecks are encountered, we need to use distributed database and distributed file system.

8. Use NoSQL and search engines

As web business becomes more and more complex, the demand for data storage and retrieval becomes more and more complex.

Through the introduction of NoSQL and search engines, it can be more flexible and efficient access to data and fast search of website content.

9. Split services

To cope with increasingly complex business scenarios, websites are often split into different product lines.

Divide the site into many different applications based on product lines, and each application is deployed independently.

Distributed services

The complexity of application systems increases exponentially as services are split into smaller and larger storage systems, making deployment and maintenance more difficult. All applications are connected to all databases, causing insufficient database connection resources.

The solution is to separate common services from independent deployment and provide common business services to upper-layer application system calls.

For example, the user management, commodity management, etc., are extracted from these services to connect to the database, and the upper application system is responsible for managing the business logic related to the user interface.

Afterword.

If you can control the amount of concurrent access, a lot of tricky technical problems won’t be a problem.

Technology is used to solve business problems, and business problems can be solved by business means.

By understanding the ins and outs and historical origins of mature web architecture technology solutions, technology selection and architecture decisions can be targeted, to the point.


Open a free knowledge planet. Talk about technology, also talk about life, welcome to join the exchange learning.


Illustration: pixabay.com

Technical Architecture for Large Web Sites

Personal public account

For more articles, please pay attention to the public number: binary road