1. I/O optimization

Increase cache to reduce disk access times.

Optimize the disk management system, design the optimal disk mode policy, and the disk addressing policy, this is considered at the underlying operating system level.

The design of proper disk storage blocks and policies for accessing these databases are considered at the application level. For example, we can design indexes for stored data, speed up and reduce disk access by addressing indexes, and speed up disk access in asynchronous and non-blocking ways.

Apply an appropriate RAID policy to improve disk I/O.

2. Tune the Web front-end

Reduce the number of network interactions (multiple request merges) Reduce the size of the amount of data transferred over the network (compression) Minimize encoding (try to convert characters to bytes in advance, or reduce the character-to-byte conversion process). Use browser cache to reduce Cookie transmission Properly layout the page Use page compression lazy loading the page CSS at the top, JS at the bottom CDN Reverse proxy page static remote deployment

3. Service degradation (automatic graceful degradation)

Denial of service and shutdown of service

4. Idempotent design

Some services are inherently idempotent, such as setting the user’s gender to male. No matter how many times you set the user’s gender, the result will be the same. However, for transfer transactions and other operations, the problem will be more complicated. It is necessary to verify the validity of service invocation through transaction numbers and other information, and only valid operations can continue to be executed.

(Note: Idempotence is a promise (not an implementation) of the system’s interface that multiple external calls will have the same effect on the system as long as the call to the interface succeeds. An interface declared idempotent assumes that an external call failure is normal, and that a retry is inevitable after the failure.

5. Failover

If any server in the data server cluster goes down, all read and write operations on this server need to be rerouted to other servers to ensure that data access does not fail. This process is called failover.

Failover includes failure confirmation (heartbeat detection and application access failure report), access transfer, and data recovery.

Failover ensures that if one copy of data is inaccessible, other copies of data can be switched to ensure system availability.

6. Performance optimization

According to the hierarchical architecture of websites, performance optimization can be divided into web front-end performance optimization, application server performance optimization, and storage server performance optimization.

  1. Web front-end performance optimization

Browser access optimization: Reduce HTTP requests; Use browser caching; Enable compression; CSS at the top of the page, javaScript at the bottom; Reduced Cookie transmission CDN accelerated reverse proxy

  1. Application server performance optimization

Distributed caching (Redis, etc.) asynchronous operations (message queues) are optimized using clustering (load balancing) code

  1. Storage Performance Optimization

Mechanical disks vs. SSDS B+ tree vs. LSM tree RAID vs. HDFS

7. Code optimization

Multithreading (Q: How to ensure thread safety? What are the lock-free mechanisms? Resource reuse (singleton, connection pool, thread pool) data structure garbage collection

8. Load balancing

HTTP redirection load balancing

When a user sends a request, the Web server returns a new URL by modifying the Location tag in the HTTP response header, and the browser continues to request the new URL, effectively redirecting the page. Load balancing is achieved through redirection. For example, when we download the PHP source package, when we click on the download link, it will return a nearby download address in order to solve the problem of download speed in different countries and regions. The HTTP return code for the redirect is 302.

Advantages: Relatively simple.

Disadvantages: The browser needs to request the server twice to complete a single access, poor performance. The processing capacity of the redirection service itself may become a bottleneck, and the scalability of the whole cluster is limited. Using HTTP302 response code redirection, it is possible for search engines to determine SEO cheating and lower search rankings.

DNS Load balancing for domain name resolution

The Domain Name System (DNS) resolves Domain names. The URL of a Domain Name is an alias of a server. The actual mapping is an IP address. A domain name can be configured to correspond to multiple IP addresses. Therefore, DNS can also serve as a load balancing service.

Larger sites, in fact, always use the DNS domain name resolution, using the domain name resolution as the first level load balancing device, get a set of the DNS server is not actual physical server provides Web services, but also provide load balancing service internal server, this group of internal load balancing server for load balancing, Distribute the request to a real Web server.

Advantages: The DNS transfers the load balancing work to the DNS server, saving the trouble of website management and maintenance. In addition, many DNS servers support geo-location based domain name resolution, that is, resolving the domain name to the nearest server address of the example user, which speeds up user access and improves performance.

Disadvantages: you can’t define rules freely, and it’s cumbersome to change mapped IP or machine failure, and DNS takes effect late. Moreover, the control of DNS load balancing is in the domain name service provider, so the website can not make more improvements and more powerful management.

Reverse proxy load balancing

Reverse proxy services can cache resources to improve web site performance. In fact, the reverse proxy server is deployed in front of the Web server (in order to cache Web responses and speed up access), which is also the location of the load balancing server. Therefore, most reverse proxy servers also provide the load balancing function and manage a group of Web servers. Requests are forwarded to different Web servers based on load balancing algorithms. The completed response processed by the Web server also needs to be returned to the user via the reverse proxy server. The Web server does not directly provide external access. Therefore, the Web server does not require an external IP address. The reverse proxy server requires two network adapters and two sets of internal and external IP addresses.

Advantages: Integrated with the reverse proxy server function, it is easy to deploy.

Disadvantages: The reverse proxy server is the hub for all requests and responses, and its performance can be a bottleneck.

LVS-NAT: Changes the IP address

Lvs-tun: Encapsulates one IP packet into another.

Lvs-dr: Changes the MAC address of the data frame to the MAC address of the selected server, and then sends the modified data frame on the LAN with the server group.

9. The cache

Caching is storing data in the nearest location to a calculation to speed up processing. Caches are the first step in improving software performance. One of the most important factors driving faster cpus today is the use of more caches, which are ubiquitous in complex software designs. Large web architecture designs use caching in many ways.

CDN: And content delivery network, deployed in the network service providers closest to the end user, the user’s network where the request is always to reach his Internet service provider, cache website here some static resources (minor changes), can be returned to the user with the fastest speed to the nearest, such as video website and portal website visitors will be big hot spot content cached in the CDN.

Reverse proxy: A reverse proxy is a part of the front-end architecture of a website. It is deployed in the front end of a website. When a user’s request arrives at the website’s data center, the reverse proxy server is the first to access the request.

Local cache: Hotspot data is cached locally on the application server, allowing applications to access the data directly in native memory without accessing the database.

Distributed cache: large websites is a huge amount of data, even if the cache only a fraction, need memory space is not a single machine can bear, so in addition to the local cache, also need a distributed cache, the data cached in a special distributed cache cluster, application communication through the network access to the cache data.

There are two prerequisites for using the cache. One is that the data access hotspot is unbalanced, and some data will be accessed more frequently, so the data should be stored in the cache. Second, the data is valid within a certain period of time and does not expire soon. Otherwise, the cache data is invalid and dirty reads are generated, affecting the correctness of the results. In website application, cache processing can speed up data access speed, but also can reduce the load pressure of back-end application and data storage, which is crucial to the website database architecture, website database is almost in accordance with the premise of caching load capacity design.

10. Load balancing algorithm

Round Robin Strengthened Round Robin Weight Round Robin Random Strengthened Random Weight Random Least Connections Least Connections Strengthened Source address Hash

Other algorithms

Fastest algorithm: Passes connections to servers that respond Fastest. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.

Observed algorithm: The server is selected for the new request based on the optimal balance of connection number and response time. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.

Predictive algorithm (Predictive) : Big-IP uses the collected server performance indicators to perform Predictive analysis, and selects a server whose performance is expected to reach the desired value in the next time slice. (Detected by BIG-IP)

Dynamic Ratio-APM: Big-IP collects performance parameters of applications and application servers to dynamically adjust traffic allocation.

Dynamic Server Act. : Dynamically adds backup servers to the primary Server group when the number of backup servers in the primary Server group decreases due to a failure.

Quality of Service (QoS) algorithm: Data flows are allocated according to different priorities.

Service Type algorithm (ToS): Allocates data flows according to load balancing of different service types (identified in Type of Field).

Regular mode algorithm: Users can set guiding rules for different data flows

11. The difference between scalability and scalability

Extensibility: The ability to sustainably extend or duplicate system functions with minimal impact on existing systems. The system infrastructure is stable and does not need frequent changes. Applications are less dependent and coupled and can respond quickly to demand changes. It is an open and closed principle at the level of system architecture design (open for extension, closed for modification). Architecture design considers future functional expansion, and when new functions are added to the system, there is no need to modify the existing system structure and code.

The main criterion to measure the scalability of the website architecture is whether the new business products added to the website can be transparent and have no impact on the existing products, and the new products can be launched without any change or few changes to the existing business functions. Whether there is little coupling between different products, changes in one product have no impact on other products, and other products and functions do not need to be affected by changes.

Scalability: Web site scalability refers to the ability to scale up or down the service capacity of a web site by changing the number of servers deployed without changing the site’s hardware and software design.

A way in which a system can increase (decrease) the size of its resources to increase (decrease) its ability to compute and process transactions. If this increase or decrease is proportional, it is called linear scalability. In the website architecture, usually refers to the use of cluster to increase the number of servers, improve the overall transaction throughput capacity of the system.

The main measure of architectural scalability is the ability to build a cluster with multiple servers and how easy it is to add new servers to the cluster. Whether the new server can provide the same service as the original service, and whether there is a limit on the total number of servers that can be accommodated in the cluster.

12. Consistent hash for distributed caches

The algorithm process is as follows: first construct a 2^32 integer ring (this ring is called the consistent Hash ring) and set the cache server stage on this Hash ring based on the Hash value of the node name (whose distribution range is [0,2^ 32-1]). Then calculate the Hash value based on the Key value of the data to be cached (the distribution range is also 0,2^ 32-1). Then search the cache server node clockwise on the Hash ring for the nearest Hash value of the example Key to complete the search for the Hash mapping between the Key and the server.

Optimization strategy: Virtualize each physical server as a group of virtual cache servers, place the Hash value of the virtual server on the Hash ring, and find the virtual server node before obtaining information about the physical server.

How many virtual server nodes is appropriate for a physical server? Experience: 150.

13. Cyber security

  1. XSS attacks

Cross Site Script attack refers to an attack in which hackers tamper with web pages and inject malicious HTML scripts to control users’ browsers to perform malicious operations when they browse web pages.

Defense: disinfection (XSS attackers are generally targeted by embedding malicious scripts in requests, which are not used in normal user input. If filtered and sanitized, that is, some HTML dangerous characters are transferred, such as “>” translated into “>”. ); HttpOnly(prevents XSS attackers from stealing cookies).

  1. Injection attacks: SQL injection and OS injection

SQL defense: PreparedStatement; The ORM. Avoid storing passwords in plain text. Handle appropriate exceptions.

  1. CSRF (Cross Site Request Forgery). It sounds similar to XSS, but there is a big difference between XSS, which exploits trusted users within the site, and CSRF, which exploits trusted sites by masquerading requests from trusted users.

Guard: httpOnly; Increase the token; Identified by Referer.

  1. File upload vulnerability

  2. DDos attack

14. Encryption

Encryption: MD5, SHA symmetric encryption: DES algorithm, RC algorithm, AES asymmetric encryption: RSA

Asymmetric encryption technology is usually used in information security transmission, digital signature and other occasions. The digital certificate used by the browser in HTTPS transmission is essentially an asymmetric encrypted public key authenticated by an authoritative authority.

15. Flow control (flow control)

Traffic discarded

The processing method of discarding user requests through a single-machine memory queue is simple and crude, and in I/ O-intensive applications (including network I/O and disk I/O), the bottleneck is usually not CPU or memory. Therefore, appropriate waiting can both alter the user experience and improve resource utilization.

Asynchronize user requests through distributed message queues.

The resources

  1. LVS: comparison of the three load balancing modes plus the other three load balancing modes
  2. Technical Architecture of Large Websites: Core Principles and Technical Analysis. Li Zhihui.
  3. Hundred-million-level Web system construction: from single machine to distributed cluster
  4. Design and Implementation of Large Distributed Web architecture. By Chen Kangxian.