What is high concurrency

A large number of requests arrive at the server at the same time or in a very short time, and each request requires the server to consume resources to process. The number of processes that can be started at the same time, the number of threads that can run at the same time, the number of network connections, CPU, IO, and memory are all server resources. Because server resources are limited, the server can process requests at the same time. From this point of view, the problem we need to solve in high concurrency is the limitation of resources. This article is an overview of the overall system architecture evolution and design, not just concurrent programming in Java.

An example diagram of a client accessing a server in a system. In a real scenario, client not only includes browser, mobile terminal, PC terminal, etc., but also calls between servers through API interfaces. At this time, the caller is no longer a server, but a client.

Problems with high concurrency

When a large number of requests occur in a short period of time, the processing and response of the server will become slower and slower, and even part of the request will be discarded (appropriate in some cases), or even the server may crash. Server crashes in turn lead to financial losses and loss of customers. At the same time, in the case of high concurrency, the robustness of the processing logic written by the programmer may lead to errors in the business logic, and then data exceptions.

The processing level of high concurrency problems

In my opinion, the problem of high concurrency should be considered from the following aspects: Web front-end, Web server, Web application, and database. Due to my limited ability, I can only consider these for reference.

Basic idea of processing

When encountering the problem of high concurrency, consider from the most basic request and response mode, namely two questions, how to improve the capability of “client” and how to improve the capability of “server”.

  1. From the client side (web browser and calling side) perspective
    • Reduce the number of requests (cached or handled by the front end if the front end can handle it, such as paging and sorting of small volumes of data to the front end)
    • Reduce the waste of unnecessary resources and reuse certain resources, such as connection pooling
  2. From the server side
    • Increase the number and supply of resources, such as network bandwidth, high-configuration servers, high-performance Web servers, and high-performance databases
    • Request shunt
      • The use of clusters, application architecture of the cluster, through LVS, NGINx, etc
      • Distributed system architecture: Within a system, multiple services are separated based on services, and key core services are deployed in multiple ways and processed with high availability.
      • Application optimization: service logic optimization, SQL optimization, read/write separation, etc

Basic means

Integrated programmes require a combination of instruments as needed. It should not be used blindly.

Client level

  • Use the browser’s caching function to reduce access to the server, JS, CSS, and images
  • Compress file transfer to reduce network traffic
  • Asynchronous request, batch data acquisition

Static servers accept the front-end layer

  • Static and static resources are separated, and some static resources are returned directly from nginx.
  • Requests are distributed to different back-end services, such as load balancing or service splitting
  • Do load balancing for Nginx, such as LVS
  • Use the CDN service

varnish

  • Dynamic content caching, such as JSP
  • Page fragment cache

Web server layer

  • Use the latest JVM and optimize the configuration
  • Adjust the Web server configuration, such as the amount of memory, the number of threads
  • Back-end server load balancing.
  • Server classification, providing specialized pictures, files, videos

Web Application Layer

  • Static dynamic content
  • Java development optimization, rational and correct use of concurrent programming model
  • Optimizing business logic
  • Use caching efficiently
  • Optimize SQL for accessing the database
  • Using an in-memory database
  • Avoid remote calls and lots of IO
  • Reasonably plan transactions and other resource-intensive operations
  • Use asynchronous processing wisely
  • Reduce real-time computing

Database level

  • Choose the database engine properly
  • Configuration optimization
  • Reasonable database design
  • Separate database, separate table
  • Use NoSQL wisely. Data that does not require strong transactions is stored in NOSQL

Principle: divide and conquer (external work), improve the speed of individual processing (internal work). External skill is the easiest to improve, internal skill is the need for solid ability.