This is the 11th day of my participation in Gwen Challenge.

Today, I will share the high concurrency of microservices. In the previous part, I talked about the concept of microservices and module division.

In the case of high microservice concurrency, there are several issues to consider: microservice partitioning, high concurrency, data DB, middleware or cache issues, I/O performance bottlenecks, monitoring issues, automated deployment issues, etc.

First, the division of micro services

Division of microservices: as mentioned before, services can be divided into horizontal functions or vertical businesses. The granularity can be positioned according to current product requirements. The most important thing is to achieve high cohesion and low coupling.

Hearing the six words of high cohesion and low coupling, the interviewer may think that this guy is good and has a certain foundation of technical design. So here’s the question. What is high cohesion and what is low coupling? High cohesion: every service is in the same network or domain, and relative to the outside, the whole is a closed, secure box, like a rose.

The external interfaces of the box remain unchanged, as do the interfaces between modules inside the box, but the contents inside each module can be changed. Modules expose only minimal interfaces to avoid strong dependencies. Adding or deleting a module should only affect related modules that have dependencies, and should not affect irrelevant modules.

The so-called low coupling: from a small point of view, it is to reduce the coupling between each Java class, use interfaces, use the Java object-oriented programming ideas of encapsulation, inheritance, polymorphism, hidden implementation details. From the point of view of modules, it is to reduce the relationship between each module, reduce the complexity of redundancy, repetition and crossover, and the function division of modules is as single as possible.

Second, high concurrency

Once a company gets big, it needs to consider compatibility, expansion, stress and so on. High concurrency is a common term. Then how to ensure high concurrency, this is a problem.

High concurrency comes from the following aspects:

  1. idempotence

  2. Standardization of interface code

  3. Operation DB performance

  4. Read/write separation operation

  5. Horizontal scaling of services

  6. Robustness of services (caching, limiting, disaster)

Idempotent: Idempotent means that one or more requests for a resource should have the same results for the resource itself (except for network timeouts). That is, any execution of it produces the same effect and returns the same result. This scenario is a very effective scenario to achieve high concurrency. Imagine that the user recharges a member. In the case of concurrency, the user may trigger the deduction of fees for multiple transactions due to misoperation or the occurrence of retry mechanism due to network or time problems, which gives the user a very bad experience. In this case, interface idempotence is needed to solve such problems.

Idempotent solutions are as follows:

(1) Token mechanism

(2) Interface logic to achieve idempotence

(3) Database layer processing to achieve idempotency

Token mechanism: Data is submitted with the token, the token is stored in redis, and the validity period of the token is set. After the data is submitted, the token is verified in the background, deleted, and a new token is generated and returned.

Interface idempotence: Common interface idempotence is that when defining an interface, the serial number and source of parameters are added. The serial number and the source of the request are combined with a unique index. In this way, the serial number of the requester and the request can be effectively determined to prevent repeated requests.

Database processing: DB layer processing has a variety of ways, 1. Pessimistic lock, 2. Optimistic lock, 3. Unique index, combined unique index, 4. Distributed lock

Pessimistic lock: the so-called pessimistic lock, it is to point to the existence of crisis consciousness, in advance (query) lock processing, prevent things from happening. Select * from XXX where id= 1 for update.

Optimistic lock: Locks only updates with optimistic psychology. Optimistic lock is usually controlled by version version number, for example, update XXX set name=#name#,version=version+1 where version=#version#.

Update XXX set name=#name#,version=version+1 where id=#id# and version=#version#

Distributed lock: Set distributed lock through Redis and ZooKeeper. When inserting or updating data, obtain distributed lock, perform operations, and release the lock.

Interface standardization: The performance of the interface is ultimately related to the implementation logic of the interface, such as code specifications, logical implementation, etc., especially in the case of complex business logic, this needs to be paid attention to.

Operation DB: For the persistence layer of the business, the most used is Mybatis, Hibernate, and possibly JPA, whatever it is, eventually it is through the factory class to inject beans, and finally execute SQL to operate DB. The optimization of SQL determines the time and effect of DB operations. If not written well, it can lead to an endless loop, or deadlock, or memory overflow. In addition, when testing, use real, standard data for testing, and in the test is not limited to the same data, and finally is the concurrent pressure test.

Read/write separation: When there are enough services and data, the ratio of read to write may be 10:1. In this case, read/write separation is required to effectively reduce the write performance deterioration caused by frequent read operations. Common methods for read/write separation include: myCAT middleware, amoeba directly implementing read/write separation, manually modifying mysql operation classes to directly implement read/write separation and load balancing randomly implemented, independent permission allocation, and mysqL-proxy (which is still the test version and consumes a bit more time).

Horizontal service expansion: In this case, multiple service nodes are required to reduce the service load caused by a single node.

Service robustness: Service robustness includes caching, traffic limiting, and disaster recovery.

For caching, as you all know, there are many common middleware such as Redis, Kafka, RabbitMq, and ZooKeeper. For some sessions, redis is commonly used for caching and sharing. For some larger data if you dislike the slow loading, you can also use the cache mechanism to solve the problem.

What is current limiting? Limit the number of requests to a node. So how do you limit the flow? Common traffic limiting algorithms such as counter algorithm, token bucket algorithm, leak bucket algorithm. There are several ways to do this: use the springcloud component zuul to limit requests. RateLimiter provided by Google and some traffic limiting algorithms are commonly used to limit requests. Redis can also do the flow limiting algorithm, and even can use Nginx directly for count limiting, you can limit the request rate, limit the number of each IP connection, limit the number of connections for each service. Such as:

Limit_req_zone $binary_remote_addr zone=req_one:20m rate=12r/s; limit_conn_zone $binary_remote_addr zone=addr:10m; Limit_conn_zone $server_name zone= perServer :20m; Server {listen 80; location / { proxy_pass http://ip:port; limit_req zone=req_one burst= 80 nodelay; limit_conn addr 20; }}Copy the code

In fact, in Spring Boot2. x, we launched our own Spring-cloud-Gateway as the Gateway, and provided a Redis based implementation in spring-Cloud-Gateway to achieve the purpose of limiting traffic.

In the case of a meltdown, or circuit breaker, this is necessary in real business. For example: the user in a mall seconds to kill a certain item, or in a meter mall to buy a certain mobile phone, in the punctual buying, found a lot of people, a lot of requests, at this time, mainly need limited flow mechanism, but also need to have meltdown (fuse), to leave a good experience for the user. When the user click the snap button, if the current number of requests, require users to wait for, it is need to provide a friendly interface for users to wait for, rather than directly to the user requesting fails, or abnormal, red throw is a very bad thing, the user may suck, and won’t go next time.

Spring – the Cloud – Gateway as a Gateway, Filter USES HystrixGatewayFilterFactory to create a Filter based on the level of the Route of fusing function.

Middleware or cache issues

Caching is a good way to reduce service stress as the number of users increases exponentially. This effectively buffers the load of requests on the service. Common caches are Redis, MQ(RabbitMQ, RocketMQ), Kafka, ZooKeeper, and so on.

Redis is generally used to cache session or user information to realize session sharing among multiple machines. It can also be used as a distributed lock to realize the functions of the lock under the distributed high concurrency, such as the realization of the second kill, order grab and other functions. It is also used as a cache for some order information to prevent a large number of order information from being backlogged and causing a high load on the server. In summary, Redis is often used as a buffer.

ZooKeeper also stores massive data. For example, When Hadoop uses YARN for resource scheduling, ZooKeeper stores massive state machine status and task information (including historical information).

4. I/O performance bottlenecks

The business of each industry may be different, but most industries have storage. When it comes to the most direct database, others include e-commerce, logistics, AI algorithms and so on. The storage of e-commerce lies in the interaction between the data of the page and the back-end storage. The storage of logistics lies in the storage of logistics information and goods information. AI algorithms lie in the storage of data sets, models, image files, training code, and so on. In general, no matter what storage, as long as the disk, hard disk related, will involve IO problems.

For IO, in blocking mode, there are often insufficient threads, even if the thread pool is used to reuse threads. In blocked I/O mode, a large number of threads are blocked and waiting for data. In this case, the threads are suspended and can only wait. As a result, the CPU usage is low, resulting in poor system throughput, high memory usage, or even memory overflow. If network I/O is blocked, network jitter, or network fault occurs, threads may be blocked for a long time. The whole system became unreliable.

What is NIO? Java.nio is a New API (New IO) in JDK 1.4 and above that provides caching support for all primitive types (except Boolean types) and provides a non-blocking, highly scalable network. NIO core API: Channel, Buffer, Selector.

Channel: A NIO Channel is similar to a stream, but has some differences: 1. A Channel can read and write simultaneously, while a stream can only read or write. 2. A channel can read and write data asynchronously. 3. A channel can read or write data from the buffer.

Buffer: A Buffer is essentially a block of memory that can be written to and then read again. This object provides a set of methods to make it easier to use a block of memory. Reading and writing data using a Buffer usually involves these four steps:

  1. Write data to buffer;

  2. Call buffer.flip();

  3. Read data from the buffer;

  4. Call buffer.clear() or buffer.compat()

When data is written to Buffer, Buffer records how much data is written. Once data is read, the Buffer needs to be cut from write mode to read mode using the flip() method. In read mode, all data previously written to Buffer can be read. Once you have read all the data, you need to empty the buffer so that it can be written again.

Selector: A component that checks multiple NIO channels to see if read or write events are ready. Multiple channels can register with the same Selector in the form of events, allowing a single thread to handle multiple requests.

When you call Selector select() or selectNow() it only returns the instance of the SelectableChannel with the data read.

In addition, NIO is used to achieve large file fragmentation processing.

5. Monitoring issues

With the continuous expansion of business, faced with increasing number of service, service online environment increasingly complex, rely on complex operations such as pain points, services rely on automatic card, call the real-time tracking, abnormal detailed analysis, call link tracking, real-time capacity planning, returning for analysis and other basic operational demands and solution is particularly important.

The purpose of monitoring mainly includes performance monitoring, service robustness, operation and maintenance management, automatic problem analysis, dynamic capacity expansion, etc. There are also many monitoring methods, such as the Dashboard provided by Springcloud to achieve link tracking of services.

The Dashboard of K8S can track and see the status of each service POD and system indicators of each service. In addition to the basic monitoring of K8S (POD health, memory usage, CPU). In order to monitor JVM indicators such as thread pool, TPS, QPS, RT, system load, Thread, MEM, class, Tomcat, GC and other parameters in microservice projects. K8s-based Promethus job can be used to collect business metrics. Also, Promethus supports the Grafana front-end UI.

6. Automatic deployment

With the increasing number of services, the operation and maintenance management of services is also a hassle. If there is a mechanism for automatic deployment that can trigger automatic deployment of all micro-services with one click, then the cliff is a good thing. This can refer to the article: micro service automated deployment CI/CD, a very detailed introduction to the actual combat of automatic deployment.