I. Introduction to micro services

1. What are microservices

When introducing micro-service, we should first understand what micro-service is. As the name implies, micro-service should be understood from two aspects: what is “micro” and what is “service”. In a narrow sense, this interpretation is well explained by the small and famous “2 Pizza Team “(2 Pizza team was first proposed by Bezos, CEO of Amazon, which means that the design of a single service requires only two pizzas from all participants in design, development, testing and operation). And the so-called service, must be different from the system, service one or a group of relatively small and independent function unit, is the user can perceive the minimum function set.

2. Origin of micro services

Microservices were first proposed by Martin Fowler and James Lewis in 2014. Microservices architecture style is a way to develop a single application using a set of small services. Each service runs in its own process and uses a lightweight mechanism to communicate, usually HTTP API. These services are built on business capabilities and can be deployed independently through automated deployment mechanisms, implemented using different programming languages, and different data storage technologies, with minimal centralized management.

3. Why microservices?

In the traditional IT industry software is mostly a pile of various independent systems, the problems of these systems can be summed up as poor scalability, low reliability and high maintenance costs. SOA servitization was introduced later, but because of the early use of the Bus pattern in SOA, the bus pattern was strongly tied to a technology stack, such as J2EE. As a result, the legacy systems of many enterprises are difficult to connect, the switching time is too long, the cost is too high, and the convergence of the stability of the new system takes some time. As a result, SOA looks beautiful, but becomes an enterprise luxury that small and medium-sized companies are afraid of.

3.1 Problems caused by the latest monomer architecture

Monomer architecture works well in a relatively small scale, but with the expansion of the system scale, it is exposed to more and more problems, mainly as follows:

1. Complexity increases

For example, some projects have hundreds of thousands of lines of code, and the differences between modules are vague and the logic is confused. The more codes there are, the more complexity there is, and the more difficult it is to solve the problems.

2. Technology debt is rising

The company’s turnover is normal things, some employees before departure, neglect the self discipline code quality, leads to stay a lot of pit, because of the huge amount of monomer project code, pit is hard to find, it creates a lot of trouble for the new staff, the greater the flow of people left by the hole, the more the so-called technical debt more and more.

3. The deployment speed slows down

This is easy to understand, monomer architecture module is very much, is a large amount of code, lead to the time it takes to deploy project more and more, once some launch should a 20 minutes, this is what a terrible thing, to start the project several times a day to the past, very little time for developers.

4. Hindering technological innovation

For example, a previous project was written using Struts2. Due to the inextricably linked modules, the amount of code was large and the logic was not clear enough. It would be very difficult and costly to reconstruct this project with Spring MVC. So more often than not, companies have to stick with the old Struts architecture, which hinders innovation.

5. Cannot scale as required

Module is CPU intensive modules, such as movies, order module is IO intensive modules, if we want to improve the performance of the order module, such as more memory, hard drives, but because of all the modules in a framework, so we have to consider when extended order module performance factors of other modules, Because we cannot extend the performance of one module to the detriment of the performance of others, we cannot scale on demand.

3.2 Differences between microservices and individual architectures

All modules of a single architecture are coupled together, resulting in a large amount of code and difficulty in maintenance. Each module of microservice is equivalent to a separate project, with significantly reduced code volume and relatively easy to solve problems encountered.

In single architecture, all modules share a database with a single storage mode. Each module of microservices can use different storage modes (for example, some use Redis, some use mysql, etc.), and the database is a single module corresponding to its own database.

Single architecture The same technology is used in the development of all modules, microservices can use different development technology for each module, and the development mode is more flexible.

3.3 Differences between microservices and SOA

Microservices, by their very nature, are SOA architectures. However, the connotation is different. Microservices are not bound to any special technology. In a microservice system, there can be services written in Java or Python, which are unified into a system based on Restful architecture style. Therefore, the micro service itself has nothing to do with the specific technology implementation and has strong scalability.

4. The nature of microservices

Micro service, key in fact is not only the service itself, but the system to provide a basic architecture, this architecture makes the micro service can be independent of the deployment, operation, upgrading, not only that, the system architecture also let micro service and micro service between “loose coupling” in the structure, and on the function of a unified whole. This so-called “unified whole” shows a unified style of interface, unified permission management, unified security policy, unified online process, unified log and audit methods, unified scheduling, unified access and so on.

The purpose of microservices is to effectively split applications for agile development and deployment.

Micro-service advocates the concept of inter-operate and not integrate between teams. Inter-operate is to define the boundaries and interfaces of the system, stack the whole within a team, and let the team be autonomous. The reason is that if the team is organized in this way and the communication cost is maintained within the system, each subsystem will be more cohesive, the dependence and coupling energy of each other will be weakened, and the communication cost across the system will be reduced.

5. What kind of projects are suitable for microservices

Microservices can be divided according to the independence of the business function itself, if the business provided by the system is very low-level, such as: The operating system kernel, storage systems, network systems, database systems, and so on, these systems are partial bottom, closely cooperate relationship between function and the function, if forced to break up into smaller service units, will make the integration work has risen sharply, and this man-made cutting can’t bring real isolation on the business, so I can’t do it to deploy and run independently, It is not suitable for micro services.

The success of micro services depends on four factors:

Small: Small size of micro service, 2 pizza team.
Independent: Can be deployed and run independently.
Light: Uses lightweight communication mechanisms and architectures.
Loose: Services are loosely coupled.

6. Micro service folding and design

Moving from a monolithic architecture to a microservice architecture is a constant problem with demarcating service boundaries: For example, if we have a User service that provides basic information about the user, should users’ avatars, pictures, etc., be separated into a new service or should they be merged into a User service? If the granularity of the service is too coarse, it is back to the old way of singleton; If you go too far, the overhead of interservice calls becomes significant and the difficulty of managing them increases exponentially. So far there is no standard that can be called service boundary demarcation, which can only be adjusted according to different business systems.

The general principle of the split is that when a business has little or no dependence on other services, has independent business semantics, and provides data to more than two other services or clients, it should be split into a separate service module.

6.1 Principles of microservice design

Single responsibility principle

This means that each microservice only needs to implement its own business logic, such as the order management module, which only needs to handle the business logic of the order.

Principle of service autonomy

It means that each microservice is independent from development, testing, operation and maintenance, including the stored database. It has a complete process of its own, and we can treat it as a project. You don’t have to rely on other modules.

Lightweight communication principles

First, the communication language is very lightweight. Second, the communication method needs to be cross-language and cross-platform. The reason for cross-platform and cross-language communication is to make each micro-service independent enough that it can not be restricted by technology.

Interface specification principle

Because there may be call relationships between microservices, in order to avoid adjustment of other microservices due to changes in the interface of one microservice, all circumstances should be considered at the beginning of design to make the interface more general and flexible as far as possible, so as to avoid adjustment of other modules.

7. Advantages and disadvantages of micro-services

7.1 features

Each microservice can run independently in its own process;
A series of independent microservices together build the whole system;
Each service is an independent business development, and a micro service generally completes a specific function, such as order management, user management, etc.
Microservices communicate with each other through some lightweight communication mechanism, such as calling through REST apis or RPC.

7.2 the characteristics of

Easy to develop and maintain

Since a single module of microservice is equivalent to a project, we only need to care about the logic of the module to develop this module, which will reduce the amount of code and logic complexity, thus making it easy to develop and maintain.

Start the faster

This is relative to a single microservice, which is significantly faster to start a single module than an entire project with a single architecture.

Local changes are easy to deploy

Found a problem in the development, if is a monomer architecture, we need to release and start the whole project, is very time-consuming, but service is different, which module the bug we only need to solve the bug of the module is ok, after solving the bugs, we only need to restart the module service, deployment is relatively simple, Not having to restart the entire project saves a lot of time.

The technology stack is not limited

For example, the order microservice and the movie microservice were originally written in Java. Now we want to change the movie microservice to nodeJs technology. It is perfectly ok, and since we only focus on the logic of the movie, the cost of technology replacement will be much less.

On-demand scaling

We mentioned above that when a single architecture wants to expand the performance of a module, it has to consider whether the performance of other modules will be affected. For us, it is not a problem at all. How to improve the performance of a movie module does not have to consider the situation of other modules.

7.3 disadvantages

High o&M requirements

For a single architecture, we only need to maintain this one project. However, for a microservice architecture, since the project is composed of multiple microservices, problems in each module will cause abnormalities in the operation of the whole project. It is often not easy to know which module causes the problems. Because we cannot track it step by step through debugging, it puts forward high requirements for operation and maintenance personnel.

Distributed complexity

For single architecture, we can not use distribution, but for microservice architecture, distribution is almost a necessary technology, because of the complexity of distribution itself, resulting in microservice architecture is also complicated.

Interface adjustment costs are high

For example, the user microservice will be called by the order microservice and the movie microservice. Once the interface of the user microservice changes greatly, all the microservices that depend on it will have to be adjusted accordingly. As there may be many microservices, the cost caused by interface adjustment will be significantly increased.

Duplication of effort

For monomer architecture, if a particular section of the business is common used by multiple modules, we can abstract into a utility class, is called all module directly, but the service is unable to do so, because of the micro service utility class is cannot be invoked by other micro service directly, so we had to be built in each micro service such a utility class, This leads to code duplication.

8. Microservices development framework

At present, the most commonly used development frameworks for microservices are as follows:

Spring Cloud: projects. Spring. IO/Spring – clou…
Dubbo: http://dubbo.io
Dropwizard: www.dropwizard.io (Focus on the development of individual microservices)
Consul, etcd&etc. (module of microservice)

9. Difference between Sprint Cloud and Sprint Boot

Spring Boot:

Designed to simplify the creation of production-grade Spring applications and services, simplified configuration files, embedded Web servers, and a number of out-of-the-box microservices features that can be deployed in conjunction with Spring Cloud.

Spring Cloud:

Microservice toolkit, for developers to provide configuration management in distributed systems, service discovery, circuit breakers, intelligent routing, micro agent, control bus and other development kits.

Second, micro service practice prophet

1. How do clients access these services? (API Gateway)

Traditional development methods, all services are local, UI can be directly called, now separated by function into independent services, running in a separate Java process generally in a separate virtual machine. How does the client UI access it? There are N services in the background, and the front desk needs to remember to manage N services. If a service is offline/updated/upgraded, the front desk needs to be redeployed, which obviously does not serve our concept of separation. Especially when the front desk is a mobile application, the pace of business changes is usually faster. In addition, the invocation of N small services is also a small network overhead. There are general micro services in the system, usually stateless, user login information and authority management is best to have a unified local maintenance management (OAuth).

Therefore, there is usually a proxy or API Gateway between the N services and the UI in the background

Provide unified service entrance, so that the micro service is transparent to the front desk
Aggregate background services to save traffic and improve performance
Provides security, filtering, flow control and other API management functions

My understanding is that the API Gateway can be implemented in many broad ways. It can be a hardware and software box, a simple MVC framework, or even a Node.js server. Their most important role is to provide an aggregation of back-end services for foreground (usually mobile applications), providing a unified service outlet and decoupling them, but API Gateways can also become single points of failure or performance bottlenecks.

2. How do services communicate with each other? (Service invocation)

Because all microservices are independent Java processes running on independent virtual machines, the communication between services is inter Process Communication (IPC), and there are many mature solutions. Now there are basically two ways that are most common. In each of these ways, you could write a book, and you’re generally familiar with the details, so you don’t have to.

REST (JAX-RS, Spring Boot)
RPC (Thrift, Dubbo)
Asynchronous message invocation (Kafka, Notify)

Generally, synchronous invocation is simple and consistent, but it is prone to invocation problems and poor performance experience, especially when there are many invocation layers. The comparison between RESTful and RPC is also an interesting topic. General REST based on HTTP, easier to implement, easier to be accepted, the server side implementation technology is more flexible, each language can support, at the same time can cross the client, there is no special need for the client, as long as the PACKAGING OF THE HTTP SDK can be called, so the relative use of a wider number of. RPC also has its own advantages. The transmission protocol is more efficient and the security is more controllable. Especially in a company, if there is a unified development specification and a unified service framework, its development efficiency advantage is more obvious. See respective technical accumulation actual condition, his choice.

And asynchronous messaging way have special widely used in the distributed system, he can not only reduce the coupling between the call service, and can be invoked, the buffer between the backlog of messages will not be washed out by the caller, at the same time can guarantee the caller service experience, get on with their work, from the background slow performance. But the price to pay is the loss of consistency, the acceptance of final consistency of data; In addition, background services are generally idempotent because messages are sent repeatedly for performance reasons (ensuring that messages are received only once is a big performance test). Finally, it is necessary to introduce an independent broker. If there is no accumulation of technology within the company, the distributed management of brokers is also a great challenge.

3. How can I find so many services? (Service discovery)

In microservices architecture, there are usually multiple copies of each service for load balancing. A service may go offline at any time, or new service nodes may be added in response to temporary access pressures. How do services feel about each other? How are services managed? That’s the problem with service discovery. There are generally two kinds of approaches, each with its own advantages and disadvantages. Basically, the distributed management of service registration information is done through ZooKeeper and other similar technologies. When the service goes live, the service provider registers its service information with ZK (or similar framework) and maintains long links via heartbeat, updating the link information in real time. Service callers use ZK addressing to find a service based on customizable algorithms, and can cache service information locally to improve performance. ZK notifies the service client when the service goes offline.

Client side: advantages are simple architecture, flexible extension, only dependent on the service registry. The downside is that it is technically difficult for the client to maintain all the addresses that call the service, and large companies generally have mature internal frameworks such as Dubbo to support it.

Server-side do: advantages are simple, all services are transparent to the foreground callers, generally in small companies deployed in the cloud services of the application used more.

4. What if the service is down?

The biggest feature of distribution is that networks are unreliable. This risk can be mitigated by microservices unbundling, but without special safeguards, the outcome can be a nightmare. We just encountered an online fault, which was a very insignificant SQL counting function. When the traffic volume increased, the load of the database became too high, affecting the performance of the application, and thus affecting all the foreground applications that called the application service. So when our system is made up of a chain of service invocations, we have to make sure that the failure of any one link does not affect the whole link. There are many corresponding means:

Retry mechanism
Current limiting
Circuit breakers
Load balancing
Degradation (local caching) These methods are generally clear and generic, so I won’t go into detail. Like Netflix’s Hystrix: github.com/Netflix/Hys…

5. Issues to consider in microservices

Here’s a nice summary of the issues that microservices architecture needs to consider, including

API Gateway
Interservice invocation
Service discovery
Service fault tolerance
Service deployment
The data call

Three, micro-service important components

1. Basic ability of micro services

2. Service registry

A service discovery mechanism needs to be created between services to help the services become aware of each other. When a service starts, it registers its service information with the registry and subscribes to the services it needs to consume.

A service registry is the core of service discovery. It holds the network address (IPAddress and Port) of each available service instance. The service registry must have high availability and real-time update capabilities. The Aforementioned Netflix Eureka is a service registry. It provides a REST API for registering services and querying service information. The service registers its IPAddress and Port using POST requests. Send a PUT request every 30 seconds to refresh registration information. Request to unregister the service through DELETE. The client obtains available service instance information through GET request. The poster that Netflix achieves high availability is achieved by running multiple instances of Amazon EC2, each of which has an elastic IP Address. When the Eureka service is started, there is dynamic DNS server allocation. The Eureka client obtains the Eureka network Address (IP Address and Port) by querying DNS. In general, it returns the Eureka server address in the same availability area as the client. Others that can serve as service registries are:

Etcd – Highly available, distributed, highly consistent, key-value, Kubernetes, and Cloud Foundry all use ETCD.
Consul – a tool for Configuring and Ashing. It provides apis that allow clients to register and discover services. Consul can perform a service health check to determine service availability.
Zookeeper – a high performance coordination service widely used in distributed applications. Apache Zookeeper was originally a subproject of Hadoop, but is now a top-level project.

2.1 Registering and Discovering the ZooKeeper service

To put it simply, ZooKeeper can act as a Service Registry, forming a cluster of multiple Service providers, and enabling Service consumers to access specific Service providers by obtaining specific Service access addresses (IP + port) from the Service Registry. As shown below:

Zookeeper is a distributed file system. Every time a service provider deploys zooKeeper, it registers its service to a path in ZooKeeper: /{service}/{version}/{IP :port}, for example, our HelloWorldService is deployed on two machines, then zooKeeper will create two directories: / HelloWorldService 1.0.0/100.19.20.01 respectively: 16888 / HelloWorldService / 1.0.0/100.19.20.02:16888.

Zookeeper provides the “heartbeat detection” function, which periodically sends a request to each service provider (actually establishing a socket connection). If there is no response for a long time, the service center considers the service provider as “suspended” and removes it. Such as 100.19.20.02 if the machine goes down, then the path on the zookeeper will only/HelloWorldService / 1.0.0/100.19.20.01:16888.

The service consumer listens to the path (/HelloWorldService/1.0.0), and whenever there is a task change (increase or decrease) in the data on the path, ZooKeeper notifies the service consumer that the service provider address list has changed and updates it.

More importantly, ZooKeeper has inherent fault tolerance and disaster recovery capabilities (such as leader election) to ensure high availability of the service registry.

3. Load balancing

To ensure high availability of services, each microservice needs to deploy multiple service instances to provide services. In this case, the client performs load balancing.

3.1 Common Load Balancing Policies

3.1.1 random

Requests from the network are randomly assigned to multiple servers within the network.

3.1.2 polling

Each request from the network is assigned to the internal server in turn, from 1 to N, and then restarted. This load balancing algorithm applies to the situation where all servers in a server group have the same configuration and average service requests are relatively balanced.

3.1.3 Weighted Polling

According to the different processing capacity of the server, assign different weights to each server, so that it can accept the corresponding weight number of service requests. For example, if the weight of server A is set to 1, the weight of server B is 3, and the weight of server C is 6, servers A, B, and C will receive 10%, 30%, and 60% of service requests, respectively. This balancing algorithm can ensure that high performance servers get more utilization and avoid heavy load on low performance servers.

3.1.4 IP Hash

This is done by generating a hash of the source IP of the request and using this hash to find the correct real server. This means that the same host always has the same server. In this way, you don’t need to save any source IP. Note, however, that this approach can lead to server load imbalance.

3.1.5 Minimum number of connections

The time spent on the server by each client request may vary greatly. As the working time increases, if simple round-robin or random balancing algorithm is adopted, the connection process on each server may be greatly different, and real load balancing is not achieved. The minimum connection number balancing algorithm has a data record for each server in internal load, which records the number of connections being processed by the server. When there is a new service connection request, the current request will be allocated to the server with the least number of connections, so that the balance is more consistent with the actual situation and the load is more balanced. This equalization algorithm is suitable for long time processing request services, such as FTP.

4. The fault tolerance

Fault tolerance, the understanding of the word, face to face means that you can tolerate mistakes, do not let mistakes expand again, so that the impact of this mistake in a fixed boundary, “a thousand miles of dam is destroyed by the nest,” the way we use fault tolerance is to let this nest do not get bigger. So the common degradation, limiting, fuses, timeout retries and so on are fault-tolerant methods.

When invoking a service cluster, if a microservice invocation is abnormal, such as timeout, connection exception, network exception, etc., service fault tolerance is implemented according to the fault tolerance policy. Currently, the supported fault tolerance policies include fast failure and failover. If the call fails for several times in a row, the call is directly disconnected. This prevents one service exception from dragging down all services that depend on it.

4.1 Fault tolerance strategy

4.1.1 Rapid Failure

If the service fails, an error is reported immediately. Usually used for non-idempotent writes

4.1.2 Failover

The service invokes, and when it fails, retries the other servers. Typically used for read operations, but retries introduce longer delays. The number of retries is usually configurable

4.1.3 Failsafe

Fail-safe: Ignore service invocation exceptions when they occur. It is usually used for writing logs.

4.1.4 Automatic Recovery after failure

When the service invocation is abnormal, the failed request is recorded and resend periodically. Usually used for message notification.

4.1.5 forking Cluster

Call multiple servers in parallel, return if one of them succeeds. Usually used for real-time read operations. The maximum parallel number can be set by forks=n.

4.1.6 Broadcast Invocation

The broadcast calls all the providers, one by one, and any one fails. Typically used to notify all providers to update local resource information such as caches or logs.

5. Fusing

Fusing technology can be said to be a kind of “intelligent fault tolerance”. When the call fails to meet the number of times, the failure ratio will trigger the fuse to open, and a program will automatically cut off the current RPC call to prevent the error from further expanding. Implementation of a fuse is mainly to consider three modes, closed, open, half open. The transitions for each state are shown below.

When we deal with exceptions, we should decide the way to deal with them according to the specific business situation. For example, when we call the commodity interface, the other party only temporarily does the degraded processing, then as a gateway call, we should cut to the replacement service to execute or obtain the bottom data, and give user friendly tips. Also distinguish between the types of exceptions, such as dependent services that crash, which can take a long time to resolve. It could also be a timeout due to temporarily high server load. As a fuse, it should be able to identify the abnormal type and adjust the fuse breaker strategy according to the specific error type. Added manual Settings to enable the administrator to manually switch the fusing state when the recovery time of failed services is uncertain. Finally, fuses are used to invoke remote services or shared resources that may fail. If you cache local private resources locally, using fuses can add extra overhead to the system. Also note that fuses cannot be used as an exception handling substitute for business logic in your application.

Some exceptions are stubborn, sudden, unpredictable, difficult to recover from, and can lead to cascading failures (for example, if a service cluster is very heavily loaded, if a part of the cluster fails and takes up a large portion of the resources, the whole cluster may suffer). If we try again and again, the result is mostly failure. Therefore, our application needs to go into a fast-fail state immediately and take appropriate measures to recover.

We can implement a CircuitBreaker with a state machine, which has three states:

Closed: The Circuit Breaker is Closed by default, allowing the operation to be performed. A CircuitBreaker internally records the number of recent failed operations. If the corresponding operation fails, the number will be renewed once. If the number of failures (or failure rate) reaches a threshold within a certain period of time, the CircuitBreaker switches to Open. In the open state, the Circuit Breaker will enable a timeout timer that is set to give the cluster time to recover from the failure. When the timer runs out, the CircuitBreaker switches to a half-open state.
Open: In this state, the operation will fail immediately and an exception will be thrown immediately.
Half-open: In this state, a Circuit Breaker will allow a certain number of operations to be performed. If all operations are successful, the CircuitBreaker assumes the fault has been recovered, switches to the closed state, and resets the number of failures. If any of these operations fail, the Circuit Breaker thinks the fault still exists, so it switches to on and starts the timer again (giving the system some more time to recover from the failure).

Limiting traffic and downgrading ensure the stability of core services. To ensure the stability of core services, as the number of visits increases, a threshold is set for the number of services that the system can handle. Requests exceeding this threshold are directly rejected. At the same time, in order to ensure the availability of core services, some non-core services can be degraded by limiting the maximum access of services and manually degrading individual micro-services through the management console

7. SLA

“SLA” stands for “service-level Agreement”. A contract between an Internet service provider and a customer that defines terms such as type of service, quality of service, and customer payment. A typical SLA includes the following items:

Minimum bandwidth allocated to customers;
Customer bandwidth limit;
The number of customers that can be served simultaneously;
Notification arrangements prior to network changes that may affect user behaviour;
Dial-in access availability;
Using statistics;
Minimum network utilization performance supported by the service provider, such as 99.9% uptime or up to 1 minute of downtime per day;
Traffic priority of various customers;
Customer technical support and service;
The penalty provision is specified for the service provider’s failure to meet SLA requirements.

8. API gateway

The gateway here refers to the API gateway, which means that all API calls are unified access to the API gateway layer, with unified access and output of the gateway layer. The basic functions of a gateway include unified access, security protection, protocol adaptation, traffic control, support for long and short links, and fault tolerance. With the gateway, each API service provider team can focus on their own business logic processing, while THE API gateway is more focused on security, traffic, routing, and other issues.

9. Multi-level caching

The simplest form of caching is to look up the database and write the data to a cache such as Redis and set an expiration time. For example, queryOrder(call times 1000/1s) has a nested query DB method queryProductFromDb(call times 300/s), Then redis penetration rate is 300/1000. In this way of using cache, it is important to pay attention to the penetration rate. If the penetration rate is large, the cache effect is not good. Another way to use the cache is to make it persistent, that is, without setting an expiration date, which presents a data update problem. Generally, there are two methods: one is to use the timestamp, the default query is redis, each time to set the data into a timestamp, each time to read the data with the current system time and the timestamp set last time to compare, for example, more than 5 minutes, then check the database again. This can ensure that there is always data in Redis, is generally a fault tolerant method of DB. There is also the real use of Redis as DB. The binlog of the subscribed database is used to push data to the cache through the heterogeneous data system, and the cache is set to multi-level. Jvmcache can be used as the level 1 cache within the application. Generally, small size and high access frequency are more suitable for this mode of JvMcache. A set of REDis is used as the level 2 remote cache, and the outermost level 3 REDis is used as the persistent cache.

10. Timeout and retry

The timeout and retry mechanism is also a fault tolerance method. Whenever RPC calls occur, such as reading redis, DB, MQ, etc., the result cannot be returned for a long time due to network failure or the failure of the service on which it depends will lead to the increase of threads, increase the CPU load, and even lead to avalanche. So the timeout is set for each RPC call. If RPC calls resources are strongly dependent, there should be a retry mechanism, but the number of retries is recommended to 1-2 times. In addition, if there is a retry, the timeout period should be reduced accordingly. For example, if there is a retry, there will be two calls in total. If the timeout is set to 2s, the client will wait 4s to return. Therefore, the mode of retry + timeout should be set to a smaller value. Here we also talk about what links are consumed in the time of the next PRC call. A normal call statistics time mainly includes: ① execution time of RPC framework on the calling end + ② network transmission time + ③ execution time of RPC framework on the server end + ④ business code time on the server end. Both the caller and the service have their own performance monitoring. For example, the caller tp99 is 500ms, and the service tp99 is 100ms. They have checked with colleagues in the network group to confirm that the network is normal. So where is the time being spent? Two reasons, the client caller, and one reason is TCP retransmission on the network. So pay attention to those two things.

11. Thread pool isolation

Thread isolation was mentioned in the context of resilience, when Servlet3 was asynchronous. The advantage of thread isolation is to prevent cascading failures, or even avalanches. When the gateway calls more than N interface services, we need to thread isolation for each interface. For example, we have call orders, goods, users. Then the business of the order should not affect the processing of the goods and users’ requests. Without thread isolation, when a network failure occurs to access the order service, there is a delay, and threads backlog eventually leads to a full CPU load for the entire service. When we say that the service is completely unavailable, how many machines can be jammed with requests at the moment. Having thread isolation will enable our gateway to ensure that local problems do not affect the whole world.

12. Degrade and limit traffic

There are mature methods for downgrading traffic limiting in the industry, such as FAILBACK mechanism, traffic limiting method token bucket, leaky bucket, semaphore, etc. Here to talk about some of our experience, the drop is commonly by the center of the unified configuration degradation switch is implemented, so when there are a lot of the interface from the same provider, the provider of the system or the machine’s computer network appeared problem, we will have a unified degradation switch, or risk a interface is an interface to relegation. That is, to have a big knife to the type of business. There is also a downgrade to remember the violent downgrade, what is violent downgrade, such as the forum function down, the user shows a large whiteboard, we have to achieve the cache of some data, that is, there is a bottom data. Traffic limiting is generally divided into distributed traffic limiting and single-node traffic limiting. To achieve distributed traffic limiting, a common back-end storage service such as Redis is required to read redis configuration information on a large Nginx node using Lua. Our current traffic limiting is single-machine traffic limiting, and distributed traffic limiting is not implemented.

13. Gateway monitoring and statistics

The API gateway is a serial call, so each step of the exception should be recorded and stored in a unified place such as Elasticserach, for the convenience of subsequent call exception analysis. Since the company’s Docker applications are uniformly allocated, and there are already 3 agents on docker before the allocation, no additional agents are allowed. We implemented an agent to capture the log output from the server, send it to kafka cluster, consume it to Elasticserach, and query it through the Web. Now the tracking function is still relatively simple, this part needs to continue to enrich.

What are microservices