Avoid microservices pitfalls: Avoid singleton to distributed singleton

Recently, the community has frequently questioned and reflected on micro-services, and even abandoned micro-services to return to the single. Starting from the problem of “distributed monomer”, this paper introduces the introduction of non-invasive scheme and Event/EDA to get rid of the practice error of micro-service: from monomer to micro-service, but finally to distributed monomer.

Review: From singleton to microservice to Function

In the past few years, microservices architecture has become mainstream in the industry, with many companies adopting microservices and migrating monolithic applications to microservices architecture. In terms of architecture, the biggest change in microservices and monomers lies in the granularity of applications under the microservices architecture: the monomer applications that all business logic are together are split into several cohesive and autonomous “smaller” applications according to the domain model. On the other hand, Function goes one step further in separation, with the granularity of separation becoming “single operation”. Based on Function, FaaS and Serverless architecture gradually evolve.

Amid the uproar over microservices and Serverless, there is also a growing chorus of skepticism and opposition: more and more people are finding that when they excitedly migrate singleton applications to microservices and Serverless architectures, the returns are not as good as they expected. Recently, there have been all kinds of doubts and reflections on micro-services, and even give up micro-services and return to monomer. For example, I did a simple search for the keyword “microservices” on InfoQ China and the following appeared in the first three pages:

Why did we stop microservices?
Why are these companies abandoning microservices?
What? If you don’t have 100 people on your team, don’t use microservices!
To traditional enterprise friends: not enough pain micro service, pit
The psychological shadow brought by microservices
How big should microservices be? How to design the granularity of microservices?
The Uber team abandoned microservices and switched to macros
Why Segment from microservice back to monomer

Either support or oppose the voice service, mostly focused on organizational structure (conway’s law, the application and code ownership), micro service split (particle size, how to identify the domain model and business boundary), distributed transaction (micro service calls to maintain consistency across multiple) and tools (automated build, deploy, testability, monitoring, Distributed link tracking, CI/CD), database separation (to avoid multiple micro-services, especially micro-services outside the domain model shared database) and other aspects of the rationality analysis and viewpoints, I believe that everyone has an understanding of these issues.

In my article today, I will look at the misconceptions in the practice of microservices (including Serverless) from another Angle — it has worked so hard to go from monomer to microservice, but finally become a distributed monomer.

Distributed monomer

“Distributed Monolith” is a sad technical term. However, this is usually the easiest “trap” for enterprises to step into after the adoption of micro-services. In fact, many micro-services I see are eventually landed in the “distributed monomer”, unable to obtain the complete benefits of micro-services.

The problem stems from the way microservices are implemented — breaking down individual microservices according to business logic, dividing them into multiple microservices, defining API interfaces, then making remote calls through REST or RPC, and finally combining these microservices to provide various business functions. To put it simply, on the basis of service separation, remote calls between processes simply replace method calls within processes. In the meantime, for the various distributed capabilities originally used, we continue to use the previous way, in short: the way remains the same, but the granularity becomes smaller.

There’s nothing wrong with doing this methodologically, and it’s pretty standard in microservices adoption. But the problem is that stopping there is not enough — there are at least two areas where further efforts are needed.

One cause of distributed monomer: access to distributed capabilities through shared libraries and network clients

Shared libraries and network clients with Distributed capabilities are one of the causes of Distributed Monolith problems, as Mohamad Byan of Verizon explains in his book Avoid the Distributed Monolith!! I quote his picture and point of view here:

The figure above shows the logical architecture of the microservice architecture, which consists of two parts:

The inner architecture (light blue) is the implementation architecture of each microservice;
The outer architecture (shown in yellow) is the capabilities needed to build a strong microservices architecture, often with familiar distributed capabilities;

Special note: the “network client” here refers to the client of various distributed capabilities, such as service registration discovery/MQ middleware/Redis and other key-value storage/database/monitoring log tracking system/security system, rather than the client of inter-service communication such as RPC.

The inner microservices access distributed capabilities provided by the outer architecture through shared class libraries and network clients:

The shared class library and network client of distributed capabilities will force strong coupling between inner microservices and various distributed capabilities of outer architecture, increasing the complexity of operation and maintenance (such as version fragmentation due to upgrade difficulties). Multi-language is limited by the languages supported by the class library and network client. Various components (such as messaging middleware) tend to use custom data formats and communication protocols — all of which forces the inner microservices to be substantially constrained by the technical selection of the outer architecture.

With Function, the problem is even more obvious: Function is much more granular and focused on business logic. Some short functions may be only a few hundred lines of code, but the scale of the shared libraries and web clients that need to be introduced to make those few hundred lines of code work can be staggering. Quoting an online picture:

The second cause of distributed singleton is simply replacing in-process method calls with remote calls

In the process of microservice architecture transformation, developers familiar with monolithic systems and architectures will habitually reuse the knowledge and experience from the monolithic era into the new microservice architecture. The most typical approach is to simply replace the original in-process method calls with remote calls such as REST or RPC when the existing monolithic applications are split into multiple microservices along business boundaries in accordance with the domain model.

When two logical business modules have a need to collaborate:

From monomer to microservice, direct method invocation is replaced by remote invocation (REST or RPC). Even if Service Mesh is adopted, it only adds more Sidecar nodes to the link, without changing the nature of remote invocation:

This leads to the “distributed singleton” mentioned earlier:

Before microservices: Applications consist of multiple modules coupled together that make method calls through memory space… .
After microservices: An application consists of multiple coupled microservices that make remote calls over a network… .

Looking at the system architecture before and after the adoption of microservices, we will find that the two are almost exactly the same!!

The microservice version may be worse in some cases: because the invocation is more fragile, because the network is far less reliable than memory. Instead, we use the network as a “glue”, trying to glue together discrete business logic modules (already split into microservices) in the same way as in the monomer era, which is certainly less reliable than direct method calls from monomers within the same process.

The Eight Fallacies of Distributed Computing are discussed in detail in The article “The Eight Fallacies of Distributed Computing “.

Similarly, when using Function, if the above method is still used, the thinking and design mode of single unit or microservice architecture is used to create FaaS/Serverless architecture:

The essence of this will not change — it will simply turn microservices into smaller functions, resulting in a much larger number of remote calls in the system:

The coupling within the system has not changed, and Serverless does not change the internal coupling problem that exists in microservices: where the invocation is, the coupling is! Just change the granularity of the component from “microservice” to “Function/ Function”.

Coupling exists because of the communication patterns between the different components of the system, not because of the technology that enables the communication.

If you let two components communicate remotely through an “invocation” (more on that later), the two components are tightly coupled regardless of how the invocation is implemented. Therefore, as the system moves from singleton to microservice to Serverless, the system remains highly coupled if it stops short of simply replacing in-process method calls with remote calls. In this sense:

Monomer application ≈ Distributed monomer ≈ Serverless monomer

Summary of distributed monomer causes

Above we listed two main reasons why microservices and Serverless practices tend to form “distributed singleton” :

Access to distributed capabilities through shared libraries and network clients;
Simply replace in-process method calls with remote calls;

Below, we discuss solutions and countermeasures to these two problems.

Introduce non-invasive solutions: physical isolation + logical abstraction

One of the reasons for the emergence of distributed singleton mentioned earlier is “access to distributed capabilities through shared libraries and network clients,” resulting in strong coupling between microservices and Lambda functions and distributed capabilities. Non-invasive solutions represented by Service Mesh are effective means to solve this problem. Other similar solutions include RSocket/Multiple Runtime Architecture, as well as database and message Mesh products. The basic ideas are as follows:

Delegation: Access distributed capabilities through Sidecar or Runtime to avoid strong binding caused by direct communication between applications and components that provide distributed capabilities. Decoupling is achieved through physical isolation.
Abstract: The implementation details of the inner microservices are hidden, only network protocols and data contracts are exposed, and various distributed capabilities of the peripheral architecture are exposed in the form of APIS, while the concrete implementation of these capabilities is shielded — decoupling through logical abstraction;

Take the Sidecar of Service Mesh as an example. After the Sidecar is implanted, the distributed capability for Service applications to be directly connected is greatly reduced (physical isolation) :

The recent emergence of the Multiple Runtime/Mecha architecture, as well as Microsoft’s open source product Dapr that follows this idea, pushes this approach to more distributed capabilities beyond service to service communication.

It also provides an abstraction of distributed capabilities in addition to delegation. In Dapr, for example, business applications can use these distributed capabilities without focusing on the concrete products that provide them, just by using the standard APIS provided by Dapr:

Take the example of messaging in the Pub-Sub model, which is the Java client SDK API provided by Dapr:

public interface DaprClient {
	Mono<Void> publishEvent(String topic, Object event);
   Mono<Void> publishEvent(String topic, Object event, Map<String, String> metadata);
}
Copy the code

Dapr completely hides the implementation of the underlying messaging mechanism when sending events. The client SDK provides a high-level abstraction for the application to send messages, and the Dapr Runtime interfaces with the underlying MQ implementation — completely decoupled the application from MQ:

The Multiple Runtime/Mecha architecture is not covered here, but if you are interested, check out my previous blog post “Mecha: Taking Mesh Through”.

Later I will have an in-depth article on this topic, detailing how to implement the abstraction and standardization of messaging and event-driven in messaging and EDA architectures to avoid binding and strong coupling between business applications and underlying messaging products.

Introduce events: Remove unnecessarily strong coupling

Having addressed the issue of tight coupling between microservice /Serverless systems and external distributed capabilities, we move on to the issue of tight coupling within microservice /Serverless systems. As discussed earlier, from singleton to microservice to Function/Serverless, if you simply replace direct method calls with remote calls (REST or RPC), there will be a dependency between the two communicating modules because of this tightly coupled call, and the dependency will continue to pass along the call chain. This results in a tree-like network of dependencies, represented by a high degree of coupling between systems:

The basic idea to solve this problem is to look at the business semantics of the communication behavior between two components and decide whether the Command/ Command pattern or the Event/ Event pattern should be used between them.

Review the past: Event and Command

First of all, let’s review the concepts and differences between Event and Command, and use a picture to summarize them perfectly:

What is an Event?

Event: “A significant change in state” — K. Mani Chandy.

An Event represents something that has happened in a realm: it usually means that an Action has occurred and a Status has changed.

Because it’s already happened, therefore:

Event can be understood as an objective statement of the fact that has happened;
This means that events are usually immutable: the information of events (representing objective facts) cannot be tampered with, and the occurrence of events cannot be reversed.
Naming: Events are usually named in the perfect tense of the verb, such as UserRegistredEvent;

Events are generated for subsequent Event propagation:

Notifies interested observers of events that have already occurred;
The observer receiving the Event will make judgment and decision according to the content of the Event: There may be subsequent actions, and some actions may need to communicate with other modules to trigger commands. After these actions are executed, the state of the domain may change and new events may be triggered.

Event propagation mode:

An Event has a clear “source /source”, that is, the place where the Event occurs (or the state changes).
However, since the producer does not know (unwilling/unconcerned) which observers will be interested in the Event, the Event does not contain “Destination /Destination” information.
Events are usually propagated in pub-sub mode through MessageQueue mechanism.
An Event usually does not require a reply or response;
Event is usually published;

What is Command?

Command is used to pass a request for an Action.

Command stands for what will happen:

Usually means that the Action has not yet occurred but is about to occur (if the request is accepted and executed);
The following may be rejected: unwilling to execute (parameter verification failure, insufficient permission) or unable to execute (receiver failure or resource inaccessible).
Naming: Command is usually named after the common form of the verb, such as UserRegisterCommand;

The goal of generating a Command is for subsequent Command execution:

Send Command to the expected executor;
The executor who receives a Command executes according to the Command’s request: There may be multiple actions in the execution process, and some actions may need to communicate with other modules to trigger commands. After these actions are executed, the state of the domain may change and new events may be triggered.

How Command is transmitted:

Command has an explicit Source, that is, the initiator of the Command.
Command also has a very specific executor (and usually one), so the name usually contains “Destination /Destination” information;
Command is usually through a point-to-point communication mechanism such as HTTP/RPC, usually synchronous;
A Command usually requires a Response: a Response to whether the Command was executed (because it might be rejected), the result of the execution (because it might fail);
“Command” is usually used to Send;

Summary of Command and Event

Summary — The essential difference between Command and Event is their intent:

The intent of Command is to tell what you want to happen;
The purpose of an Event is to inform you of what has happened;

Differences in intent will ultimately have a special effect on dependencies between services:

The initiator of a Command must clearly know the receiver of the Command and clearly indicate what needs to be done (so-called Command, instruction, manipulation, choreography), especially when the initiator issues multiple commands in succession, which usually have a very clear sequence and logical relationship. Combination-specific business logic;

The dependencies of Command are simple and unambiguous: the initiator “explicitly depends” on the receiver;

The initiator of an Event is only responsible for publishing the Event, and does not need to pay attention to the recipients of the Event, including who the recipients are (one or more) and what they will do (so-called notification, driver, coordination). Even when events actually have multiple recipients, there is often no clear order between the recipients and the business logic in their processing is often independent of each other.

The dependency of Event is slightly more complicated: the initiator clearly does not depend on the receiver, while the receiver has “implicit reverse dependency” on the initiator — reverse refers to the direction reversal compared with the dependency of Command, in which the receiver relies on the initiator in turn. Implicit means that such dependence is only reflected in the indirect relationship that “the recipient depends on the Event, while the Event is issued by the initiator”, and there is no direct dependence between the recipient and the initiator.

From a business perspective: Relational models determine communication behavior

After reviewing Command and Event, let’s look at our previous question: Why does simply replacing a direct method call with a remote call (REST or RPC) go wrong? The main reason is that in this substitution process, simplicity means choosing the remote call directly without thinking about it, i.e. choosing the full Command mode:

In real business scenarios, the business logic relationship of each component (microservice or Function) is usually not as exaggerated as the figure above. It should not be all Command (as will be discussed later, nor all Event), but should be a combination of the two as described below, taking microservice as an example (Function analogy) :

Business input: Microservice A receives the input of the business request (either Command or Event)
Business logicimplementation“Execution process:
- Microservice A has many actions in the process of executing Command (or triggered by Event).
- Some are internal actions of microservice A, such as operation of database, operation of key-value storage, business logic processing in memory, etc.
- Some of them communicate with external microservices, such as executing queries or requiring the other party to perform certain operations. These communication methods are in the form of Command, as shown in the figure.
- After these internal and external actions are completed, the execution process is complete;
- If it is a Command, the result of the Command operation needs to be given back in the form of a reply.
Business State changeThe triggerThe subsequent behavior of:
- After the above execution process is complete, if a business state change is involved, an event needs to be published for that;
- Events are distributed via the Event Bus to other microservices that are interested in the event: Note that this process is decoupled, microservice A does not know or care which microservices are interested in the event, and the event does not need to respond;

In the process of microservice A’s business logic execution, it needs to communicate with other microservices in the form of Command or Event, as shown in the figure B/C/D/E. For these microservices B/C/D/E (regarded as the downstream services of microservice A), their processing flow after receiving the business request is similar to that of microservice A.

Therefore, we can simply deduce the scenario when the business processing logic is extended from microservice A to the downstream service of microservice A (microservice B/C/D/E in the figure) :

After summarizing the behavior of A/B/C/D/E in processing business logic involved in the figure, the communication behavior is basically the same:

Abstract, a typical microserver communication behavior in the business process can be summarized as the following four points:

Input: Takes a Command request or an Event notification as input, which is the starting point of the business process.
Internal Action: Internal logic of microservices, typically database operations, access to key-value stores such as Redis (corresponding to various distributed capabilities in the Multiple Runtime/Mecha architecture). Optional, usually 0 to N.
External access: Use Command to access other external microservices. Optional, usually 0 to N.
Notification of service changes: Events are published in the form of events to notify service status changes caused by the preceding operations. Optional, usually 0-1.

In this behavior pattern, 2 and 3 are not sequential and may be interleaved, while 4 is usually at the end of the process: Only when various internal actions and external commands are completed, the business logic implementation is completed, and the state change is completed, can the “die is done” be released as an Event: “Operation completed, state changed, notice”.

To recap, the essential difference between events and commands is their intent:

The purpose of an Event is to inform you of what has happened;
The intent of Command is to tell what you want to happen;

From the perspective of business logic processing, externally accessed commands and actions for internal operations are the “implementation” parts of business logic: these operations make up the complete business logic — and if these operations fail, the business process will directly affect (fail or partially fail). Publishing events are the subsequent “notification” part after the completion of business logic: after the completion of business logic processing and state change, events drive the subsequent further processing. Note the drive, not the direct manipulation.

From the perspective of time line, the whole business processing process is shown in the figure below:

The problem with full Command: unnecessarily strong coupling

The problem of the microservice system of the whole Command is that in the “notification of state change” at the last stage above, the Event and Pub-Sub models are not adopted, but the commands continue to call other downstream related microservices one by one:

Events can decouple producers and consumers, so there is no strong dependence between microservice A and microservice C/D/E in the figure, and there is no need to lock the existence of each other. However, Command is different. After using Command, microservice A and downstream related microservice C/D/E will form strong dependence, which will spread and eventually lead to the formation of A huge and deep dependency tree. However, because of finer granularity, Function often has more serious problems:

However, if Event is introduced in the link of “status change notification”, it can decoupled the microservice and the downstream notified microservice, so as to remove the dependency and avoid unlimited spread. As shown in the figure below, the left figure shows the dependency relationship after using Event instead of Command to notify state change. Considering the decoupling effect of Event on producers and consumers, we “cut off” the green Event arrow. The result is a dependency graph on the right that is broken down into small scale dependency trees:

Suggestions for using Event and Command:

When splitting a single application into microservices, you should not simply replace the original method call with Command.
You should examine the semantics of each invocation on the business logic: is it part of the execution of the business logic? Or is it a status notification after completion of execution?
Then use that to decide whether to use Command or Event;

Orchestration and coordination

There are two other concepts in the use of Command and Event: choreography and coordination.

This blog post, Microservices Choreography vs Orchestration, is highly recommended: The Benefits of Choreography, by Jonathan Schabowsky, CTO of Solace. In his blog here, he summarizes two models for making microservices work together, and gives a telling analogy:

Orchestration: the need to actively control all elements and interactions, much like a conductor conducting an orchestra’s musicians — corresponding to commands;
Choreography: A pattern needs to be set up, and the microservice will dance to the music without supervision or instruction — corresponding to events;

I have seen many articles with similar views, among which there is a picture impressed by me and I have excerpted it:

To the left are the ideal images to be obtained by Orchestration, and to the right are the actual images of large rollover vehicles.

Problems with events throughout: development difficulties and unclear business boundaries

In the use of Command and Event, in addition to the full use of Command, there is another extreme of the full use of Event, which is more common in Lambda (FaaS) :

The first problem with this approach is that Event is used instead of Command semantics where applicable. However, due to the semantic difference between Command and Event, this substitution will be awkward:

As Command is one-to-one, the Event that replaces it has to degenerate from “1:N” to “1:1”, and the Pub-Sub model no longer exists.
Commands need to return results, especially Query commands must have Query results. After Event is used to replace them, “Events supporting Response” has to be implemented, which is typical for realizing request-Reply model in message mechanism.
Or introduce another Event to reverse inform the result, that is, two asynchronous events instead of a synchronous Command — this requires additional subscription and processing by the initiator, far more complex than using simple commands;
It also introduces a very troublesome state problem: that is, the context of communication between services is usually stateful, and the Reply Event must be sent exactly to the instance of the initiator of the Request Event, instead of randomly selecting one. This makes Reply events not only 1:1 bound to the subscriber service, but also to a specific instance of that service — such a Reply Event is no longer an Event;

A common solution to this state problem is to choose a stateless scenario. If Reply events are handled without consideration of state, then the Event Reply can simply be sent to any instance.

It is often difficult to achieve statelessness for microservices with large granularity, so it is often awkward to use events throughout microservices, and in fact few people do so. However, in the Function/FaaS system with very small granularity, it is common to adopt the Event mode in the whole process.

Personally, I have reservations about the full use of events, and I prefer to preserve the use of Command even in FaaS: if an operation is an integral part of the execution of the “business logic”, the tight coupling of the Command approach is more likely to reflect the existence of the “business logic” :

The “complete” decoupling of all events creates new problems (regardless of the additional complexity in coding) — the business logic is already hard to represent in the mass of fine-grained Event calls, Domain Modeling and Bounded Context are submerged in these Event calls and are difficult to identify:

Note: This problem is called “Lambda Pinball”, which will not be further expanded here. A follow-up article is planned to discuss the origin and solution of “Lambda Pinball” in detail.

Selection of Command and Event: Be realistic and impartial

To sum up the selection of Command and Event, my personal advice is not to cut all the dots: the disadvantages of full Command are easy to understand, but simply replacing it with full Event may not be appropriate.

My personal opinion tends to be based on the semantics of actual “business logic” processing:

If it is the “implementation” part of the business logic: prefer to use Command;
If it is a follow-up “notification” section after the business logic is completed: It is strongly recommended to use Event;

Summary and reflection

Caution: Do not become a distributed monomer

Above we have listed two main reasons and countermeasures for the tendency to form “distributed singleton” in microservices and Serverless practices:

Access distributed capabilities through shared libraries and network clients: introduction of non-intrusive solutions decoupled applications and various distributed capabilities;
Simply replace in-process method calls with remote calls: distinguish Command from Event, and introduce Event to remove unnecessary strong coupling between microservices; The former is not technically mature at present. Typical projects such as Istio/Dapr have yet to be strengthened, and there is relatively large resistance in landing for the time being. But the latter has been a well-established practice in the industry for many years, widely used even before the rise of microservices and Serverless, so it is recommended that improvements be made immediately.

Stay tuned for details later in this article on how events and Event Driven Architecture can be more easily incorporated into microservices and Serverless without coupling to specific implementations that provide Message Queue distributed capabilities.

Reflection: Calm thought beyond the noise and invective

If we stick to the “simple substitution of remote calls for in-process method calls” in microservices and Serverless practices, and stick to the singleton era habits of introducing various SDKS, then distributed singleton problems are inevitable. Our microservices transformation, Serverless practices often end up with:

Change the monomer into… Even worse distributed singleton.

Of course, microservices can be distributed singletons, but that doesn’t mean that microservice architectures are a lie or worse than singletons. Serverless may also suffer from distributed singleton (and Lambda Pinball, more on that later), but that doesn’t mean Serverless isn’t a good idea — both microservices and Serverless are tools for solving specific problems, and like all tools, Before using tools, we need to study and understand them and learn how to use them properly:

You need to create the right architecture for microservices, which can be very different from a single architecture: it’s not necessarily “as-is” replacing method calls with remote calls, and it’s best not to use distributed capabilities directly by sharing class libraries and network clients;
Serverless requires us to thoroughly rethink the architecture and change the way of thinking in order to ensure that the benefits outweigh the disadvantages.

Resources and recommended reading

Avoid the Distributed Monolith!! : A September 2018 presentation by Mohamad Byan from Verizon, describing the distributed singleton pitfalls of microservices practices and ways to solve them.
“Mecha: Meshing through” : My previous post detailed the Multiple Runtime/Macha architecture to Mesh more distributed capabilities.
The Eight Fallacies of Distributed Computing The Eight Fallacies of Distributed Computing The Eight Fallacies of Distributed Computing
Take-aways of Event-Driven Utopia: A presentation by Bernd Rucker at QCon on the Opportunities and Pitfalls of an Event-Driven Utopia. Some of the images for this article are taken from this POWERPOINT presentation.
Practical DDD: Bounded Contexts + Events => Microservices: A talk by Indu Alagarsamy on the intersection of Domain Driven Development (DDD) and Messaging. It is recommended to use messaging technology to communicate between clean, well-defined bounded contexts to remove spatio-temporal coupling.
Building Event-driven Cloud Applications and Services: a series of tutorials that discuss common practices and techniques for Building event-driven Applications and Services. Look at building event-driven cloud applications and services
The Architect’s Guide to Event-driven Microservices: A brochure in PDF format from The Solace Web site, subtitled “The Architect’s Guide to Building a Responsive, The Elastic and Resilient Microservices Architecture/Architect’s Guide to building responsive, flexible and Resilient Microservices architectures.” See the Event-driven Microservices Architect Guide for Chinese translation
To traditional enterprise friends: not enough pain micro services, pit: netease cloud Liu Chao Liu teacher’s super good article, extremely real and comprehensive about micro services landing need to consider all aspects and various problems, strongly recommended reading.

Financial Class Distributed Architecture (Antfin_SOFA)