The interview began

A handsome, middle-aged man in a plaid shirt walks up to you with a MAC full of scratches, looks at his shiny head and thinks he’s a top architect. But we have seen the series of warm male ao third, the abdomen has poem book gas from China, empty is not empty.

Yes, the boy is still me. Last time you ran away in the middle of the sentence. This time I have to ask you a good question.

Good interviewer, because I was in a hurry last time, THE aobing series was updated, so I hurried home to see it!

I believe you, let’s get started, last time we talked about message queue repeated consumption, can you tell me what this scenario looks like?

Message repeated consumption is a serious and common problem that must be considered after the use of message queues. Whenever I use message queues in the development process, I immediately consider the problem of repeated consumption.

For example, after a user places an order successfully, I need to go to an activity page and add GMV (total sales amount) to him. Finally, I will give him a reward according to his GMV. This is a common way of playing in e-commerce activities.

Similar to the cumulative order amount to which gradient gives you a reward for which gradient.

I can only tell you that the page is to use asynchronous to add 10000% of the (don’t ask me why, because the backend is AoBing 😂 I do), or you want to, you are a user add next to him, it means that on the table to the operation, how many times have you consider the double tenth day of the operation of this table? I don’t think the database or the cache will hold up.

And we should also have such experience, you placed an order immediately to see some active page, sometimes immediately have, sometimes lack of delay for a long time, why? The speed depends on the consumption rate of the message queue.

If you successfully pay your next order, you can send out a message. The developer of the activity above will monitor your successful payment message. If I monitor your successful payment message of this order, I will add it to you in the GMV form of my activity.

But I will tell you that the general message queue use, we have a retry mechanism, that is, my downstream business has an exception, I will throw an exception and ask you to send again.

There is something wrong with my activity, you can ask to resend it. But think about what the problem is, right?

Sorted out a Java core knowledge points. It covers JVM, locking, concurrency, Java reflection, Spring principle, microservices, Zookeeper, database, data structure and many other knowledge points. Basically covers all the technical knowledge of Java architecture information, but also covered the interview questions, but also some springboot project source code to share with you

Because the data picture is too many not to show one

There are the most free big factory interview questions

** If you need to get this document, you can get it for free by scanning it!! **

Yes, you are not the only one monitoring this message ah, there are other services monitoring, they will fail ah, he failed he also asked to resend, but you are actually successful, resend, your money is not added twice?

Right?? Is this the truth??

Don’t get it? Look at the following left

Like this, we failed in the integral system processing, he must ask you to resend the system once the news right, integral system to receive and process a success, but other people’s activities, coupons, etc. Service is listening to the news ah, that don’t live system may appear to give his plus GMV and twice, coupon button twice this kind of situation?

In fact, retry is very normal, service network jitter, developer code bugs, and data problems can be processed failure to request retransmission.

Well, that’s a very careful analysis. How do you ensure that during development?

Normally we call this kind of processing interface idempotent.

** is a mathematical and computer concept commonly found in abstract algebra.

The characteristic of an idempotent operation in programming is that any number of executions have the same effect as a single execution.

An idempotent function, or idempotent method, is a function that can be executed repeatedly with the same parameters and achieve the same results. These functions do not affect system state, nor do they have to worry about system changes caused by repeated execution.

For example, the “setTrue()” function is an idempotent function, and the result is the same no matter how many times it is executed.** More complex operation idempotent guarantees are implemented using unique transaction numbers (serial numbers).

Generally speaking, you call my interface with the same parameter, and the result is the same for how many times. If you add GMV to the same order number, how much will it cost if you add it once, and how much will it cost if you add it N times?

But if you do not do idempotent, you call an order many times the money will be added many times, the same way you call a refund many times the money will be reduced many times.

The general processing process is as follows:

How do you guarantee that?

Here’s what I said:

Hello, handsome interviewer, generally idempotent, I will consider the scene to see whether it is strong or weak check, such as money related scenes that is very critical, do strong check, do not not very important scenes do weak check.

Strong check:

For example, you listen to the user payment success message, you listen to add GMV is not to call the add money interface, that add money interface then call a add flow interface, two in a transaction, success together with success and failure together.

Every time the message comes, you should check the flow table with the unique identifier such as order number + business scene (compared to the Tmall Double 11 event) to see if there is this flow. If there is, you should return directly, do not go through the following process. If there is no, you should execute the following logic.

The reason for the use of flow table is that when it comes to activities like money, if there is any problem, you can also go to the flow table to check accounts, and to help developers locate problems.

Some of you may still be a little confused, and some of you in the talent exchange group also said that some examples can be put in a little pseudo-code, so I will write some of the ones that can be used in code from this period.

Weak check:

This is simple, some unimportant scenes, such as who to send text messages, I will put the id+ scene unique identifier as the Redis key, in the cache, the expiration time depends on your scene, the message within a certain period of time will go to Redis to judge.

With KV, even if the message is lost, it may not matter, anyway, throw an insignificant notification SMS (dare you say you don’t have the situation of verification code SMS lost?). .

There are many weak verification companies with token ah, anyway, a lot of variety, but the important scene must be strong verification, really check the problem when there is no persistent data in the disk, the heart is still empty, just like you and your girlfriend separated from the heart state. How do I know I’m single? Guess)

Have you ever worked with sequential message consumption? How did you promise that?

No! The over!

I’ve been asking a lot of people about sequential consumption in the past week. There are not many such scenarios in the development process. I have discussed with Sanwei several times.

Generally, messages of different operations in the same business scenario pass at the same time, and the order itself is correct. However, when you send messages, they are sent at the same time, but when you consume them, they are mixed up, so there is a problem.

We all know that data synchronization pressure is still very high when there is a large amount of data. Sometimes a table with a large amount of data needs to synchronize hundreds of millions of data. (Not master-slave synchronization, master-slave latency is a problem, maybe from the database or the primary database to the standby database)

In this case, we often depress the queue and then slowly consume the data. The problem is that we increase, modify, and delete the data of the same Id in the database at the same time, but when you send the message to consume the data, it changes to modify, delete, and increase, so that the data is not correct.

A piece of data should have been deleted, but it is still in your possession. It is not a big problem!

Are the results completely different

So how do you solve it?

Let me briefly mention a simple implementation of RocketMQ that we use.

RocketMQ is an open source product of Alibaba. I asked my friends about it and many companies use it, so I will use it as an example. I will talk about RocketMQ and Kafka later.

The producer consumer typically needs to guarantee sequential messages, perhaps in a business scenario, such as order creation, payment, shipment, and receipt.

Are these items an order number? An order must be an order number, that’s easy.

There are multiple queues under a topic. In order to ensure the order of sending, RocketMQ provides MessageQueueSelector queue selection mechanism, which has three implementations:

We can use Hash to send the same order to the same queue, and then use synchronous send. Only the creation message of the same order is sent successfully, and then send the payment message. In this way, we ensure that the delivery is orderly.

The queuing mechanism in RocketMQ’s topic ensures that the storage satisfies a FIFO (First Input, First Output), and the rest is consumed in order by the consumers.

RocketMQ only guarantees sequential delivery, sequential consumption is guaranteed by consumer Business!!

It is easy to understand here. When you send an order, you put it in a queue, Hash the order number you agree with, is it still the same result? It must be a consumer consumption, is the order guaranteed?

True sequential consumption different middleware has its own different implementation here I give an example, you understand the idea.

When I was writing about this, someone in the talent pool asked me, “Wouldn’t it be nice if one queue went out in an orderly fashion and one consumer spent money?” WHAT I meant to say is that consumers are multi-threaded and you send messages to them in an orderly fashion. Can you make sure that they are processed in an orderly fashion? Or a consumption success hair next safe.

Can you tell me about distributed transactions?

Distributed transactions are almost necessary in today’s distributed deployment systems.

Let’s talk a little bit about what is a transaction?

Distributed transactions, transaction isolation levels, ACID, I’m sure you’re all familiar with these things, but what is a transaction?

Concept:

Usually means something to do or done.

In computer terms, a program execution unit that accesses and possibly updates various data items in a database.

Transactions are typically caused by the execution of user programs written in high-level database manipulation languages or programming languages such as SQL, C++, or Java, and are defined by statements (or function calls) in the form of begin transaction and end transaction.

A transaction consists of all operations performed between a BEGIN transaction and an end transaction.

Features:

Transactions are the basic unit of recovery and concurrency control.

Transactions should have four attributes: atomicity, consistency, isolation, and persistence. These four properties are commonly referred to as ACID properties.

Atomicity: A transaction is an indivisible unit of work in which all or none of the operations involved are performed.

Consistency: The transaction must change the database from one consistent state to another. Consistency is closely related to atomicity.

Isolation: The execution of a transaction cannot be interrupted by other transactions. That is, the operations and data used within a transaction are isolated from other concurrent transactions, and the concurrent transactions cannot interfere with each other.

Durability: Also called permanence, they mean that once a transaction is committed, its changes to data in the database should be permanent. Subsequent operations or failures should not affect it in any way.

That some students still do not understand, Ao Bing I summed up is: a business is a series of operations, either successful at the same time, or failed at the same time. It then starts with the ACID properties of the transaction (atomicity, consistency, isolation, persistence).

A transaction is a sequence of operations that can be performed properly, and it must satisfy both ACID properties.

So what are distributed transactions?

You can think about it, you place an order process may involve more than 10 links, you place an order to pay are successful, but you failed to deduct coupons, points added failed, the former company will be wool, the latter users will be unhappy, but these are in different services how to ensure that everyone is successful?

Smart, distributed transactions, you see you will answer!

A real application scenario would be many times more complex than the one I presented. I used a simple example just to make it easier to understand.

The distributed transactions I have been exposed to and learned about are as follows:

  • 2PC (Two-stage submission)

  • 3PC (Three-stage submission)

  • TCC (Try, Confirm, Cancel)

  • Best effort notice

  • XA

  • Local message table (developed by ebay)

  • Half Message/Final Consistency (RocketMQ)

Here I will introduce the simplest 2PC (two-phase) and the semi-message transaction (final consistency) that you may use more often in the future. The purpose is to give you an understanding of the role of message-oriented middleware in distributed transactions. All other transactions are similar and have many advantages.

Of course, there are also various disadvantages:

For example, database resources are locked for a long time. As a result, the system responds slowly and fails to run concurrently.

Network jitter appears brain split, leading to things participants, can not well execute the coordinator’s instructions, resulting in data inconsistency.

Single point of failure: for example, the thing coordinator breaks down at a certain moment. Although a new Leader can be elected through the election mechanism, problems will inevitably occur in this process, while TCC can only support development with a strong technical team, which costs too much.

So without BB’s, let’s start introducing these two things.

2PC (two-stage submission) :

2PC (two-stage commit) can be said to be the beginning of the distributed transaction, like matchmaker, is through the messaging middleware to coordinate multiple systems, both systems when the transaction is locked but do not commit the transaction, when both systems are ready, tell the messaging middleware, and then commit the transaction separately.

But I don’t know if you see the problem?

Yes, you may have noticed that if system A commits successfully, but system B fails due to network fluctuation or some other reason, it will still fail.

Final consistency:

Throughout the process, we can ensure that:

  • The service active fails to submit the local transaction, and the service passive cannot receive the message delivery.

  • As long as the local transaction is successfully executed by the business active, the message service must deliver the message to the downstream business passive and ultimately guarantee that the business passive can successfully consume the message (consumption success or failure, that is, there must be a final state).

But technology is such, all sorts of extreme we all need to consider, it is hard to have a perfect plan, so just can have so much plan three-step, TCC, best to inform and so on distributed transaction scheme, you just need to know is what to do, do what is good, have what harm, in the actual development of the time to look good, Systems are designed according to business scenarios. Technology without business is meaningless, and business without technology has no foundation.

Again: there is no perfect system, only the most suitable system.

The end of the interview

The guy can’t tell, there’s something there, those are good answers, can you talk to me about RocketMQ tomorrow?

Aobing spent so much time on this chapter. I’m not sure if he can finish it. I feel sorry for him. I’d like to give him a thumbs up, and message backtracking will also be covered in a separate section on messaging middleware, which is a bit too long.

conclusion

In fact, I wrote this chapter longer than the previous seckill, because I did not know how to explain the scene of sequential message, which is easier for everyone to understand. Finally, I referred to the Internet, and the practical application scenarios of sequential message are not as extensive as others. I also talked with 3Y for several times, and finally decided the binlog scene.

In short, the source of creation in this period is a little exhausted. This chapter is really difficult to write, including distributed transactions, which are also very complicated in the actual development process. It takes a long time to do design when it is needed.

I always want to write a little more easy to understand, this article even so I think it is not easy to understand, but the scene of the news is like this, and everyone added me do not ask me a lot of buckle details, I think it may help more than I tell you the answer, right?