preface

Hello everyone, I am a little boy picking up field snails, recently a friend (6 years of work experience) met Tencent Cloud, the following are the interview questions and answers. Come on, roll it together.

  1. Talk about projects, good design, good code
  2. What is zero copy?
  3. How many IO models are there? What’s the difference between NIO and multiplexing?
  4. How does a Future implement block waiting for results?
  5. The difference between ReentrantLock and Synchronized? The principle of Synchronized?
  6. Chat with AOS? How does ReentrantLock work?
  7. Optimistic lock and pessimistic lock, let you write how do you achieve?
  8. Paxos protocol? What’s the workflow like?
  9. B+ tree talk? Is the B+ tree ordered? What’s the main difference between a B+ tree and a B- tree? B+ tree index, one lookup process?
  10. TCP congestion mechanism
  11. Talk about JVM tuning
  12. What are the disadvantages of database sub – table?
  13. How are distributed transactions resolved? TCC understand?
  14. How does RocketMQ ensure the accuracy and security of messages?
  15. Algorithm: sum of three numbers
  • Public number: a boy picking up snails

1. Talk about projects, good design, good code

For projects, you can talk about projects you work on, especially projects that have highlights. If you don’t have a highlight project, talk about good design, what interfaces you’ve optimized, how much performance you’ve improved, what slow SQL you’ve optimized. Even some good code writing will do.

If it’s about optimizing interfaces, you can check out this article:

Note the interface performance optimization practice summary: Eight suggestions for optimizing interface performance

For code optimization details, check out this post:

Work for 4 years and share 50 tips to make your code better

For slow SQL optimizations, check out my previous MySQL column series:

  • Order by in detail
  • Group by
  • Actual combat! How to solve deep pagination problem in MySQL
  • Back-end Programmers must have: 30 tips for writing high quality SQL
  • Ali gave some SQL and asked how many tree searches should be performed?
  • Production problem analysis! Delete in subquery does not move index? !

2. What is zero copy?

Zero-copy means that the CPU does not need to copy data from one storage area to another when performing I/O operations on a computer, thus reducing context switching and CPU copy time. It is an I/O operation optimization technique.

Traditional I/O execution process

Traditional IO processes include read and write processes.

  • Read: Reads data from disk into the kernel buffer and copies it to the user buffer
  • Write: Writes data to the socket buffer first and then to the nic device.

  1. The user application process invokes the read function to make IO calls to the operating system, changing the context from user mode to kernel mode (switch 1)
  2. The DMA controller reads data from disk into the kernel buffer.
  3. The CPU copies the kernel buffer data to the user application buffer, and the context changes from kernel state to user state (switch 2). Read returns
  4. The user application process initiates IO calls through write function, and the context changes from user mode to kernel mode (switch 3)
  5. The CPU copies data from the user buffer to the socket buffer
  6. The DMA controller copies data from the socket buffer to the nic device, the context is switched from kernel mode back to user mode (switch 4), and the write function returns

The traditional I/O read/write process includes four context switches (four user/kernel switches) and four data copies (two CPU copies and two DMA copies).

Zero copy implementation:

Zero copy does not mean no data is copied, but it reduces the number of user/kernel mode switches and CPU copy times. Zero copy is generally implemented in three ways:

  • mmap+write
  • sendfile
  • Sendfile with DMA collect copy function

mmap+write

Mmap takes advantage of this virtual memory feature by mapping read buffers in the kernel to buffers in user space to reduce the number of data copies!

  1. The user process initiates IO calls to the operating system kernel through mMAP method, and the context changes from user mode to kernel mode.
  2. The CPU uses a DMA controller to copy data from the hard disk to the kernel buffer.
  3. The context switches from kernel state back to user state, and the MMAP method returns.
  4. The user process initiates IO calls to the operating system kernel through write method, and the context changes from user mode to kernel mode.
  5. The CPU copies data from the kernel buffer to the socket buffer.
  6. The CPU uses the DMA controller to copy data from the socket buffer to the nic, switch the context from kernel mode back to user mode, and return the write call.

Zero copy for mMAP +write implementation, four user-kernel/user-space context switches for I/O, and three data copies (including two DMA copies and one CPU copy).

sendfile

Sendfile represents transferring data between two file descriptors. It operates within the operating system kernel to avoid copying data from the kernel buffer to the user buffer

  1. The user process initiates the sendFile system call, and the context (switch 1) changes from user to kernel mode
  2. DMA controller that copies data from disk to kernel buffer.
  3. The CPU copies data from the read buffer to the socket buffer
  4. The DMA controller asynchronously copies data from the socket buffer to the nic,
  5. Context (switch 2) Switches from kernel state back to user state, sendfile call returns.

Zero copy of sendFile implementation, 2 context switches between user space and kernel space for I/O, and 3 data copies. The three data copies include two DMA copies and one CPU copy.

Sendfile with DMA collect copy function

After Linux 2.4, sendFile was optimized and upgraded to include SG-DMA, which is a SCATTER/Gather operation for DMA copies, which reads data directly from the kernel space buffer to the nic. Use this feature for zero copy, that is, one more CPU copy can be saved.

  1. The user process initiates the sendFile system call, and the context (switch 1) changes from user to kernel mode
  2. DMA controller that copies data from disk to kernel buffer.
  3. The CPU sends the file descriptor information in the kernel buffer (including the memory address and offset of the kernel buffer) to the socket buffer
  4. The DMA controller copies data directly from the kernel buffer to the nic based on the file descriptor information
  5. Context (switch 2) Switches from kernel state back to user state, sendfile call returns.

It can be found that the zero copy of SendFile +DMA Scatter/Gather implementation, I/O has two context switches between user space and kernel space, and two data copies. Two of the data copies are packet DMA copies. This is true zero-copy technology. There is no TRANSFER of data through the CPU at all times. All data is transferred through DMA.

Read it once to understand: zero copy details

3. How many IO models are there? What’s the difference between NIO and multiplexing?

There are five IO models

  • Blocking IO model
  • Non-blocking IO model
  • IO multiplexing model
  • Signal driven model of IO model
  • IO model asynchronous IO(AIO)

NIO (Non-blocking IO model)

NIO is a non-blocking IO model. The flow of non-blocking I/OS is as follows:

  1. The application process initiates recvFROM to read data from the operating system kernel.
  2. OS kernel data not ready immediately returns EWOULDBLOCK error code.
  3. The application process polls the call and continues to issue recvFROM to the operating system kernel to read the data.
  4. The operating system kernel data is ready to be copied from the kernel buffer to user space.
  5. The call completes, and a success message is returned.

NIO(non-blocking IO model) has a performance problem, that is, frequent polling leads to frequent system calls, which also consumes a lot of CPU resources. Consider the IO reuse model to solve this problem.

IO multiplexing model

IO multiplexing is a process that waits until kernel data is ready and then actively notifies the application process to make system calls.

Core idea of IO reuse model: the system provides us with a class of functions (such as select, poll and epoll functions), which can monitor multiple FD operations at the same time. Any one of them returns kernel data, and then the application process launches recvFROM system call.

Select for IO multiplexing

The application process can monitor multiple FDS simultaneously by calling the SELECT function. In the FDS monitored by the SELECT function, as long as any data state is ready, the SELECT function will return the readable state, and the application process will issue a recvFROM request to read the data.

In the non-blocking IO model (NIO), N (N>=1) polling system calls are required, whereas with the IO multiplexing model of SELECT, only one query is required, greatly optimizing performance.

However, SELECT has several disadvantages:

  • The maximum number of I/O connections monitored is limited, which is usually 1024 on Linux.
  • The select function returns by iterating through the FDset to find the ready descriptor fd. (I/O events occurred, but I do not know which streams, so traversal all streams)

Poll was later proposed because of the connection number limitation. Compared to SELECT, Poll solves the connection number limitation problem. However, select, like poll, still needs to iterate through the file descriptor to get the socket ready. If a large number of clients are connected at the same time, very few may be in a ready state at any one time, and efficiency linearly decreases as the number of monitored descriptors increases.

Epoll for IO multiplexing

In order to solve the problems existing in SELECT /poll, the multiplexing model epoll was born, which is implemented by event-driven, and the flow chart is as follows:

Epoll registers a FD (file descriptor) with epoll_ctl(). Once a FD is ready, the kernel uses a callback mechanism to activate the FD quickly and is notified when the process invokes epoll_wait(). Instead of traversing file descriptors, we listen for event callbacks. That’s where epoll shines.

4. How does Future implement blocking waiting for results?

Future.get() is used to fetch asynchronous results. It’s blocking. What’s behind it?

We can look at the FutureTask class structure diagram:

FutureTask implements the RunnableFuture interface. RunnableFuture inherits both Runnable and Future interfaces.

A Future represents the life cycle of a task and provides methods to determine whether it has been completed or cancelled, to get the result of the task, to cancel the task, and so on.

public interface Future<V> { boolean cancel(boolean mayInterruptIfRunning); Boolean isCancelled(); Boolean isDone(); // Take the result value of the Future. V get() throws InterruptedException, ExecutionException if the current Future is not finished and the current thread blocks and waits. // Get the result value of the Future. Just like get(), V GET (Long timeout, TimeUnit Unit) throws InterruptedException, ExecutionException, TimeoutException; }Copy the code

FutureTask is a combination of Runnable and Future. We can think of Runnable as a producer and Future as a consumer. FutureTask is shared by both, with the producer running the run method to calculate the results and the consumer getting the results.

Producer-consumer model, if producer data is not ready, consumers will be blocked. When the producer data is ready, the consumer is awakened to continue executing. Let’s take a look at how FutureTask is implemented internally.

FutureTask maintains task state internally

Private static final int NEW = 0; private static final int NEW = 0; FutureTask private static final int COMPLETING = 1; Private static final int NORMAL = 2; private static final int NORMAL = 2; Private static final int exception = 2; Private static final int CANCELLED = 4; // Initiate an interrupt request private static final int Interrupt = 5; Private static final int INTERRUPTED = 6;Copy the code

Producer run method:

Public void run() {// If state is not NEW, or runner fails to set, return if (state! = NEW || ! UNSAFE.compareAndSwapObject(this, runnerOffset, null, Thread.currentThread())) return; try { Callable<V> c = callable; if (c ! = null && state == NEW) { V result; boolean ran; Result = c.call(); // Run successfully ran = true; } catch (Throwable ex) { result = null; // Failed ran = false; // setException setException(ex); } if (ran) set(result); } } finally { runner = null; int s = state; if (s >= INTERRUPTING) handlePossibleCancellationInterrupt(s); }}Copy the code

Consumer get method

public V get() throws InterruptedException, ExecutionException { int s = state; // If the state is shorter than or equal to COMPLETING the FutureTask task, call awaitDone to make the current thread wait. if (s <= COMPLETING) s = awaitDone(false, 0L); return report(s); }Copy the code

What does awaitDone do?

private int awaitDone(boolean timed, long nanos) throws InterruptedException { final long deadline = timed ? System.nanoTime() + nanos : 0L; WaitNode q = null; boolean queued = false; for (;;) If (thread.interrupted ()) {// Remove node Q from the list and throw InterruptedException removeWaiter(q); throw new InterruptedException(); } int s = state; // FutureTask has completed if (S > COMPLETING) {if (q! = null) q.thread = null; // return s; } else if (s == COMPLETING) // Cannot time out yet thread.yield (); Else if (q == null) q = new WaitNode(); else if (! queued) queued = UNSAFE.compareAndSwapObject(this, waitersOffset, q.next = waiters, q); //timed = true timed = timed else if (timed) {nanos = deadline-system.nanotime (); if (nanos <= 0L) { removeWaiter(q); return state; } // let the current thread wait for nanos time locksupport. parkNanos(this, nanos); } else LockSupport.park(this); }}Copy the code

Of course, the interview, do not have to talk about the source so fine, just need a general idea of good.

5. The difference between ReentrantLock and Synchronized? The principle of Synchronized?

The difference between ReentrantLock and Synchronized?

  • Synchronized is implemented by the JVM, while ReenTrantLock is implemented by the API.
  • Before synchronization optimization, Synchronized performance was much worse than ReenTrantLock, but since the introduction of biased, lightweight locks (spinlocks), Synchronized performance has become almost the same.
  • The use of Synchronized is more convenient and simple. It is guaranteed by the compiler to lock and release the lock. ReenTrantLock requires manual declarations to lock and release locks, preferably in finally.
  • ReentrantLock can specify whether the lock is fair or unfair. Synchronized can only be an unfair lock.
  • ReentrantLock responds to interrupts and reincarnation, while Synchronized does not.

As for the principle of Synchronized, you can read my article

Synchronized parsing — if you’d like to peel me open layer by layer

6. What about AOS? How does ReentrantLock work?

The core answer points of AQS (Abstract Synchronous Queues) are:

  • State State maintenance.
  • CLH queue
  • ConditionObject notice
  • Template method design patterns
  • Exclusive and share mode.
  • Custom synchronizer.

We can look at my previous article ha: AQS analysis and combat

ReentrantLock ReentrantLock ReentrantLock ReentrantLock ReentrantLock

7. Optimistic lock and pessimistic lock, let you write you how to achieve?

Pessimistic locks:

Pessimistic locks her single-minded and insecure, her mind only belongs to the current thread, and she is constantly worried that her beloved data may be modified by another thread. So once one thread has (acquired) a pessimistic lock, no other thread can modify the data until the lock is released.

  • The SQL statementselect ... for updateIs an implementation of pessimistic locking
  • Java’s synchronized keyword is another example of pessimistic locking

Optimistic locking:

Optimistic locking is optimistic because it assumes that data changes are not frequent and that operations generally do not have concurrency problems. Therefore, it does not lock, but updates the data to determine whether other threads have changed the data before. Implementation: Optimistic locking is usually implemented using version number mechanism or CAS algorithm.

CAS has been used in business to solve concurrency problems before. If you are interested, you can have a look.

  • CAS optimistic locking is a practice for solving concurrency problems

8. Understand the Paxos protocol? What’s the workflow like?

8.1 Why is the Paxos algorithm Needed?

Currently, all applications are deployed in a cluster, and all machines are required to be in the same state. Suppose you have two machines A and B. A wants to change the state to A and B wants to change the state to B. Who should you listen to? Like 2PC, you can introduce a coordinator who will listen to whoever comes first.

The problem here is that the coordinator is a single node, if it dies. Because you can bring in multiple coordinators

But with so many coordinators, who should you listen to?

Paxos algorithm is introduced to solve this problem. Paxos algorithm is a distributed consistency algorithm based on messaging.

8.2 Roles of Paxos

Paxos involves three roles, numbered numbered Proposer, Accecptor, and Learners.

  • Proposer: a Proposal contains the number and value of the Proposal.
  • Proposals are accepted. Once the proposal is accepted, the proposal value within the proposal (which can be represented by V) is selected.
  • Learner: Whichever proposal is chosen, Learner learns the chosen proposal

A process may be a Proposer, an Acceptor, or a Learner.

8.2 Derivation process of Paxos algorithm

Consistency algorithms require preconditions

  • Of these proposed proposals, only one will be chosen
  • If no proposal is put forward, there should be no proposal chosen

– When the proposal is selected, learner can learn the selected proposal

Assuming only one Acceptor, it is guaranteed that only one value will be selected as long as the Acceptor accepts the first proposal it receives. However, when this Acceptor fails, the entire system becomes unavailable.

If it is a Proposer with more than one Acceptor, how do you select a proposal?

We can add a convention condition, let’s say constraint P1: an Acceptor must accept the first proposal it receives. If each Proposer sends a different value (V1, V2, V3) to a different Acceptor, then a different value is selected.

P1a: A proposal is selected and must be accepted by more than half of acceptors. If a proposal with a value has been approved by more than half of acceptors, the value is considered selected. That is, proposal P= proposal parameter + proposal value, which can be denoted as [M,V].

You can now allow multiple proposals to be selected, but you must ensure that all selected proposals have the same value. Otherwise there will be inconsistency again. So we can add another constraint P2:

If proposal P[M1,V1] is selected, then the value of all selected proposals P with a higher number than M1 must also be V1.Copy the code

A proposal must be selected and accepted by at least one Acceptor, so we can change the P2 constraint to the Acceptor constraint P2a:

If the proposal P[M1,V1] is accepted, all P numbers higher than M1 that are accepted by acceptors will also have the value V1.Copy the code

The problem of multiple proposals being selected has been resolved, but there are still problems if the network is unstable or down.

Suppose there are five acceptors. Proposer2 makes a proposal for [M1,V1], which is accepted by more than half of acceptor25 acceptances. Therefore,V1 is considered selected by both acceptor25 and Proposer2. Proposer1 sends an Acceptor1 proposal for [M2,V2] (V2≠V1 and M2>M1) just after an Acceptor1 has recovered from an outage. This is the first proposal received by Acceptor1. According to P1 (an Acceptor must accept the first proposal it receives), an Acceptor must accept the proposal. Acceptor1 also considers V2 to be selected.

This raises two questions:

  • Acceptor1thinkV2Is selected,Acceptor2~5andProposer2thinkV1Be selected. There is an inconsistency.
  • V1It was selected, but the higher numbered wasAcceptor1Accepted proposal[M2,V2]Value of is V2, and V2≠V1. This is analogous to P2a (if the proposal P[M1,V1] is accepted, all P numbers higher than M1 that are accepted by acceptors are also V1.) The contradiction.

We’re going to strengthen the constraint P2a to get the constraint P2b,

If P[M1,V1] is selected, any Proposer produces a P with a value of V1.Copy the code

For the description in P2b, how do I guarantee that any Proposer produces a P with a value V1 as well? As long as it satisfies P2c:

For any M and V, if the proposal [M,V] is made, there must be a set S consisting of more than half of acceptors that satisfies either of the following two conditions:

  • Either no Acceptor in S has accepted a proposal numbered less than M.
  • The value of the proposal with the largest number approved by all acceptors in S is Vn

8.3 Algorithm Flow

8.3.1. Proposer generates proposals

  • Prepare the request
  • Accept the request

How do you generate proposals based on P2c constraints?

A Proposer selects a new proposal number N and sends a request to a set of acceptors (S) with more than half the number of acceptors, requesting each of the acceptors in S to respond as follows:

  • If an Acceptor has not accepted a proposal, it promises to a Proposer no more proposals numbered less than N.
  • If an Acceptor has accepted the request, it returns to the Proposer the largest numbered proposal with a number less than N that it has accepted.

We call this a Prepare request numbered N.

  • If a Proposer receives more than half of the Acceptor responses, it generates a numberedN, value is the proposal [N,V] of V, where V is the value of the proposal with the largest number in all responses.
  • If no Proposer receives a response with a value generated by the Proposer, it sends that proposal to S with the expectation that the Acceptor will accept it.

We call this request an Accept request

8.3.2 Acceptor accepts the proposal

An Acceptor may receive two requests from a Proposer :Prepare and Accept. When an Acceptor can respond to a request, it also has a constraint: P1b:

An Acceptor may accept a proposal numbered N as long as it has not yet responded to any Prepare request numbered greater than N.Copy the code

The Acceptor received a Prepare request with number N if it had previously responded to a Prepare request with number greater than N. Due to constraint P1b, this Acceptor will not accept the proposal numbered N. Therefore, acceptors ignore this request.

An Acceptor needs to remember only two things: the maximum number of proposals accepted and the maximum number of requests responded.

8.3.3 Paxos Algorithm Description

Phase 1:

  • Proposer selects a proposal number N and sends a Prepare request numbered N to more than half of the acceptors.
  • If an Acceptor receives a Prepare request numbered N that is greater than the number of Prepare requests it has responded to, it sends back to the Proposer the highest-numbered proposal (if any) it has accepted as a response. The Acceptor also promises not to accept any proposal numbered less than N.

Stage 2:

  • If the Proposer receives a response to a Prepare request numbered N from more than half of its acceptors, it sends an Accept request to the proposal numbered N to more than half of the acceptors. Note :V is the value of the highest-numbered proposal in the response received. If the response does not contain any proposals, then V is decided by the Proposer itself.
  • If an Acceptor receives an Accept request for a proposal numbered N, it accepts the proposal as long as the Acceptor has not responded to a Prepare request numbered greater than N.

8.3.4 Learner learns the selected value

9. B+ Tree? Is the B+ tree ordered? What’s the main difference between a B+ tree and a B- tree? B+ tree index, one lookup process?

B+ trees are ordered.

What’s the main difference between a B+ tree and a B- tree?

  • The internal nodes of the B-tree store data; The internal nodes of a B+ tree do not store data, but only serve as indexes, while its leaf nodes store data.
  • The adjacent leaf nodes of a B+ tree are connected by a linked list pointer, but a B- tree is not.
  • A B-tree ends up finding a specific value, whereas a B+ tree ends up finding data in a leaf node through an index
  • Any keyword in a B-tree can occur in only one node, whereas a B+ tree can occur multiple times.

Suppose you have an SQL statement like this:

select * from Temployee where age=32;
Copy the code

Age = age; age = age; age = age; If you want to draw an example, like a secondary index tree,

Then draw the primary key index of id. Let’s draw the cluster index structure as follows:

Therefore, the execution of this SQL query looks like this:

  • Search the idx_age index tree, load disk block 1 into memory, and search the left branch to disk address disk block 2 since 32<37.
  • Load disk block 2 into memory, continue traversing in memory, find the record age=32, obtain id = 400.
  • If id=400, go back to the primary key index tree.
  • Search id primary key index tree, load disk block 1 into memory, traversal in memory, find 400, but B+ tree index non-leaf nodes do not save data. The index continues to search the right branch of 400 to disk address disk block 3.
  • Load disk block 3 into memory, walk through memory, find id=400, get row R4, ok, done.

10. How does TCP implement congestion control?

Congestion control works on the network to prevent excessive data packets from being injected into the network and prevent the network from being overloaded. Its main goal is to maximize the bandwidth of bottleneck links on the network.

In fact, there are several commonly used congestion control algorithms

  • Slow start
  • Congestion avoidance
  • Congestion occurs
  • Fast recovery

Slow start algorithm

Slow start algorithm, which literally means take your time. It means that after the TCP connection is established, do not send a large amount of data at first, but first detect the network congestion level. Increase the size of the congestion window gradually from small to large. If there is no packet loss, the CWND size of the congestion window will be increased by 1 (MSS) for every ACK received. The sending window is doubled in each turn, showing exponential growth. If packet loss occurs, the congestion window is halved, and the congestion avoidance stage is entered.

  • The TCP connection is completed, and the CWND = 1 is initialized, indicating that one MSS unit of data can be transmitted.
  • Each time it receives an ACK, the CWND increments by one;
  • After each RTT, CWND doubled; It goes up exponentially

To prevent network congestion caused by excessive CWND growth, a slow start threshold (SSthRESH) state variable needs to be set. When the CWND reaches this threshold, it reduces congestion as if the water pipe has been turned down. That is, when CWND > SSTHRESH, congestion avoidance algorithm is entered.

Congestion avoidance algorithm

In general, the slow start threshold ssthRESH is 65535 bytes after CWND reaches the slow start threshold

  • CWND = CWND + 1/ CWND for each ACK received
  • For each RTT, CWND = CWND + 1

Obviously this is a linear ascending algorithm to avoid network congestion problems too quickly.

Congestion occurs

Packet loss occurs when network congestion occurs:

  • RTO retransmission timed out
  • The fast retransmission

If RTO timeout retransmission occurs, the congestion generation algorithm is used

  • Slow start threshold sshTHRESH = CWND /2
  • CWND is reset to 1
  • Enter a new slow start process

This is really decades of hard work, once back to the pre-liberation. In fact, there is a better way to handle it, is fast retransmission. When the sender receives three consecutive duplicate ACKS, it quickly retransmits them without waiting for the RTO to time out.

Slow start thresholds SSTHRESH and CWND vary as follows:

  • Congestion window size CWND = CWND /2
  • Slow start threshold ssTHRESH = CWND
  • The fast recovery algorithm is displayed

Fast recovery

Fast retransmission and fast recovery algorithms are usually used together. The fast recovery algorithm thinks that three more duplicate Acks are received, indicating that the network is not that bad, so there is no need to be as strong as the RTO timeout.

As mentioned earlier, CWND and SSHTHRESH have been updated before going into quick recovery:

- cwnd = cwnd /2
- sshthresh = cwnd
Copy the code

Then, the really fast algorithm is as follows:

  • cwnd = sshthresh + 3
  • Retransmission of duplicate Acks (missing packets)
  • If repeated ACK is received, then CWND = CWND +1
  • CWND = sSHTHresh if new data is received after ACK. The receipt of an ACK for new data indicates that the recovery process is over and the congestion avoidance algorithm can be entered again.

11. The JVM tuning

11.1 When is JVM tuning generally considered?

  • Heap memory continues to rise to the maximum memory value set.
  • Full GC times are frequent;
  • [Fixed] GC pauses are too long (more than 1 second)
  • Memory exceptions such as OutOfMemory occur in applications.
  • Some applications use local cache and take up a lot of memory space.
  • The system throughput and response performance are low or degraded.

11.2 Objectives for JVM tuning

  • Latency: low GC pauses and low GC frequencies;
  • Low memory usage;
  • High throughput;

11.3 JVM tuning quantified goals

  • Heap memory usage <= 70%;
  • Old generation memory usage <= 70%;
  • Avgpause <= 1 second;
  • Full Gc count 0 or AVG pause Interval >= 24 hours;

11.4 STEPS for JVM tuning

  • Analyze GC logs and dump files to determine whether optimization is needed and identify bottlenecks.
  • Determine JVM tuning quantification goals;
  • Determine JVM tuning parameters (adjusted based on historical JVM parameters);
  • Tuning memory, latency, throughput and other metrics in turn;
  • Compare and observe the difference before and after tuning;
  • Continue analyzing and tuning until you find the right JVM parameter configuration;
  • Find the most appropriate parameters, apply them to all servers, and follow up.

11.5 Common JVM Parameters

Stack configuration correlation

-Xmx3550m -Xms3550m -Xmn2g -Xss128k 
-XX:MaxPermSize=16m -XX:NewRatio=4 -XX:SurvivorRatio=4 -XX:MaxTenuringThreshold=0
Copy the code
  • -Xmx3550m: The maximum heap size is 3550m.
  • -Xms3550m: Sets the initial heap size to 3550m.
  • -Xmn2g: Sets the size of the young generation to 2g.
  • -Xss128k: The stack size of each thread is 128K.
  • -xx :MaxPermSize: Set the persistent generation size to 16 MB
  • -xx :NewRatio=4: Sets the ratio of the young generation (including Eden and two Survivor zones) to the old generation (excluding persistent generation).
  • -xx :SurvivorRatio=4: Sets the size ratio of Eden zone to Survivor zone in the young generation. If set to 4, the ratio of two Survivor zones to one Eden zone is 2:4, and one Survivor zone accounts for 1/6 of the whole young generation
  • -xx :MaxTenuringThreshold=0: Set the maximum garbage age. If set to 0, the young generation object passes through the Survivor zone and goes directly to the old generation.

Garbage collector correlation

-XX:+UseParallelGC -XX:ParallelGCThreads=20 -XX:+UseConcMarkSweepGC -XX:CMSFullGCsBeforeCompaction=5 - XX: + UseCMSCompactAtFullCollection: - XX: + UseConcMarkSweepGCCopy the code
  • -xx :+UseParallelGC: Selects the garbage collector as a parallel collector.
  • -xx :ParallelGCThreads=20: Number of threads to configure the parallel collector
  • -xx :+UseConcMarkSweepGC: Enables concurrent collection for the elderly user.
  • – XX: CMSFullGCsBeforeCompaction: as the concurrent collector wrong memory space is compressed, sorting, so run after a period of time will produce “fragments”, results in lower operation efficiency. This value sets the number of GC runs after which the memory space is compressed and collated.
  • – XX: + UseCMSCompactAtFullCollection: open to the compression of old generation. Performance may be affected, but fragmentation can be eliminated
  • -xx :+UseConcMarkSweepGC uses the CMS garbage collector

Auxiliary information

-XX:+PrintGC
-XX:+PrintGCDetails
Copy the code

11.6 Common Tuning Policies

  • Choose the appropriate garbage collector
  • Adjust memory size (garbage collection is very frequent, if the memory is too small, adjust memory size appropriately)
  • Adjust the memory region size ratio (GC in one region is frequent, all others are normal).
  • Adjust the age of objects in the old generation (old generation is frequently GC, and many objects are collected at a time).
  • Adjust the standard of large objects (in the old days, frequent GC, many objects are collected at a time, and the size of individual objects is relatively large).
  • Adjust GC trigger timing (CMS, G1 is often Full GC and the program is severely stuttering).
  • Adjust JVM local memory size (GC counts, times, and collected objects are normal, heap space is sufficient, but report OOM)

12. What are the disadvantages of separate tables and databases?

  1. Transaction problem, no longer can use local transactions, need to use distributed transactions.
  2. Cross-node Join problem: Solving this problem can be implemented in two queries
  3. Count, Order BY, Group BY, and aggregate function issues across nodes: get results on each node and merge them on the application side.
  4. ID problem: After the database is shard, it can no longer rely on the primary key generation mechanism of the database itself. The simplest is to consider UUID
  5. Sort paging across shards

13. How to solve distributed transactions? TCC understand?

Distributed transactions:

This means that transaction participants, transaction supporting servers, resource servers, and transaction managers are located on different nodes in different distributed systems. To put it simply, distributed transactions refer to transactions in distributed systems, which exist to ensure data consistency between different database nodes.

When it comes to distributed transactions, you need to know these two basic theories.

  • Theory of CAP
  • The BASE theory of

Theory of CAP

  • Consistency (C: Consistency) : Consistency refers to whether data is consistent across multiple copies. For example, after one partition node is updated, the data read from other partition nodes is also updated.
  • A: Availability: The services provided by the system must always be available, and the results of each operation request can always be returned within A limited time. The emphasis here is on “finite time” and “return result”.
  • P:Partition tolerance: When a distributed system encounters any network Partition failure, it still needs to ensure that it can provide services that meet the requirements of consistency and availability.

The BASE theory of

It is an extension of THE AP in CAP, and for our business systems, we consider sacrificing consistency for system availability and partition fault tolerance. BASE is an abbreviation for Basically Available, Soft state, and Eventually consistent.

  • Basically Available: This is achieved by supporting local failures rather than system-wide failures. If users are partitioned across five database servers, the failure of one user database affects only the 20% of users on that particular host, leaving the rest unaffected.
  • Soft State: The Soft State may be out of sync for a period of time
  • Eventually Consistent: It’s ok if the final data is Consistent, rather than strongly Consistent all the time.

Several solutions for distributed transactions:

  • 2PC(two-stage submission) scheme, 3PC
  • TCC (Try, Confirm, Cancel)
  • Local message table
  • Best effort notice
  • Seata transaction

TCC (Compensation Mechanism)

TCC uses a compensation mechanism, the core idea of which is that for each operation, a corresponding acknowledgement and compensation (undo) operation is registered. Try-confirm-cancel (TCC) consists of three stages:

  • Try phase: Perform the consistency check for all services and reserve necessary service resources.
  • Confirm phase: Services are submitted without any check because the try phase has already been checked. By default, no error occurs in the Confirm phase.
  • Cancel phase: This phase releases all service resources occupied by the try phase and rolls back all operations performed in the Confirm phase if services fail to be executed.

The following example is used to simulate the process of TCC implementing distributed transactions:

Let’s say user A has A balance of 100 gold pieces and has 5 gifts. A spends 10 gold coins, places an order, buys 10 roses. Balances, orders, gifts are all in different databases.

Try phase of TCC:

  • Generates an order record with an order status to confirm.
  • Update the balance of the gold coin in user A’s account to 90 and freeze the gold coin to 10 (reserved business resources)
  • Set the number of gifts for the user to 5 and pre-increase the number to 10.
  • After the Try succeeds, the Confirm phase is entered
  • If any exception occurs during the Try process, it enters the Cancel phase

TCC Confirm phase:

  • Order status updated to paid
  • Update user balance is 90, can be frozen to 0
  • User gift count updated to 15, pre-increased to 0
  • If any exception occurs during the Confirm process, the Cancel phase is entered
  • If the Confirm process is successful, the transaction ends

Cancel phase of TCC:

  • Change the order status to Cancelled
  • Update the user balance back to 100
  • Update user gift count to 5

  • The advantage of TCC is that you can customize the granularity of database operations, reducing lock conflicts and improving performance
  • The disadvantage of TCC is that the application is highly intrusive, and different rollback strategies need to be implemented according to different failure reasons such as network and system failures, which is difficult to implement. Generally, TCC open-source frameworks such as ByteTCC, TCC-Transaction, and Himly are used.

If you are interested, you can read my previous article:

Essential for backend programmers: Distributed Transaction Basics

14. How does RocketMQ ensure the accuracy and security of messages?

As FAR as I’m concerned, this is all about making sure that RocketMQ doesn’t lose messages, that it doesn’t double consume, that message ordering, message stacking.

If the message is not lost, it is considered from the producer side, the storage side, and the consumer side

You can take a look at my previous article:

Message queue classic ten questions

15. Sum of three numbers

Given an array nums containing n integers, determine if there are three elements a, B, and c in nums such that a + b + c = 0. Please find all the triples that sum to 0 and are not repeated.

Note: Repeated triples cannot be included in the answer

Example 1:

Input: nums = [1, 2, 1, 4] output: [[1, 1, 2], [1, 1]]Copy the code

Example 2:

Input: nums = [0]Copy the code

Ideas:

We can sort the array first, and then use the left and right double Pointers.

The complete code is as follows:

class Solution { public List<List<Integer>> threeSum(int[] nums) { List<List<Integer>> result = new LinkedList<>(); If (nums = = null | | nums. Length < 3) {/ / is empty or element number is less than 3, the direct return to return the result; } Arrays.sort(nums); // sort for(int I =0; i<nums.length-2; If (nums[I]>0){// if(nums[I]>0){break; } if(I >0&&nums[I]==nums[I -1]){// Filter repeat continue; } int left = i+1; Int right = nums.length-1; Int target = -nums [I]; // The target sum is the inverse of the ith, Is a + b + c = 0, can automatically while b + c = {(left (right) if (nums [left] + nums = = [right] target) {/ / b + c = - a, a + b + c = 0 result.add(Arrays.asList(nums[i],nums[left],nums[right])); left++; // move the left pointer right--; While (left<right&&nums[left]==nums[left-1]) left++; While (left<right&&nums[right]==nums[right+1]) right--; }else if(nums[left]+ nums[right]<target){left++; }else{right--;}else{right--; } } } return result; }}Copy the code

Reference and thanks

  • Callable/Future usage and principle analysis
  • Summary of JVM tuning
  • Distributed theory (v) – Consistency algorithm Paxos
  • This must be the best Paxos consistency algorithm in the whole web