preface

Good brothers, just like the title, Anjiang was abused, the abuse all over the body, boils down to a dish (real dish).

What a sad story it was. Be reasonable. I was pretty confident before I went back. With the advance of the interview, I found that I could not carry, database underlying principle and network related I really do not ah, not forget, the interviewer will pick these to ask, not according to the resume on the technical point.

The words of the interview company are said to be set up by a team led by a technology guru of Weibo (of course, it is still in Guangzhou, there is a good brother want to challenge, let me know in the comment section). Went in also did an interview question, be reasonable, the topic directly to the standard byte. Look at a coin probability problem, two algorithms (remember one is in place), and a Mysql IMPLEMENTATION of MVCC (seems to be synchronized and volatile). That’s 45 minutes.

Think this is the end of it? Even more abusive was the interviewer, who basically interrupted me when I said a technical point. For example, the difference between Http and Netty data transfer is derived from the difference between Feign and Dubbo. How can I say it? It is hard to resist this mode!!

The topic recorded some of the better not to answer, Ann Sauce can not say difficult and not difficult (the pictures are moved by my brother, there is a link at the back).

What is the underlying data structure of Mysql index

The data structure used by Mysql index is B+ tree (as shown below). The main difference from B+ tree is that B+ tree distinguishes leaf nodes from non-leaf nodes. Only leaf nodes store data; non-leaf nodes only store key values.

Leaf nodes are connected by bidirectional Pointers, and the lowest leaf node forms a bidirectional ordered linked list.

Mysql index data structure why choose B+ tree, not red black tree or other tree

Then I answered lonely (I was a little confused at that moment, but I knew it, but I didn’t tell him).

  1. B+ tree is a multi-path balanced binary tree. Compared with red-black tree and binary tree, the height of the entire tree structure will be greatly reduced, which means that the number of index searches will be reduced, thus improving the query efficiency.

  2. B+ numbers distinguish leaf nodes from non-leaf nodes. Only leaf nodes actually store data. Because Mysql’s InnoDB storage engine reads one page of data at a time (the default page is 16K). In a B tree, both leaf nodes and non-leaf nodes store real data. In other words, as the number of columns increases, the space occupied by the nodes increases, the tree becomes taller, and the disk I/O count increases.

  3. Range lookup is better supported using B+. B trees can also support range lookup, but you need to look up or down the tree. B+ already maintains a bidirectional ordered list on the leaf node, so it’s a natural fit for range lookup.

How is the Mysql index file loaded into memory

I didn’t understand what the interviewer meant during the interview, but suddenly I remembered it when I came out to take the bus. What the interviewer should want to know is the process of using index lookups, which involves loading the page index file (too bad).

The main point here is that the index file is loaded by page. For example, if I want to do an ID (primary key index) equivalent lookup, the process is as follows:



Suppose we query for data equal to 9. Query pathDisk block 1->Disk block 2->Disk block 6.

First disk I/O: Load disk block 1 into memory, compare the disk from the beginning in memory, 9 < 15, go left to disk address disk block 2.

Second disk I/O: Load disk block 2 into memory, compare the disk from the beginning in memory, 7 < 9 < 12, address disk to locate disk block 6.

Third disk I/O: Load disk block 6 into the memory, compare the Data in the memory from the beginning, find 9 in the third index, fetch Data, if Data stores row records, fetch Data, the query ends. If the disk address is stored, you need to retrieve data from the disk based on the disk address, and the query ends. (The difference here is that Data in InnoDB stores row Data, while MyIsam stores disk addresses.)

How does TCP ensure transmission reliability

When the interviewer asks me this, the answer is I won’t (really, I won’t)!!

This article will roughly talk about the reliability of several points, do not do in-depth analysis. B: Neither do I. It’s time to study the Internet.

  1. TCP uses the connection management mechanism of three handshakes and four waves of disconnection to establish a connection. (AT that time, I only answered this one, and then the interviewer came up with a sentence that just because of this can ensure the security and reliability of transmission? I really wanted to say yes).

  2. And test. TCP check and calculation of the same as UDP, in the calculation to add 12byte false header, check scope includes THE TCP header and data part, but UDP check and field is optional, and TCP is a must have. Calculation method for: throughout the sender message segment is divided into multiple 16-bit segment, then radix-minus-one complement summing all segments, and will result in test and field, the receiver calculated using the same method, such as the final result for all inspection field is 1 is correct (0 is right in the UDP), or errors.

  3. Serial number. TCP numbers each byte of data. This is the serial number. Functions are as follows:

  • 3.1 Guarantee reliability (when the received data is always missing a certain serial number of data, can immediately know).
  • 3.2 Ensure the sequential arrival of data.
  • 3.1 Improve efficiency, can realize multiple sending, one confirmation.
  • 3.1 Removing duplicate data.
  1. Acknowledgement acknowledgement mechanism (ACK). TCP implements reliable data transmission through the acknowledgement reply mechanism. In the TCP header, there is a flag bit – ACK. This flag bit indicates whether the acknowledgment number is valid. The receiver will confirm the data arriving in sequence. When the ACK bit is 1, the acknowledgement field in the acknowledgement header is valid. When validation is performed, the value of the validation field indicates that the data prior to this value has arrived in order. If the sender receives an acknowledgement packet of the sent data, it continues to transmit the next part of the data. If no acknowledgement packet is received after a certain period of time, the retransmission mechanism is enabled.

  2. Timeout retransmission mechanism. If a packet is not acknowledged by the receiver within a certain period of time, the sender retransmits the packet (usually, an alarm clock is set after the packet is sent, and the sender retransmits the packet if no response is received at the specified time).

  3. Flow control. The speed at which the receiver processes data is limited. If the speed at which the sender sends data is too fast, the buffer at the receiver is full, and the sender continues to send data, packets will be lost, which leads to a series of chain reactions such as packet loss and retransmission.

  4. Congestion control. Flow control solves the packet loss problem which may be caused by transmission rate between two hosts, on the one hand, it ensures the reliability of TCP data transmission. However, if the network is very congested, sending data at this time will increase the burden of the network, and the data segment sent may exceed the maximum survival time and not reach the receiver, which will cause packet loss. TCP uses a startup mechanism that sends out a small amount of data, like a pathfinder, to figure out how congested the network is and then decide how fast to send it. A congestion window is introduced here.

The difference between Http and Netty data transmission

Actually, this question is askingRPCBrought it out. I asked one at the timeDubboFeignWhat’s the difference? Then Anjiang jumped into the trap becauseDubboUsing theNettyTransfer, and then call the local method through reflection, whileFeignIs through theRestTemplatecallHttpInterface mode. Then it was exactly what the interviewer wanted, which led to this one question, and I answered lonely.

This is mainly for Http and Scoket, and Netty itself supports Http.

  1. HTTP is an object-oriented protocol belonging to the application layer. A Scoket is an interface that provides a Socket for applications to interact with TCP/IP.

  2. HTTP is a short connection (although long connections are supported, the connection time is determined by the server). A Scoket is a long connection. Normally, a Socket connection is a TCP connection. Therefore, once a Socket connection is established, the communication parties start to send data to each other until the connection is disconnected.

  3. HTTP is based on the request/response mode. During a request, the roles of the Server and client are not interchangeable. A Scoket is full-duplex communication. The Server and client can communicate with each other.

Http is a car that provides a concrete form for encapsulating or displaying data. Sockets are engines that provide network communication capabilities.

Reference:

MySQL > select * from ‘MySQL’

How does TCP ensure reliability