preface

Today we are going to depth profiling a cliche topic “please input a name in taobao show the full path of the goods to the final”, this is very difficult, involves the working mechanism of network hardware need to switch, router, network card of the working mechanism, the software is involved in TCP, the working principle of the LVS, There are a lot of mistakes on the Internet to explain these content and not enough detail, a lot of knowledge points with a pen, this article will use a lot of diagrams of the working mechanism of the hardware and software in the network to do a detailed introduction, I believe that you see certainly have a gain, liver text is not easy, you come to a three consecutive support

A friendly note: I’ll talk about how LVS works in the next post, and it will be much easier to look at LVS after this post

Next, we will deeply analyze the communication process between A and B in the figure below. Sit tight and start!

Use real-life scenarios to understand some concepts in the network

It may be difficult for us to understand many technical terms, but it will be clear at a glance when the life and learning scenes are introduced. The design ideas of many technologies are taken from the dribs and dribs of life. Next, we will take the school scene as an example to understand the design ideas of network step by step.

Xuejun Elementary school opened. On the first day, all the students from Class 1, Grade 1 were here. The new teacher didn’t know every one of them, but it didn’t matter

Now the teacher wants to know the student in the class named “Li Si”, so he first yelled “Li Si” in the class, although everyone heard the teacher’s voice, but because it was Li Si, so the others kept silent, only Li Si replied “here”

So the teacher will put the bill on the list and related personnel to link up, the next time when I was looking for bill can directly find the corresponding personnel, don’t shout at top of voice in the class, in the same way the teacher if you want to find someone else, and a loud roar as long as each other’s response, we call this way radio, remember that each other can directly after the one-to-one.

So why doesn’t the teacher go to another class and find Zhang SAN and Li Si? After all, there might be two of them in another class. Notice that the prefix on the student list is the same

Their prefix is Gavin primary school class 1 grade a, so the teacher in charge will only be in class 1 grade “a” for someone, not silly to run to the other classes and find people (of course you want to go to other classes to shout or all shout is no problem, but no one response, it is not necessary to), summarize the teacher if you need to know the list of “Gavin primary school class 1 grade a bill” who is the student, Then she needs:

  1. First of all, find the first grade of the army elementary school where

  2. Then in grade one class shouted a “Li si”, other students although heard the teacher’s shouts, but because the call is not their own, so are silent, only Li Si himself responded to a “to”

  3. The teacher corresponds to go up oneself, then next time do not rely on the way that roar to look for Li 4, directly find oneself can.

That was in the same class. What about students in different classes? For example, Zhang SAN of class 1, Grade 1 needs to find Wang Wu of Class 2, Grade 1, how to do? First see if the prefixes, the class names, are the same

Is different obviously, so zhang SAN can’t shout in class, he should go out, find fifty and first class “Gavin primary school class two, grade one,” and then shout at top of voice, “detective” fifty and then answer “to”, then find words having remembered who is Cathy, so reached second after work directly find fifty and communication.

Here we have the prototype of the network, now let’s look at the class in this example, how the students and so on correspond to the terminology of the network, I believe you will be enlightened

  • Host: student

  • Subnet: the class corresponds to the subnet in the Internet, the Internet is composed of countless subnets, to communicate with the computer, must first find the computer subnet

  • Network address: “Xuejun primary school grade one class” that is, the class number is equivalent to the network address, equivalent to the subnet number, mainly used to determine whether the host is in the same subnet

  • Host address: Zhang SAN, Li Si is equivalent to the host address, equivalent to determine the class, can be assigned to the students

  • IP address: It is not difficult to find that the IP address is composed of network address + host address, but the host IP address is composed of 32 bits of binary number, that is, 4 bytes, the IP address is usually expressed in dotted decimal (A.B.C.D) form, Such as 192.168.1.10,

  • Subnet mask: Although IP address contains the network address, such as IP “Gavin primary school class 1 grade a li si” network address as “Gavin primary school class 1 grade one”, but only it is impossible to know the network address of the IP, you must specify which several as the network address, in this case I specified in the first nine (Gavin the length of the primary school class 1 grade a) word for network address, We can write xuejun Primary school grade one class Li Si /9, representing the first 9 bits of the IP address for the network address, in the computer we use 192.168.1.10/24 such form to express, this used to indicate an IP address which bit identifier is the host in the form of the subnet we called the subnet mask

  • MAC address: “student’s face + ID” is equivalent to the computer MAC address, that is, the address of the network card, each network card address is unique in the world, the MAC address has 48 bits, usually expressed as 12 hexadecimal numbers, every two hexadecimal numbers separated by colon, such as [08:00:20:0a :8C:6D], Once the subnet is found, you can first find the MAC according to the IP, and then find the corresponding computer according to the MAC. Generally, one MAC corresponds to one IP (a student assigns a class number to it).

  • Gateway: If two students are not in the same class (not in the same subnet), they need to go out to find, the door here is equivalent to the gateway

Also implied here is a point to be called to the student, must first entrance registration then school according to its “appearance” + id (MAC address) assigned to a particular class (a first-grade class Gavin zhang SAN, for example, the corresponding IP address), that is to say computer if you want to communication must be assigned a subnet an IP, because it is easy to understand, The students are not in the class (there is no number), and you can’t hear the teachers or students in which class they call, so we can get the basic structure of the network as follows

At this point you know the basics of the network. All that’s left is a few implementation nuances. The rest of the article will show you that our example is strikingly similar to how the network works

How do computers communicate

So let’s move on and see how do computers communicate

With that knowledge in mind, now let’s take a look at how two computers communicate. I believe you will be enlightened. After the analysis in the last section, you will find that there are no more than two situations in which computers communicate

  1. Communication between two computers in the same subnet (in the same class)

  2. Communication between two computers not in the same subnet (not in the same class)

1. The two computers belong to the same subnet

People can communicate by voice, but computers must be connected through network cables

But computers generally only have one network interface, can only connect to a network cable, but there may be hundreds of machines in a subnet, how to connect them

There is a famous saying in computers: there is no problem that can’t be solved by adding another layer, and if there is, add another layer, so we can add a middle layer, have other computers connect to the middle layer, and then have the middle layer forward

The birth of the IP

As we mentioned in the previous section, the computer must first be assigned to communication in A subnet IP, you can manually specify (static IP) for every machine, but manual configuration is too much trouble, so usually by the DHCP server to dynamically assign IP addresses, such as A is just the subnetwork access to the computer, now it has no IP, So it sends A broadcast packet that contains the MAC address of machine A. Because it’s A broadcast packet, every machine can receive it, but only DHCP can respond. DHCP sends A response packet with the IP address of the machine

The procedure for obtaining an IP address is as follows

  1. First, A sends A DHCP DISCOVRE broadcast packet. Since it does not have an IP address, the source address is set to 0.0.0.0. When A is newly connected, it does not know what the DHCP IP is, so it sets 255.255.255.255 (restricted broadcast addresses that will only be broadcast in subnets). All machines receive it, but only the DHCP server can respond

  2. The DHCP server sends A broadcast packet containing the assigned IP address and A’s MAC address. All the machines can receive the broadcast packet. After receiving the broadcast packet, the DHCP server compares the MAC address in the broadcast packet with its own. After receiving the packet, DHCP records that the IP address has been used and sends A DHCP ACK again to confirm that the IP address is available

Note: In order to avoid covering all aspects, the DHCP process has been simplified, which is different from the actual situation, but does not affect the overall understanding

In this way, each machine to access the subnet through DHCP to obtain its own IP address

Ok, now that each machine has been assigned an IP address and the MAC address is known (the network card comes with it), how do two machines communicate?

Known D IP, we’ll look at how A and D communication, first of all A to determine D is in the same subnet, how to judge, in the previous section we mentioned, subnet mask, generally at the time of network which several as the network address, we will specify which several as the host address, as long as the same network address can be considered to be in the same subnet

An IP address has 32 bits. If we specify the first 24 bits as the network address, then the last 8 bits are the host address

If the first 24 bits are the network address, the first 24 bits need to be reserved and the remaining 8 bits (0) are not reserved. In this case, you need to set the IP address to the first 24 bits (1) and the last 8 bits (0) (255.255.255.0 in decimal) to obtain the network address. Is the 11000000.11000000.00000001.00000000, converted to a decimal is 192.168.1.0

Since the first 24 bits are network numbers, the network addresses of A (192.168.1.10/24) and D (192.168.1.13/24) are the same (192.168.1.0), which means they are on the same subnet

ARP

Since it is the same subnet, it is easy for A to find D. At the beginning, A knows the IP of D, but it can not match with the specific machine, just like the new head teacher holding the student list but can not match with the specific student, so A sends A broadcast packet in the subnet to roar: Who is the machine whose IP address is 192.168.1.13

Since it is A broadcast packet, B, C, and D all receive the broadcast packet. After receiving the packet, they compare the IP address and find that only D can match. Therefore, B and C do not respond. Update the Mac address of 192.168.1.13 to macD(Mac address of D). Note that the broadcast packet of A contains the IP address and Mac address of A. Therefore, after receiving the broadcast packet, D locally updates the Mac Address of 192.168.1.10 to macA. This process is called Address Resolution Protocol (ARP), which is a Protocol for obtaining Mac addresses based on IP addresses.

In the process, if the intermediate device is A switch, it also notices that A passes through port A, and the switch records that port A is connected to PORT A (denoted by A’s MAC address), and when D responds, it sends A packet to port 4, and the switch records that port 4 is connected to port D, so it passes through this ARP, A, D, Records on the switch are as follows

The switch uses ARP to continuously update the records of the machine and its connected ports in the table

This is easy when A sends data to D again, with A’s IP, A’s MAC, D’s IP and D’s MAC

When the switch receives the packet, it looks at the MAC address of D, macD, and looks up its record table. MacD should go out from port 4, so it forwards the packet directly to D, and there is no need for global broadcast. Because it records the relationship between MAC address and port, it does not involve IP.

The request for two computers on the same subnet should be understood. Let’s look at how two computers on different subnets communicate.

#### 2. The two computers belong to different subnets

In this case, it is like two different classes of students to communicate, first of all, they must go out, find each other’s class, and then find themselves, this door in the network we call the default gateway (gateway), generally by the router to act as the role of the gateway, gateway address is the first host address of the subnet, If the network address is 192.168.1.0/24, the default gateway IP address is 192.168.1.1. Each host in the subnet has a default gateway

Now let’s see how does A communicate with D

  1. First, calculate whether D and A are in the same subnet. The subnet of A is 192.168.1.1/24, that is, its network address is 192.168.1.0, and its subnet mask is 255.255.255.0. Add the subnet mask of A and the IP address of D, 192.168.2.11, to obtain 192.168.2.0, indicating that A and D are not on the same subnet

  2. At first, A does not know the MAC address of the gateway, so A first sends an ARP packet to obtain the MAC address of the gateway, and then sends the following packet to the gateway

  

Note: the destination Mac is the gateway’s Mac, but the destination IP is D’s IP, not the gateway’s IP! It’s not hard to understand why. You can only send a package through multiple transfer stations (MAC addresses), and the destination address can’t change

  1. When the router receives the packet, it takes out the destination IP address and looks it up in the routing table

| | purpose address subnet mask | | | the next-hop port

   | ———– | ————- | —— | —- |

| 192.168.1.1 | 255.255.255.0 | | | 0

| | 192.168.2.1 | 255.255.255.0 | | 1

| | 192.168.2.3 | 255.255.255.0 | | 2

Each item in the routing table consists of destination address, subnet mask, next hop, and port. The destination address and subnet mask can be used to calculate the network address. Take the second item as an example. All IP addresses whose network addresses are 192.168.2.1&255.255.255.0 = 192.168.2.0 go through port 1

The destination IP address matches each item, and the router forwards the packet through the corresponding port

The destination IP address is 192.168.2.11, and the destination IP address is 192.168.2.0. If the destination IP address is 192.168.2.11, the destination IP address is 192.168.2.0. If the destination IP address is 192.168.2.11, the destination IP address is 192.168.2.0. Therefore, it uses ARP to obtain the MAC address corresponding to this IP address first, and then changes the destination MAC address of the packet to the MAC address of D. Note that the source MAC address should also be changed to the MAC address corresponding to the router port

  

During the forwarding process, the source AND destination IP addresses do not change, but the source and destination MAC addresses change constantly

   

  1. The switch forwards the above packets to D

This is why routers are layer 3 devices. Layer 3 is the network layer, which is responsible for addressing according to IP. In addition, it is not difficult for you to think that a port on a router is a broadcast domain

How do routers communicate with each other

If there are multiple routers on the same network, how do the hosts communicate with each other

What is the process if A wants to communicate with D? First, the process from ROUTER A to router 1 is the same as before. Now the key is how router 1 forwards packets to router 2. Again the routing table, but this time the routing table is a little different than before

| | purpose address subnet mask | | | the next-hop port

| ———– | ————- | ———— | —- |

| 192.168.2.0 | 255.255.255.0 | 192.168.10.6, | |

| | 192.168.3.0 | 255.255.255.0 | | 0

| | 192.168.4.0 | 255.255.255.0 | | 1

If we look at the first record, there is a next hop, which corresponds to the port on router 2. The main reason why we don’t use the port here is to avoid creating a large number of ARP caches on router 1

Let’s take a look at what happens if we use ports. We said earlier that a port represents a broadcast domain. Suppose that a packet with subnet 192.168.2.0 is sent from router 1 through port R2, then the MAC address of the destination IP address of the packet needs to be changed before being forwarded through port R2. If you do not know the MAC address of the destination IP address, send an ARP request to obtain the MAC address of the destination IP address and save it in the ARP cache of the router. If there are many hosts on the subnet of the destination IP address, R2 sends an ARP request each time and saves it locally. If the IP address of the next hop is used, all requests to the 192.168.2.0 subnet will only be forwarded to 192.168.10.6. This means that only the ARP cache of this IP address is stored. Therefore, forwarding between routers generally uses the next hop.

The last question, the router is how to know the routing table, through the static routing algorithm and dynamic routing algorithm, the static route is the way by the rules is artificial configuration, dynamic routing is learning through a router, share with adjacent routers own routing table information to make the Internet router gradually improve its own routing table

Public network, Intranet, and NAT

The communication between hosts discussed in the previous section is on subnets, that is, private networks, but not on the public network

What are private networks and Public networks

Private network is also called Intranet, also called LAN, the network built by enterprises or home users is private network, for example, many computers in the company form a subnet, they can communicate with each other, and the subnet address is private address, different subnet private address can be the same. But if you want to go to Google you have to go out of this subnet, to the public network, which is what we call the Internet in the broad sense.

Each device on the public network IP is the only global, this in fact is easy to understand, such as hangzhou has a Gavin elementary school, Beijing also have a probably learn the elementary school, if two Gavin elementary school students need to communicate with each other must fill in the address of the other party, can’t I learn to fill in the primary school, Courier don’t know which is it to learn the elementary school, So each Xuejun primary school, such as xuejun Primary school in Xihu District, Hangzhou city, Zhejiang Province, needs to be assigned a globally unique address, so that the Courier will know to deliver the letter to that address.

In other words, if the host on the private Network wants to access the public Network, it must convert the IP on the private Network to the IP on the public Network. This process is called Network Address Translation (NAT).

Believe careful you must have found a problem, the subnet address through NAT into the public network request, its response packet how to find the request host, that is, the private network IP address and the public IP address should have a mapping relationship, the more commonly used is port mapping. A TCP connection is a TCP quadruple that requires the IP and port of the requester and the requested party

That is, in addition to the IP address in the request packet, there is actually a port

To save IP resources, NAT uses port mapping

In this case, the public IP address and port address of the request have changed, but when the request comes back, the IP address and port of the public network will be changed into the IP+ port of the internal network, which solves the problem that the response packet cannot find the host

# # #

After reading this article, I believe you should understand how two hosts in the Internet communicate with each other. In addition, many people on the Internet are confused about “why do we have a MAC if we have AN IP, or why do we have an IP if we have a MAC”. IP a remote positioning subnet, the effect of reducing network storm, locating subnet, the radio (ARP request) the impact on the network will be much smaller, after obtaining the MAC ARP in subnet can find corresponding machine, both, it would be like in the school for someone, you’ll have to find the corresponding class, students find the student again.

Thinking about the practical principles of technology in terms of life scenarios often achieves twice the result with half the effort

In addition, it is not difficult for you to understand why the switch works on the second floor and the router works on the third floor

At layer 2, the data link layer, MAC addresses are used to identify nodes on the data link. The switch compares the MAC addresses and determines which port to forward the packets from

At layer 3, the network layer, which corresponds to IP addressing, the router takes out the IP header to decide which port to forward the packet through

Next LVS, no more doves, stay tuned!