Remote network communication protocol

Modern Internet development process, no matter what architecture system, the inevitable and is an important link of network communication, network communication scheme and would make the process efficiency and time-consuming becomes lower, and we generally come into contact with in the process of JAVA development is based on TCP/IP network protocol, so a good software engineer, One of the must-have stacks is some knowledge of remote network protocols

OSI seven-layer network model

Generally we say that the network model is the OSI network model, and the so-called OSI network model is generally divided into seven layers, which are respectively (application layer > presentation layer > session layer > transmission layer > network layer > data link layer > physical layer) from top to bottom:

Application layer –> Presentation layer –> Session layer –> Transport layer –> Network layer –> Data link layer –> Physical layer

The approximate access call looks like this:

As can be seen from the figure, the OSI network model will be particularly detailed in each step, and in our development process, the most common contact is generally based on the OSI layer protocol -TCP/IP protocol

TCP/IP four (five) layer network model

The TCP/IP OSI model combines some of the operations into one model, which is traditionally considered to be a four-layer model, respectively:

Application layer > Transport Layer > Network layer > Network interface layer

That is, the model relationship corresponding to OSI is as follows:

However, some people think that the network interface layer should not merge the data link layer and the physical layer, because the two layers are different in performance, so there is a five-layer model. The comparison figure between the three models is as follows:

TCP/IP request flow

Now that we know the general TCP/IP model, let’s ask ourselves a question: what are these four layers of models used for? What was done? Before we think about these questions, let’s understand what each of the four layers of the network model consists of

The application layer

Hypertext Transfer Protocol (HTTP): The basic protocol for the World Wide Web

File Transfer (TFTP)

Telnet provides remote access to other hosts. It allows users to log in

Internet host and execute commands on this host.

Network management (SNMP), which provides methods for monitoring network devices, configuration management, statistics collection, performance management, and security management.

Domain name System (DNS), which is used to translate domain names and their public broadcast network nodes into IP addresses on the Internet

The network layer

Internet protocol (IP)

Internet Control Information Protocol (ICMP)

Address Resolution Protocol (ARP)

Reverse Address Resolution Protocol (RARP)

Network interface layer

The network access layer is also called the host-to-network layer. The functions of the network access layer include hardware mapping of IP addresses to physical addresses and encapsulation of IP addresses into frames. The network access layer defines connections to physical media based on network interfaces of different hardware types

Let’s take a look at what the TCP/IP four-tier model looks like after a complete request is made:

We can see from the above, when the client a request (application layer), from the transport layer will according to your request, the request to add the Tcp header, and pass the network layer, in the network layer, will the current request processing/computing (get the IP address and other information), IP header information is added to the request and then passed to the data link layer, In this layer, we will calculate a Mac code for the current request according to the IP address. Since the IP address is still repeated, and the Mac address is unique, the Mac head information is added to the request at this time, and the unique request can be identified according to the current request.

When data is transferred to the server, the request is parsed in reverse order. The MAC header is first parsed from the data link layer, the rest of the request information is forwarded, and then the Ip header is parsed. Then to parse the Tcp header information, port and request message parameters, find the corresponding process according to the port, and respond to the operation, which is a complete scheduling process

ARP addressing protocol

Above we have introduced the process of encapsulation request, we first store IP header information in the request, and then store MAC header information, here can not help but have a question, IP and MAC what relationship? In fact, any device has a MAC and an IP information corresponding to the request, the client will use an ARP addressing protocol to find the MAC information corresponding to the IP, the protocol is as follows: When we know the IP of the machine, we send a broadcast message based on the current IP. After receiving the broadcast, the host with the corresponding IP address returns a response message, that is, the MAC head information corresponding to the current host. In this way, the host can obtain the MAC head information based on the IP address

Note: In order to prevent ARP addressing every time, the local machine has a cache policy. Generally, when we change the IP information, the cache information will be invalid, and then ARP addressing will be restarted

Hierarchical load

In the process of distributed development, we often hear a professional term — two-layer load/four-layer load/seven-layer load. In fact, the XXX layer load here refers to the layer of network protocol where the load balancing scheme is located (for the layer of parsing with the server – reverse layer).

The load on the second floor

Load agreement on the second floor, in general is made of load balance for MAC header information, such as the current one cluster, we hope to external access IP address is the same, but do not match the MAC machine, ensure that request distribution on each machine, then can provide a virtual MAC header information, parsing the MAC information in the request, Change the MAC address information to the real MAC address header information of the machines to be distributed in the cluster to achieve load balancing

Three layers of the load

Layer-3 loads refer to ip-level loads, similar to layer-2 MAC loads. The load balancing server provides a virtual IP address header and modifies the virtual IP address to the IP address of the actual distributed machines when parsing requests

Four levels of load

Load on four layers in the transport layer of the OSI model, this layer is usually in the TCP/UDP such agreement, but this layer are generally encapsulate the current information of the client’s request packet (contains the source IP, destination IP, the current port, destination port, etc.), so the realization of the four levels of load is generally after receive the requested information, Modify the IP/ port number information in the request data to distribute to different applications (such load balancing as :Nginx)

Seven layers of the load

Besides the above common several kinds of load balancing, and a special kind of load, call the load, the load is generally in the application layer to do the operation, the interactive layer, application layer are generally the client request this layer typically only HTTP/DNS protocol, so in the current layer, we can do many load conditions, for example, according to a different URL Different request types can be distributed to different servers

TCP/IP handshake protocol and wave protocol

Three-way handshake

TCP connection is established through the three-way handshake protocol, the so-called three-way handshake is the client and the server in the connection process, a total of three packets to confirm and establish contact with each other, and in Sokect programming, the handshake process is triggered by connect

From the figure above, we can see the process of three handshakes as follows:

First handshake: The client sends a SYN packet indicating the server port to which the client is to connect. The initial Sequence Number X is stored in the Sequence Number field. After the packet is SENT, the client status changes to SYN-sent

Second handshake: After receiving the request from the client, the server sends back ACK and SYN reply from the client is 1. In addition, the server selects the ISN serial Number Y, stores it in Seq, and sets the Acknowledgement Number as SQL +1 sent by the client. After sending, The status of the server changes to SYN-rcvd

Third handshake: The client confirms the ACK to be 1 and puts the ACK+1 on the server in the sequence SEQ. After sending the ACK, the client estab-lished is also in the state of estab-lished after receiving the acknowledgement package. At this point the handshake ends

Four times to wave

Unlike three handshakes when connecting, four waves of the hand are required to ensure that the connection is closed:

First wave: When the client needs to disconnect, it sends a packet with FIN 1, indicating that I have no data to send and I am ready to disconnect. However, I can still accept your data at this time. After sending, the status isFIN-WAIT-1

Second wave: The server receives the FIN flag bit from the client and sends an ACKNOWLEDGEMENT packet, indicating that it has received your request to CLOSE the connection. The ACK value is 1, the seQ sequence is generated, and the server replies with the SEQ +1 sent by the client as the ACK confirmation field. After sending the packet, the server is in the close-WIAT state. When the client receives a response, the state changes to FIN-WaIT-2, but the server is still open and there may be data that needs to be sent

Third wave: When the server has no data to send, it sends a packet with the FIN value of 1 and ACK value of 1, generates the seQ sequence, and sends the ACK confirmation field of the LAST packet. After the packet is sent, the server is in the last-ACK state

Fourth wave: the client receives the packet that will be closed from the server, and sends an acknowledgement packet, with the ack of the server as seQ and the SEQ +1 of the server as the ACK acknowledgement field. At this point, the client enters the time-wait state and waits for 2MSL. At this point, the server receives a response and closes the connection. Or if the client does not receive a response after 2MSL, it will consider the server closed and also close the connection

The SYN attack

In the three-way handshake, after the server sends the ACK field to the client, the client that receives the ACK is called a half-connection. If the client does not return the ack packet at this time, the server will resend the ack until it times out. However, if a large number of non-existent client IP addresses are forged within a period of time to initiate connection requests, Therefore, a large number of useless connections occupy the queue within a short period of time. Normal user connections are blocked and the network breaks down. SYN attacks are the most common DDOS attacks.

1. Filtering gateway defense 2. Hardening TCP/IP lines

Why is TCP/IP a three-way handshake, not two and not four?

The three-way handshake is because after receiving a SYN request packet from the Client, the Server sends a SYN+ACK packet. ACK packets are used for reply, and SYN packets are used for synchronization. However, when the Server receives a FIN packet, it may not close the SOCKET immediately (because there are still pending messages). Therefore, the Server responds with an ACK message telling the Client, “I received your FIN packet.” I can send FIN packets only after all packets on the Server are sent. Therefore, THE FIN packets cannot be sent together. Therefore, a four-step handshake is required

Why wait for 2MSL to officially close after four waves?

The network is not reliable. Although the client can close after receiving the confirmation from the server, it may fail and need to try again or the network is delayed. The time sent by the client is close to the MSL time, and the return from the server is close to the MSL time. Wait for two maximum times before a return message is received and the server is considered down

TCP IO communication mechanism

Duplex protocol

TCP is a full-duplex protocol that allows data to be transmitted in two directions at the same time. Therefore, full-duplex is the combination of two simplex communication modes. It requires that both the sending device and the receiving device have independent receiving and sending capabilities.

agreement concept
Simplex agreement Data transmission Data can be transmitted in only one direction
Half duplex protocol Data transmission allows data to be transmitted in both directions, but only in one direction at a time
Full duplex protocol Allowing data to be transmitted in both directions at the same time requires the device to have independent receiving and sending capabilities
IO communication process

TCP and UDP are transport protocols extended for certain application scenarios based on the Socket concept. Socket is an abstraction layer through which applications send and receive data, just as applications open a file handle to read and write data to disks. Socket is used to add applications to the network and communicate with other applications in the same network. Different types of sockets are associated with different types of underlying protocol clusters. The main socket types are stream socket and Datagram socket. The Stream socket uses TCP as an end-to-end protocol (the underlying IP protocol) to provide a reliable byte stream service. The Datagram socket uses THE UDP protocol (and the underlying IP protocol) to provide a “do your best” data packet service.

For TCP communication, each TCP Socket has a send Buffer and a receive Buffer in the kernel. TCP’s full-duplex working mode and TCP sliding window depend on these two independent buffers and the filling state of the Buffer. The receive buffer, on the other hand, caches the data to the kernel. If the application never calls the Socket’s read method, the data remains in the buffer, which copies the data to the application layer’s buffer. The send method usually reads data from the buffer of the application layer into the Socket kernel buffer and returns the data. However, if the application does not read the data, the TCP cache will not be removed after the buffer is full and the peer window is closed, which also confirms the reliable transmission of TC. If the transferred data exceeds the size of the window, the receiver will discard the remaining data

The sliding window

In the early network communication process, data was directly sent without considering the data loss caused by network congestion. Therefore, in order to solve this problem later, a flow control technology — sliding window protocol was developed. Both sender and receiver need to maintain a sequence of data frames, which is called window

Window size: The maximum frame that can continue to send data without waiting for a reply is called the window size

Send window: window in which you can continue sending without waiting for a reply

Receive window: frames that receive incoming data must be processed in the current window, but data that falls outside the window can be discarded in the window

See the sliding-window demo online for this:

Media.pearsoncmg.com/aw/ecs_kuro…