This paper is participating in theNetwork protocols must be known and must be known”Essay campaign

Hello, I’m Little Egg.

The entire process of data transfer can be summed up in four words: create, connect, send and disconnect. Each interaction is explained in detail, so keep reading.

The interaction process is shown in the figure below:

How are sockets created

Internal structure of the protocol stack

As shown in the figure above, the entire request interaction process is divided into several parts. First, the top layer is the application, then the Socket library, which I have introduced in the article X x X, if you are not familiar with it, you can have a look at it first.

Below is the operating system’s internal again, this is included in the protocol stack, the upper part of the TCP and UDP protocol stacks, they are responsible for sending and receiving data, is a need to connect, can a don’t need to connect directly to send and receive data, the detailed difference between the two I will later in the article, alone here, everyone understand the first line.

The bottom half of the stack is the IP protocol, which really turns the data into the medium through which network packets actually transmit the data.

IP is the network adapter driver, used to control the network adapter hardware.

Recognize sockets

Inside the protocol stack, there is a memory space for storing control information, which records the IP address, port number, status and other information of the object to be connected.

The socket itself is just a concept. There is no such thing. If the concept has to be given an entity, the control information can be regarded as its entity.

When sending data, we need to look at the IP address and port number of the object to which the socket is connected; Once the data is sent, the socket keeps a record of how long it took to send the data, as well as if a response was received.

Let’s actually take a look at what information the socket has, which can be typed into the console on your computernetstatCommand to query:

  • Proto: indicates the protocol type. This is TCP, and if udp is used it will be udp.
  • Local Address: indicates the IP Address of the Local host.
  • Foreign Address: INDICATES the IP Address of the communication object
  • State: indicates the communication status. ESTABLISHED indicates that the connection is completed. CLOSE_WAIT indicates that the connection is closed. Another common state is LISTENING: Waiting for the connection.

When the browser makes a Socket call to the protocol stack through the Socket library, the protocol stack performs the operation of creating a Socket based on the request.

The stack first allocates an area of memory to hold the socket, and then stores control information into it, so that the socket is created.

Connecting to the server

Once the socket is created, the browser calls CONNECT and the protocol stack connects the local socket to the server socket.

The connection is the exchange of control information between communication parties. The control information exchanged by the connection operation is determined according to the communication rules. As long as the two parties are connected according to the rules, the connection relationship can be established to complete the preparation of data receiving and receiving.

Control information

Control information can be divided into two types. One is the control information exchanged between the client and server when they contact each other. This information is required for the whole communication process of establishing a connection, sending and receiving data, and disconnecting the connection. Ethernet and IP protocols also have their own control information. This information is also called a header. To distinguish them, we call them TCP header, Ethernet header, and IP header respectively.

Some TCP header information is listed here for reference only.

Control information is also stored in sockets, where information passed by the application and received from communication objects are stored, as well as the execution state of data receiving and receiving operations.

The actual process of the connection operation

The first step in connection operations is to create a header in the TCP module that represents connection control information. When the TCP header is created, the TCP module will pass the information to the IP module to entrust it to send. After IP sends packets, the network packet reaches the server through the network. The IP module on the server transmits the received data to the TCP module. The TCP module finds the corresponding socket according to the header information, and writes the corresponding information in the socket.

At the same time, when the response is returned, the ACK control bit is set to 1, indicating that the network packet has been received. The SERVER TCP module sends the response message to the client through the IP module.

After the client receives the response, its IP module transmits the information to the TCP module, and then confirms whether the connection is successful through the TCP header information. SYN equals to 1 means that the connection is successful. The client also sets ACK to 1 and sends it back to the server.

Once the connection is established, data can be sent and received at any time, and the connection will remain until close is called.

Sending and receiving data

The trigger operation for sending and receiving data is initiated by the application, which specifies the length of the data to be sent by investigating write.

When the protocol stack receives data, it may not send it immediately, but put it in the send buffer. Why do this?

Some programs may transmit all the data at once, while others may transmit it line by line. In this case, sending data as it is received may result in sending a large number of small packets, resulting in inefficiency. How much data needs to be accumulated before sending is generally considered based on two factors, one is the data length of each network packet, and another is the latitude of processing time.

The length of data a network packet can hold

Let’s start with two terms:

MTU: Maximum length of a network packet. In Ethernet, it is usually 1500 bytes. It is the total length including the header.

MSS: The maximum length of all data contained in a network packet, excluding headers.

The processing time

When an application sends data infrequently, if it has to wait until the length reaches MSS each time, the wait time is too long. To deal with this, the protocol stack has a timer that will send network packets out if a certain amount of time is reached, even if it is far short of the MSS length.

The ACK mechanism confirms the receiving status of network packets

When the client sends data to the server, TCP calculates the number of bytes and writes it in the TCP header. At the same time, IT generates a random number as an ACK and sends it to the server. After receiving the packet, the server compares the actual received length with the LENGTH given by the TCP header to ensure that the data is not missing. The client also needs to tell the server from which byte it was sent, and since our ACK is a random value, we need to send it to the server with the SYN control bit set to 1, so that the server knows from which byte it was originally sent.

After receiving the data, if the data is ok, the receiver needs to tell the sender how much data was received, which is also returned by the ACK number operation. The ACK value is the total number of bytes received.

With this mechanism, we can verify that the receiver received the data correctly, and if not, we can resend the network packet.

Whatever errors occur in the network, we can find them and take remedial action.

Sliding window

If we were to wait for an ACK to acknowledge each network packet before sending the next one, this waiting time would be wasted.

The concept of window-sliding is to send network packets one at a time and to continue to send the next packet without waiting for ACK to return, reducing the waste of waiting time.

However, there are problems with this approach. If the sender keeps sending data to the receiver, the second data will arrive before the receiver finishes processing the first data. The data will enter the receiving buffer, and the data will grow and overflow will occur. To avoid this, the receiver tells the sender how much data it can receive, and the sender controls the data it sends based on this value.

Delete the socket

When data is sent and received, the disconnection mechanism is started. Take the Web as an example. When data is sent and received, the server initiates the disconnection process by calling the close program of the Socket library. The protocol stack entrusts the IP module to send data to the client.

When the client receives the TCP header with FIN 1, the client protocol stack marks its socket as open and notifies the server that it has received the packet with FIN 1. The client returns an ACK number to the server.

UDP protocol sending and receiving operations

Before, we explained the data sending and receiving operation based on TCP protocol. It can be seen that the whole process is actually quite complicated, but sometimes we do not need such complicated security verification, and UDP can meet some simple data sending and receiving. For example, we used UDP protocol to query the IP address of DNS server as mentioned earlier.

UDP does not have the mechanisms of TCP, such as receiving confirmation and window, and does not need to exchange control information or perform connection operations before sending and receiving data.

Receiving data is as simple as finding the corresponding socket based on the receiver and sender IP addresses in the IP header and the receiver and sender port numbers in the UDP header and handing the data to the appropriate application.

Pay attention to eggs, grow together, progress together