1. Transport layer features

The Transport Layer, Layer 4 of the TCP/IP protocol family, is responsible for creating the virtual Transport control protocol TCP, or user datagram protocol UDP, also known as the “Transport Layer”, on the network host. This layer sends and receives data to applications running on its host. The transport layer assigns port numbers to processes running in the application on the host and adds TCP or UDP headers to messages received from the application to specify source and destination port numbers.

This image is from Microchip

The transport layer provides a total end-to-end solution for reliable communication. TCP/IP relies on the transport layer to effectively control communication between two hosts. When an IP communication session must start or end, the transport layer is used to establish the connection. It has the following characteristics: 1. Transport layer ensures reliable end-to-end connection between two address Spaces. 2. Data can be sent bidirectionally as an unstructured sequence of bytes of any length. 3. Different transport mechanisms should be supported.

2. Transport layer protocol

At the transport layer, there are two important transport protocols: connection-oriented Transport Control Protocol (TCP), which provides reliable transport; And the unreliable, connectionless user datagram protocol (UDP).

For transmission Control Protocol (TCP) : it has message correction, data retransmission and message confirmation mechanisms. It always ensures that error-free messages are delivered in the correct order. In other words, host B always receives the sequence of messages sent by host A as expected. In addition, there is a message reordering function. During communication, data in a message packet may exceed the size of packets that can be received by the network layer. In this case, the message cannot be transmitted to the other party through one interaction. You must perform a process called “unpacking,” in which the message is split into several packets and a number is assigned to each packet in that group. After receiving all these packets, the transport layer will sort and combine them into a complete message according to the packet number.

For the User Datagram Protocol (UDP), it is a connectionless process, meaning that the connection status is not checked when a message is sent. It can make errors and may not guarantee that messages are delivered sequentially.

2.1 the TCP protocol

2.1.1 TCP header Structure

The transmission Control Protocol (TCP) Header structure is shown in the following figure:

This image is from NMAP.ORG

If you feel that the above image brings visual fatigue and confusion due to the lack of coloring, you can refer to the following image. In this figure, each part of the field is colored separately, which brings visual impact and makes people clear at a glance.

The two figures above are schematic TCP headers (data structures) in RFC 793. The TCP/IP protocol family is essentially a service, so each layer has a corresponding structure declaration that initializes the structure variables of the layer to achieve the purpose of interaction between the layers. In other words, the application layer will have structured data types declared by the user, and the transport layer will have structure types declared by the system, as well as the network layer, data link layer and physical layer. For example, the following TCP struct type declarations are derived from Linux source code declarations and tcpdump struct type declarations.

Tcpdump 4.9.2 tcpdump 4.9.2 tcp.h

typedef  uint32_t  tcp_seq;
/*
 * TCP header.
 * Per RFC 793, September, 1981.
 */
struct tcphdr {
uint16_t  th_sport;    /* source port */
uint16_t  th_dport;    /* destination port */
  tcp_seq    th_seq;      /* sequence number */
  tcp_seq    th_ack;      /* acknowledgement number */
uint8_t    th_offx2;    /* data offset, rsvd */
uint8_t    th_flags;
uint16_t  th_win;      /* window */
uint16_t  th_sum;      /* checksum */
uint16_t  th_urp;      /* urgent pointer */
} UNALIGNED;
Copy the code

The tcp.h header is defined in Linux kernel 5.4.3.

struct tcphdr { __be16 source; __be16 dest; __be32 seq; __be32 ack_seq; #if defined(__LITTLE_ENDIAN_BITFIELD) __u16 res1:4, doff:4, fin:1, SYN :1, RST :1, PSH :1, ACK :1, urg:1, ece:1, cwr:1; #elif defined(__BIG_ENDIAN_BITFIELD) __u16 doff:4, res1:4, cwr:1, ece:1, urg:1, ack:1, psh:1, rst:1, syn:1, fin:1; #else #error "Adjust your <asm/byteorder.h> defines" #endif __be16 window; __sum16 check; __be16 urg_ptr; };Copy the code

As you can see from the member list of the two struct datatype declarations above, it corresponds to the TCP header structure. For details about the TCP Segment Header fields, see TCP Segment Header

2.1.2 TCP three-way handshake

Transmission Control Protocol (TCP) is connection-oriented, that is, each transaction (A complete communication) establishes A reliable connection through three handshakes, and the connection is bidirectional, that is, host A can initiate A handshake connection to host B, or host B can initiate A handshake connection to host A.

In the three handshakes, each handshake is accompanied by a message exchange and transmission. The three message data establish three important signals that both sides of the connection need to know:

  1. ISN for sending data (to prevent hackers, these should be unpredictable).
  2. The buffer space (window) available for local data, in bytes.
  3. Maximum segment size (MSS), which is a TCP option that sets the maximum segment that the localhost will accept. MSS is usually the link MTU size minus 40 bytes of TCP and IP headers, but many implementations use 512 or 536 byte segments (which are the maximum, not the requirement).

Note that no data is sent during the three-way handshake until a connection has been successfully established. As shown below:

No task data is sent during the three-way handshake

Let’s start the three-way handshake with an example of code.

The client. C file is the code implementation of the client in communication. It creates a socket handle sock_fd and ADAPTS it to the server host (IP 10.66.114.115, port 8888) through connect and completes the three-way handshake process. Then it sends a message to the server:

Then wait for the server to respond to the message data and make a judgment. If it fails, repeat the above process; Otherwise, the system prints the response packet data received from the server, closes the link handle, and exits the process. Here is the main task and logic of the code in the client.c file.

Client code (client.c)

#include <sys/types.h> #include <sys/socket.h> #include <stdio.h> #include <netinet/in.h> #include <arpa/inet.h> #include <unistd.h> #include <string.h> #include <stdlib.h> #include <fcntl.h> #include <sys/shm.h> #define DEFAULT_PORT 8888 int main() {// define sockfd int sock_fd = 0; if(-1 == (sock_fd = socket(AF_INET,SOCK_STREAM, 0))) { printf("failed to create socket.\n"); exit(-1); } char buf[1024] = {0}; struct sockaddr_in server; memset(&server, 0, sizeof(server)); server.sin_family = AF_INET; server.sin_port = htons(DEFAULT_PORT); // Server port server.sin_addr.s_addr = inet_addr("10.66.114.115"); // Server IP address // Connect to server successfully, return 0, If (connect(sock_fd, (struct sockaddr *)&server, sizeof(server)) < 0) {perror("connect failed."); exit(-1); } for(;;) { sprintf(buf, "client index %d", 0); send(sock_fd, buf, strlen(buf), 0); memset(buf, 0x00, sizeof(buf)); if(recv(sock_fd, buf, sizeof(buf), 0) && strlen(buf) < 0) { printf("No data received... \n"); continue; } else { printf("client recv data:[%s]\n", buf); break; // Close the socket connection handle and exit the process close(sock_fd); return 0; }Copy the code

The server.c file is the code implementation of the Server role in communication. It is responsible for creating listening connections for the specified IP and port. And wait for the client to send three handshake connection request, after the establishment of connection (accept), and wait for the write data request from the client, and parse and print. It then responds to the client message.

Release the socket connection handle. Here, the server is the active party initiating the closing of the connection, and the client is the passive party belonging to the closing of the connection. The code looks like this:

Server code (sever. C)

#include <stdio.h> #include <string.h> #include <stddef.h> #include <fcntl.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <unistd.h> #include <errno.h> #define DEFAULT_PORT 8888 int main(int argc, char **argv) { char buf[1024] = {0}; int sock_fd = 0, conn_fd = 0; int len = 0; if(-1 == (sock_fd = socket(AF_INET, SOCK_STREAM, 0))){ printf("failed to create socket.\n"); exit(-1); } struct sockaddr_in server; memset(&server, 0x00, sizeof(server)); server.sin_family = AF_INET; server.sin_addr.s_addr = htonl(INADDR_ANY); server.sin_port = htons(DEFAULT_PORT); if(bind(sock_fd, (struct sockaddr*)&server, sizeof(server)) == -1){ printf("bind socket error: %s(errno: %d)\n",strerror(errno),errno); exit(-1); If (listen(sock_fd, 8) < 0) {printf("listen failed.\n"); exit(-1); } puts("waiting for client's connection ... \n"); If ((conn_fd = accept(sock_fd, (struct sockaddr*)NULL, NULL)) < 0) {printf(" Accept socket error: %s(errno: %d)",strerror(errno),errno); exit(-1); } for(; ;) { len = recv(conn_fd, buf, sizeof(buf),0); if(len) { printf("server recv data:[%s]\n", buf); } memset(buf, 0x00, sizeof(buf)); sprintf(buf, "server index %d", 0); send(conn_fd, buf, len, 0); // After receiving data from the client, close the socket connection (wave four times) break; } close(conn_fd); close(sock_fd); for(;;) ; Return 0; }Copy the code

The communication process between client and server realized by the above code is shown in the figure below. The server starts to service and listens on port 8888. After the client initiates the first of three TCP handshake requests, the client parses the packet and responds to the client (the second handshake). After receiving the packet from the server, the client performs a series of parsing processes and responds to the packet (the third handshake) after checking that the packet is correct. After that, the server blocks, waits for the client to initiate a data write request, parses and prints the data, and responds to the client’s message. After that, the server actively closes the socket connection handle.

Communication diagram between client and server

2.1.2.1 TCUPDUMP Network Packet Capture

In order to see the details of TCP communication (three-way handshake), the server uses tcpdump to capture packets after starting the listening process.

tcpdump -i bond0 -w dst port 8888 -w ./3WayHandshake.cap

Start the client server process at the same time. When the process is up and running, it initiates a handshake connection to the server. The whole process is shown in the figure below:

In the screenshot, data segments 1, 2, and 3 in the red line box correspond to three message data segment interactions during the TCP three-way handshake. Now each data segment is illustrated in detail.

The following figure shows the first handshake of TCP establishment (three-way handshake). The red line in the figure shows that the first message sent from the client to the server contains important values such as the source port, destination port, TCP data length 0, SYN value, and MSS(maximum data segment length).

TCP three-way handshake The process of setting up the first handshake of a connection

During the second TCP handshake, the server receives the SYN and MSS fields from the client and responds to the SYN, MSS, source port, destination port, and ACK.

TCP Three-way handshake Indicates the process of setting up the second handshake

In the TCP third handshake, the client performs a series of parsing and confirmation after receiving the response message from the server. If all is well, the ACK confirmation number is found to the server. At this point, the connection-oriented aspect of this communication is officially established, and the data communication operation can begin.

TCP Three-way handshake The process of setting up the third handshake

The above three-way handshake process is now represented in sequence diagram form to understand the three-way handshake process of TCP connection establishment more clearly.

TCP three-way handshake

The TCP three-way handshake has the following important features: It ensures that both parties know that they are ready to transmit data, and it also allows both parties to agree on the initial serial numbers that are sent and confirmed during the handshake (thus, they are error-free)

Serial no. 4, 5, and 6 are the data interaction between the client and the server after the TCP three-way handshake is established. Serial No. 4 is the 14 bytes of data “Client index 0” sent to the server using the SEND function API. After receiving the data message from the client, the server sends an ACK number and sends a response message containing 14 bytes of Server Index 0 to the client. The server then initiates the disconnection.

2.1.2.2 About Serial numbers

Each TCP host is assigned its own serial number. The initialization sequence number used in TCP connections is defined by the host, which uses a random algorithm to generate an unpredictable initialization sequence number. Send confirmation number = Serial number received + Number of bytes received. The confirmation number field contains the next serial number value that the other party expects.

2.1.3 TCP Half-open Connection

In general, TCP always completes the three-way handshake and establishes connection communication. However, the online world is always complex, and some surprises are inevitable. As shown in the figure, when a host sends the first SYN handshake request to another host (suppose the host on the left is the client), the server responds with an ACK + SYN packet. The client host then completes the handshake connection process by sending a third ACK packet. However, the host (client) may lose its network connection or the machine may go down. Thus, although the host (server) is constantly trying to send ACK + SYN data in response to an ACK response from the host (client). But because the state of the host (client) is not known, a so-called half-open connection occurs when the handshake process does not end with a final ACK.

2.1.4 TCP wave four times

In section 2.1.2.1, the three-way handshake in the process of establishing connection-oriented communication is explained in detail by unpacking the captured packet file tcpdump. Serial numbers 4, 5, and 6 in the packet capture file refer to the data interaction between the client and the server. It can be seen that there are four data packets with serial numbers of 7, 8, 9, and 10 in the data segment of the complete captured packet that are not described. It is obvious that these four datagrams are the four waves of TCP when releasing this communication connection.

TCP waved the transaction four times

As can be seen from the figure, it is the server that initiates the request to close the socket connection first (port 8888->37948). This is the first process in the four wave release of the connection, and the packet contains field values FIN (transaction completed) and ACK (confirmation number). The client responds with an ACK (confirmation number) upon receiving the request from the server to release the connection. This is followed by sending a FIN + ACK field value, the third wave. After receiving the message packet from the client, the server parses and makes logical judgment. If there is no problem with the serial number and confirmation number, it replies with ACK (confirmation number), indicating that it agrees to close the network connection transaction.

TCP waved four times

2.1.4.1 Again about serial numbers

Have you noticed that the sequence number SYN and ACK are synchronized from 1 and 1 in the previous TCP three-way handshake to 15 and 15 now? 16, 16?

Now let’s go over the topic of serial numbers again.

About the serial number generation problem, the following three points must, must, must read several times, important things say three times. (Note: The following three points, plus the example below, are taken from “TCP/IP Principles and Applications – 4th Edition” by Jeffrey L. Carroll, Laura A. Chapman, Ed Tittel, James Pyles.)

  1. Each host assigns its own sequence number (SYN).
  2. The initial call sequence number used in TCP connections is defined by the host. For security purposes, the initial call sequence number should be selected randomly.
  3. The confirmation number field is only added when data is received, except during TCP startup and dismantle sequences.

Because the data flow can change direction (host 1 sends data to host 2, and then host 2 sends data to host 1), the sequence number field on each side can be added for a while, and then paused for a while as the communication with the other side begins to add sequence number fields.

Now that we have a general understanding of the sequence number, we will take a closer look at the sequence number SYN and ACK in our sample code by capturing each data segment in the packet file.

The figure above shows the data segment information captured by tcpdump. Based on the principle of increasing serial numbers, the sequence diagram of changing values of serial numbers such as SEQ and ACK can be obtained as follows:

2.2 the UDP protocol

This image is from NMAP.ORG

Compared to TCP’s header structure, UDP’s header structure is very simple, with only four fields. They are: source port number, destination port number, length, and checksum.

  1. Source port – This 16-bit information identifies the source port of the packet.
  2. Target port – This 16-bit information identifies application-level services on the target machine.
  3. The length – length field specifies the entire length (including the header) of the UDP packet. It is a 16-bit field with a minimum of 8 bytes, the size of the UDP header itself.
  4. Checksum – This field stores the checksum value generated by the sender before sending. IPv4 has this field as an optional field, so when the checksum field does not contain any values, it is set to 0 and all its bits are set to zero.

Because UDP provides connectionless and unreliable data interaction, UDP is about 40% faster than TCP in the same configuration environment. If you are looking for response speed and can tolerate or accept packet loss, UDP must be the best transport layer protocol. Of course, if you are serious about packet loss but can accept network data latency, then response, TCP is your only choice. The following applications use UDP to transfer data. Domain Name Services (DNS) Simple Network Management Protocol (SNMP) Trivial File Transfer Protocol (TFTP) Routing Information Protocol (RIP)

The differences between TCP and UDP communication processes are shown in the following figure:

This figure is from CLOUDFLARE

· The order in which UDP packets are received may not be the same as that in which they are sent. · Packet is not guaranteed to reach the destination and may be lost during transmission. · No connection needs to be established before packets are sent.

3. Summary

This paper explains the functions of the transport layer and the position of the TCP/IP protocol family, as well as the two most important protocols of the transport layer, namely transmission control protocol TCP and user datagram protocol UDP. At the same time, also detailed analysis of THE TCP protocol header structure, as well as TCP characteristics, in addition, also explained the TCP connection oriented three-way handshake process, through the way of graphic details of the connection process of the message interaction details. In addition, the function and header structure of UDP are described in detail.