1. The sequence

TCP/IP can realize the transport layer function, the representative protocols are TCP and UDP, let’s take a look at the construction of TCP UDP.

2. The TCP protocol:

TCP Indicates the meanings of each field

2.1 Source Port and Destination Port:

A process on a computer communicates with other processes through a computer port, and a computer port can only be occupied by one process at a time. Therefore, by specifying a source port and a destination port (2 bytes each), you can know which two processes need to communicate.

The source port and destination port are represented by 16 bits, and the number of ports on the computer can be calculated as 2^16 (65535).

2.2 the serial number

TCP is byte stream oriented, and each byte in the byte stream transmitted over a TCP connection is numbered sequentially.

Serial number (4 bytes) indicates the number of the first byte of data to be sent in this paragraph. Each byte of the byte stream transmitted over a TCP connection is numbered sequentially. Since the sequence number is represented by 32 bits, sequence number winding occurs every 2^32 bytes, again starting at 0.

For example, a 100 KB HTML document contains 102400 bytes (100 x 1024). Each byte is numbered. The document number ranges from 0 to 102399. The ordinal field value refers to the ordinal number of the first byte of data sent in this paragraph. Then, after the HTML document of 100 is divided into four equal parts, the first TCP packet segment contains the first 25KB data (0 to 25599 bytes). The serial number of the packet is as follows: 0 the second TCP packet segment contains 25600 to 51199 bytes of the second 25kb data. The sequence number of the packet is 25600… According to 8 bits = 1 byte, then 4 bytes can represent the numerical range: [0, 2^32], a total of 2^32 (4294967296) serial number. When the serial number reaches the maximum value, the next serial number returns to 0. That is to say, TCP can number 4GB data. In general, it can ensure that when the serial number is used repeatedly, the data of the old serial number has already reached the end point through the network or is lost.

2.3 confirmation number

The reliability of TCP is based on the fact that each data packet needs to be acknowledged. In other words, after receiving a message from the other party, each party of communication sends a corresponding acknowledgement message (the acknowledgement message contains the acknowledgement number) to acknowledge receipt.

Acknowledgement number (4 bytes, 32 bits) : indicates the number of the first byte of the next packet segment that the receiver expects to receive from the sender.

For example, after receiving the first 25KB packet whose serial number is 0, the communication party needs to send an acknowledgement packet with the acknowledgement number = 25600 (0-25599, therefore 25600 is expected to be received).

2.4 Data Migration

Data offset 0.5 bytes (4 bits) Indicates the distance between the start of the TCP packet segment and the start of the TCP packet segment. This field actually indicates the header length of the TCP packet segment. Since there are also option fields of uncertain length in the header, the data offset field is necessary, but note that “data offset” is in 32-bit words (that is, 4-byte words). Since the maximum decimal number represented by a 4-bit binary number is 15, the maximum data offset is 60 bytes, which is also the maximum size of the TCP header (i.e., the option length cannot exceed 40 bytes, since the TCP header adds up to 20 bytes excluding the options).

2.5 keep

It occupies 0.5 bytes (4 bits). Reserved for future use, but should be set to 0 for now.

2.6 control bits

1 byte (8 bits)

1. Emergency URG:

When URG=1, the emergency pointer field is valid. It tells the system that there is urgent data in this message segment and that it should be sent as soon as possible (equivalent to high-priority data) rather than in the original queued order. For example, a long program has been sent to run on a remote host. But some problems were found that required the program to be canceled, so the user issued an interrupt command from the keyboard. If emergency data is not used, these two characters are stored at the end of the cache that receives TCP. These characters are delivered to the receiver’s application process only after all data has been processed. It wastes a lot of time. When URG is set to 1, the sending application tells the sender’s TCP that it has urgent data to transmit. The sender TCP then inserts the emergency data at the top of the column data, while the data after the emergency data remains normal data. This is used in conjunction with the Urgent Pointer field in the header.

2. Confirm ACK: indicates whether the confirmation number field is valid. ACK=1: valid. The preceding confirmation number field is valid only if ACK=1. TCP specifies that the ACK value must be 1 after the connection is established. TCP rules must be set to 1 except for SYN packets that are initially established.

3. Push PSH: When two applications communicate interactively, sometimes the application on one side wants to receive a response immediately after typing a command. In this case, TCP can use push operations. In this case, the sender TCP sets PSH to 1 and immediately creates a packet segment to send. After receiving the packet segment with PSH=1, the receiving TCP delivers the packet to the receiving application process as soon as possible. Instead of waiting for the entire cache to fill up and deliver up.

4. Reset RST: This function is available only when RST=1. If you receive a packet with RST=1, your connection to the host is seriously wrong (for example, the host crashes) and you must release the connection and re-establish the connection. Or the last data you sent to the host was faulty and the host refused to respond.

If this bit is 1, it indicates that the TCP connection is abnormal and must be forcibly disconnected. For example, an unused port cannot be accessed even if a connection request is sent. You can return a package with RST set to 1. In addition, when the host restarts due to program downtime or power outage, all connection information will be initialized, so the original TCP traffic will not continue. In this case, if the communicating party sends an RST packet set to 1, the communication is forced to disconnect.

5. Synchronize SYN: Used when establishing a connection to synchronize the sequence number. When SYN=1, ACK=0, it indicates that this is a packet segment requesting to establish a connection. When SYN=1, ACK=1, the peer agrees to establish a connection. SYN=1, indicating that the packet requests or agrees to establish a connection. The SYN is set to 1 only in the first two handshakes. SYN 1 indicates that you want to establish a connection and set the initial sequence number value in its sequence number field.

6. Terminate FIN: marks whether data is sent. If the bit is 1, it indicates that no more data will be sent and you want to disconnect the connection. When the communication ends, the two hosts can exchange the TCP segment with FIN position 1 with each other. After each host confirms the FIN packet of the other host, the connection can be disconnected. However, the host does not have to reply to a FIN packet immediately after receiving a TCP packet with FIN set to 1. Instead, the host can wait until all data in the buffer has been successfully sent and automatically deleted.

2.6 the window

It takes 2 bytes. Indicates the amount of data the other party is now allowed to send. That is, the amount of data that the other party is allowed to send starting from the confirmation number in this article. The reason for this limitation is that the receiver’s data cache space is limited. In a word,The window value serves as a basis for the receiver to let the sender set its send window.

2.7 inspection and

It takes 2 bytes. Provides additional reliability. I won’t go into too much detail here, but I’ll write about it later. What is the purpose of using checksums? If a bit error occurs in noisy communication, it can be detected by the FCS of the data link. So why do you need checksums in TCP or UDP?

In fact, the checksum between TCP and UDP is more about checking whether data is damaged due to router memory failure or program bug than checking for errors caused by noise.

Anyone who has written in C knows that Pointers, if used incorrectly, have a high risk of destroying in-memory data structures. There may also be a bug in the router’s program, or the program may fail abnormally. Packets are sent over the Internet through many routers, and once one of them fails, packets, protocol headers or data passing through that router are likely to be corrupted. Even in this case,TCP or UDP, if it can provide checksum calculation, can determine whether the protocol head and data are corrupted.

2.8 Emergency Pointer

It takes 2 bytes. Marks the location of the emergency data in the data field. The emergency pointer is meaningful only if URG=1, indicating the number of bytes of emergency data in this paragraph (after the emergency data ends, it is normal data). Therefore, the emergency pointer indicates the position of the end of the emergency data in the message segment. When all the emergency data has been processed, TCP tells the application to resume normal operations. It is worth noting that emergency data can be sent even when the window is 0. How to handle emergency data is an application problem. Generally used in the case of temporary interruption of communication, or interrupted communication. For example, when you click the stop button in a Web browser or type Ctrl+C using TELNET, you will have a package with URG 1. In addition, the emergency pointer is also used as a flag to indicate fragmentation of data flow.

2.9 options

The option field is used to improve TCP transport performance because it is controlled by data offset (header length), which is up to 40 bytes long.

The MSS option of type 2 is used in cases where the maximum segment length is determined when establishing a connection. This option is available on most operating systems. Maxium Segment Size: specifies the maximum length of a data field. The length of the data field plus the length of the TCP header is equal to the length of the entire TCP Segment. The MSS value indicates the length of the data field that one expects the other party to send a TCP packet segment. Communication parties can have different MSS values. If this parameter is not specified, 536 bytes are used by default. MSS appears only in SYN packets. That is, MSS appears in the packet segment with SYN=1.

Type 3, Windows Scaling, is an option used to improve throughput. The TCP header window field has only 16 bits. So the maximum number it represents is 65535. Therefore, a maximum of 64K bytes of data can be sent within the ROUND trip time (RTT) of TCP packets. If this option is used, the maximum window size can be expanded to 1 gigabyte. As a result, high throughput can be achieved even in a network environment with long RTT. This window enlargement option arises as communications with large delay and bandwidth ratios (such as satellite communications) require larger Windows to meet performance and throughput.

Type 8 timestamp option (Timestamps), which can be used to calculate the RTT (round trip time), the sender sends a TCP packet, the current time value in the timestamp field, the receiving party after receiving send confirmation message, copy the timestamp field value to the confirmation message, when the sender receives the confirmation message can calculate the RTT.

Used for serial number management in high-speed communication. When several gigabytes of data are forwarded to the network at high speed, the 32-bit serial number value can be used up quickly. In a network environment with unstable transmission, it is possible to receive a packet with an earlier serial number scattered across the network at a later time. Reliable transmission cannot be achieved if the receiver confuses the old and new serial numbers. To avoid this problem, the timestamp option was introduced to distinguish between old and new serial numbers. (It can also be used to prevent the winding of serial number PAWS, and can also be used to distinguish different messages with the same serial number. Since the sequence number is represented as 32, every 2^32 sequence number is wrapped, so using the timestamp field it is easy to distinguish different messages with the same sequence number.)

Type 5 SACK Selects Selective Acknowledgements: Used to ensure that only the missing packet segments are retransmitted, rather than all packet segments. For example, host A sends packets in segments 1, 2, and 3, while host B receives packets in segments 1 and 3. The SACK option is used to tell the sender to send only the missing data. So how do you specify which segments are missing? Two function bytes are required to use SACK. One indicates that the SACK option is to be used, and the other indicates how many bytes this option takes up. The description of missing packet segment 2 is accomplished by describing its left and right boundary packet segments 1 and 3. The 1s and 3s are actually serial numbers, so describing a missing segment requires 64 bits, or 8 bytes of space. Therefore, it can be estimated that the entire option field describes at most (40-2)/8=4 missing packet segments. Type 1 NOP(no-operation) : It requires that the length of each option in the option section must be a multiple of 4 bytes. If the length is insufficient, it is filled with NOP. It can also be used to split different option fields. For example, the window enlargement option and SACK are separated by NOP.

2.10 the filling

Just as IP packets require a fixed 32bits header, Options requires a Padding field to complement the field. It is also a 32 bits integer. This is so that the entire header length is a multiple of 4 bytes.

3. The UDP protocol:

3.1 Source Port Number:

Indicates the sending port number. The field length is 16 bits. This field is optional, and sometimes the source port number may not be set. This field is set to 0 when there is no source port number. Can be used in communication that does not require a return.

3.2 Target port Number:

Indicates the receiving port. The field length is 16 bits.

3.3 Packet Length:

This field stores the sum of the length of the UDP header and the data length. The unit is byte.

3.4 Checksum:

Checksums are designed to provide reliable UDP headers and data.

1. What is the purpose of using checksums?

If a bit error occurs in noisy communication, it can be detected by the FCS of the data link. So why do you need checksums in TCP or UDP?

In fact, the checksum between TCP and UDP is more about checking whether data is damaged due to router memory failure or program bug than checking for errors caused by noise. Anyone who has written in C knows that Pointers, if used incorrectly, have a high risk of destroying in-memory data structures. There may also be a bug in the router’s program, or the program may fail abnormally. Packets are sent over the Internet through many routers, and once one of them fails, packets, protocol headers or data passing through that router are likely to be corrupted. Even in this case,TCP or UDP, if it can provide checksum calculation, can determine whether the protocol head and data are corrupted.

2. How to check?

It is similar to the method of checking IP datagram header, but UDP checks the header and data part together. This method is not strong in error detection, but simple and fast. (1) the sender: first put all zeros in the inspection and fields, and then read the pseudo first and UDP datagram to 16 bytes in series, if the UDP datagram data section for odd number should fill in a full zero bytes (the bytes sent), according to one’s complement and calculate these 16 bytes, then write test and field and send the UDP datagram. (2) Receiver: sum the received datagram together with the false header according to binary inverse code. If there is no error, the results are all 1; otherwise, the data will be discarded.

In addition,UDP may not use checksums. At this point, the checksum field is filled with 0. In this case, because checksum calculation is not performed, the overhead of protocol processing is reduced, thus increasing the speed of data forwarding. However, if the UDP header port number or the IP address in the IP header is corrupted, other communications may be adversely affected. Therefore, checksum checking is recommended for comparison over the Internet.

3. Why does UDP pseudo header need to be used for checksum?

Why do UDP pseudo-headers also need to be computed during checksum calculation? In TCP/IP, five elements are required to identify an application for communication. They are source IP Address, Target IP Address, Source Port, Target Port, and Protocol NUMBER. However, only two of them (source and destination ports) are included in the UDP header, and the remaining three are included in the IP header.

Suppose the other three were corrupted. What would happen? Obviously, this will most likely result in applications not receiving packages when they should, and applications receiving packages when they shouldn’t. To avoid such problems, it is necessary to verify that the necessary five identifiers in a communication are correct. Therefore, the concept of false head is introduced in the calculation of checksum.

In addition,IPv6 does not have a checksum field in the IP header. TCP or UDP through the pseudo header, 5 digits can be verified, so that even in the case of IP header is not reliable can still provide reliable communication transmission.

4.UDP is unreliable. Why is checksum required?

UDP is unreliable because of the following: Message delivery is not guaranteed: message delivery is not confirmed, retransmitted, or timed out. Delivery order is not guaranteed: packet sequence number is not set, packet reordering is not performed, and queue blocking is not occurred. Connection status is not tracked

UDP does not provide packet grouping, assembly, and sorting. That is, after a packet is sent, it is impossible to know whether the packet arrived safely and intact. However, for each datagram, transmission reliability is required as much as possible. The “check and” in UDP is a means to maximize the reliability of this datagram.

4. Reference connection

www.cnblogs.com/fantastic12… Blog.csdn.net/weixin\_415… Diagram of TCP/IP

Please pay attention to the public number “programmer interview way” reply to “interview” to get a complete set of interview package!! Recommendation of high quality articles

1. Computer network —- Three times handshake four times wave 2. An article that gives you a thorough understanding of the structure of HTTP request and response packets 3. A dream come true —– project self-introduction 4. An article that lets you thoroughly understand the past life of HTTP 5. An article that will get you through HTTP methods and status codes 6. Here’s your design pattern 7. Shock!!! Check out this programmer interview manual!! 9. Nearly 30 interviews shared