1. An overview of the

TCP/IP is supposed to be basic to a programmer, but it was always a vague idea and took some time to sort things out. This article introduces the basic elements of network transport, including the following

  • Frame classification, structure and field meaning
  • The Wireshark is used to learn the structure and field meanings of FRAMES,IP packets, and TCP packets in TCP/IP
  • Serial number and confirmation number
  • TCP window mechanism and available window size negotiation mechanism

2. The frame

On a network, the amount of data transferred at a time is limited. During data transmission, big data is divided into several packets for transmission. This packet is called an Ethernet Frame, which is also called a protocol data unit (PDU). Different protocols use different frame formats and MTU values (Maxitum Transmission Unit). Frames fall into two main categories:

  • IEEE 802.3 Ethernet standard: Divided into three types

    • Novell raw IEEE 802.3
    • The IEEE 802.2 LLC
    • The IEEE 802.2 SNAP
  • Ethernet II frames, also known as DIX frames, are the most common frame type and are the Ethernet frame format used in TCP/IP networks

EEE 802.3 frames and Ethernet II frames are compatible with each other and their formats are similar as shown below:



Ethernet frame size (MTU) is supposed to be [64=46+18, 1522=1500+22]. In fact, EthernetII and 802.3 originally specified a maximum frame size of 1518, and later supported VLAN. In 1998, 802.3AC specified a maximum frame size of 1518+4(VLAN)=1522

Note: If the frame length is less than 64 bytes, fill is required so that the frame length is 64 bytes

2.1. IEEE 802.3 format

The first line in the previous figure shows the IEEE 802.3 format. The meanings of each field are as follows:

  • A frame begins with a 7-byte Preamble and a 1-byte Start of Frame Delimiter.
  • The header contains the source address (MAC destination) and the MAC address (MAC source) of the destination address. An IEEE 802.1Q VLAN tag (802.1Q tag (optional)) is used to describe VLAN membership and transmission priority. The value is the length of the Payload (length (IEEE 802.3)).
  • The Frame Check sequence is a 32 – bit cyclic redundancy check code to verify whether Frame data is corrupted.
  • Interpacket Gap: After a frame is sent, the sender needs to send at least 12 OCTET idle line status codes before the next frame is sent.
  • Payload: Payload of a frame

2.2. Ethernet II frame format

Ethernet II frames are the de facto standard, which is basically the same as IEEE 802.3 frames except for the following: 1. In the preceding part, 802.3 is split into Preamble(7 OCTETS) and SFD(Start of Frame Delimiter, 1 OCTET). Ethernet II is only Preamble(8 OCTEts). 2. For 802.3, the type or length part (Ethertype (Ethernet II) or Length (IEEE 802.3)) indicates the length, whose value is less than or equal to 1500; For Ethernet II, it represents the type, with a value greater than 1536, and the two frames can be distinguished by value size. The other numbers between 1500 and 1536 are undefined.

“802.1Q Tag (optional)” is optional. After removing this field, common Ethernet II frame formats are as follows:

3. Overview of packages

You can use the Wireshark to capture and select a packet

It mainly includes five parts: – The first line of physical layer data, physical layer data frame, this layer of data unit (bit) – the second line of data link layer data, Ethernet frame header information, this layer of data unit (frame) – the third line of network layer data, IP header information, this layer of data unit (packet) – the fourth line of transmission layer data, TCP header information, this layer of data unit can be called packet, in order to distinguish TCP data unit is called segments and UDP data unit is called datagrams – fifth line application layer data, load data, do not have to

The detailed structure and field description of the four layers are described below

4. Overview of physical layer data frames



The above content is not sent directly from the network, but is generated during frame parsing. So these things don’t count in the length of the frame.

5. Ethernet frame header information at the data link layer

The format of the mesa frame header is as follows:

The following information is displayed in the Wireshark:

6. IP datagram format

The IP datagram format is as follows:

The meanings of each field are as follows:

  • The current protocol version number is 4, so I P is sometimes called I P v 4.
  • Header length refers to the number of 32-bit headers, including any options. Because it is a 4-bit field, it has a maximum header of 60 bytes
  • Type of service (TOS) : Most current implementations do not support the TOS feature
  • The total length field refers to the length of the entire IP datagram, in bytes. Using the header length field and the total length field, we can know the starting position and length of the data content in the ipdatagram. Since this field is 1, 6 bits long, ipdatagrams can be up to 6, 5, 5, 3, 5 bytes long
  • The identity field uniquely identifies each datagram sent by the host. It is usually incremented by 1 each time a message is sent
  • Flag field and slice offset field: used for fragmentation. When the size of the sent IP packet exceeds the MTU, the IP layer needs to fragment the data. Otherwise, the data cannot be sent successfully. IP fragmentation occurs at the IP layer. The mtus of different networks are different. If the MTU of a network on the transmission path is smaller than that of the source network, the router may fragment IP packets again. The reassembly of fragmented data only takes place at the IP layer of the destination end.
  • The time to live field sets the maximum number of routers a datagram can pass through. It specifies the lifetime of datagrams. The initial value of T T L is set by the source host, and once it passes through a router that processes it, its value is subtracted by 1. When the value of this field is 0, the datagram is discarded and a cMPpacket is sent to notify the source host
  • Protocol field: TCP/UDP
  • The header checksum field is the checksum code computed from the ipheader. It does not evaluate the data after the header
  • Source IP address
  • Destination IP address
  • Optional, an optional message of variable length in a datagram

Note: the T C P header is usually 20 bytes if optional fields are not included.

The following information is displayed in the Wireshark:

7. TCP datagram format

TCP datagram format:

The meanings of each field are as follows:

  • Port number of the source end and destination end, used to search for the originating and receiving application processes. These two values, together with the source and destination IP addresses in the IP header, uniquely determine a connection
  • The serial numberIdentifies the byte stream of data sent from T C P source to T C P receiver. It represents the first byte of data in the packet segment. If you think of the byte stream as a one-way flow between two applications, each byte is counted with an ordinal number.

    • Sequence Number: identifies the byte stream sent from the TCP source to the TCP receiver. It indicates the Sequence Number of the first byte in the packet segment in the data stream. It is mainly used to solve the problem of out-of-order network reports.
    • Acknowledgment Number: The 32-bit Acknowledgment Number contains the next Acknowledgment that the end that sent the Acknowledgment should expect to receive, and therefore should be the last successfully received data byte Number plus 1. It is used to solve the packet loss problem
  • The header length is required because the optional field length is variable. This field takes up 4 bits, so T C P has at most 60 bytes of header. However, there are no optional fields and the normal length is 20 bytes
  • There are six flag bits in the TCP header. Multiple of them can be set to 1 at the same time

    • URG emergency pointer, which is used to ensure that TCP connections are not interrupted and to urge mid-tier devices to process the data as quickly as possible
    • ACK confirmation sequence number is valid
    • The PSH receiver should deliver this packet segment to the application layer as soon as possible. This flag bit represents the Push operation. A Push operation is when a packet arrives at the receiving end and is immediately sent to the application rather than queued in a buffer
    • RST indicates a connection reset request. It is used to reset connections that generated errors, and it is also used to reject faulty and invalid packets
    • The SYN synchronization sequence number is used to initiate a connection.
    • The FIN sends the packet successfully. Indicates that the sending end has reached the end of data and no data can be transmitted
  • Window size: flow control is provided by the declared window size at each end of the connection. The window size is the number of bytes, starting with the value indicated by the confirmation ordinal field, which is the byte that the receiver expects to receive. The window size is a 16-bit field, so the maximum window size is 6, 5, 5, 3, 5 bytes.

  • Checksum: verifies and overwrites the entire T C P packet segment: T C P header and T C P data. This is a mandatory field that must be computed and stored by the originator and validated by the receiver.
  • Emergency pointer: The emergency pointer is valid only when U, R, and G are set to 1. The emergency pointer is a positive offset that is added to the value in the ordinal field to indicate the ordinal number of the last byte of the emergency data. Emergency mode is a way for the sender to send emergency data to the other end.

Note: The length of the TCP header is 20 bytes

The following information is displayed in the Wireshark:

8. Length calculation

In the above article, N lengths appear, all listed as follows:

  • Frame length, the total length of the entire packet = 2220 bytes
  • Physical layer Data frame Overview Information length: This part of the data is not transmitted from the network, but the record generated when the information is received, so the length to the frame is not calculated, that is, 0 bytes
  • Data link layer Ethernet frame header length = 14 bytes, fixed value
  • The LENGTH of the IP header is 20 bytes
  • TCP header ascending order = 20 bytes
  • TCP load data = 2160 bytes

Frame length (2220 bytes) = Physical layer data frame overview length (0 bytes) + Ethernet frame header length (14 bytes) + IP header length (20 bytes) + TCP header length (20 bytes) + TCP payload data (2166 bytes)

9. Serial Number Add rules and relative serial numbers

Each end of a TCP session contains a 32-bit serial number. The serial number is the number of bits successfully sent by the current end, and the confirmation number is the number of bits successfully received by the current end. Note: If the received packet contains the SYN flag bit and the FIN flag bit, the acknowledgement number must be added by 1.

In each TCP session, the initial sequence number is random and can be any value, such as 0xf61C6CBE. In Wireshark, however, the relative serial number/confirmation number is used instead of the actual serial number/confirmation number for convenience. Relative value displayed = (actual serial number – the initial session serial number)

10. TCP window mechanism and available window size negotiation mechanism

TCP allows data to be transmitted reliably through a sequence number and acknowledgement mechanism. If confirmation is required every time a piece of data is sent, the transmission efficiency will be very low. A sliding window mechanism was introduced for this purpose. The sliding window makes full use of the bandwidth and buffer of both parties. The sender does not need to wait for the confirmation of the other party, but can continuously send multiple packets to the other party, and the other party can temporarily store these data in the buffer, and give the other party a confirmation. In this way, the speed of data transmission can be greatly increased. In order to avoid data loss when the buffer of the receiver fills up, the available window size negotiation mechanism of TCP window is introduced. The communication parties use the fields Windows Scale and Windows Size Value in the TCP packet to inform each other of the current available cache value.

10.1 How do TCP Sessions Negotiate the available window size by Capturing packets

In the default TCP protocol, the packet can only use the 16-bit window, that is, the maximum value is 65535. However, as the bandwidth has been massively increased and gigabits are not common, a Window of 65535 length is no longer sufficient. In order to break this limit, the Window Size Scaling option was used. The true available window Size is 2Window Size Scaling * Windows Size value.

The three-way handshake protocol is used to illustrate the process of window negotiation. In the protocol packet of the first handshake, the important attributes of the packet are as follows:

windows size value : 8192

Calculated Window size: 8192 (The available window value of this calculated window size is equal to the calculated Window size of 8192)

Windows scale: 2 (multiply by 4)

The important attributes in the second handshake packet are as follows:

windows size value : 14600

The available window value of this calculated window size is equal to the calculated window size of 14600, because the two sides have not yet finished negotiating whether the other side supports Windows Scale.

Windows Scale: 7(multiply by 128) : Also indicates that it supports this property

The important attributes in the third handshake package are as follows:

windowss size value : 16652 Calculated window size : 66608

Since both sides support the Window Size Scaling option, enable this function to get larger in Calculated Window Size. Calculated window size = windowss size value * Windows size scaling factor Windows size scaling factor : 4 = 2Window Size Scaling = 66608 = 16652 * 4 = 16652 * 2 2 So tell them that my available window Size is 66608

If either party does not support this property, then Windows Size Scaling factor is not available and the window size can only be the WindowSS size value value