TCP/IP not fully explained

TCP/IP is a general name for communication protocols. First, let’s take a look at the OSI reference model

OSI reference model

The roles of the layers in the OSI reference model are as follows:

To summarize:

  • The application layer: Services an application and specifies the details of communication within the application. Including file transfer, email, remote login and other protocols
  • The presentation layer: Converts the information processed by the application into a format suitable for network transmission, or converts the data at the next layer into a format that the upper layer can process. Therefore, it is mainly responsible for data format conversion. Specifically, it is to convert the data format inherent to the device into the network standard transmission format. Different devices may interpret the same bitstream differently. Thus, keeping them consistent is the main role of this layer.
  • The session layer: Responsible for establishing and disconnecting communication links, as well as data segmentation and other data transfer related management.
  • The transport layer: plays the role of reliable transmission. It is processed only on the communication nodes, not on the router.
  • The network layer: Transfers data to the destination address. The destination address can be an address that multiple networks connect to through a router. This layer is therefore responsible for addressing and routing.
  • Data link layer: Is responsible for the communication transmission between nodes connected at the physical layer. For example, communication between two nodes connected to an Ethernet. The sequence of 0 and 1 is divided into meaningful data frames and sent to the peer end (data frame generation and reception).
  • The physical layer: responsible for 0, 1 bit flow (0, 1 sequence) and the high voltage, the light between the exchange.

Basic TCP/IP knowledge

TCP/IP and OSI reference model

Hardware (physical layer)

At the bottom of TCP/IP is the hardware responsible for data transfer. This hardware is equivalent to physical layer devices such as Ethernet or telephone lines. There has been no unified definition of its content. As long as people use different transmission media on the physical level (such as using network cables or wireless), network bandwidth, reliability, security, delay, etc., will be different, and there is no established index in these aspects. All in all, TCP/IP is a protocol proposed on the premise that interconnected devices can communicate with each other.

Network Interface Layer (Data link layer)

The data link layer in Ethernet is used for communication, so it belongs to the interface layer. That said, it doesn’t hurt to think of it as the “driver” that makes the NIC work. Drivers are software that acts as a bridge between the operating system and hardware. Computer peripheral add-ons or expansion cards, not directly into the computer or computer expansion slot can be used immediately, but also need the corresponding driver support.

Internet Layer (Network layer)

The Internet layer uses the IP protocol, which is equivalent to layer 3 network layer in the OSI model. IP protocol forwards subcontracted data based on IP addresses.

The functions of the Internet layer and transport layer in the TCP/IP layer are usually provided by the operating system. In particular, a router must be able to forward packets of packets across the Internet layer.

In addition, all hosts and routers connected to the Internet must implement IP functions. Other network devices connected to the Internet (such as Bridges, Repeaters, or hubs) do not necessarily implement IP or TCP functions.

IP

  • IP is a protocol that sends packets across networks, making them available to the entire Internet. The IP protocol enables data to be sent to the other side of the globe, during which time it identifies the host using the IP address.
  • IP also implies the functions of the data link layer. Through IP, hosts that communicate with each other can communicate regardless of the underlying data link.
  • Although IP is also a protocol for packet switching, it does not have a retransmission mechanism. The packet is not retransmitted even if it fails to reach the peer host. Therefore, it is a non-reliable transport protocol.

ICMP

  • If an IP packet fails to reach the destination ADDRESS of the peer due to an exception, an exception notification needs to be sent to the sender. ICMP was designed for this function. It is also sometimes used to diagnose the health of a network.

ARP

  • A protocol that resolves the physical address (MAC address) from the IP address of a packet.

The transport layer

The transport layer of TCP/IP has two representative protocols. The functionality of this layer itself is similar to that of the transport layer in the OSI reference model.

The transport layer’s primary function is to enable communication between applications. Inside a computer, more than one program is usually running at the same time. To do this, you must distinguish which programs are communicating with which programs. What identifies these applications is the port number.

TCP

  • TCP is a connection-oriented transport layer protocol. It can ensure the communication between hosts on both ends is reachable. TCP correctly handles packet loss and transmission sequence disorder during transmission. In addition, TCP can effectively use bandwidth and alleviate network congestion.
  • However, in order to establish and disconnect, it sometimes requires at least 7 times of sending and receiving packets, resulting in wasted network traffic. In addition, to improve network utilization, TCP defines a variety of complex specifications, which is not conducive to video conferencing (audio and video data amount is set).

UDP

  • UDP is different from TCP in that it is a connectionless transport layer protocol. UDP does not care whether the peer end actually receives the transmitted data. If you need to check whether the peer end has received packet or whether the peer end is connected to the network, you need to implement this in the application.
  • UDP is commonly used in multimedia fields such as multicast, broadcast communication, and video communication with less packet data.

Application layer (layer above the session layer)

TCP/IP layered, the OSI reference model of the session layer, presentation layer and application layer functions are implemented in the application. Sometimes these functions are implemented by a single program, sometimes by multiple programs. Therefore, a closer look at TCP/IP’s application capabilities reveals that it implements not only the application layer of the OSI model, but also the session and presentation layer.

The architecture of TCP/IP applications overwhelmingly belongs to the client/server model. The program that provides the service is called the server, and the program that receives the service is called the client. In this communication mode, the service provider is pre-deployed to the host, waiting to receive any request that the client may send at any moment.

The client can send requests to the server at any time. Sometimes the server may process exceptions or be overloaded. In this case, the client can resend the request after waiting for a while.

WWW

  • The WWW can be said to be an important driving force for the popularity of the Internet. With the help of a mouse and keyboard, users can surf freely and easily on the Web using software called a Web browser. This means that information stored on a remote server is displayed in the browser with a click of the mouse. The browser can not only display text, pictures, animation and other information, but also play sound and run programs.
  • The HyperText Transfer Protocol (HTTP) is used for communication between the browser and the server. The main format of the data transferred is HTML (HyperText Markup Language). HTTP in the WWW belongs to the OSI application layer protocol, while HTML belongs to the presentation layer protocol.

E-Mail

  • E-mail actually means sending letters over the Internet. With E-mail, no matter how far away people are, as long as they are connected to the Internet, they can send emails to each other. The protocol used to send E-mail is called SMTP.

File Transfer (FTP)

  • File transfer refers to transferring files saved on another computer’s hard disk to a local hard disk, or transferring files from a local hard disk to another machine’s hard disk.
  • This process uses a protocol called File Transfer Prototol (FTP). FTP has been in use for a long time and can be transmitted in binary or text mode.
  • During file transfer in FTP, two TCP connections will be established, which are the control connection used for sending the transmission request and the data connection used for actual data transmission.

Remote Login (TELNET and SSH)

  • Remote login is the function of logging on to a remote computer so that programs on that computer can run. TELNET and SSH are commonly used for remote login on the TCP/IP network.

Network Management (SNMP)

  • The Simple Network Management Protocol (SNMP) is used for Network Management over TCP/IP. The hosts, Bridges, and routers managed by SNMP are called SNMP agents, and the section managed by SNMP is called Manager. SNMP is the protocol used by the Manager and Agent.
  • The SNMP agent stores information about network interfaces, communication data volume, abnormal data volume, and device temperature. This Information can be accessed through the Management Information Base (MIB). Therefore, in TCP/IP network management, SNMP belongs to the application protocol and MIB belongs to the presentation layer protocol.
  • The larger the scope and the more complex the structure of a network, the more it needs to be managed effectively. SNMP allows administrators to check network congestion in a timely manner, discover faults in a timely manner, and collect necessary information for network expansion.

TCP/IP layered model and communication example

Packet header

In each layer, the data to be sent is appended with a header that contains information necessary for the layer, such as the destination address to be sent and protocol-related information. Typically, the information supplied to the protocol is the packet header, and the content to be sent is data.


The packet transmitted on the network consists of two parts: one is the header used by the protocol, and the other is the data transmitted from the upper layer. The structure of the header is defined in detail by the protocol specification. For example, identify which bits of the packet the domain of the upper-layer protocol should take from, how to compute the checksum and insert which bits of the packet, and so on. It is impossible for the two computers to communicate with each other if they are different in identifying the serial number of the protocol and calculating the checksum.

Therefore, at the beginning of the packet, it is clear how the protocol should read the data. On the other hand, when you look at the header, you know the information necessary for the protocol and what to do with it. Therefore, looking at the header of the package is like looking at the specification of the protocol. The first is like the face of the agreement.

Example: Sending data packets

Suppose a sends an email to B saying, “Good morning”. In terms of TCP/IP communication, E-mail is sent from one computer A to another computer B. Let’s use this example to explain the process of TCP/IP communication.

  • Application processing

    Start the application program to create an email, fill in the recipient email address, enter the email content “Good morning” by the keyboard, and click the mouse button “Send” to start TCP/IP communication.

    First, coding takes place in the application. (UTF-8, etc.) these encodings are equivalent to OSI presentation layer functions.

    After the code conversion, the actual message is not necessarily sent immediately, because some mail software has the ability to send more than one message at a time, or users may click the “receive” button to receive new messages. Such functions as managing when a communication connection is established and when data is sent are, in a broad sense, part of the session layer of the OSI reference model.

    The application establishes a TCP connection at the moment the email is sent, and uses this TCP connection to send data. The process is to send the application data to TCP at the next layer, and then do the actual forwarding processing.

  • TCP module processing

    TCP is responsible for establishing connections, sending data, and disconnecting connections according to the instructions of the application. TCP provides reliable transmission of data sent from the application layer to the peer end.

    To implement this function of TCP, a TCP header needs to be attached to the front end of application layer data. The TCP header contains the source port number and target port number (to identify the sending host and the application on the receiving host), sequence number (to determine which part of the packet to send is data), and checksum (to determine whether the data is corrupted). The packet with the TCP header attached is then sent to the IP.

  • IP module processing

    IP combines the TCP header transmitted from TCP and TCP data as its own data, and adds its OWN IP header to the front end of the TCP header. Therefore, the IP header in an IP packet is followed by the TCP header, followed by the application data header and the data itself. The IP header contains the IP address of the receiver and the IP address of the sender. Following the IP header is information to determine whether the data following it is TCP or UDP.

    After an IP packet is generated, the route or host that accepts the IP packet is determined by referring to the routing control table. IP packets are then sent to drivers that connect to these routers or host network interfaces to actually send the data.

    If you do not know the MAC Address of the receiving end, use the Address Resolution Protocol (ARP) to search for the MAC Address. Once the PEER MAC address is known, the MAC address and IP address can be delivered to the Ethernet driver for data transmission.

  • Network interface (Ethernet driven) processing

    An IP packet passed from an IP is nothing more than data to an Ethernet driver. Attach an Ethernet header to the data and send it. The Ethernet header contains the MAC address of the receiving end, the MAC address of the sending end, and the protocol that identifies the Ethernet data of the Ethernet type. The Ethernet packets generated based on the above information are transmitted to the receiving end through the physical layer. The FCS in the send process is calculated by the hardware and added to the end of the packet. The purpose of setting up FCS is to determine whether packets are corrupted due to noise.

The summary is as follows:


Packets passing through a data link

When packets flow, the Ethernet packet head, IP packet head, TCP packet head (or UDP packet head) and the application’s own packet head and data are appended from front to back. An Ethernet packet tail is appended to the end of the packet

Each packet header contains at least two bits of information: the addresses of the sender and receiver, and the protocol type of the upper layer.

As each protocol layer passes, there must be information that identifies the sender and receiver of the packet. Ethernet uses MAC addresses, IP uses IP addresses, and TCP/UDP uses port numbers to identify hosts at both ends. Even within an application, information like an E-mail address is an address identifier. This address information is appended to the header of each packet as it passes through each layer. As shown in figure:

In addition, each layered packet header contains an identifier bit, which identifies the type of protocol at the previous layer. For example, the Ethernet type in the packet header of Ethernet, the protocol type in IP, and the port number of two ports in TCP/UDP all play a role in identifying the protocol type. Even the header message of an application sometimes contains a label that identifies its data type.

Packet receiving processing

Receiving and sending are reverse processes:

  • Network interface (Ethernet driven) processing

    After receiving an Ethernet packet, the host checks the MAC address of the Ethernet packet header to determine whether the packet is destined for the host. Discard data if it is not sent to its own packet.

    If you receive a packet that happens to be addressed to you, you look for the type field in the Ethernet packet header to determine what type of data the Ethernet protocol is sending. In this case the data type is obviously an IP packet, so the data is then passed to the subroutine that handles IP, or if it is not IP but some other protocol such as ARP, the data is passed to ARP. In summary, if the type field in the Ethernet packet header contains an unrecognized protocol type, the data is discarded.

  • IP module processing

    The IP module does the same after receiving the IP packet header and subsequent data. If it is determined that the IP address in the packet header matches its own IP address, it can receive the data and search for the protocol at the upper layer. If the upper layer is TCP, the part after the IP packet head is transmitted to TCP for processing. If it is UDP, the IP packet header is forwarded to UDP for processing. In the case of a router, the address of the receiving end is usually not its own address. In this case, you need to use the routing control table to investigate the host or router that should be sent before forwarding the data.

  • TCP module processing

    In the TCP module, the checksum is first evaluated to determine whether the data is corrupted. Then check to see if you are receiving data in sequence. Finally, check the port number to determine the specific application.

    After the data is received, the receiver sends an acknowledgement receipt to the sender. If the receipt message does not reach the sender, the sender thinks that the receiver did not receive the data and keeps sending it repeatedly.

    Once the data has been received in its entirety, it is passed to the application identified by the port number.

  • Application processing

    The receiver application receives the data directly from the sender. By parsing the data, you can know that the recipient address of the mail is USER B’s address. If host B does not have an email address for host B, host B returns an error message saying “there is no such address” to the sender.

    But in this example, host B happens to have host B’s inbox on host B, so host B and recipient B can receive the body of the E-mail. The mail will be saved to the local hard drive. If the save works, the receiver will return a “processing ok” receipt to the sender. On the other hand, if the disk is full or the mail fails to be saved, the system sends a “Handle exception” acknowledgement to the sender.

    Thus, user B can use the mail client on host B to receive and read the “good morning” E-mail sent by user A on host A.


The data link

Data link refers to the data link layer in the OSI reference model, and sometimes to Ethernet, wireless local area network and other means of communication.

The protocol of the data link layer defines the specification of the transmission between devices interconnected by the communication medium. Communication media include twisted pair cable, coaxial cable, optical fiber, radio and infrared media. In addition, data is sometimes transferred between devices via switches, Bridges, Repeaters, and so on.

The data link can also be regarded as the smallest unit in network transmission. In fact, a careful observation of the Internet connecting the world can be found that it is nothing more than composed of many such data links, so the Internet can also be called “the collection of data links”.

The MAC address

MAC addresses are used to identify interconnected nodes in a data link. In Ethernet or FDDI, MAC addresses are used according to ieee 802.3V specifications. Other devices such as WIRELESS LAN (IEEE802. lLA/B/G/N) and Bluetooth use the same MAC address.

The MAC address is 48 bits long. The MAC address contains 3 to 24 bits (bits) indicating the vendor identifier. Each NICT has a unique identifier. The 25-48 bits are used internally by the manufacturer to identify each network card. Therefore, you can guarantee that there will not be network cards with the same MAC address in the world.

The MAC address specification of IEEE802.3 does not limit the type of data link. That is, the MAC address of any data link network (Ethernet, FDDI, ATM, wireless LAN, bluetooth, etc.) will not be the same.

  • Exception: People are free to set their MAC addresses on the microcomputer board. For example, if multiple VMS are started on a host, the MAC addresses can only be set by the virtual software to multiple virtual nics without hardware. In this case, it is difficult to ensure that the GENERATED MAC addresses are unique. However, no matter which protocol member communication device, the design premise is that the MAC address is unique. This can also be said to be the basic principle of the online world.

Shared media network

From the use of communication media (communication, media), networks can be divided into shared media type and non-shared media type.

Shared media network refers to a network in which multiple devices share a communication medium. The earliest Ethernet and FDDI were media sharing networks. In this way, the devices use the same carrier channel for transmitting and receiving. For this reason, half duplex communication is basically adopted, and it is necessary to control the media access.

There are two media access control modes in shared media network: one is contention mode and the other is token passing mode.

  • Contention way

    Contention means Contention for access to data transmission, also known as CSMA (Carrier Sense Multiple Access). This method usually requires each station in the network to occupy the channel to send data on a first-come, first-served basis. If multiple stations send frames simultaneously, there will be conflicts. This can lead to network congestion and performance degradation.

In other parts of the Ethernet, another way to improve the CSMA – CSMA/CD mode. CSMA/CD requires each station to check for conflicts in advance and release channels as early as possible in the event of a conflict. Its specific working principle is as follows:

  • If there is no data flow on the carrier channel, any station can send data.

  • Check for possible conflicts. In the event of a conflict, the transmitting data is abandoned and the carrier channel is immediately released.

  • After giving up sending, a random delay of a period of time, and then re – contention medium, re – send the frame.

  • Token passing mode

    Token transmission is to send a special message called “token” along the token loop, which is a way to control transmission. Only the station with the token can send data. This approach has two characteristics: first, there is no conflict, and second, each station has the opportunity to get a token through an equal loop. Therefore, even network congestion does not cause performance degradation.

    Of course, in this way, a station cannot send data frames without receiving a token, so the data link utilization is less than 100% without too much network congestion. To this end, a variety of token passing techniques have been derived. For example, early token release, token appending and multiple token loops at the same time. The purpose of these methods is to maximize network performance.

Unshared media network

Unshared media network refers to unshared media and adopts a special transmission control mode for media. In this way, each station in the network is directly connected to a switch, which is responsible for forwarding data frames. In this mode, the sender and receiver do not share communication media, so full duplex communication is adopted in most cases.

Not only does ATM use this mode of transport control, but it has recently become the mainstream mode of Ethernet. The Ethernet switch is used to build a network so that a one-to-one connection is formed between a computer and a switch port to achieve full-duplex communication. There is no conflict in this one-to-one connection mode of full-duplex communication, so more efficient communication can be achieved without the CSMA/CD mechanism.

Half duplex and full duplex communication

Half duplex refers to the sending or receiving only communication mode. It is similar to a radio transceiver. If two sides talk at the same time, they cannot hear each other. Full-duplex, by contrast, allows data to be sent and received at the same time. Similar to a telephone, the receiver and the caller can talk at the same time.

Forwarding by MAC address

In Ethernet (10BASE5, 10BASE2) and other media sharing networks using coaxial cables, only one host can transmit data at a time. When the number of hosts connected to the network increases, the communication performance deteriorates significantly. When hubs or concentrators are connected in a star form, a new type of network device, the switching hub, is a technology that uses switches used in non-media sharing networks in Ethernet. Switching hubs are also called Ethernet switches.

An Ethernet switch is a bridge with multiple ports. They decide from which network interface to send data based on the target MAC address of each frame in the data link layer. The reference table for recording the send interface is called the forward table.

  • Switch forwarding mode

    Switch forwarding has two modes, one is called storage forwarding and the other is called through forwarding.

    Store-and-forward mode Check the FCS bit at the end of the Ethernet data frame before forwarding it. Thus, you can avoid sending broken frames due to conflict or error frames due to noise.

    In straight-through forwarding, you do not need to receive all frames before forwarding them. You only need to know the destination address to start forwarding. Therefore, it has the advantage of short latency. But there is also the inevitable possibility of sending an error frame.

TCP/IP protocol

IP is equivalent to layer 3 of the OSI reference model.

IP (IPv4, IPv6) corresponds to layer 3, the network layer, in the OSI reference model. The main function of the network layer is to “realize the communication between terminal nodes”. This kind of communication between terminal nodes is also called “end-to-end communication”.

The next layer of the network layer the data link layer is used to transfer packets between nodes that connect to the same data link. Once across multiple data links, the network layer is needed. The network layer can span different data links, and data packets can be transmitted between nodes at both ends even on different data links.

The main function of IP is to send packets to the final destination address in a complex network environment.

In the Internet world, devices with IP addresses are called hosts. They can be supercomputers or minicomputers. However, to be precise, a host is defined as “a device that is configured with an IP address but does not perform routing control.” A device with both IP address and routing control capability is called a router, which is different from a host. Nodes are a general name for hosts and routers.

The relationship between the network layer and data link layer

The data link layer provides direct communication between two devices. In contrast, IP, as the network layer, is responsible for communication between two networks that are not directly connected.

IP Basics

IP is roughly divided into three functional modules, which are IP addressing, routing (forwarding up to the end node), and IP subcontracting and packet grouping.

A MAC address is an identifier used to identify different computers on the same link. IP addresses are used to identify the target address for communication among all hosts connected to the network.

Routing control

Routing refers to the ability to send packet data to the final destination address. Even if the network is very complex, routing control can determine the path to the destination address. Once this routing control is abnormal, packet data is likely to be “lost” and unable to reach the destination address. Therefore, a packet can successfully reach the final destination address, all rely on routing control.

IP is connectionless

IP is connectionless. That is, before sending packets, there is no need to establish a connection with the destination address of the peer end. If the upper layer encounters data that needs to be sent to the IP address, the data is immediately compressed into an IP packet and sent out.

In the case of faces with connections, connections need to be established beforehand. If the peer host is turned off or does not exist, the connection cannot be established. Conversely, an unconnected host cannot send data.

The connectionless case is different. Packets are sent even if the peer host is off or does not exist. Conversely, it is not known when or where a host will receive data. Network monitoring should generally be done so that hosts receive only packets destined for them. If you are not prepared, you may miss some bags that should be collected. Therefore, there may be a lot of redundant communication in a connectionless approach.

IP is connectionless for reasons of simplicity and speed. Connection-oriented processing is more complex than connectionless processing. Even managing each connection itself is quite a chore. In addition, each communication requires a prior connection, which slows down processing. When a connection is required, you can delegate this service to the upper layer. Therefore, IP adopts connectionless approach in order to achieve simplicity and high speed.

The corresponding upper layer (transport layer) TCP is linkoriented

The purpose of IP is to try to send the packet to the destination, it does not do the final receipt of the verification. Packet loss, misplacement, and data volume doubling may occur during IP packet transmission. This guarantee is provided by TCP, which is responsible for ensuring that the peer host actually receives the receipt.

DNS

DNS is used to maintain a database that represents the correspondence between host names and IP addresses within an organization. In the application, when a user enters a host name (domain name), DNS automatically retrieves the database that registered the host name and IP address and quickly locates the corresponding IP address. In addition, if the host name or IP address needs to be changed, it only needs to be processed within the organization, and there is no need to apply for or report to other organizations.

The domain name and DNS server must be configured on a hierarchical basis. If the DNS server goes down, DNS queries for that domain will not work. Therefore, at least two DNS servers are required to improve the DISASTER recovery capability. If the first DNS server fails to provide query services, the system automatically switches to the second or even the third DNS server for DISASTER recovery.

The host and software that perform DNS queries are called DNS parsers. The workstation or PC that the user is using belongs to the parser. A resolver must register at least one DNS IP address. Typically, it includes at least the IP address of the domain name server within the organization.

The DNS query

DNS query mechanism:

  • Computer A accesses www.baidu.com.
  • First, query the IP address in the DNS server.
  • If the DNS server knows the IP address, it returns the IP address directly. If the DNS server does not know the IP address, it sends a query request to the root DNS server.
  • Root domain return address.
  • DNS server Queries the IP address of the DNS server at www.baidu.com.
  • The IP address is returned to the client.
  • Computer A establishes A communication with www.baidu.com.

TCP and UDP

TCP is a connection-oriented, reliable streaming protocol. A stream is an uninterrupted data structure. TCP implements sequence control or retransmission control to provide reliable transmission. In addition, it has many functions such as “flow control”, “congestion control” and improving network utilization.

UDP is an unreliable datagram protocol. The subtleties are handed over to the upper application layer. In the case of UDP, while you can ensure the size of a message sent, you can’t guarantee that it will arrive. So sometimes applications need to do retransmission.

The characteristics of UDP

UDP does not provide complex control mechanisms and uses IP to provide connectionless communication services. And it is a mechanism for sending data from an application to the network as it is received. Even in the case of network congestion, UDP cannot perform traffic control to avoid network congestion. In addition, UDP does not retransmit packets even if they are lost during transmission. There is no function to correct even when the arrival order of packets is out of order. If these details are needed, they have to be handled by udP-enabled applications. UDP is connectionless and has simple and efficient processing. Therefore, it is often used in the following aspects:

  • Communication with less total packets (DNS, SNMP, etc.)
  • Video, audio and other multimedia communication
  • Restricted to application communication on a specific network such as a LAN
  • Radio communication

The characteristics of TCP

TCP is quite different from UDP. It fully realizes various control functions in data transmission, including retransmission control in packet loss and sequence control in subcontracting out of order. None of this is available in UDP. In addition, TCP, as a connection-oriented protocol, sends data only after confirming the existence of the communication peer, thus controlling the waste of communication traffic. TCP achieves reliable transmission through checksum, sequence number, acknowledgement, retransmission control, connection management and window control.

TCP handshake waving schematic diagram:

TCP window control and retransmission control

In using window controls, if a segment is missing, consider the case that a confirmation reply does not return. In this case, the data has reached the peer end and does not need to be retransmitted. However, when window controls are not used, data that does not receive an acknowledgement is retransmitted.

Second, consider the loss of the situation when sending. If the receiving host receives a data other than the serial number it should receive, it will return a confirmation reply for the data received so far. The diagram below:

When a packet segment is lost, the sender always receives an acknowledgement with the number 1001. This acknowledgement reminds the sender that “I want to receive data starting from 1001”. Therefore, in the case of a large window and packet segment loss, an acknowledgement with the same serial number will be repeatedly returned. If the sending host receives the same acknowledgement for three consecutive times, it resends the corresponding data. This mechanism is more efficient than the time-out management mentioned earlier and is also known as high-speed replay control.

Flow control

The sender sends data according to its actual situation. However, the receiver may receive an unrelated packet and may spend some time processing other problems. So it takes some time to do other processing for this packet, and even fails to receive any data under heavy load. As a result, if the receiver drops the data it is supposed to receive, the retransmission mechanism will be triggered, resulting in unnecessary waste of network traffic.

To prevent this phenomenon, TCP provides a mechanism for the sender to control the amount of data sent according to the actual receiving capability of the receiver. This is called flow control. The way it works is that the receiving host notifies the sending host of how much data it can receive, and the sending host sends data up to this limit. This size limit is called the window size.

In the TCP header, there is a special field for notifying the window size. The receiving host notifies the sender of the size of the buffer it can receive in this field. The larger the value of this field is, the higher the network throughput is.

However, when the buffer on the receiving end is faced with data overflow, the value of the window size is also set to a smaller value to notify the sender, thus controlling the amount of data to be sent. That is, the sending host controls the amount of data to be sent according to the instructions of the receiving host. This forms a complete TCP flow control.

Congestion control

Generally speaking, computer networks live in a shared environment. Therefore, there is also the possibility of network congestion due to communication between other hosts. If a large amount of data is suddenly sent during network congestion, it is very likely to cause the entire network to break down.

To prevent this problem, TCP controls the amount of data to be sent at the beginning of a communication using a value derived from an algorithm called slow start.

Application protocol

Definition of application layer protocols

There are many applications that take advantage of the network, including Web browsers, E-mail, remote logins, file transfers, network management, and so on. It is the application protocol that enables these applications to perform specific communication processing.

Low-level protocols, such as TCP and IP, are widely applicable and independent of upper-layer application types. An application protocol is a protocol designed and created to implement an application.

The application layer of TCP/IP covers all the functions of layers 5, 6 and 7 in the OSI reference model, including not only the session layer function of managing communication connections, the presentation layer function of transforming data formats, but also the application layer function of interacting with the peer host.

Remote login

The application that implements the computing function of logging in from one’s own local computer to the other end of the network is called remote login. After logging in remotely to a general-purpose computer or UNIX workstation, you can not only use the applications on those hosts directly, but also set the parameters on those computers. TELNET and SSH are used for remote login.

TELNET

TELNET uses a TCP connection to send text commands to and execute them on the host. Local users operate locally as if they were directly connected to the Shell inside the remote host.

TELNET can be divided into two basic services. One is the simulation terminal function, the other is the negotiation option mechanism.

SSH

SSH is an encrypted remote login system. When you log in to TELNET, you do not need to enter the password to send the message, which may cause the risk of communication eavesdropping and illegal intrusion. SSH can be used to encrypt communication content. Even if the messages were intercepted, it would not be possible to decipher the passwords sent, the specific commands given, and the results returned from the commands.

The file transfer

FTP is the protocol used to transfer files between two connected computers.

FTP Working Mechanism

FTP uses two TCP connections for file transfer: one for control and the other for data (file) transfer.

The TCP connection for control is mainly used in the control part of FTP. For example, verify the login user name and password, set the name of the sent file, and set the sending mode. With this connection, you can send requests and receive replies via an ASCII string. Data cannot be sent on this connection; it requires a dedicated TCP connection.

Control connections remain connected until the user asks to be disconnected. However, most FTP servers will forcibly disconnect users who have not entered any new commands for a long time.

Typically, TCP connections for data transmission are set up in the opposite direction to connections for control.

E-mail

The Protocol that provides email services is called Simple Mail Transfer Protocol (SMTP) o SMTP uses the TCP Protocol at its transport layer to efficiently send mails.

WWW

The WWW defines three important concepts, They are the means and location of accessing information (URI, Uniform Resource Identifier), the presentation of information (HTML, HyperText Markup Language) and information forwarding (HTTP, HyperText Transfer Protocol.

URI

URI is an abbreviation for Uniform Resource Identifier, used to identify resources. Uris are efficient identifiers that can be used outside of the WWW in a variety of combinations, such as home page addresses, E-mail, phone numbers, and so on.

Urls are often used to indicate the location of resources (files) on the Internet. But URIs are not limited to identifying Internet resources; they can be used as identifiers for all resources. Now, in effective RFC documents, urls are no longer used, but URIs, which are a broader concept than URLS, which are narrow. Therefore, URIs can be used in other application protocols besides WWW.

URI format

HTTP: // hostname: port number/path? Access content # partial information

The TLS/SSL with HTTPS

Encrypting HTTP traffic over TLS/SSL is called HTTPS traffic. As shown in figure:

The last question

So far, the article has been almost written. Next by a question in-depth consolidation of the knowledge system

The process from entering the URL to loading the page?

Comb through the main part first:

  1. From the browser receiving the URL to the network request thread (this part can expand the mechanism of the browser and the relationship between the process and thread)
  2. Start the network thread to issue a complete HTTP request (this part covers DNS query, TCP/IP request, layer 5 Internet protocol stack, etc.)
  3. Requests received from the server to the corresponding backend (this part may involve load balancing, security interception, internal backend processing, etc.)
  4. HTTP interaction between background and foreground (this part includes HTTP header, response code, packet structure, cookie and other knowledge, which can mention cookie optimization of static resources, as well as encoding and decoding, such as GZIP compression, etc.)
  5. HTTP caching (this section includes HTTP cache headers, etag, catch-control, etc.)
  6. The parsing process after the browser receives HTTP packets (parsing HTML-lexical analysis and then parsing into DOM tree, parsing CSS to generate CSS rule tree, merging into render tree, Then layout, painting and rendering, composite layer composition, GPU rendering, processing of external chain resources, loaded and domContentLoaded, etc.)
  7. Visual formatting model of CSS (rules for rendering elements, such as inclusion block, controller enclosure, BFC, IFC, etc.)
  8. JS engine parsing process (JS interpretation stage, preprocessing stage, execution stage to generate execution context, VO, scope chain, recycle mechanism, etc.)
  9. Others (can expand different knowledge modules, such as cross domain, Web security, Hybrid mode, etc.)

From receiving the URL from the browser to starting the network request thread

Browsers are multi-process, with a master process and a new one for every TAB page.

Processes may include master processes, plug-in processes, Gpus, TAB pages (browser cores), and so on

  • Browser process: the main process of the Browser (responsible for coordination and control), there is only one
  • Third-party plug-in process: Each type of plug-in corresponds to one process, which is created only when the plug-in is used
  • GPU process: a maximum of one, used for 3D drawing
  • Browser rendering process (kernel) : one process per Tab page by default, with no influence on each other, controlling page rendering, script execution, event handling, etc. (sometimes optimized, such as multiple blank tabs will be merged into one process)

Each TAB page can be thought of as a browser kernel process, and this process is multi-threaded, with several classes of child threads

  • The GUI thread
  • JS engine thread
  • Event trigger thread
  • Timer thread
  • Network request thread

When a URL is entered, it is parsed (urls are essentially uniform resource locators). A URL generally consists of several parts:

  • protocol, protocol headers, such as HTTP, FTP, etc
  • host, host domain name or IP address
  • portAnd the port number
  • path, directory path
  • query, that is, query parameters
  • fragmentThe hash value after # is used to locate a location

A separate thread is created for each network request, for example, if the URL is resolved to HTTP, a new network thread is created to handle the resource download

Therefore, the browser will open up a network thread to request the resource according to the parsed protocol.

Start the network thread until it makes a complete HTTP request

This part mainly includes: DNS query, TCP/IP request construction, five layer Internet protocol stack and so on

The DNS

If you enter a domain name, perform DNS resolution to obtain an IP address. The process is as follows:

  • If the browser has a cache, use the browser cache, otherwise use the native cache, or use host
  • If no IP address exists on the local server, the system queries the CORRESPONDING IP address on the DNS server (through routing and caching, etc.)

Domain name query may be performed by the CDN scheduler (if the CDN storage function is available). In addition, IT is necessary to know that DNS resolution is time-consuming. Therefore, if too many domain names are resolved, the first screen load will be slow

TCP/IP request

TCP divides HTTP long packets into short packets and establishes a connection with the server through three-way handshake for reliable transmission.

Then, when the connection is disconnected, four waves are required (four waves are required because it is full-duplex)

TCP/IP concurrency limit

Browsers have a limit on the number of concurrent TCP connections under the same domain name (2-10). And in HTTP1.0 there is often a TCP/IP request for each resource download

Get and POST

Get and POST are both TCP/IP in nature, but they differ in TCP/IP in addition to HTTP.

Get generates one TCP packet and post two

  • When a GET request is made, the browser sends the HEADERS and data together, and the server responds with 200,
  • For a POST request, the browser sends headers, the server responds with 100 continue, the browser sends data, and the server responds with 200 (returns data).

HTTP layer differences:

  • GET is harmless when the browser falls back, while POST resubmits the request.
  • GET requests are actively cached by browsers, whereas POST requests are not, unless set manually.
  • GET requests can only be url encoded, while POST supports multiple encoding methods.
  • GET requests pass parameters in the URL with length limits, whereas POST does not.
  • GET is less secure than POST because parameters are exposed directly to the URL and therefore cannot be used to pass sensitive information.
  • The GET argument is passed through the URL, and the POST is placed in the Request body.

Five layer Internet protocol stack

The five layers are application layer, transmission layer, network layer, data link layer and hardware layer. Corresponding OSI model:

The request is received from the server to the corresponding background

Load balancing

All user-initiated requests are directed to the scheduling server (reverse proxy server, for example, installed with NGINx control load balancing). The scheduling server then allocates different requests to the servers in the corresponding cluster according to the actual scheduling algorithm. The scheduler then waits for the HTTP response from the actual server and feeds it back to the user

HTTP packet Structure

  1. General head
Request Url: address of the requested Web server Request Method: Request Method (Get, POST, OPTIONS, PUT, HEAD, DELETE, CONNECT, TRACE) Status Code: Remote Address: the Address of the requested Remote server (which will be converted to IP)Copy the code
  1. Status code
1xx -- indicating message, request received, continue processing 2XX -- success, request received, understood, accepted 3XX -- redirection, further action must be taken to complete the request 4XX -- client error, request has syntax error or request cannot be implemented 5XX -- server error, The server failed to implement a valid request // Common status code 200 - indicates that the request was successfully completed and the requested resource is sent back to the client 304 - the requested page has not been modified since the last request, Request client to use local cache 400 -- client request error (for example, it can be blocked by security module) 401 -- unauthorized request 403 -- Access forbidden (for example, it can be forbidden when not logged in) 404 -- resource not found 500 -- server internal error 503 -- Service unavailableCopy the code
  1. Request/response headers
Accept: Accept-encoding: Indicates the compression Type supported by the browser, such as gzip. If the value exceeds the compression Type, content-type cannot be received. The type of entity content the client sends cache-control: Specifies a caching mechanism for requests and responses, such as no-cache if-modified-since: last-modified for the server to match to see If the file has changed. If-none-match (http1.1) specifies the server time for which a request is not made, and the server time for which a request is not made. The server ETag is used to match whether the file content has changed (very accurately). For example, keep-alive Host: URL of the requested server Origin: where the original request originated (only down to the port),Origin has more respect for privacy than Referer: The source URL of the page (applicable to all types of requests, down to the detailed page address, which is often used by CSRF interceptors) User-agent: Some necessary information about the User client, such as the UA headerCopy the code
  1. Common response headers (parts)
Access-control-allow-headers: specifies the request Headers allowed by the server. Headers access-Control-allow-methods: Specifies the request Headers allowed by the server. Access-control-allow-origin: specifies the request Headers allowed by the server. Content-type: specifies the Type of entity Content returned by the server. Date: Specifies the time when data is sent from the server. Cache-control: specifies the time when data is sent from the server. 6. tell the browser or other client when it is safe to cache a document last-modified: Expires: The time at which a document should be considered expired so that it cannot be cached max-age: ETag: the current value of the entity tag of the request variable set-cookie: Sets the Cookie associated with the page. The server passes the Cookie to the client keep-alive through this header. If the client has keep-alive, the Server also responds (e.g. Timeout =38). Server: Some information about the ServerCopy the code

cookie

Long connection and short connection

TCP/IP level:

  • Long connection: Multiple packets can be sent continuously on a TCP/IP connection. If no packets are sent during the TCP connection, both parties need to send detection packets to maintain the connection. Generally, they need to maintain the connection online (similar to heartbeat packets).
  • Short connection: A TCP connection is established when data is exchanged between communication parties. After data is sent, the TCP connection is disconnected

HTTP level

  • In HTTP1.0, short connections are used by default, that is, the browser establishes a connection without making an HTTP operation and then breaks the connection at the end of the task, such as a separate connection for each static resource request
  • As of http1.1, long connections are used by default. Keep-alive: In the case of keep-alive, when a web page is opened, the TCP connection for HTTP transmission between the client and the server is not closed. If the client accesses the web page of the server again, the existing connection will continue

Keep-alive does not last forever. It has a duration and is usually configured on a server (such as Apache). Keep-alive only takes effect when both the client and the server support the keep-alive connection

The HTTP 2.0

Significant differences between HTTP2.0 and HTTP1.1:

  • In http1.1, each request for a resource is required to open a TCP/IP connection, so the corresponding result is that each resource is corresponding to a TCP/IP request, because TCP/IP itself has a concurrent limit, so when the resource is more than one, the speed is significantly slower
  • Http2.0, a TCP/IP request can request multiple resources, that is, as long as a TCP/IP request, you can request several resources, split into smaller frame requests, speed significantly improved.

Http2.0 features:

  • Multiplexing (i.e. one TCP/IP connection can request multiple resources)
  • Header compression (HTTP header compression, reduced volume)
  • Binary frame splitting (a binary frame splitting layer is added between the application layer and the transmission layer to improve transmission performance and achieve low latency and high throughput)
  • Server-side push (the server can send multiple responses to a request from the client and can actively notify the client)
  • Request priority (If a stream is assigned a priority, it is processed based on that priority, with the server deciding how many resources it needs to process the request.)

https

SSL/TLS handshake:

  1. The browser requests an SSL link and sends a random number to the server – Client Random and an encryption method supported by the Client, such as RSA encryption, which is in plaintext.
  2. The Server selects a set of encryption algorithms and Hash algorithms, returns a random number — Server Random, and sends its identity information back to the browser in the form of a certificate (the certificate contains the website address, asymmetric encryption public key, certificate authority and other information).
  3. The browser receives the certificate from the server
  • Verify the validity of the certificate (whether the authority is legitimate, whether the url contained in the certificate is the same as that being accessed). If the certificate is trusted, the browser will display a small lock, otherwise there will be a prompt
  • After the user receives the certificate (trusted or not), browsing produces a new random number – Premaster Secret, which is then encrypted with the public key in the certificate and the specified encryption methodPremaster secret, sent to the server.
  • Client Random, Server Random and Premaster Secret are used to generate symmetric encryption key- for HTTP link data transmission through certain algorithmssession key
  • Computes the handshake message using the agreed HASH algorithm and uses the generatedsession keyThe message is encrypted, and finally all the previously generated information is sent to the server.
  1. The server receives a reply from the browser
  • Use known encryption and decryption methods and their own private key to decrypt, obtainPremaster secret
  • Generate the same rules as the browsersession key
  • usesession keyDecrypt the handshake message sent by the browser and verify that the Hash is the same as that sent by the browser
  • usesession keyEncrypt a handshake message and send it to the browser
  1. The browser decrypts and computes the HASH of the handshake message. If the HASH is the same as that sent by the server, the handshake is complete.

HTTP cache

Caches can be easily divided into two types: strong cache (200 from cache) and negotiated cache (304).

  • When the cache is strong (200 from cache), the browser directly uses the cache if it determines that the local cache is not expired without sending an HTTP request
  • When caching is negotiated (304), the browser makes an HTTP request to the server, which then tells the browser that the file has not changed and lets the browser use the local cache

Cache control in HTTP1.1:

  • Cache-Control: Cache control header, including no-cache and max-age
  • Max-AgeCache-control: cache-control: cache-control: cache-control: cache-control: cache-control: cache-control: Max-age =3600, and its value is absolute time, calculated by the browser itself
  • If-None-Match/E-tag: The header of the browser is if-none-match, and the server is e-tag. Similarly, If the if-none-match and e-tag Match after the request is sent, the content is unchanged, and the browser is notified to use the local cache. Unlike last-Modified, e-tag is more accurate. It is something like a fingerprint and is generated based on the FileEtag INode Mtime Size. That is, the fingerprint changes whenever the file changes, and there is no 1s accuracy limit.

Last-modified:

  • Indicates when the server file was last changed
  • One drawback is that it can only be accurate to 1s
  • Then there is the problem that some server files change periodically, causing the cache to become invalid

E-tag:

  • Is a fingerprint mechanism that represents the fingerprint associated with a document
  • Only the files change, and as the files change,
  • There is no precise time limit, as soon as the file is read, the e-tag will be different

If both e-Tag and Last-Modified are used, the server checks the E-tag first

If both cache-control and Expires are enabled, cache-control takes precedence.

Parsing the page flow

  1. Parse the HTML and build the DOM tree
  2. Parses the CSS and generates the CSS rule tree
  3. Combine DOM tree and CSS rules to generate render tree
  4. Render tree (Layout/reflow), responsible for each element size, location calculation
  5. Render the Render tree (paint), which draws the page pixel information
  6. The browser sends each layer’s information to the GPU, which then composes the layers and displays them on the screen

HTML parsing is the first step in the rendering process. To put it simply, this step goes like this: The browser parses the HTML and builds a DOM tree. Bytes → characters → tokens → Nodes → DOM

  1. Conversion: The browser converts the obtained HTML content (Bytes) to a single character based on its encoding
  2. Tokenizing word segmentation: The browser converts these characters into different token tokens according to the HTML specification standard. Each token has its own unique meaning and set of rules
  3. Lexing: The result of word segmentation is a bunch of tokens, which are then converted into objects that define their attributes and rules respectively
  4. DOM construction: Because HTML tags define relationships between different tags, this relationship is like a tree structure. For example, the parent of the body object is the HTML object, and the parent of the segment P object is the body object

The CSS rule tree generation is similar.

Building a Render tree

Once the DOM tree and CSSOM are in place, it’s time to build the render tree. In general, render trees correspond to DOM trees, but not strictly one-to-one. This is because some invisible DOM elements are not inserted into the render tree, such as invisible tags like head or display: None.

Apply colours to a drawing

With the Render tree in place, it’s time to start rendering. The basic flow is as follows:

  1. Computing CSS styles
  2. Building a Render tree
  3. Layout, main location coordinates and size, whether to wrap, various position overflow Z-index attributes
  4. Draw, draw the image

Then, the lines and arrows in the diagram represent dynamic changes to the DOM or CSS through JS, resulting in a Layout or Repaint

Layout and Repaint:

  • Layout, also known as Reflow, means back flow. This typically means that the content, structure, position, or size of an element has changed, requiring recalculation of styles and rendering trees
  • Repaint. Meaning that the changes to the element only affect the appearance of the element (for example, the background color, border color, text color, etc.), then simply apply the new style to the element

What causes reflux?

  1. Page render initialization
  2. DOM structure changes, such as deleting a node
  3. Render tree changes, such as less padding
  4. Window resize
  5. The most complicated one: get some properties and cause backflow. Many browsers optimize for backflow by waiting until there is enough of it to do a batch backflow, but in addition to direct changes to the Render tree, the browser can also trigger backflow to get the correct value when fetching some properties, making browser optimizations ineffective, including
  • offset(Top/Left/Width/Height)
  • scroll(Top/Left/Width/Height)
  • cilent(Top/Left/Width/Height)
  • width,height
  • GetComputedStyle () or IE’s currentStyle is called

Changing the font size causes backflow

Optimization scheme

  • Instead of changing styles item by item, it’s better to change the style once, or define the style as class and update it once
  • Instead of looping around the DOM, create a documentFragment or div, apply all DOM operations to it, and finally add it to window.document
  • Avoid multiple reads of attributes such as offset. If you can’t avoid it, cache them in variables
  • Locate complex elements absolutely or fixedly out of the document flow, otherwise backflow would be costly

Simple layer and composite layer

  • You can assume that there is only one compound layer by default, and that all DOM nodes are under this compound layer
  • If hardware acceleration is enabled, you can turn a node into a composite layer
  • The drawing between composite layers does not interfere with each other, and is directly controlled by the GPU
  • On a simple layer, even with absolute layout, the change does not affect the overall backflow, but the drawing will still be affected in the same layer, so the performance of animation is still very low. Composite layers are independent, so hardware acceleration is generally recommended for animation

The download of resources outside the chain

When an external link is encountered, a separate download thread is opened to download the resource.

The handling of CSS resources has several characteristics:

  • CSS downloads asynchronously and does not block the browser from building the DOM tree
  • (this is related to browser optimization to prevent CSS rules from constantly changing and avoid repeated builds)
  • With exceptions, media Query declared CSS does not block rendering

The processing of JS script resources has several characteristics:

  • Block browser parsing, which means that if an external script is found, parsing HTML will continue until the script has been downloaded and executed
  • The optimization of the general modern browsers have optimization, browser in a script block, will also continue to download other resources (of course with a concurrent limit), but even though the script can be downloaded in parallel, the parsing process is still blocked, which means to be after the script execution is the next parsing, parallel downloads are just a kind of optimization
  • Defer and Async. Normal scripts block browser parsing, but you can add defer or async properties so that the script becomes asynchronous and can wait until the parsing is complete

Note that defer and async are different: defer executes, while async executes asynchronously.

  • Async is executed asynchronously. The async is executed asynchronously after the async download is completed. It must be executed before the onload event, but not before or after the DOMContentLoaded event
  • Defer is deferred, and it looks like the script was placed after the body in the browser (although it should be before the DOMContentLoaded event according to the specification, the optimization effect varies from browser to browser, and may be behind it).

When you encounter images and other resources, it is directly asynchronous download, will not block parsing, directly replace the original SRC place with pictures after downloading

The loaded and domcontentloaded

  • The DOMContentLoaded event is triggered only when the DOM is loaded, not including stylesheets, images (e.g. not if there are async loaded scripts)
  • When the load event is triggered, all the DOM, stylesheets, scripts, and images on the page have been loaded

Block Formatting Context (BFC)

Features:

  • The internal boxes are placed vertically, one after the other
  • The vertical direction of box is determined by margin, and the margin between two boxes belonging to the same BFC will overlap
  • BFC areas do not overlap with float boxes (can be used for typesetting)
  • A BFC is a separate container on a page, and the child elements inside the container do not affect the outside elements. And vice versa
  • When calculating the height of the BFC, floating elements are also involved (no floating collapse)

How do I trigger the BFC?

  • The root element
  • The float property is not None
  • Position is absolute or fixed
  • Display: inline-block, flex, inline-flex, table, table-cell, table-caption
  • The overflow is not visible

Thank you

The process from entering the URL to loading the page? How to improve their front-end knowledge system by a problem! Diagram of TCP/IP

Due to my limited skills, if there is any mistake in this article, please contact me, thank you!