This article mainly introduces some related network protocols, mainly IP protocol, UDP protocol, TCP protocol, HTTP protocol and other protocols, but also introduces the flow of data packet transmission.

preface

The Internet is really an architecture of ideas and protocols. An agreement is a well-known set of rules and standards that, if all parties agree to use them, make communication between them unimpeded.

When we talk about requesting resources from a server, our computer (the client) is actually requesting information from another computer (the server).

Data between the two computers is transmitted through data packets. If a large amount of data is sent, it is broken up into smaller packets for transmission.

Data transmission between computers will pass through many nodes, and data transmission between different nodes will use different protocols. It divides complex network communication into multiple layers through the concept of layering, and assigns different responsibilities to each layer. Within the layers, you just concentrate on doing your own things. With the idea of “divide and rule”, a “big trouble” is divided into several “small trouble”, thus solving the problem of network communication.

There are two network layer models, namely TCP/IP network layer model (4-layer protocol) and OSI network layer model (7-layer protocol). The 4-layer protocol is introduced here.

Four layer protocol

  • The first layer is called the link layer (data link layer/network interface layer), which is responsible for sending raw packets over the underlying network such as Ethernet and WiFi. It works at the nic level and uses MAC addresses to mark devices on the network, also known as the MAC layer.
  • The second layer is called the network layer (network interconnection layer), and the IP protocol is at this layer. On the basis of link layer, replace MAC address with IP address, and connect many Lans and wide area networks into a virtual huge network. When looking for devices in this network, it is ok to “translate” IP address into MAC address again.
  • The third layer is called the transport layer. This layer is responsible for ensuring the reliable transfer of data between two points marked by IP addresses. It is the layer at which TCP works. TCP is a stateful protocol. Data can be sent only after a connection is established with the peer party, ensuring that data is not lost or repeated.
  • The fourth layer is called the application layer, which has various application-specific protocols. A typical example is HTTP.

TCP/IP protocol

The IP protocol is responsible for delivering packets to the destination host.

Packets are sent over the Internet and comply with the Internet Protocol (IP) standard. Different devices online on the Internet have unique addresses. A computer’s address is called an IP address. Visiting any website is really just your computer asking another computer for information.

If you want to send A packet from host A to host B, the packet is appended with host B’s IP address information before transmission so that it can be addressed correctly during transmission. In addition, host A’s OWN IP address is attached to the packet. With this information, host B can reply to host A.

This additional information is loaded into a data structure called an IP header. The IP header is the information at the beginning of an IP packet, including the IP version, source IP address, destination IP address, and lifetime.

UDP protocol.

UDP, also known as the User Datagram Protocol, is responsible for delivering packets to applications.

UDP is a protocol that works at the transport layer. Similar to TCP, IT can interact with application programs. IP transmits the data packets to the target computer, and UDP or TCP takes over and tells the target computer which application program to send the data packets to.

One of the most important pieces of information in UDP is the port number, which is a number bound to every application that wants to access the network. UDP sends specified packets to specified programs using the port number, so IP sends packets to specified computers using the IP address information, and UDP sends packets to the correct program using the port number.

Like the IP header, the port number is loaded into the UDP header, which is then combined with the original UDP packet to form a new UDP packet. The UDP header contains information such as the destination port and the source port number.

When UDP is used to send data, various factors may cause packet errors. Although UDP can verify whether the data is correct, UDP does not provide a retransmission mechanism for incorrect packets, but only discards the current packet. After UDP is sent, it cannot know whether the packet can reach the destination.

UDP is a connectionless protocol. Unlike TCP, it does not need to establish a connection with three handshakes. Therefore, it is not reliable to send data whenever it wants. UDP doesn’t guarantee data reliability, but it is very fast, so UDP is used in areas where speed is important but data integrity is not so critical, such as online video and interactive games.

TCP protocol

Transmission Control Protocol (TCP) is responsible for sending data to applications.

There are two problems with UDP transport:

  • Data packets are easily lost during transmission.
  • Large files are broken into smaller packets for transmission. These packets take different routes and arrive at the receiver at different times. UDP does not know how to assemble these packets into a complete file.

TCP is a connection-oriented, reliable, byte stream based transport layer communication protocol. Compared with UDP, TCP has the following characteristics:

  • TCP provides a retransmission mechanism for packet loss.
  • TCP introduces the packet sorting mechanism to ensure that out-of-order packets are combined into a complete file.

The life cycle of a complete TCP connection consists of three phases: establishing the connection, transmitting data, and disconnecting the connection.

  • The first isEstablish a connectionPhase.
    • This stage establishes the connection between the client and server through a three-way handshake.
    • TCP provides connection-oriented communication transport.
    • Connection-oriented refers to the preparation work between the two ends before data communication begins.
    • The three-way handshake means that when a TCP connection is established, the client and server send a total of three packets to confirm the connection.
  • The secondTo transmit dataPhase.
    • At this stage, the receiving end needs to confirm each packet. That is, after receiving the packet, the receiving end needs to send the confirmation packet to the sender.
    • Therefore, if the sender does not receive the confirmation message from the receiver within a specified period after sending a data packet, the packet is considered lost and the retransmission mechanism is triggered.
    • Similarly, a large file is divided into many small packets during transmission. After these packets arrive at the receiving end, the receiving end sorts them according to the sequence number in the TCP header to ensure complete data.
  • Finally, there is the disconnect phase. Once the data has been transferred, it’s time to terminate the connection, which involves the final stage four waves of the hand to ensure that both parties can disconnect.

The HTTP protocol

HTTP is also known as HyperText Transfer Protocol.

  • Hypertext: a mixture of text, pictures, audio, video, etc., containing “hyperlinks” that can jump from one “hypertext” to another, forming complex non-linear, net-like structural relationships.
  • Transport: HTTP is a convention and specification for transferring data between two points in the computer world.
  • Protocol: HTTP is a protocol used in the computer world. It uses a language that computers can understand to establish a specification for communication between computers, as well as related controls and error handling.

HTTP is a convention and specification for the transfer of hypertext data, such as text, pictures, audio and video, between two points in the computer world.

HTTP allows browsers to fetch resources from servers. It is the fundamental technology that builds the Internet. It has no entity and relies on many other technologies to implement it, but many of them also rely on it. HTTP can be defined as “the sum total of all application-layer technologies associated with the HTTP protocol.”

The HTTP protocol has the following characteristics:

  • Flexible and extensible: You can add any header field to achieve any function.
  • Reliable transmission: Data is delivered “as far as possible” based on TCP/IP.
  • Application-layer protocols: More common than FTP and SSH, and can transmit arbitrary data.
  • Request-reply mode: The client initiates a request and the server replies a request.
  • Stateless: Each request is independent and unrelated, and the protocol does not require the client or server to record information about the request.

Other agreement

  • HTTPS: HTTP is a plaintext transmission protocol, which is not secure. So you have THE HTTPS protocol, which uses ciphertext transport. “HTTP over SSL/TLS”, which is HTTP running on SSL/TLS, is the equivalent of “HTTP+SSL/TLS+TCP/IP”.

  • SSL/TLS: a secure protocol used to encrypt communication. The SSL encryption algorithm is used to encrypt the communication. After the TCP three-way handshake is established, the TLS connection is required to ensure secure communication.

  • DNS (Domain name System) : by using meaningful names as equivalent alternatives to IP addresses; Domain names are separated into multiple words with. Levels ascending from left to right, with the rightmost being called the top-level domain. Domain name resolution refers to mapping domain names to IP addresses.

  • URI/URL: A URI is a uniform resource Identifier (URI) that uniquely marks resources on the Internet. Urls, or uniform resource locators, commonly known as “urls,” are actually a subset of URIs.

  • Proxy: A Proxy is a link between the requestor and the responder in THE HTTP protocol. As a “transfer station”, it can forward requests from clients or replies from servers. There are anonymous Proxy, transparent Proxy, forward Proxy (sending requests), and reverse Proxy (responding requests). You can do load balancing, content caching, security protection, data processing and so on.

Data transfer process

When the browser initiates an HTTP request:

  • Construct THE HTTP request line: The browser constructs the HTTP request line information and prepares to make a network request
  • Lookup cache: Before making a web request, the browser looks in the browser cache to see if there is a file to request
  • DNS Domain name resolution: Obtain the IP address through the DNS system
  • Waiting TCP queue: A domain name can establish a maximum of six TCP connections at the same time. If the number of TCP connections is larger than six, the request must wait
  • Establish A TCP connection: Establish a connection with the server using a three-way handshake
  • Making an HTTP request: The browser sends a packet of requests to the server, including the request method, request URI, and HTTP version protocol

The flow of packets from client to server:

  • Application layer packaging: When a browser (application) makes an HTTP request, it adds HTTP headers to the packet and passes it to the transport layer
  • Transport layer add-on TCP header: The transport layer appends the TCP header to the front of the packet to form a new TCP packet and hand it to the network layer
  • Network layer additional IP header: The network layer then attaches the IP header to the packet to form a new IP packet and hand it to the underlying layer
  • The packets are transmitted by the underlying layer to the network layer of the server (another computer)
  • Network layer unpacking: This is where the IP header is unpacked and the unpacked data is handed over to the transport layer
  • Transport layer unpacking: At the transport layer, the TCP header in the packet is unpacked and the data portion is handed over to the upper-layer application
  • Arrival: Finally, the packet is transferred to the server’s upper application.

Server side processing HTTP request flow:

  • Response return: Returns data for the response header and body. The response header includes the protocol version and status code. The response body contains the actual content of the HTTP
  • disconnect:
    • Typically, the server returns the request data and closes the TCP connection
    • If the header field containskeep-AliveIf yes, the TCP connection is maintained. The TCP connection can save the time for establishing a connection on the next request and improve the resource loading speed.
  • redirect: If the status code is30X, a redirection operation is required, pointing toLocationAddress in the field.

conclusion

  • Data on the Internet is transmitted through packets, which can get lost or go wrong during transmission.
  • The TCP/IP network layered model (four-layer protocol) consists of the link layer, network layer, transport layer, and application layer.
  • The IP protocol is responsible for delivering packets to the destination host.
  • UDP protocol is responsible for delivering data packets to specific applications; Data reliability is not guaranteed, but transmission speed is fast.
  • TCP is also responsible for delivering data packets to specific applications; Complete data transmission can be guaranteed. The connection can be divided into three phases: establishing the connection, transferring the data, and disconnecting the connection.
  • HTTP protocol, also known as hypertext Transfer protocol, is a protocol and specification for transmitting hypertext data such as text, pictures, audio, and video between two points in the computer world.
  • Other protocols include HTTPS, SSL/TLS, DNS, URI/URL, and proxy.