Computer network is the basis of computer and software industry interview. Whether it is software/hardware development, technical support or testing positions, it will involve the basic knowledge of computer network. This paper is based on the relevant knowledge arrangement made by the author in the interview preparation. Main contents of this paper:

  • OSI and TCP/IP models
    • OSI seven layer model
    • The defect of the OSI
    • TCP/IP five-tier model
  • TCP three handshakes and four waves
    • TCP Establishing a connection
    • TCP three handshakes, what about two handshakes?
    • TCP Closes the connection
    • Why is it three handshakes to establish a connection, but four waves to close it?
    • Why do I need the TIME_WAIT state?
    • How does TCP ensure reliable transmission
  • HTTP and HTTPS
    • The basic concept
    • The difference between HTTPS and HTTP
    • Status code of the HTTP response
    • HTTP long connection and short connection

OSI and TCP/IP models

OSI seven layer model

Open System Interconnect (OSI) stands for Open System interconnection. Commonly known as the OSI reference model, it is a network interconnection model developed by ISO (International Organization for Standardization) in 1985.

OSI defines seven layers of network interconnection framework (physical layer, data link layer, network layer, transport layer, session layer, presentation layer, application layer), namely ISO Open interconnection system Reference model.

Describes the functions of each layer from top to bottom:

  1. Application layer: The layer closest to the user in the OSI reference model that provides the application interface to the computer user and also provides the user with various network services directly. Our common application layer network service protocols are: HTTP, HTTPS, FTP, POP3, SMTP and so on.
  2. Presentation layer: The presentation layer provides various encoding and conversion functions for application layer data, ensuring that data sent by the application layer of one system can be identified by the application layer of another system. If necessary, this layer can provide a standard representation for converting multiple data formats within a computer into a standard representation for communication. Data compression and encryption are also among the transformation capabilities that the presentation layer can provide.
  3. Session layer: Responsible for establishing, managing, and terminating communication sessions between presentation layer entities. Communication at this layer consists of service requests and responses between applications in different devices.
  4. Transport layer: The transport layer establishes end-to-end links between hosts. The function of the transport layer is to provide reliable and transparent end-to-end data transmission services for upper-layer protocols, including error control and flow control. This layer hides the details of the lower level data communication from the upper level, so that the upper level user sees only a reliable host-to-host data path that can be controlled and set by the user between the two transport entities. We usually talk about TCP UDP at this layer. The port number is the “port” here.
  5. Network layer: this layer establishes the connection between two nodes through IP addressing, selects appropriate routing and switching nodes for packets sent by the transport layer at the source end, and correctly transmits them to the transport layer at the destination end according to the address. It is usually called the IP layer. This layer is known as the IP protocol layer. The IP protocol is the foundation of the Internet.
  6. Data link layer: Combines bits into bytes and bytes into frames, accesses media using link layer addresses (Ethernet uses MAC addresses), and detects errors. The data link layer is divided into two sub-layers: logical link control layer (LLC) and media access control layer (MAC).
    • MAC sub-layer processing CSMA/CD algorithm, data error check, frame, etc.
    • The LLC sublayer defines fields that enable the last protocol to share the data link layer. In practical use, the LLC sublayer is not required.
  7. Physical layer The transmission of the actual final signal is achieved through the physical layer. Transmit bitstreams over physical media. Level, speed and cable pins are specified. Common devices include hubs, Repeaters, modems, network cables, twisted-pair cables, and coaxial cables. These are transport media for the physical layer.

The defect of the OSI

OSI’s seven-tier architecture is conceptually clear and theoretically complete, but it is complex and impractical. By the way, OSI, which had been backed by some big companies and even some national governments, failed:

  • OSI experts lack practical experience, and they lack a commercial drive to accomplish OSI standards
  • The OSI protocol is too complex to implement and inefficient to run
  • The OSI standards took too long to come to market in a timely manner. (In the early 1990s, although the entire OSI international standard had been developed, TCP/ IP-based Internet was already running successfully on a large scale around the world.)
  • The OSI hierarchy is not very good, and some functions are repeated in multiple layers

TCP/IP five-tier model

The mapping between TCP/IP Layer 5 protocols and OSI Layer 7 protocols is as follows:

  1. Application Layer The task of the application layer is to complete specific network applications through the interaction between application processes. Application layer protocols define the rules for communication and interaction between application processes (processes: running programs in a host). Different application layer protocols are required for different network applications. There are many application layer protocols in the Internet, such as domain name system DNS, HTTP protocol to support world Wide Web applications, SMTP protocol to support E-mail and so on. The data units that the application layer interacts with are called packets.
  2. Transport Layer

    The main task of the transport layer is to provide a common data transfer service for communication between two host processes. Application processes use the service to transmit application-layer packets. “Generic” means that multiple applications can use the same transport layer service rather than targeting a specific network application. Because a host can run multiple threads at the same time, the transport layer has the functions of reuse and reuse. The so-called reuse means that multiple application layer processes can simultaneously use the services of the lower transport layer. On the contrary, the transport layer delivers the received information to the corresponding processes in the upper application layer. The transport layer mainly uses the following two protocols:
    • Transmission Control Protocol (TCP) – provides connection-oriented, reliable data transfer services.
    • User Datagram Protocol (UDP) – provides a connectionless, best effort data transfer service (without guarantee of reliability of data transfer).
  3. Network layer The network layer provides communication services for different hosts on a packet-switched network. When sending data, the network layer encapsulates the packet segments or user datagrams generated by the transport layer into packets and packets for transmission. In the TCP/IP architecture, packets are also called IP datagrams, or datagrams for short, because the network layer uses the IP protocol. Note here: Do not confuse the “user datagram UDP” of the transport layer with the “IP datagram” of the network layer. In addition, no matter which layer of data units, can be generally expressed as “grouping”. Another task of the network layer is to select an appropriate route so that the ramet passed down from the source host transport layer can find the destination host through the router in the network layer. The Internet is composed of a large number of heterogeneous networks connected with each other through routers. The network layer protocol used by the Internet is the Connectionless Internet Protocol (Intert Prococol) and many routing protocols, so the network layer of the Internet is also called the Internet layer or IP layer.
  4. Data Link Layer The data link layer is usually referred to as the link layer for short. Data transmission between two hosts is always transmitted over a segment of the link, which requires the use of a special link-layer protocol. When transmitting data between two adjacent nodes, the data link layer transmits the frame assembly process of IP datagram handed over by the network layer on the link between two adjacent nodes. Each frame contains data and necessary control information (such as synchronization information, address information, error control, etc.). When receiving data, control information enables the receiver to know which bits a frame begins and ends in. In this way, when the data link layer receives a frame, it can extract the data portion from it and submit it to the network layer. The control information also enables the receiver to detect errors in the received frame. If an error is detected, the data link layer simply discards the errant frame to avoid wasting network resources further across the network. If errors in data transmission at the link layer need to be corrected (that is, the data link layer must not only detect errors, but also correct errors), then the reliability transport protocol is used to correct the errors. This approach complicates the link layer protocol.
  5. The unit of data transferred at the physical layer is bits. The function of the physical layer is to realize the transparent transmission of bitstreams between adjacent computer nodes and shield the differences between specific transmission media and physical devices as far as possible. The data link layer above it does not have to consider what the specific transmission medium of the network is. “Transparently transmitted bitstream” means that the bitstream transmitted by the actual circuit does not change, and the circuit appears to be invisible to the transmitted bitstream. Of the various protocols used on the Internet, the most important and well-known are TCP/IP. Nowadays, TCP/IP is often referred to not only TCP and IP, but the entire TCP/IP protocol family used by the Internet.

TCP three handshakes and four waves

The TCP creation and link folding processes are automatically created by the TCP/IP protocol stack. So developers don’t need to control this process, but it’s helpful to understand the underlying workings of TCP.

TCP Establishing a connection

TCP Establishes a TCP connection with a three-way Handshake. A TCP connection requires the client and server to send Three packets. The specific process is as follows:

  • Client sends packet with SYN flag – one handshake – server;
  • The server sends a packet with the SYN/ACK flag – the second handshake – the client;
  • The client sends a packet with an ACK flag – the three-way handshake – the server.

TCP three handshakes, what about two handshakes?

In the case of two-handshake, when A sends A message to B, but the message is blocked on A node due to network reasons, and the blocked time exceeds the set period, A will think that the message is lost and resend the message. When the communication between A and B is complete, the message that A considers invalid reaches B.

For B, this is A new request link message and sends an acknowledgement to A. For A, it thinks that if no message is sent to B again (because the last call has ended), all A will ignore B’s confirmation, but B will wait for A’s message. This causes B’s time to be wasted (for the server, resources such as CPU are wasted), which is not feasible, which is why there is no double handshake.

Therefore, there is a revision of the three-way handshake. The third handshake looks redundant but is not. This is mainly to prevent the connection misjudgment caused by the invalid request message segment being suddenly sent to the server.

TCP Closes the connection

Removing a TCP connection requires four packets and is called a four-way handshake. Either the client or the server can initiate the wave action actively. In socket programming, either side performs the close() operation to generate the wave action.

  • The client sends a FIN to shut down client-to-server data transfer;
  • When the server receives the FIN, it sends back an ACK with the received sequence number plus one. Like the SYN, a FIN takes a sequence number;
  • The server closes the connection with the client and sends a FIN to the client.
  • The client sends an ACK packet for confirmation and sets the sequence number of the ACK packet to 1.

Why is it three handshakes to establish a connection, but four waves to close it?

When establishing a connection, the server receives a SYN packet in LISTEN state and sends the ACK and SYN packets to the client.

After receiving a FIN packet from the peer party, the peer party stops sending data but can still receive the receipt. In this case, the peer party may not send all data to the peer party. Therefore, we can close the connection immediately or send some data to the peer party and then send a FIN packet to the peer party to close the connection. Therefore, our ACK and FIN are generally sent separately.

Why do I need the TIME_WAIT state?

According to the status conversion relationship of TCP connection closure, the party that actively closes the TCP connection enters the TIME_WAIT state after sending the ACK packet for the FIN packet of the other party. The TIME_WAIT state is also called the 2MSL state.

What is 2MSL? MSL indicates the Maximum Segment Lifetime before any Segment is discarded. So 2 megaliters is going to be 2 times this time. The party that finally concludes as active shutdown will continue to wait a certain amount of time (2-4 minutes) for the application to end at both ends.

Why TIME_WAIT state is required:

  1. Another consequence of this 2MSL wait is that during the 2MSL wait for this TCP connection, The socket that defines this connection (client IP address and port number, server IP address and port number) can no longer be used. This connection can only be used after 2MSL ends.
  2. Each specific TCP implementation must select a message segment maximum lifetime MSL in order to make old packets disappear from the network due to expiration. It is the maximum amount of time that any packet segment is in the network before being discarded.

Impact of TIME_WAIT:

When one end of a connection is in TIME_WAIT state, the connection can no longer be used. In fact, what makes sense to us is that this port will no longer be used. When a port is in TIME_WAIT state (which should be the connection), this means that the TCP connection is not broken (completely broken), so if you bind the port, it will fail. For the server, if the server suddenly crashes, it will not be able to restart within 2MSL because bind will fail. One way to solve this problem is to set the socket’s SO_REUSEADDR option. This option means that you can reuse an address.

How does TCP ensure reliable transmission

TCP ensures data transmission reliability in the following ways:

  1. Checksum: TCP keeps the checksum of its head and data. This is an end-to-end checksum to detect any changes in the data during transmission. TCP discards the segment and does not acknowledge the receipt of the segment if the segment is checked or incorrect.
  2. Confirmation reply + serial number (cumulative confirmation + SEQ). After receiving a packet, the receiver acknowledges it (cumulative acknowledgement: the acknowledgement of all data received in sequence). TCP numbers each packet sent. The receiver sorts the packets and sends the ordered data to the application layer. The TCP receiver discards duplicate data.
  3. Flow control: Each side of the TCP connection has a fixed buffer space. The TCP receiver only allows the sender to send data that the buffer of the receiver can accept. When the receiver does not have time to process the data of the sender, the sender can be prompted to reduce the sending rate to prevent packet loss. TCP uses a variable-size sliding window protocol for flow control. (TCP uses sliding Windows to achieve flow control)
  4. Congestion control: Reduces data transmission when the network is congested.
  5. The stop-wait protocol is also designed to achieve reliable transmission. Its basic principle is to stop sending each packet and wait for the confirmation of the other party. Send the next packet after receiving confirmation.
  6. Timeout retransmission: When TCP sends a segment, it starts a timer and waits for the destination to acknowledge receipt of the segment. If an acknowledgement cannot be received in time, the packet segment is resend.

HTTP and HTTPS

The basic concept

HTTP: is the most widely used network protocol on the Internet. It is a client – and server-side request and response standard (TCP) used to transfer hypertext from the WWW server to the local browser. It can make browsers more efficient and reduce network traffic.

HTTPS: an HTTP channel that aims at security. In short, it is the secure version of HTTP, that is, ADDING SSL layer under HTTP. The SECURITY foundation of HTTPS is SSL, so SSL is required for encrypting details. The HTTPS protocol has two main functions. One is to establish an information security channel to ensure the security of data transmission. Another is to verify the authenticity of the site.

The difference between HTTPS and HTTP

  1. HTTPS requires you to apply for a certificate from a CA. Generally, there are few free certificates, so a certain cost is required.
  2. HTTP is a hypertext transmission protocol, and information is transmitted in plain text. HTTPS is a secure SSL encryption transmission protocol.
  3. HTTP and HTTPS use completely different connections and use different ports, the former 80 and the latter 443.
  4. HTTP connections are simple and stateless; HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identity. It is more secure than HTTP.

Status code of the HTTP response

As a program developer for some servers to return the meaning of the HTTP state should be well aware of, only to figure out these status code one by one, the work encountered various problems can be handled with ease. So let’s take a look at some of the more common HTTP status codes.

  1. 1XX – Message prompts that these status codes represent temporary responses. Clients should be prepared to receive one or more 1XX responses before receiving regular responses. Such as:

    • The original request has been accepted and the client should Continue to send the rest of the request. (New in HTTP 1.1)
    • 101: Switching Protocols Server converts compliance with a client’s request to another protocol (HTTP 1.1 new)
  2. 2xx – Success This type of status code indicates that the server successfully accepted the client request.

    • 200-OK Everything is fine, reply documents for GET and POST requests follow. -201-created The server has Created the document and the Location header gives its URL.
    • 202 – Accepted The request has been Accepted, but processing is not yet complete.
  3. 3XX – Redirect client browser must take more action to fulfill the request. For example, the browser might have to request a different page on the server, or repeat the request through a proxy server. Such as 304 Not Modified

  4. 4XX – Client error An error has occurred and the client seems to have a problem. For example, if a client requests a page that does not exist, the client does not provide valid authentication information. Such as:

    • 400 – Bad Request A syntax error occurs in the Request.
    • 401 – Unauthorized Access Was denied. A customer attempted to gain Unauthorized access to a password-protected page.
  5. 5XX – Server error The server was unable to complete the request because it encountered an error. Such as:

    • 500 – Internal Server Error The Server encountered an unexpected condition and could not complete the customer’s request.
    • 502 – Bad Gateway When a server, acting as a Gateway or proxy, accesses the next server to complete a request, but the server returns an invalid reply.

HTTP long connection and short connection

Short connections are used by default in HTTP/1.0. That is, each time the client and server perform an HTTP operation, a connection is established and broken at the end of the task. When the client browser accesses an HTML or other type of Web page that contains other Web resources (such as JavaScript files, image files, CSS files, etc.), the browser re-establishes an HTTP session each time it encounters such a Web resource. From HTTP/1.1 onwards, long connections are used by default to preserve the connection feature. HTTP with long connections adds this line to the response header:

Connection:keep-alive
Copy the code

In the case of a long connection, when a web page is opened, the TCP connection between the client and the server for the transmission of HTTP data is not closed. When the client accesses the server again, it continues to use the established connection. Keep-alive does not hold a connection forever, it has a hold time that can be set in different server software such as Apache. To implement persistent connections, both clients and servers must support persistent connections.

Subscribe to the latest articles, welcome to follow my official account