What is the HTTP protocol

1, define,

HTTP stands for HyperText Transfer Protocal. A transport protocol for transferring hypertext locally from a World Wide Web server.

2. Packet structure

The STRUCTURE of HTTP packets is similar to that of TCP requests. TCP packets are transmitted in the structure of TCP headers and data. HTTP packet structure: start line + header + blank line + entity (data)

Low starting line

1) Request message

Eg: GET /home HTTP/1.1 (method path version)

2) Response message

The start line in the response packet is also called the status line eg: HTTP/1.1 200 OK (Version status code cause)

● Headers (request headers & Response headers)

Is the key-value pair of Key: Value

3. Dependence (Network Basis)

HTTP is based on THE TCP protocol and its position in the TCP/IP four-tier model is shown in the following figure:

In the communication process of Web applications using HTTP, the data transmission flow is as follows:

Capture packets to obtain packets

You can install your own packet capture tool (I used Fiddler) to listen for network requests and retrieve raw packet data. The following is the original packet obtained by the packet capture tool. You can verify the packet structure mentioned above

Now you can use network directly to view the request data. Why do you need to capture packets? In some scenarios, we cannot directly use network to view HTTP requests (such as mobile terminal/applets). If there is an online problem and request analysis is needed, packet capture tools are needed. Or if we need to do network related performance optimization, we need to make targeted improvement according to the network condition of our existing website.

Two, long connection & short connection

HTTP long and short connections are essentially TCP long and short connections.

1. What is long connection

Multiple HTTP requests can reuse the same TCP connection. A long connection is not permanent. If no HTTP request has been sent for a certain period of time (which can be specified in the header), the connection will be broken.

2. What is a short link

For each HTTP request, a connection is established and broken at the end of the task.

3. How to enable the long connection

If Connection is set to keep-alive in the packet header, it must be set on both the server and client. Take a look at how the company’s ERP system project (test environment) is set up:

If a client uses the http1.1 protocol and does not want to use long links, the header must be set to close. If the server side also does not want to support long links, the value of connection should also be specified as close in response

4. Concurrency performance gap before and after long connection

Since we haven’t learned how to press yet and haven’t set up a Web server yet, let’s take a look at other people’s test reports

Third, HTTP protocol development process

1, the HTTP / 1.0

By default, a new TCP connection is opened for each request, and the connection is disconnected immediately after receiving the request. In the case of a large number of network requests, it is easy to reach the maximum number of browser requests (IE 11, Firefox, Chrome, the maximum number of concurrent connections is 6), and the communication overhead is high

2, HTTP / 1.1

● Implement and default that all connections are long connections, that is, multiple HTTP requests and responses can be sent over a TCP connection. In this way, the client initiates multiple requests, reducing the waste of network resources and communication time caused by TCP handshake

● More cache control strategies were introduced over 1.0: ETag, if-unmodified-since, if-match, if-none-match, etc. For a more detailed HTTP caching policy, see MDN

● The request header must contain the Host header to distinguish between different domain names and port numbers

3, HTTP / 2.0

  • Binary: HTTP1.x parsing is text-based. There are natural defects in format parsing based on text protocol. There are various forms of text expression, and many scenarios must be considered in order to achieve robustness. Binary is different, only recognizing the combination of 0 and 1. Based on this consideration HTTP2.0 protocol parsing decision to adopt binary format, implementation is convenient and robust

  • Multiplexing: Multiple streams can exist in a TCP connection and multiple requests can be sent. A request corresponds to an ID. In this way, a connection can have multiple requests. The requests of each connection can be randomly mixed together, and the receiver can assign the requests to different server requests according to the REQUEST ID

  • Header compression: Http1.x’s request body typically has a response compression Encoding specified by the Content-Encoding header field, but the request header is not compressed. When the request fields are very complex, especially for GET requests, the request messages are almost all headers. At this time, there is still a lot of room for optimization HTTP/2 for the header field, also adopted the corresponding compression algorithm – HPACK, the request header compression

  • Server push: In HTTP/2, the server is no longer completely passive to receive and respond to requests. It can also create a stream to send messages to the client. When a TCP connection is established, for example, the browser requests an HTML file, the server can return the HTML, Other resource files referenced in the HTML are returned to the client, reducing the client’s wait.

4, HTTP / 3.0

Although HTTP/2 solves many of the problems of 1.1, HTTP/2 still has some defects. These defects are not from HTTP/2 itself, but from the underlying TCP protocol. We know that TCP links are reliable connections. HTTP/1.1 can use up to six TCP connections at the same time, with one blocking and the other five still working, but HTTP/2 has only one TCP connection, which magnifies the blocking problem. Because TCP has been widely used, it is difficult to modify TCP directly. Based on this, HTTP/3 chooses a compromise method — UDP protocol. HTTP/3 implements multiplexing, 0-RTT, TLS encryption, traffic control, packet loss and retransmission on the basis of UDP.

To sum up, HTTP/1.x can be used in scenarios where requests are infrequent; When we request the home page, which requires a large request back end and fast response, we can use HTTP/2.0; When we are in the weak network environment or in the mobile terminal network unstable situation, it is very easy to lose packets, then we can use HTTP/3.0 to optimize.

HTTP and HTTPS

HTTPS is based on the HTTP protocol and uses SSL or TLS (known as SSL3.0) to encrypt data, verify the identity of the peer, and protect data integrity.

1. HTTPS features

  • Content encryption: the use of mixed encryption technology, the middle can not directly view the plaintext content
  • Authentication: Authenticates the client to access its own server through a certificate
  • Protect data integrity: Prevent transmitted content from being impersonated or tampered with by middlemen

2. The difference between HTTP and HTTPS

  • Certificate HTTPS requires you to apply for a certificate from a CERTIFICATE authority (CA). Generally, a few free certificates need to be paid.
  • The default HTTP port number is 80, and the default HTTPS port number is 443
  • Encryption HTTP runs on TOP of TCP, and all transmitted content is in plaintext. HTTPS runs on top of SSL/TLS, and SSL/TLS runs on top of TCP, and all transmitted content is encrypted.
  • Security HTTPS is a network protocol that uses HTTP+SSL to encrypt transmission and authenticate identities. It effectively prevents hijackings by carriers and solves a major problem in preventing hijackings. It is more secure than HTTP.

Q: What is the process of HTTPS connection A: Reference

Five, the WebSocket

WebSocket is a protocol in HTML5, which is essentially based on TCP. It initiates a special HTTP request through HTTP/HTTPS protocol to shake hands and then creates a TCP connection for exchanging data. It also supports long connections. HTTP/2.0 can also do server push, so how does it differ from WebSocket? HTTP/2.0 allows the server to push resources to the client, but that is not perceived by the application. It mainly allows the browser (user agent) to cache static resources in advance, so we cannot expect HTTP2 to establish two-way real-time communication like WebSocket. Therefore, in the case of real-time communication, we still need to communicate with the server through WebSocket.