Hi guys, this is Programmer Cxuan. Welcome to the latest issue of my article.
In this article, we will talk about HTTP 2.0, what changes HTTP 2.0 has made over HTTP 1.1, and what features HTTP 2.0 has.
By the way, if you haven’t read my HTTP 1.1 series, I suggest you read the following articles, which are very nice, and you’ll get something out of them.
By the end of this HTTP post, you’ll be fine with bickering with your interviewer
By the end of this HTTPS, you’ll be fine with bickering with the interviewer
The original HTTP
When HTTP was just born, it was only used for web content acquisition, generally used for page access. At that time, the page content was not as rich as it is now, there were not many interactive scenes, and there were no huge and complicated CSS and JS, and the page loading speed was very fast. But with the advent of Web 2.0, more content is displayed, better typography, and more user interaction scenarios, the content of the page is getting bigger and bigger, making the page load slower and slower.
The bottleneck of HTTP
There are two main factors that affect an HTTP network request: bandwidth and latency.
Let’s start with bandwidth. If we’re still in the dial-up phase, bandwidth bottlenecks can easily occur because of the small amount of data transferred per unit of time. But now that 10Mbps, 100Mbps, and even 1,000 Mbps are in every home, we don’t have to worry about bandwidth being the bottleneck for network requests.
All that’s left is delay.
There are three main aspects of delay:
- HOL blocking: Browsers block requests for a number of reasons. A browser can have only four connections for the same domain name at a time (this may vary depending on the browser kernel). If the number of connections exceeds the threshold, subsequent requests will be blocked.
- DNS Lookup: The browser needs to know the IP address of the target server to establish a connection. The system that resolves domain names into IP addresses is called DNS. This can often be done with DNS caching results to reduce this time.
- Initial Connection: HTTP is based on TCP. The browser can set up an actual connection only after the third handshake. However, these connections cannot be reused, resulting in three handshakes and slow startup for each request. The effect of three-way handshake is more obvious in the case of high latency, and the effect of slow startup is more significant in the case of large file requests.
HTTP 1.0 is one of the most complained about being connected to reuse, when every time there is a new request to experience a process of three-way handshake and four wave, and the connection of build and release requires a lot of server resources, the requested page is less able to deal with, but with the increasing of the request, the HTTP 1.0 more and more difficult.
However, Connection: keep-alive can be set in HTTP 1.0 headers. If keep-alive is set in the header, the connection can be multiplexed within a certain period of time. The specific period of multiplexing can be controlled by the server, usually within 15 seconds. After HTTP 1.1, the default value of Connection is keep-alive. To disable Connection reuse, explicitly set Connection:Close.
One of the most frequently complained about issues with HTTP is its head of blocking, which blocks subsequent requests because bandwidth is not fully utilized.
If five requests are issued at the same time, subsequent requests cannot be processed if the first request is not completed, as shown in the figure below
If the first request is not processed, requests 2, 3, 4, and 5 are blocked until request 1 is processed. When the network is open, the performance impact is small, but once request 1 does not reach the server for some reason, or the request is not returned in time because of network congestion, it affects all subsequent requests, causing an infinite block of subsequent requests, and the problem becomes more serious.
In HTTP 1.1, pipelining was designed to solve the problem of pipelining, as shown in the following figure
While at first glance this pipelined design might seem to solve the blocking problem, since the three requests on the right are sent one after the other rather than waiting for the response to arrive, in fact, this is not the case.
Pipelining is no saviour, and it has its flaws:
- Since only idempotent requests such as GET and HEAD can be used pipelining, non-idempotent requests such as POST cannot be used because of sequential dependencies between requests.
- In fact, the queue head blocking problem is not completely solved, because the server still returns the response in sequence, that is, the return request FIFO – first back.
- Most HTTP proxy servers do not support Pipelining.
- There were issues negotiating with older servers that did not support Pipelining.
Because of these problems, browser vendors either don’t support Pipelining at all, or turn pipelining off by default, with stringent requirements.
SPDY
Despite all the problems with HTTP1.0 and HTTP 1.1, the industry came up with all sorts of optimizations, but they were, how should I say, palliative, until Google came up with SPDY in 2020, People began to take a positive view and address the problems of the old HTTP protocol itself, which directly accelerated the birth of HTTP 2.0.
Let’s talk about what SPDY is and what features it has.
Know SPDY
SPDY aims to address HTTP’s shortcomings, namely latency and security. We’ve been talking about latency, but as for security, although we haven’t talked about it in detail, HTTP plaintext transport is a problem. In order to reduce latency as the goal, there is room for adjustment of HTTP and TCP at the application layer, but TCP as a lower-level protocol has existed for decades, in fact, has been deeply embedded in the global network infrastructure, if you want to move will inevitably hurt the bone, the industry response will not be high. So SPDY’s scalpel is aimed at HTTP.
- To reduce latency, the client’s single connection and single request, and the server’s FIFO response queue are all big chunks of latency.
- The original design of HTTP is that the client initiates a request and the server responds. The server cannot actively send content to the client.
- Compress HTTP headers. HTTP 1.x headers are getting bloated, and cookies and user agents can easily increase the size of headers to 1KB or more. And because of the stateless nature of HTTP, headers must be carried repeatedly with each request, wasting traffic.
In order to increase the possibility of solving these problems, Google is smart enough to avoid the transport layer in the first place, and to leverage the power of the open source community to increase the diffusion of the protocol users by simply setting up the User Agent in the header of the request and supporting it on the server side. This greatly reduces the difficulty of deployment. The design of SPDY is as follows
SPDY functions are divided into basic functions and advanced functions. Basic functions are enabled by default, while advanced functions need to be manually enabled.
SPDY basic functionality
- Multiplexing reduces the overhead of establishing and releasing TCP connections and improves bandwidth utilization by sharing one connection with multiple requests.
- One of the issues with multiplexing is that some key requests block on a shared connection basis. SPDY allows you to set priorities for each request so that the most important ones get a response first.
- Header compression. The aforementioned HTTP 1.x headers are often repetitive and redundant. Choosing the right compression algorithm can reduce the size and number of packets. SPDY can compress more than 80% of the header.
SPDY advanced features
- Server push, HTTP can only be sent by the client, the server can only passively send response. ** x-associated -Content ** header tells the server that new Content is being pushed.
- Different from server push, server hint does not push content, but only tells the client that new content is generated. The download of content still requires the client to initiate the request. The x-subresources header is used to notify the server. In general, the client needs to query the server status first and then download resources, saving a query request.
With the advent of automatic SPDY, page load times decreased by 64 percent compared to HTTP, and SPDY has been supported by major browser vendors for more than a year. However, SPDY didn’t last as long as people think. SPDY started in 2012 and went out of maintenance in 2016. If HTTP 2.0 hadn’t come along, I’m sure Google would have received more real feedback and data. But SPDY did its job during that time.
A preliminary HTTP 2.0
HTTP 2.0, also known as HTTP/2, is a version 2.0 of the Hypertext Transfer Protocol. Because the popularity of SPDY allowed the IETF to see the effects of optimization and to modify the protocol layer to optimize HTTP, the IETF began to formally consider making plans for HTTP 2.0. Also, some of SPDY’s designers were invited to contribute to the design of HTTP 2.0.
HTTP2.0 was originally designed to be universal. SPDY is more like Google’s own toy. You can play with it any way you want. Any design at the beginning will be related to the future maintenance problems, if there are any flaws or deficiencies may have a huge impact, so consider the problem Angle should be very rigorous and careful.
HTTP 2.0 was designed with some important presuppositions:
- The basic model of the client sending requests to the server does not change.
- The original protocol header will not be changed. Services and applications using http:// and https:// will not be modified. Http2 :// will not be used.
- Clients and servers using HTTP 1.x can smoothly upgrade to HTTP 2.0.
- Proxy servers that do not recognize HTTP 2.0 can downgrade requests to HTTP 1.x.
The client needs to determine whether HTTP 2.0 is supported before deciding with the server whether to use HTTP1.x or HTTP 2.0, so there must be a negotiation between the client and the server, which adds an RTT delay each time. The changes we made to HTTP 1.x were designed to reduce latency, and now we have an RTT, which is clearly unacceptable. Google also encountered this problem when developing SPDY Protocol. They adopted the approach of forcing Negotiation to be completed at SSL layer, and therefore developed an extension of TLS called NPN(Next Protocol Negotiation). HTTP 2.0 uses the same approach, but after discussion, HTTP 2.0 does not force SSL layer, HTTP 2.0 does not use NPN, Instead, we developed an extension to TLS called ALPN(Application Layer Protocol Negotiation), and now SPDY intends to apologize to ALPN as well.
Major changes in HTTP 2.0
HTTP 2.0 has changed a lot since its design and inception, but for developers and vendors, there are a few major changes:
Binary format
HTTP 1.x uses the plaintext protocol, and its format consists of three main parts: Protocol parsing must be done to identify the three parts of request line, header and body. Protocol parsing is text-based, and text-based parsing has the defect of diversity, while binary format can only identify 0 and 1, which is relatively fixed. Based on this consideration, HTTP 2.0 decided to adopt binary format, which is easy to implement and robust.
The following figure nicely illustrates the different message formats used by HTTP1.x and HTTP 2.0.
In HTTP 2.0 packets, length defines the start and end of the frame, type defines the frame type, and there are ten different types of frame. Flags defines some important parameters. Stream ID is used for flow control. The rest of the payload is the body of the request.
Although the HTTP 2.0 packet format looks completely different from HTTP 1.x, HTTP 2.0 does not change the semantics of HTTP 1.x. It just encapsulates the semantics of HTTP 1.x, as shown in the figure below
As you can see from the figure above, HTTP 1.x is encapsulated as HEADERS Frame by HTTP 2.0, while HTTP 1.x is encapsulated as Data Frame by HTTP 2.0. Browsers even revert HTTP 2.0 frames to HTTP 1.x format automatically during debugging.
Connection sharing
HTTP 1.x does not really solve the connection MultiPlexing problem, so HTTP 2.0 has to solve the big problem of connection sharing. In this way, even packets from many streams can be mixed together and transmitted over the same connection, and the different data streams can be assembled by reconnecting according to the Stream ID identifier at the beginning of different frames.
What is a stream?
A STREAM is a virtual channel in a connection that can host two-way messaging. Each stream has a unique integer identifier. To prevent streAAM iD conflicts on both ends, the flow initiated by the client has an odd ID, and the flow initiated by the server has an even ID.
As we mentioned above, one of the main reasons HTTP 1.x does not really address connection sharing is the inability to prioritize different requests, which can block critical requests. HTTP 2.0 allows you to assign different priorities to different streams, and dependencies can be set between streams. Dependencies and priorities can be adjusted dynamically, which solves the problem of critical requests being blocked.
The head of compression
HTTP 2.0 uses encoder to reduce the size of the headers. Each communication partner caches a header field table to avoid repeated transmission of headers and reduce the transmission size. HTTP 2.0 uses the HPACK compression algorithm.
The main idea of this compression algorithm can be found in the official documentation httpwg.org/specs/rfc75…
Server push
As we have discussed above, HTTP 2.0 can be used to send the content of the client in advance in a Push mode. Since there is no request, connection, etc., static resources can be greatly accelerated through Server Push mode. Server push has an even bigger advantage: caching, which can also share cached resources between pages.
Note the following points:
1. Push follows the same origin policy;
2. This server push is determined based on the request response of the client.
When a server wants to PUSH a resource, it sends a Frame of Type PUSH_PROMISE with the stream ID that PUSH will create. It tells the client: I’m going to send you something with this ID, and the client is ready to go. When the client parses the frame and finds that it is of type PUSH_PROMISE, it is ready to receive the stream that the server is pushing.
HTTP 2.0 pitfalls
HTTP 2.0 brings us the most amazing is multiplexing, although multiplexing has all kinds of benefits, but you can think about it, multiplexing is good, but it is built on the basis of TCP connections, in the case of frequent connections, will not cause stress to TCP connections, in this perspective, TCP can easily become a performance bottleneck.
Also, using HTTP 2.0 adds a TLS handshake, which increases RTT, as we discussed above.
In HTTP 2.0, multiple requests are in the same TCP pipeline, so when HTTP 2.0 packet loss occurs, the entire TCP starts waiting for retransmission, blocking the TCP. All requests in the connection.
conclusion
In this article, we mainly talk about HTTP from 1.x to SPDY, and then HTTP 2.0 protocol changes, as well as HTTP 1.0, 1.1 pain points and disadvantages, SPDY appeared background and discovery. Then, what are the main features of HTTP 2.0, how does HTTP 2.0 change from HTTP 1.x, and what are its disadvantages?
HTTP 2.0, a bit of a blast!