From past to Future: A study of HTTP2

Let’s start with a brief history of HTTP and what HTTP/2.0 is good for.

HTTP / 0.9

HTTP/0.9, released in 1991, is an application layer protocol based on TCP/IP. It does not involve the transmission of packets, but only specifies the communication format between the client and server. By default, port 80 is used. This version is extremely simple, with only one command GET, and only text transfer is supported.

Request process: After TCP is established, the client requests a web page such as index. HTML from the server, and the server can only respond to the string in HTML format. When the server finishes sending, it closes the connection.

HTTP / 1.0

In May 1996, HTTP/1.0 was released.

First: the addition of many formats of content can be sent, which makes it possible for the Internet to transmit not only text, but also images, videos and binary files. Through this update, laid the foundation for the great development of the Internet.

Secondly: In addition to GET command, also added POST, HEAD and other commands, rich browser and server interaction means.

Finally: The format of HTTP requests and responses has changed. In addition to the data section, each communication must include headers (HTTP headers) that describe some metadata.

Other new features include status codes, multi-character set support, multi-part sending, permissions, caching, content encodings, and more

A big change from HTTP/0.9.

However, HTTP/1.0 has some drawbacks, the main one being that only one request can be sent per TCP connection. Once the data is sent, the connection is closed, and if additional resources are requested, a new connection must be created.

TCP new connections are expensive because the client and server need to shake hands three times, and the initial sending rate is slow because TCP starts with a slow start. As a result, HTTP/1.0 has poor performance. The more external resources a web page loads, the more this problem becomes.

The solution to this problem is to add a Connection: keep-alive on both the client and server for reuse. However, this field is not a standard field and may behave differently from implementation to implementation, so it is not a fundamental solution.

HTTP / 1.1

In January 1997, HTTP/1.1 was released, only half a year after version 1.0. It further refined the HTTP protocol, which is still in use 20 years later and is still the most popular version.

Added some new features such as:

A persistent connection

Connection: keep-alive, that is, the TCP Connection is not closed by default and can be reused by multiple requests. No need to declare Connection: keep-alive.

The client and server can close the connection if they find the other side inactive for a period of time. It is standard practice for the client to send Connection: close on its last request, explicitly asking the server to close the TCP Connection. Connection: close

Currently, most browsers allow up to six persistent connections to the same domain name for security purposes.

Pipelining: The ability of clients to send multiple requests simultaneously over the same TCP connection further improves the efficiency of the HTTP protocol. (Although it can improve efficiency, but not much support, and not much use, does improve some efficiency, but also there are some problems, HTTP/2.0 later)

The Content – Length field

A TCP connection can transmit multiple responses at the same time, so you need a mechanism to distinguish which response the packet belongs to. This is what the Content-Length field is for, declaring the Length of the response.

Such as: the Content – Length: 3000

He tells the browser that the response is 3000 bytes long, and the following bytes belong to the next response.

In HTTP/1.0, the Content-Length field is not required because it does not involve sending multiple responses at once, or the browser detects that the server has closed the TCP connection, indicating that all packets have been received.
Block transfer encoding

To use the Content-Length field, the server must know the Length of the packet before sending the response.

For some time-consuming dynamic operations, this means that the server has to wait for all operations to complete before sending data, which is obviously inefficient. A better approach is to send a chunk of data as it is generated, using a “stream” instead of a “buffer”.

Therefore, HTTP/1.1 makes it possible to use chunked Transfer encoding instead of the Content-Length field. Any request or response header with a Transfer-Encoding field indicates that the response will consist of an undetermined number of data blocks.

Each non-empty block of data is preceded by a hexadecimal number indicating the length of the block. Finally, a block of size 0 indicates that the data for this response has been sent.
Other features

HTTP/1.1 also added a number of verb methods: PUT, PATCH, HEAD, OPTIONS, DELETE

In addition, the Host field has been added to the client request header to specify the domain name of the server.

With the Host field, requests could be sent to different sites on the same server, laying the foundation for the rise of virtual hosting.
disadvantages

Spriting is a process of combining many small images into one large image and then breaking it up. The downside of Spriting is that if you need to use that many images per page, you need to send them.
Inlining, short for lining images inside urls in CSS files, is similar to Spriting.
Concatenation (Concatenation), if there are too many JS files, you can use a tool to consolidate many JS files into a single file, so that the browser only needs to download once, instead of countless requests to download, but if a certain CODE of JS is changed, then it also needs to download again. This is a huge change for debugging and developers alike.
Sharding, because the HTTP/1.1 specification states that a client can establish a maximum of a limited number of TCP connections to the same host. So Sharding is about spreading services across as many servers as possible. This allows users to establish many TCP connections with multiple hosts at the same time, reducing load times.
Thread blocking

Despite the addition of persistent connections and pipelining, all data communication within the same TCP connection is carried out in sequence. The server does not process another response until it has processed one. If the response is particularly slow, there will be a long queue of requests waiting. This is called “head-line blocking”

To avoid this problem, there are two ways to reduce the number of requests and open more persistent connections at the same time. This leads to many web optimization techniques, such as merging scripts and stylesheets, embedding images in CSS code, domain sharding, and so on.
Some solutions to this problem:
Too many options and too much detail

And HTTP/1.1 contains too much detail and optional parts, which makes it too big.

Too many options, which results in less commonly used features that are rarely supported in later implementations, and features that are initially implemented that are rarely used.

As time went on, these seemingly marginal features were introduced, and client and server interoperability issues were exposed. HTTP Pipelining is a good example.
Underutilized TCP

We can reduce pauses in transmission by making better use of TCP and exploit the time that could be used to send/receive more data.
Transfer size and number of resources

In recent years, the amount of data required to load the front page of a website has gradually increased and has exceeded 1.9MB. And we’re more concerned with the fact that the average page requires more than 100 resources to display and render.
Significant delay

HTTP/1.1 falls short on network latency. Part of the problem with HTTP pipelining is that it is still turned off by default for most users. Although in recent years, our network bandwidth from the original only a few hundred KB, now the general network can reach tens of MB, but the network latency is not reduced. On mobile devices, for example, it’s hard to get a good, fast web experience, even with a high connection rate.

For example, you can distribute resources on different hosts to establish connections and achieve faster speeds.

SPDY protocol

In 2009, Google unveiled its own SPDY protocol to address the inefficiency of HTTP/1.1.

Once this protocol proved viable in Chrome, it was used as the basis for HTTP/2, where major features were inherited.

If you use Chrome, you’ll find an example of SPDY on Baidu:

HTTP/2

In 2015, HTTP/2 was released. It is not called HTTP/2.0, because the standards committee is not going to release the original version, the next new version will be HTTP/3.

To improve the

Reduce the latency sensitivity of the protocol
Fixed pipelining and head of line blocking issues
Prevents the host from requiring a higher number of connections
Preserve all existing interfaces, content, URI formats, and structures

HTTP/2 is a binary protocol

You can use HTTP /2 parsers such as Wireshark to analyze and debug protocols.

HTTP/ 2 sends different types of binary frames. They have some common fields, and there are 10 different types of frames specified. The two most basic types correspond to HTTP/1.1 DATA and HEADERS.
Multiplexing of streams to achieve bidirectional, real-time communication, multiplexing.
Packets have priorities and dependencies that the server can process first
Header compression can reduce transmission costs, but there may be security issues.
Support for resetting, which means that if you want to transmit something several times, and then halfway through the transmission, the receiver suddenly doesn’t want it anymore, you can send an RST_STREAM frame to terminate it. With HTTP/1.1, you need to disconnect the entire connection.

Server active push, also known as cache push

If the client requests a resource, the server can infer the resource that the client will request again after requesting the resource. At this time, the server can actively prepare the resource and push it to the client. If the client does not need it, it can also send an RST_STREAM frame to terminate.
Traffic control: HTTP/2 can have its own advertised traffic window that restricts one end from sending data. It works in a similar way to SSH.

It contains a lot of information, including fields like Stream ID, SEQ/ACK

The following graph illustrates the performance difference between HTTP/1.1 and HTTP/2.

The text/blank

Ed. / Chen PI Shuang, Ying sound

Author’s past articles:

Summary: ANALYSIS of TCP connection and common attack methods (wechat public platform link)

(Nuggets community link here)

This article has been authorized by the author, the copyright belongs to chuangyu front. Welcome to indicate the source of this article. Link to this article: https://knownsec-fed.com/2018-09-25-guan-yu-http2-de-yan-jiu/

To subscribe for more sharing from the front line of KnownsecFED development, please search our wechat official account KnownsecFED. Welcome to leave a comment to discuss, we will reply as far as possible.

Thank you for reading.

From past to Future: A study of HTTP2

HTTP / 0.9

HTTP / 1.0

HTTP / 1.1

SPDY protocol

HTTP/2

Related Posts

Java exception throws and try, catch application instances

Mac Data Recovery Is lost. Don’t worry

Application of distributed database system based on Raft protocol