GitHub 19K Star Java engineer into god road, don’t come to understand it!

GitHub 19K Star Java engineer into the way of god, really don’t come to understand it!

GitHub 19K Star Java engineer into the way of god, really really don’t come to understand it!

HTTP/3 has been in the news a lot lately, and more and more large international companies have started using HTTP/3.

So, with HTTP/3 already in the works, it’s only a matter of time before it’s fully available, so it’s time for us, as leading developers, to understand what HTTP/3 is and why we need it.

So, what is HTTP/3? What technologies did he use? What problem was solved?

Problems with HTTP/2

Prior to writing this article, I wrote an article entitled “What DID HTTP/2 do wrong? Analyze the problems with HTTP/2 and the reasons behind them.

Here is not a detailed introduction, we strongly recommend that you read this article first, contribute to the study of this article.

In the last article, we mentioned that HTTP/2 has TCP queue header blocking, TCP handshake latency, and protocol rigidity because the underlying transport layer protocol is still TCP.

This results in HTTP/2 using technologies such as multiplexing, binary framing, and so on, but there is still room for optimization.

QUIC agreement

We know that HTTP/2 is “deprecated” because the transport layer protocol it uses is still TCP, so HTTP/3’s primary problem is to bypass TCP.

If a new protocol is developed, it will also be affected by the rigidification of the intermediate device and will not be used on a large scale. So, the developers came up with a udP-based implementation.

Google was the first to take this approach and put it into practice, launching a protocol called QUIC in 2013, which stands for Quick UDP Internet Connections.

As the name suggests, this is a protocol based entirely on UDP.

From the beginning of the design, Google wanted to use this protocol to replace HTTPS/HTTP to make web pages faster. In June 2015, QUIC’s network draft was formally submitted to the Internet Engineering Task Force. In October 2018, the Internet Engineering Task Force HTTP and QUIC Working Group officially renamed HTTP over QUIC to HTTP/3.

So what we’re talking about now is HTTP over QUIC, which is HTTP based on the QUIC protocol.

So, to understand how HTTP/3 works, all you need to know is QUIC.

The QUIC protocol has the following features:

  • Udp-based transport layer protocol: It uses UDP port numbers to identify a specific server on a specified machine.
  • Reliability: Although UDP is an unreliable transport protocol, QUIC takes a twist on UDP to provide similar reliability to TCP. It provides packet retransmission, congestion control, transmission pacing, and other features found in TCP.
  • Implement unordered, concurrent byte streams: a single data stream of QUIC guarantees ordered delivery, but multiple data streams can be out-of-order, which means that the transmission of a single data stream is in order, but in multiple data streams the receiver may receive it in a different order than the sender sent it!
  • Quick handshake: QUIC provides 0-RTT and 1-RTT connection establishment
  • Use TLS 1.3 Transport Layer security protocol: TLS 1.3 has a number of advantages over earlier VERSIONS of TLS, but the main reason to use it is that it takes fewer round trips to shake hands, thus reducing protocol latency.

So, where does QUIC belong in the TCP/IP protocol family? As we know, QUIC is implemented based on UDP, and is the protocol that HTTP/3 depends on. So, according to the layer of TCP/IP, it belongs to the transport layer, that is, it belongs to the same layer with TCP and UDP.

To be more specific, since QUIC is not only responsible for the transport layer protocol, but also has the security-related capabilities of TLS, you can see where QUIC fits into the HTTP/3 implementation by looking at the figure below.

Next, we will analyze the QUIC protocol separately. Let’s see how he makes the connection.

The QUIC connection is established

We know that TCP, a reliable transport protocol, requires a three-way handshake, and because of the three-way handshake, it consumes an additional 1.5 RTT, compared to 3-4 RTT connections when TLS is added.

So how does QUIC make connections? How to reduce RTT?

QUIC proposes a new connection establishment mechanism, based on which the quick handshake function is realized. A QUIC connection establishment can use 0-RTT or 1-RTT to establish the connection.

In the process of handshake, QUIC uses Diffie-Hellman algorithm to ensure the security of data interaction and combines its encryption and handshake process to reduce the round-trip times in the process of connection establishment.

Diffie-hellman (DH) Key exchange is a special way to exchange keys. It is one of the earliest key exchange methods put into practice in the field of cryptography. DH allows two parties to reach a shared key over an insecure channel without the other party’s (private) information. This key is used for symmetric encryption of subsequent information exchange.

The overall process of establishing a QUIC connection is as follows: QUIC uses the Diffie-Hellman algorithm to negotiate the initial key during the handshake. The initial key depends on a set of configuration parameters stored on the server, which are periodically updated. After the initial key negotiation succeeds, the server provides a temporary random number, and the two parties generate the session key based on this number. The client and server use the new key to encrypt and decrypt data.

The process is divided into two steps: Initial handshake and Final (and repeat) handshake.

Initial handshake

When the connection is established, the client sends a greeting message (inchoate Client Hello (CHLO)) to the server. Since the connection is established for the first time, the server returns a rejection message (REJ) indicating that the handshake has not been established or that the key has expired.

However, the rejection message contains more information (configuration parameters), mainly including:

  • Server Config: a Server configuration that includes the long term Diffie-Hellman public value of the diffie-Hellman algorithm on the Server side
  • Certificate Chain: trust Chain used to authenticate the server
  • Signature of the Server Config: The Signature encrypted by the public key of the leaf certificate of the trust chain for the Server Config
  • Source-address Token: An authenticated encrypted block containing the client’s publicly visible IP Address and the server’s timestamp.

After receiving a rejection message (REJ), the client parses the data, verifies the signature, and then caches the necessary configuration.

Meanwhile, after receiving REJ, The client randomly generates a pair of its own ephemeral Diffie-Hellman private value and ephemeral Diffie-Hellman public value for the connection.

After that, the client packages the short-term public key it just generated in a message package called Complete CHLO and sends it to the server. The purpose of this request is to transfer your short-term key to the server for forward confidentiality, as described in more detail later.

After sending a Complete CHLO message to the server, the client does not wait for a response from the server to reduce RTT, but transfers data immediately.

To ensure data security, the client computes its short-term key with the long-term public key returned by the server to obtain an initial key.

With this initial key, the client can use this key to encrypt the information it wants to transmit and then securely transmit it to the server.

On the other hand, the server that receives the Complete CHLO request, parses the request, and has both the client’s short-term public key and its own long-term key. In this way, the server can get the same initial keys as the client.

After receiving the data encrypted by the client using the initial key, he can use the initial key to decrypt the data, and can return his response encrypted by the initial key to the client.

Therefore, from the initial establishment of the connection to the data transfer, it consumes only 1 RTT of the initial connection establishment

Finally, shake hands

So, can subsequent data transfers be encrypted using the initial keys?

In fact, not entirely, because the initial key is generated based on the server’s long-term public key, and before the public key becomes invalid, almost all connections use the same public key, so there is a certain risk.

Therefore, in order to achieve the security of Forward Secrecy, the client and server need to use each other’s short-term public keys and their own short-term keys to perform operations.

In cryptography, Forward Secrecy (FS) is the security attribute of a communication protocol in cryptography, which means that the leakage of a long-used master key does not result in the leakage of a past session key.

Now the problem is that the client’s short-term key has been sent to the server, and the server has only given the client its long-term key, but not its short-term key.

So, when the server receives Complete CHLO, it sends a Server Hello (SHLO) message to the server, which is encrypted using the initial keys.

The CHLO message package will contain a short-term public key that the server will regenerate.

In this way, the client and the server each have the ephemeral Diffie-Hellman public value of the other.

In this way, both the client and the server can perform operations based on their own short-term Key and the other’s short-term public Key to generate a forward-secure Key that is used only for the current connection. Subsequent requests are encrypted and decrypted based on this Key.

In this way, the two parties complete the final key exchange, the connection handshake, and establish the QUIC connection.

When you get a time to create a connection, the client will be retrieved from the cache cache down before their own server’s public key for a long time, and to create a short key, to generate a new key, and then use the initial key want to transfer the data encryption, the server sends a Complete CHLO request. So you get 0 RTT data transfer.

So, if it’s a long term public key with a cache, then the data transfer happens directly, with a prepare time of 0 RTT

Above, the diffie-Hellman algorithm is used to negotiate the key, and the encryption and handshake processes are combined to greatly reduce the RTT of the connection process, so that quic-based connection establishment can be as little as 1 RTT or even 0 RTT.

To help you understand the process, here is a flowchart of how to set up a QUIC connection on Google’s website.

In addition, through the above process of handshake establishment, we can also know that QUIC in the whole process through encryption and decryption is very good to ensure the security.

multiplexing

One of the biggest problems with HTTP based on TCP is queue header blocking. In this respect, how does QUIC solve this problem?

During TCP transmission, data is divided into sequential packets. These packets are transmitted to the receiving end over the network. The receiving end combines these packets into original data in sequence, thus completing data transmission.

But if one of the packets does not arrive in the right order, the receiver will hold the connection waiting for the packet to return, blocking subsequent requests. This causes TCP queue header blocking.

Similar to HTTP/2, QUIC can have multiple independent logical data streams on the same physical connection. These data streams are transmitted in parallel over the same connection, and the transmission between the multiple data streams does not require timing and does not affect each other.

Streams in QUIC provides a lightweight, ordered abstraction of byte Streams

A single data stream of QUIC guarantees orderly delivery, but multiple data streams can be out of order. This means that a single data stream is transmitted sequentially, but in multiple data streams the order received by the receiver may be different from the order sent by the sender!

That is to say, there is no dependence between multiple data flows on the same connection (it is not required to arrive in order). Even if a data packet fails to arrive, it will only affect its own data flow, not other data flows.

Connect the migration

The IP address and port of the server and client are used to identify the TCP connection. In network switching scenarios, such as mobile phone network switching, its IP will change. This causes the previous TCP connection to fail and need to be re-established.

This kind of scene is still quite frequent with the popularity of mobile devices today.

So, at this point, QUIC is optimized.

The QUIC protocol uses a unique UUID to mark each connection. When the network environment changes, as long as the UUID stays the same, data transmission can continue without shaking hands.

reliability

TCP is called a reliable link not only because it has a three-handshake and four-close process, but also because it makes a lot of reliability guarantees such as flow control, data retransmission, congestion control, and so on.

This is why TCP has always been an important protocol for HTTP implementation.

If QUIC wants to replace TCP, it will need to make efforts in this area as well, after all, UDP itself does not have these capabilities.

TCP congestion control is an algorithm of TCP to avoid network congestion. It is a major congestion control measure on the Internet. There are many classical algorithm implementations, such as TCP Tahoe and Reno, TCP Vegas, TCP Hybla, TCP New Reno, TCP Westwood and Westwood+, and TCP BIC and CUBIC, to name a few.

The QUIC protocol also implements congestion control. Does not rely on specific congestion control algorithms, and provides a pluggable interface that allows users to experiment. By default, TCP protocol Cubic congestion control algorithm is used.

With regard to flow control, QUIC provides two levels of flow control based on stream and Connection, requiring both control over a single stream and overall control over all streams.

QUIC’s connection-level flow control is used to limit the total buffer that a QUIC receiver is willing to allocate to a connection, preventing the server from allocating an arbitrarily large cache to a client. The process of connection-level flow control is basically the same as that of flow-level flow control, but the offset limit of forwarding data and receiving data is the sum of all flows.

disadvantages

Above, we introduced a lot of QUIC compared to THE advantages of TCP, it can be said that this protocol compared to TCP is really better.

Because it is based on UDP, and did not change the UDP protocol itself, just do some enhancement, although can avoid the problem of rigid intermediate equipment, but, in the promotion of the above is not completely without problems.

First, many enterprises, carriers, and organizations intercept or limit UDP traffic outside of port 53 (DNS) because this traffic is often abused in attacks.

In particular, some existing UDP protocols and implementations are vulnerable to amplification attacks. Attackers can control innocent hosts to send a large amount of traffic to victims.

Therefore, the UDP based QUIC protocol transmission may be blocked.

In addition, because UDP has always been located unreliable connection, so many intermediate devices for its support and optimization degree is not high, so the possibility of packet loss is relatively serious.

conclusion

The following table is my summary of the similarities and differences between HTTP/2 and HTTP/3, some of which have been introduced in this article, and some of which are not particularly important in my opinion, which have not been mentioned in this article. You can learn by yourself if you are interested.

features HTTP/2 HTTP/3
Transport layer protocol TCP QUIC based on UDP
The default encryption no is
Independent data flow no is
Team head block TCP queue header blocking exists. Procedure There is no
The header compression HPACK QPACK
Shake hands with time delay TCP + 1-3 RTT TLS 0-1 RTT
Connect the migration There is no There are
Server push There are There are
multiplexing There are There are
Flow control There are There are
Data retransmission There are There are
Congestion control There are There are

References:

http3-explained.haxx.se/

The QUIC Transport Protocol: Design and Internet-Scale Deployment

www.codenong.com/cs106840038…

Nan01ab. Making. IO / 2018/12 / QUI…

medium.com/@chester.yw…

About the author: Hollis, a person with a unique pursuit of Coding, is an Alibaba technologist, the co-author of three Courses for Programmers, and the author of a series of articles on how Java Engineers Become Gods.

If you have any comments, suggestions, or want to communicate with the author, you can follow the public account [Hollis], directly leave me a message in the background.