From the new Suspense SSR to HTTP chunking

preface

A few days ago, the React 18 release was released, and immediately my wechat feed was flooded.

Suspense SSR is an interesting new feature. It seems to be an adaptation of past Suspense implementations in an SSR scenario. Today let’s talk about this feature

The realization of the old

Before we get to Suspense SSR, we need to know what Suspense is.

Suspense components have been added in React 16.6 to help gracefully load components for asynchronous requirements. For example, asynchronous unpacking routes and components that require asynchronous request results.

The following figure shows asynchronous components in Suspense. When asynchronous components are not loaded in Suspense, loading components in Fallback will be loaded first, and the loading results will be rendered after the loading of asynchronous components.

const ProfilePage = React.lazy(() = > import('./ProfilePage')); // Lazy-loaded

// Show a spinner while the profile is loading
<Suspense fallback={<Spinner />} ><ProfilePage />
</Suspense>
Copy the code

It’s elegant and easy to use. However, in the SSR scenario, the existing renderToString is all synchronized. The server scenario does not support scheduling similar to requestIdleCallback on the browser. Server applications also cannot voluntarily cede scheduling tasks in exchange for greater efficiency.

The biggest problem with synchronization is the slow rendering time. Because server-side rendering is cpu-intensive computing, the TPS of server-side rendering services is often unsatisfactory. If you have to wait for asynchronous rendering in Suspense, the processing time is not guaranteed.

React does two things in order to help with Suspense on the server. Not only adapted to the function, but also optimized.

The new implementation

In the new server implementation, we continue to use a recursive renderString approach. In Suspense, however, the logic changes.

In Suspense, loading in fallback will be rendered in the first screen. At this point, the promise of the sub-component will be pushed to the asynchronous queue, and when the promise is completed, it will be pushed to the browser for replacement through HTTP chunking technology.

The logic is roughly coded as follows. After the promise ends, the server pushes a replace(“1”, “2”) function that replaces the asynchronous resulting DOM with the loading DOM

<div>
  <! - $? -->
  <div id="1">Loading...</div>
  <! - / $-- -- >
</div>
<div hidden id="2">
  <div>Actual Content</div>
</div>
<script>
  replace("1"."2");
</script>
Copy the code

The asynchronous result is not pushed by a new HTTP request, but is pushed based on the previous HTTP request through the block transfer feature. It creates the illusion of a long connection. Next, let’s take a look at the gameplay of chunking transfers.

Block transmission

To talk about chunking transfer, start with the end of the HTTP protocol. In an HTTP message, how do you determine the end of a request/response body? There are usually two options:

content-length: Adds one packet to the request/response bodycontent-lengthThe request header, whose value is the number of bytes in the response body. If the browser is receivingcontent-LengthIn bytes, the request is considered complete.
Transfer-Encoding: chunked: This scheme enables block transmission of variable length. We need to send a terminator to tell the browser that the request body ends. whileSuspense SSRIt is using this implementation that willSuspenseThe rendered results are pushed in and out.

Transfer-encoding: chunked takes precedence over Content-Length. Ps: However, the most rigorous way to determine the end of a block is to calculate the length in advance, rather than using terminators. In most scenarios, however, length can be determined.

Next, let’s do an experiment to see the phenomenon of block transfer under the packet capture tool.

The experiment

Let’s start a service with Express to simulate the chunking logic. The code logic is roughly as follows:

To pushhtmlThe whole skeleton, includinghtmlTag, encoding format,bodyThe label.
Push a P label every one second, push 10 times no longer push.
Turn off push and declare the request ended

In this case, we can simulate the principle realization of pushing results in Suspense SSR in asynchronous condition.

const express = require('express')
const app = express()
const port = 3677;

const renderToStream = (res) = > {
  return new Promise((resolve, reject) = > {
    let count = 0;
    let timeId = setInterval(() = > {
      if (count === 10) {
        res.write(" already pushed" + count + "< / p >");
        clearInterval(timeId);
        resolve();
        return;
      }
      res.write(" Push once every 1 second 
");
      count++;
    }, 1000)
    })
}

app.get('/stream'.(req, res) = > {
  res.write("
      ");
  res.write("<head><meta charset='UTF-8'/><title>test stream</title></head>");
  res.write("<body></body>");
  renderToStream(res).then(() = > {
    res.end();
  })
})

app.listen(port, () = > {
  console.log(`Example app listening on port ${port}`)})Copy the code

We open Wireshark, start monitoring the service, and refresh the page to start the request

As you can see, when we start the request, the server first pushes us the request header, body, HTML tag, and so on. Transfer-encoding: chunked has been identified at this point.

Each subsequent second, a TCP packet and its ACK acknowledgement message are captured. The TCP packet is the response content, and the ACK is the acknowledgement message sent back by the browser. There are exactly 10 pairs.

TCP packets contain the information we push each time:

At the beginning of the packet body, declare our packet size in hexadecimal format. 0x1b converts to 27 bytes.
This is followed by two hexadecimal characters, 0x0d and 0x0A, which represent carriage return and newline symbols respectively.
At the end of the packet body, 0x0D and 0x0A are also stored to declare the end of the packet.

We can also verify this logic in Wikipedia, which is, after all, what the specification defines 🤗

Each non-empty block starts with the number of bytes (bytes in hexadecimal) of data that the block contains, followed by a CRLF (carriage return and newline), then the data itself, and finally the block CRLF. In some implementations, a white space (0x20) is filled in between the block size and CRLF.

Zh.wikipedia.org/wiki/%E5%88…

After 10 pushes, an HTTP response packet is generated. This packet contains the previous TCP packet block. At the end of all packets, a 0, blank single line and 0x0D, 0x0A are filled to represent the end of the response body:

Why is it 0 here? Quite simply, the number of bytes in a blank single line is 0

One thing to note here: the HTML skeleton from our first push, and the P tag from our second push, are all concatenated to the end of the request. The browser automatically corrected it for us, spelling it inside the body.

Suspense SSR does this by pushing each render result and replacement function to the front end.

The last

React takes all this effort just to be good at Suspense? Apparently not. If we look at this question from fiber’s point of view, it’s no surprise.

In the original Fiber architecture, a lot of effort was made to change the component tree from tree to linked list so that the linked list could be broken during traversal. The recursive traversal of the tree can only be traversed from the root node to the leaf node at one time, and there is no pause in the middle.

If traversal can be interrupted, then we can use the browser’s scheduling capabilities to see if our traversal affects the browser’s rendering. If it does, wait until the next schedule to continue rendering.

But there is no browser scheduling API on the server. Server-side rendering is cpu-intensive and computationally oriented, which takes up a lot of time and resources.

Suspense SSR takes advantage of the reason of adaptation to push Suspense asynchronous rendering into asynchronous queue. Before waiting for the results of asynchronous rendering, our main thread can also make time to deal with other rendering requests and increase the number of rendering tasks that can be handled. This approach kills two birds with one stone for experience and performance.

Go ahead and upgrade!

The resources

Zh.wikipedia.org/wiki/%E5%88…
Github.com/jinhailang/…

From the new Suspense SSR to HTTP chunking

preface

The realization of the old

The new implementation

Block transmission

The experiment

The last

The resources

Related Posts

Console Warterfall parameters are introduced and long-running solutions

Talk about closures from the execution context

Hand lift a map component 02