background

Optimizing page performance and improving user experience has always been our goal. Browser caching, pre-loading, pre-rendering, and more can be used to improve the performance and experience of pages. However, in actual business scenarios, there is a kind of page has always been a big problem in performance optimization, that is, the first hop page. This is a scenario where the user is visiting the site for the first time. For Web pages, first-hop scenarios (such as SEO, paid traffic) generally perform worse than two-hop scenarios. There are several reasons for this, mainly the disadvantage of first hop users in connection reuse and local resource cache utilization. In the first hop scenario, many optimization methods on the end (preloading, preexecution, prerendering, etc.) cannot be implemented. In the case that the client’s cache capability cannot be utilized, using CDN’s proximity to users may be a direction of performance optimization. Next, several common performance optimizations will be introduced, leading to our proposed edge rendering solution.

Train of thought

Idea 1 – SSR

For the sake of performance optimization, we usually output the first-screen dynamic content directly to the server through server rendering (SSR).


Idea 2: CSR + CDN

In order to reduce the white screen time, the edge caching capability of CDN can be considered. The HTML of the page can be directly cached on the CDN node. However, for most scenarios, the main content of the page is dynamic or personalized. Caching all HTML content on THE CDN has a great impact on business, and very few scenarios can accept it. How about caching only the static part of HTML on the CDN instead? In fact, this idea is also a very common operation, that is, to cache the static frame part of HTML on the CDN, so that users can quickly see part of the content, and then initiate asynchronous requests on the client side to obtain dynamic content and render (CSR). The rendering space-time map in CSR + CDN mode is as follows:


Idea 3 – ESI

The CSR + CDN approach solves the problem of white screen time well, but brings the delay of dynamic content display. The reason for this problem is that we split the dynamic and static content of the page into two phases, which are serial, and the serial process is interspersed with js download and execution. Is there any way to integrate dynamic and static content on the CDN?

ESI(Edge Side Include) gives us a good idea. ESI was also a specification proposed by CDN service providers at the beginning. It can add specific dynamic tags to HTML tags to cache static content of pages on CDN, and dynamic content can be freely assembled. ESI’s rendered space-time map is as follows:


Although the effect of ESI did not meet our expectations, it gave us a good direction for thinking. If ESI can be modified to return static content first, and dynamic content can be returned to the page after being obtained by CDN node, short blank screen time and no delay in the return of dynamic content can be guaranteed. To achieve an effect similar to streaming ESI, fine-grained operations on requests and streaming returns are required on the CDN. Is such a complex operation supported on a CDN node? The answer is yes: edge computing. At present, some CDN service providers have provided perfect edge computing capability (Cloudfare has supported, alicDN has also supported the internal test version, and will be open to the public soon). We can perform operations similar to the service worker of the browser on CDN, and flexibly program requests and responses.

Based on the power of edge computing, we have a new option: edge streaming rendering scheme. Details of the scheme are as follows

Solution – Edge Streaming Rendering (ESR)

Rendering process

The core idea of the scheme is that the static content and the dynamic content are returned to users successively by streaming mode with the help of edge computing capability. Compared with server, CDN nodes are closer to users and have shorter network delay. On the CDN node, the static part of the page that can be cached is quickly returned to the user. Meanwhile, the dynamic content request is initiated on the CDN node, and the dynamic content is returned to the user after the response flow of the static part. The final page renders the space-time map as follows:

As can be seen from the figure above, the CDN edge node can quickly return the first byte and static part of the page content, and then the dynamic content is initiated by CDN and streamed back to the server to the user. The program has the following features:

  1. The first screen, TTFB, will be short and static content (such as page headers, basic structure, skeleton) will be visible quickly
  2. Dynamic content is generated by CDN, earlier than traditional browser rendering, and does not depend on browser download and js execution. Theoretically, the end time of the final rePONse is the same as the time of accessing the complete dynamic page directly from the server.
  3. After the static content is returned, you can start parsing some HTML, downloading and executing JS and CSS. Move some blocking pages forward so that the full stream of dynamic content can be displayed more quickly.
  4. The network between edge nodes and servers has more room for optimization than the network between clients and servers. For example, TCP connection construction and network transmission overhead can be reduced for dynamic requests through dynamic acceleration and connection reuse between EDGE and server. To achieve the final dynamic content return time, faster than the client directly access the server

The demo contrast

There is currently a demo of the main search page on alicDN (edge-routine.m.alibaba.com/) (as the demo page may be frequent), Here’s how the load compares to the original page on different networks (with Charles’s Network Throttle throttle) :

  1. No speed limit (wifi) :

  1. The speed limit of 4 g

  1. The speed limit of 3 g

As can be seen from the above results, under the condition of slower network speed, the final main elements rendered by CDN streaming will come out earlier than the original SSR method. This is also true because the slower the network, the slower the loading time of static resources, and the more obvious the effect of the corresponding browser loading static resources in advance. In addition, no matter what network conditions, CDN streaming rendering mode of the white screen time is much shorter.

The overall architecture

Architecture diagram

Edge streaming rendering

1. The template

Templates are a syntax similar to ESI blocks, based on templates that extract content that needs to be requested dynamically and separate and cache content that can be returned statically. So templates essentially define the dynamic and static content of a page.

During streaming rendering, the page template is parsed from top to bottom, returned directly to the user if static content is present, and fetch logic for dynamic content is performed if dynamic content is encountered. Static and dynamic content may alternate throughout the process.

There are several types of templates for design.

  • The first: raw HTML

This template is the least invasive to the existing business. It only needs to add certain labels to the content of the existing SSR page to declare the dynamic part of the page:

<html>
  <head>
  	<link rel="stylesheet" type="text/css" href="index.css">
  	<script src="index.js"></script>
    <meta name="esr-version" content="0.0.1"/>
  </head>
  <body>
  	<div>staic content....</div>
    
    <script 
    	type="esr/snippet/start" 
      esr-id="111"
      content="SLICE"></script>
    <div>
    	dynamic content1....  
    </div>
    <script type="esr/snippet/end"></script>
    
  	<div>staic content....</div>
    
    <script 
    	type="esr/snippet/start" 
      esr-id="222" 
      content="https://test.alibaba.com/snippet/222"></script>
    <div id="222">
    	dynamic content2....  
    </div>
    <script type="esr/snippet/end"></script>
  </body>
</html>
Copy the code
  • Second: Static templates (real scenarios that are not associated for the time being)

This template needs to be separately sent to THE CDN (if the rendering layer is connected to FASS gateway and SSR in the future, the template content can be shared with them in this part, and a copy of the template will be automatically synchronized to the CDN when the template is published in the workflow, and the CDN cache will be cleared at the same time). Dynamic content can be rendered in two ways. One is to use the back-end SSR out of the dynamic HTML fragment, the other is the back-end to provide dynamic data, by the edge section dynamic HTML fragment rendering.

The advantage of using SSR dynamic HTML fragments is that there is no need to do HTML template rendering on the edges and no need for developers to write two sets of template logic. The disadvantages are that SSR capability is required on the back end and large volume of dynamic content transfer.

The advantage of using edge nodes to render dynamic HTML content is that the back end only needs to provide dynamic data, no SSR capability is required (but the front end needs to have CSR capability for downgrading), and the volume of dynamic content transmitted is small. The tangent point is that the dynamic content cannot be streamed through on the edge node. It needs to be completely downloaded to the edge node and then returned to the user after processing.

<html>
  <head>
  	<link rel="stylesheet" type="text/css" href="index.css">
  	<script src="index.js"></script>
  </head>
  <body>
    <div>staic content....</div>
    
    <script 
    	type="esr/block"
      esr-id="111"
      content="https://test.alibaba.com/snippet/111"></script>
    
    <div>staic content....</div>
    
    <script 
      type="esr/template" 
      esr-id="222"
      content="https://test.alibaba.com/api/data">
    	<div>
      	{$data.name}
      </div>
    </script>
  </body>
</html>
Copy the code

2. Static content presentation

Static content comes from templates. For different template types, the way to get static content is different. For templates of the “raw HTML” type, static content is extracted from the full HTML returned by the first dynamic request, based on HTML comment tags, and stored in the EDGE cache. For “static template”, the template file of CDN will be pulled and stored in edge cache. Static content has a cache expiration time and version number.

The static content from the beginning of the template is returned directly to the user in response. Subsequent static content (such as HTML and body closing tags) can be done in two ways: a. One is to wait for dynamic content to return before writing to the response stream. This approach is seO-friendly, but the disadvantage is that dynamic content blocks subsequent static content, and if there are multiple dynamic content blocks, the dynamic template cannot be returned first and can only be displayed in sequence. The alternative is to return the static content completely, and then script the content into the appropriate pit in the form of a bigPipe class. The advantage of this approach is that static content can be displayed in its entirety at the beginning, and multiple dynamic content can be displayed first come first. The downside is SEO unfriendly (because dynamic content can be plugged into JS)

3. Dynamic content

Dynamic content is a dynamic content request initiated on EDGE during the rendering process when the region to be dynamically retrieved is parsed. Dynamic content supports dynamic acceleration to the server (source). The dynamic content interaction between the continuous node and the back end can be divided into three ways: a. The first way is that the back end dynamic content returns the full page and needs to be extracted from the content through annotation marks. This method has the advantage of less intrusion on existing services, but has the disadvantage of large volume of dynamic content transmission and interception of dynamic content after downloading complete HTML. B. The second method is that the back-end dynamic content only returns the content of the dynamic block. The advantage of this method is that the dynamic response stream can be returned to the user, but the disadvantage is that the page needs to provide a SEPARATE URL that only returns the content of the dynamic block. C. The third is that the back-end dynamic content only returns data, which is rendered on the edge node with the dynamic rendering template in the static template and then returned to the user. The advantages are that the amount of data transmitted to the back end is small, and the back end does not need to have SSR capability. The disadvantages are that developers need to maintain an additional set of template logic, and there may be CPU overhead and limitations for doing complex template rendering on edge nodes.

Dynamic content interaction between users and edge nodes can be divided into two forms: a. Waterfall flow (corresponding to WATER_FALL in routing configuration) : Dynamic content is returned in the form of waterfall flow. While multiple dynamic content loading operations are parallel on edge nodes, the page content is displayed from top to bottom for the user. This approach has the advantage of being SEO-friendly and does not affect the loading order of page modules. The disadvantage is that when there are multiple dynamic modules, the frame of the whole page cannot be seen. The content of the first dynamic block will block the display of subsequent dynamic block content, and the JS AND CSS resources at the bottom of the page cannot be loaded and executed in advance. B. Embedded (ASYNC_INSERT in the corresponding route configuration) : Static content is returned all at once, and dynamic content occupies some pits first. The subsequent dynamic content is inserted into the previously occupied pit as innerHTML. The advantage of this approach is that the JS and CSS resources at the bottom of the page cannot be loaded and executed in advance, and the page can see the full picture first. The disadvantage is not SEO friendly, and the page module execution order will change according to the dynamic block return speed, need to do some judgment and compatibility in the browser side of the page logic.

Edge routing

Route configuration: g.alicdn.com/edgerender/… (Just a hypothetical URL, a JSON resource published to a static CDN)

{
  version: '0.0.1' // Set the version number
  origin: 'us-proxy.alibaba.com'.host: 'edge.alibaba.com'
	pages: [
    {
    	pageName: 'seo'.// Page name identifier
      match: '/abc/efg/.*'.// The page path matches the regular string
      renderConf: { // Render configuration
        renderType: 'ESR'.// Edge render
        templateType: 'FULL_HTML'.// Template type: use the complete HTML generated by SSR as a template
        dynamicMode: 'WATER_FALL|ASYNC_INSERT'./ / dynamic content append mode of return: the waterfall flow return | asynchronous filling holes (innerHTML)
        templateUrl: ' ' / / template url}}, {pageName: 'seo'.match: '/abc/efg/.*'.renderConf: { 
        renderType: 'ESR'.templateType: 'STATIC'.// Static template, available from the CDN URL
        dynamicMode: 'WATER_FALL|ASYNC_INSERT'./ / dynamic content append mode of return: the waterfall flow return | asynchronous filling holes (innerHTML)
        templateUrl: 'https://g.alicdn.com/@g/xxx.html'}}, {pageName: 'jump'.match: '/jump/.*'.renderConf: {
        renderType: 'REDIRECT_302'./ / 302 jump
        rewriteUrl: 'https://jump'}}, {pageName: 'proxy'.match: '/proxy/.*'.renderConf: {
        renderType: 'PROXY_PASS'./ / 301 jump
        rewriteUrl: 'https://proxypassurl'}}}]Copy the code

Routing can be considered as an entrance to edge computing, and only pages in routing configuration will go through the corresponding rendering process. Otherwise the page will go straight back to the source and get the full content of the page. The JSON above is the routing configuration file currently designed. The configuration file is published to assets CDN in overwrite mode as a static resource. At the same time, in order to support the configuration of grayscale distribution, there will be two configurations of grayscale version and full version on the line, and a fixed proportion will be configured in the routing code to load the grayscale or full version configuration.

Currently, there are three rendering modes designed in routing, namely streaming rendering, redirection and reverse proxy. The configuration of redirection and reverse proxy is relatively simple, similar to nginx configuration, requiring only lifting the destination URL.

The stability of

Scope of influence control

  1. CDN switch: The domain name can switch traffic back to unified access at any time based on the region and proportion
  2. Edge computing SCOPE switch: Edge computing coverage path is configured on THE CDN to control edge computing to run only under part of the path
  3. Edge computing route switch: In edge computing, only part of the page is rendered in flow mode by reading the route configuration. Otherwise, the page is dynamically accelerated to obtain the whole page content

Exception handling

  1. DNS switch: If a serious CDN problem occurs, the DNS switch is directly switched to unified access
  2. If the basic functions of edge computing are abnormal, disable edge computing of all paths on the CDN configuration platform and use the default dynamic acceleration
  3. If you render at the edge, an error occurs before any response content is returned to the client, catch the error and demote to the full page content
  4. If an edge render has returned the static part of the response to the client, then an edge node has a problem loading dynamic content (timeout, HTTP error code, does not match the version number of the static content), and returns onelocation.reload()The script tag and end the response, allowing the page to force a refresh. When refreshing, the query parameter of bypass edge calculation can be added to ensure that edge rendering is not carried out during refreshing

gray

  1. Edge computing code grayscale a. The platform itself supports grayscale publishing edge computing code
  2. Route configuration grayscale a. In the edge calculation code, load two configuration urls of grayscale version and official version according to a fixed proportion. Only grayscale configuration is published for grayscale release, and full configuration is published for full release. Clear the CDN cache when publishing
  3. A. Give the grayscale page a special template version number, if this version number is encountered, do not go to the edge rendering.

A smooth release

There is a common problem with the separation of the front and back ends: smooth publishing. When static resources (JS, CSS) of the page are published, but not with the backend, it may cause a mismatch between the HTML content returned by the backend and the front-end JS, CSS content. If the mismatch between the two is not handled compatibly, you can have style errors or the Document selector can’t find the element.

One way to address smooth publishing is to make compatibility in the code when making simultaneous changes to the front and back ends. This way, successive releases do not affect page availability.

Another way is by version number. Manually configure the version number on the backend page. When there is an incompatible release, the front-end resource is sent first, and the backend manually changes the version number to ensure that only the backend machine that has successfully published the HTML reference is the new static resource.

The issue of smooth releases has always been with batch releases and Beta releases. However, in the ESR scenario, we cache the static part in the CDN, which makes the possibility of inconsistency between the front end and the back end more likely. To solve this problem, business developers need to identify the risks at the time of release. If compatibility has been made, no special treatment can be done. However, if there is no compatibility, the version number of the page template needs to be modified. When the static content of the new version does not match the version number, the streaming rendering will be abandoned to ensure that the compatibility problem between dynamic content and static content is not found on the page.

Edge CDN service provider

At present, major CDN service providers support edge computing as follows:

  1. Alicdn a. Supports edge computing in service worker-like environment, and its functions meet requirements B. The overseas nodes are limited at present, and the performance of some regions can match or even exceed akamai, but the performance of some domain names is slightly worse than Akamai due to the few nodes.
  2. ESI can assemble dynamic and static content, but does not support streaming. Dynamic content will block the front screen. There are many overseas nodes, which has performance advantages compared with ALICDN in some areas
  3. Cloudfare a. Supports edge computing in service worker-like environments, and its functions meet requirements B. Without experience in using it, the process may be complicated if you want to use it

Some details to consider

  1. If static streaming rendering is used, HTTP headers are quickly returned to the user with the static part. On the one hand, the dynamic parameters behind the aplus script may be solidified. On the other hand, if the dynamic content has headers that return set-cookies, it cannot be directly conveyed to the browser. So be careful if there is a strong dependency on aplus dynamic parameters in the scenario, or if there is an important set-cookie operation. One current solution is to trigger the desired set-cookie logic through an asynchronous interface with the same domain name on the page. (Or, if allowed, let JS write the cookie based on the set-cookie header returned by the dynamic content)
  2. Dynamic page titile and meta tag properties are variable – this can be fixed by writing JS to the page to reset these properties after dynamic content is retrieved
  3. For SEO pages, the use of dynamic content insertion (including title and meta tag subsequent JS writes) can be unfriendly to crawlers. It can be solved by specially identifying crawler UA and directly returning SSR to the edge node to complete the page content

Project progress

At present, the feasibility of the scheme has been verified by demo. We are experimenting with actual business scenarios on Alibaba’s international site. In the future, more sophisticated and rich solutions (such as react rendering directly on edge nodes) will be shared, as well as the actual performance on the line.

reference

  1. cloudfare edge worker
  2. 2016 – the year of web streams
  3. ESI
  4. Async Fragments: Rediscovering Progressive HTML Rendering with Marko
  5. The Lost Art of Progressive HTML Rendering