Front-end performance optimization —- Static file client offline cache

1. Introduction

Last time, I shared how to optimize your project to the Max during the WebPack phase. Article link: Optimizing webpack packaging to perfection _20180619

2. Explore business bottlenecks

H5 page performance bottleneck, network factors can account for almost 80%. Either reducing the output file size or using HTTP2.0 or PWA will reduce the network impact on H5 page loading.

Our products are mainly used in Latin American countries. Latin American users have one obvious feature:

  1. Poor network environment (2G and 3G users still exist);
  2. The user’s mobile phone model is mainly Android, and the model is old (indicating that webView is old).

The following figure is the information queried from the company’s H5 performance monitoring platform.

It can be seen from the above figure that the network environment of most users is still very good. However, front-line colleagues often reported that our H5 page was slow to open on users’ phones. After our discussion, we have two guesses:

  1. After optimizing the file size, does the local CDN take longer to resolve the page and static domain names?
  2. Although THE CDN Nginx set HTTP cache (max-age, etc.) for static files, but for some reason on the user’s phone does not cache as long as expected;

3. Exploration of network factors

With speculation and because, we have explored the problems mentioned above through technical means. The theoretical knowledge of inquiry dependence is as follows:

The diagram above shows the entire life cycle of a page that loads when we enter the address in the browser’s address bar and press Enter. The browser (or Webview) provides a Performance API that lets us get the start time of each phase.

According to the Performance API, a data reporting script is prepared. The script link can be found here

I put the main part of the code and need to pay attention to the place out, you can pay attention to when using:


// Note that in practice some older Android devices do not have the getEntries method, so make your scripts compatible
const entryList = performance && performance.getEntries && performance.getEntries();

// Usually we put CSS in the head tag and JS at the bottom of the body tag. If the static file uses a separate CDN domain name, you just need to get the object of the first LOADED CSS
    for (let i = 0; i < entryList.length; i++) {
        const obj = entryList[i];
        if (obj.initiatorType === 'link') {
            linkPerformance = obj;
            break; }}// Below are the start times of the various stages of fetching
    const cssDomainlookStart = linkPerformance ? linkPerformance.domainLookupStart : 0;
    const cssDomainlookEnd = linkPerformance ? linkPerformance.domainLookupEnd : 0;

    const connectStart = linkPerformance ? linkPerformance.connectStart : 0;
    const connectEnd = linkPerformance ? linkPerformance.connectEnd : 0;

    const requestStart = linkPerformance ? linkPerformance.requestStart : 0;
    const responseStart = linkPerformance ? linkPerformance.responseStart : 0;
    const responseEnd = linkPerformance ? linkPerformance.responseEnd : 0;


// Omega is the company's unified data reporting script. Here is the time difference between the start and end of each phase
    Omega && Omega.trackEvent && Omega.trackEvent('static_domain_timing'.' ', {
        country: country,
        lookupTiming: cssDomainlookEnd - cssDomainlookStart,
        connectTiming: connectEnd - connectStart,
        requestTiming: responseStart - requestStart,
        responseTiming: responseEnd - responseStart,
        host: 'static.didiglobal.com',
        cityid
    });
Copy the code

3.1 Data research results of HTML domain names

Because the data is reported by script first written to the database, and then found by SQL script. Therefore, they are presented in the form of tables.

3.2 Data research results of CDN domain names

  • _c0 DNS resolution time
  • _c1 TCP Handshake time
  • _c2 Time when a request is sent to reques End
  • _c3 Response Start time to response end
  • _c4 city ID
  • _c5 countries

conclusion

When we have the above statistics, we come to the conclusion that:

In general, our page and static domains operate well in major Latin American cities. Slowness exists, but it is negligible compared to the total number of people.

4. Problem conjecture and optimization scheme

The conclusion above does not mean that the network environment of our business has no room for optimization at present, but according to the optimization cost and tightness, the current state can still meet the needs. However, we subsequently pushed the operations department to upgrade our network protocol to HTTP2.0.

4.1 Conjecture of the problem

Excluding the impact of network factors on our page loading speed, so what is the cause of the impact of page loading speed? Because our project was 100% single-page applications using VUE, one of the biggest disadvantages of single-page apps was that the “first screen” was ** slow.

  1. If the user is entering a page for the first time, it is understandable that the user is slow, because new downloads of JS, CSS and HTML are required (this is often referred to as “first screen slow”).
  2. If static resources such as CSS and JS cached by the client fail, the system will slow down after the NTH entry.
  3. A page has been modified, online will also slow. (This situation can be similar to 1, which is the first screen)

Based on the above three cases, we put forward a hypothesis:

If the HTML and static files (JS, CSS) required for rendering H5 are cached in advance in some way, can the rendering speed of the first screen be accelerated? How much can it be specifically improved? Is it necessary to promote it on a large scale? .

4.2 Data Results

To verify the conjecture in 4.1, we need data to back it up. Then we asked the client students to do several experiments with us.

In the client version of the time, some “single page” H5 rendering resources needed to follow the client package with the release, and then monitor the loading speed of these pages.

Here are some of the data we obtained from the statistical comparison of the rendering time before and after the pre-cached client page:

Change calculation formula :(average value before optimization – average value after optimization)/average value before optimization

Based on the above data as theoretical support, the conclusion is that caching static resources into the client in advance can significantly speed up the page rendering speed.

5. Automatic offline cache solution

In 4.2, manual method was used to verify the effect of pre-caching static resources into the client on H5 page rendering. We need to communicate with the client in advance about the release time each time, and manually package the resources we need to cache in advance and send them to the students on the client.

There are many disadvantages to this approach:

  • The efficiency is very low, each time to manually package, the client will need to manually cache the resources “implanted”;
  • Pages that need to be cached are also inflexible;
  • If there is an online problem on the online page, after the repair but before the client release time, then the front-end online will lead to cache failure;
  • The H5 has the advantage of being ready to go online. The above method depends on the client version, cannot dynamically update;

With so many disadvantages, it is obviously not desirable to promote in our business line in this way. Below we design a set of automatic cache static resources into the client.

5.1 Client

To implement an automated caching solution, the client and the front end must work closely together. Because I’m mainly responsible for the front end, the client side, and I’m going to give you a quick overview of this. If you want to discuss it further, please leave me a message.

The basic flow can be seen in the following figure:

  1. When the client starts or the user opens the sidebar, the server API is requested to obtain static file information that needs to be cached.
  2. When loading h5 page resources, the client adds a layer of interception to check whether the resource requested by the current URL is cached on the client. If there is cache, the client directly uses the local cache and does not access the network. If the cache cannot be found locally, go directly to the network request line.

The above is just a simple process. In the actual development, we also need to establish the mapping relationship between URL and local cache file, delete the cache, and offline the cache module.

5.2 the front side

The front-end project on-line process is basically like the following.

  1. Local or development machine (most companies have a dedicated live server to build code) build code;
  2. Copy the code directory after build to the online server, if there is a CDN server, and copy the static file to the CDN.

When we design the front-end offline cache solution, an important factor to consider is that this solution should not have too much impact on our existing business and build on-line process. In other words, our old project can access this set of offline cache solution with simple modification.

Finally, our scheme is shown in the figure below:

The scheme above relies on the theory that CDN is designed to share server load, stream, provide file download, speed up, etc. In order to reduce the number of static resource requests that need to be cached by the client during startup, resources that need to be cached are packaged into zip files for the client to download during build phase.

  1. Each line of business (Git repository) builds its own separate offline ZIP
  2. The offline ZIP follows the static file online to the CDN. Take full advantage of CDN download resources (see CDN nGINx configuration support does not support zip file requests).
  3. Local builds store file information so that later builds can zip only the changed resources (reducing the zip file size).

6. Develop webPack plug-ins

From the previous section, you can see that we want to put the time to generate the ZIP cache package in the build phase. Because all of our projects are built with WebPack. So a WebPack plug-in was developed to do just that.

The plug-in used in our project is called Webpack-static-chache -zip. This plugin is compatible with webpack2.0+ and webpack4.0.

The function points of the plug-in are as follows:

For details, please refer to the following configuration:

const AkWebpackPlugin = require('webpack-static-chache-zip');

    // Initialize the plug-in
    new AkWebpackPlugin({
    // The name of the final generated offline package. The default value is' offline '
    'offlineDir': 'offline'.// Generate environment code source, default 'output' webpack compiles output production environment code directory
    'src': 'output'.// Whether to keep the generated offline package folder (zip package source file)
    'keepOffline': true.// datatype: [required] 1 (Android passenger), 2 (Android driver), 101 (iOS passenger), 102 (iOS driver)

    'datatype': ' './ / terminal_id business names, such as passenger side purse, not repeat specific view the wiki at http://wiki.intra.xiaojukeji.com/pages/viewpage.action?pageId=118882082

    'terminal_id': ' '.// If one data_type corresponds to more than one terminal_ID. You can list them as an array object as follows
    'terminal_list': [{data_type: 1
            terminal_id: 2
        },
        {
            data_type: 1
            terminal_id: 2}... ] .// The file path you want to include, fuzzy matching, has a higher priority than excludeFile
    'includeFile': [
      'balance_topup'.'static'.'pay_history'].// The priority of file path fuzzy matching to be excluded is lower than includeFile
    'excludeFile': [
        'repay_qa'.'test'.'pay_account'.'fill_phonenum'.'balance_qa'.'select_operator'.'payment_status'].If you need to cache other types of files [' PNG ', 'JPG ']
    'cacheFileTypes': []./ / product ID to maintain uniqueness contact module please go to http://wiki.intra.xiaojukeji.com/pages/viewpage.action?pageId=272106764 for their own use of the module is already in use
    // Please register your own module
    'module': 'passenger-wallet'.// Page domain name. Multiple domain names can be configured for a file that may be used by multiple domain names after the file is online
    For example, https://aaaa.com/a.html and https://bbbb.com/a.html access the same file but use different domain names for different business scenarios
    'pageHost': 'https://page.didiglobal.com'.// urlpath
    'urlPath': '/global/passenger-wallet/'.// This field and the patchCdnPath below are special. Such as our packaging output path/xx/XXXX/output/aaa/bb/index. The HTML at the time of launch is, in fact, will copy to the output directory
    / / principle on our url of the page on the server should be https://page.didiglobal.com/aaa/bb/index.html but some projects may in order to shorten the path to find our actual access through ngxin configuration
    Here you can configure the / / https://page.didiglobal.com/index.html is patchUrlPath: 'aaa/bb'
    'patchUrlPath': ' '.// CDN Domain Name Static file domain name (js/ CSS/HTML) if not configured or set to an empty array, pageHost is used by default
    'cdnHost': 'https://static.didiglobal.com'.// cdnPath If not set, urlPath is used by default
    'cdnPath': ' '.// Refer to the patchUrlPath usage above
    'patchCdnPath': ' '.// Zip file domain name if not set will default to cdnHost
    'zipHost': ' '.// zipPath defaults to cdnPath if not set
    'zipPath': ' '.// An H5 page will run in different ends (for example, our Brazilian and global drivers are two separate clients), and the H5 pages in these two ends will be different
    // Set the domain name of the environment page and static file using the otherHost configuration.
    // It can be left blank or null
    'otherHost': {
      // The domain name of the page
      'page': 'page.99taxis.mobi'.// Can set a separate CDN domain name if not set to the same as the page domain name
      'cdn': 'static.99taxis.mobi'
    },

    For details about compression parameters, see https://archiverjs.com
    'zipConfig': {zlib: {level: 9}},Fs (fs-extra), this.success, this.info, this.warn, this.alert
    // Before copying the file to the offline folder
    beforeCopy: function () {},
    // After copying the file to the offline folder
    afterCopy: function () {},// Before compressing the offline folder
    beforeZip: function (offlineFiles) {
        // offlineFiles file path information in the offline package folder
    },
    // After the offline folder is compressed
    afterZip: function (zipFilePath) {
        // zipFilePath Indicates the path of the generated offline ZIP package}})Copy the code

6.1 Plug-in workflow

Here’s how the plug-in works

  1. Gets the time when webPack compilation ends
  2. Copying files from the compiled output directory to a offline directory (the directory to be compressed as a ZIP file) is performed according to theincludeexcludeTo determine whether files need to be copied
  3. If you already have online compilation information, compare the zip file to the last one

Version 6.2 of the diff

Version diff here refers to, the first online the whole quantity of offline package, in order to save user traffic and user mobile disk space, the second to the first N times online, we should launch is the zip package after the diff, and needs to be some old file information told the client side, let the client to delete, save the user disk space).

Both file information and diff information should be stored using a storage service (database). However, in order to speed up the compilation and diff process, there is no interface request to exchange data with the server. All information is stored locally as json files. After compiling online, the information is sent to the server for storage.

The basic flow of DIFF is shown below:

Our requirement is to cache only five versions of the offline package online. This means that each release is combined with the previous 5 releases. Based on the combinatorial formula we learned in high school, we should cache up to 10 diff versions and 1 full version. That’s 11 versions.

6.3 principle of diff

The client needs to verify the integrity of the offline file after downloading the offline cache resource. So the plugin computes MD5 for each file that needs to be cached when generating the ZIP package, and the client computes MD5 after downloading it, comparing it to my MD5.

Since I will calculate the MD5 value of the file once during the build phase, I will use this MD5 value as the unique identification of a file during diff.

The diff principle of this version can be simplified as a comparison of two arrays. The diagram below:

The following is the core code example, maybe my implementation is not the best, if the big guy has a better implementation method can leave a message to me (manual gratitude)

// There are two arrays oldArr and newArr, which elements are added when oldArr becomes newArr, and which elements should be deleted
// We can add a field type to the element to mark the 0 tag with 1 to indicate that the existing 2 tag element is new

const oldArr = [
    {tag: 1},
    {tag: 2},
    {tag: 3},
    {tag: 4}];const newArr = [
    { tag: 3 },
    { tag: 4 },
    { tag: 5 },
    { tag: 6}]const newArrTag = [];
// Need to delete
const delArrList = [];

for (let i = 0; i < newArr.length; i++) {
    const newItem = newArr[i];

    // By default, each entry in newArr is set to new
    newItem.type = 2;

    newArrTag.push(newItem.tag);

    for (let m = 0; m < oldArr.length; m++) {
        const oldItem = oldArr[m];

        if(newItem.tag === oldItem.tag ) {
            newItem.type = 1}}}for (let n = 0; n < oldArr.length; n++) {
    const oldItem = oldArr[n];

    if(! newArrTag.includes(oldItem.tag)) { oldItem.type =0delArrList.push(oldItem); }}const resultArr = newArr.concat(delArrList);


console.log(resultArr)

After changing oldArr to newArr, each element is marked as new, should be deleted, or as old[{tag: 3.type: 1 },
  { tag: 4.type: 1 },
  { tag: 5.type: 2 },
  { tag: 6.type: 2 },
  { tag: 1.type: 0 },
  { tag: 2.type: 0}]Copy the code

The code above is the flow of version DIff. Of course, it’s a little more complicated when combined with WebPack and business. If you are interested, check out the plugin link and read the source code.

7. Data statistics after the launch

The following is the data statistics of our passenger side wallet 3 days before and after its launch:

In the red box below is the page where we accessed the offline cache:

As you can see from the data comparison above, pages that use offline caching,Dom readyWere significantly improved by almost 90%.

The following figure shows the statistical data of the company’s Web page performance indicator platform:

  1. The following is the data of a certain 3 days before access:

  1. The following is the data of a certain 3 days after access:

According to the above statistical results, the loading time of all H5 pages on the passenger side is shortened by 300ms. Domready’s time was also reduced by nearly 300ms. ** Actually this comparison is not accurate, because this data contains data that is not connected to the offline cache page. You can only see the average of all the pages. ** The statistics at the beginning of this summary reflect the impact of offline caching on page load times.

Conclusion:

In order to optimize the loading speed of H5 pages on the client, this paper introduces our team’s preliminary research and practical development work on the offline cache of static files, as well as the final engineering practice. The overall improvement of our page loading speed is very obvious. It only needs to be connected in the way of Webpack plug-in, and it does not need to do too much transformation to the existing front-end projects, and it also greatly improves the work efficiency of front-end engineers.

Space is limited, some of the content of the article may be relatively vague, interested students or students who want to practice in their own team, please leave a comment for discussion.