Why do differential updates

Traditional JavaScript resource loading is generally stored or servo in the local server through CDN. By setting maxAge, last-Modified, eTAG and so on, the browser can cache after access to reduce repeated requests. However, in the product update, a few contents are often Modified. After the hash change, the user needs to download the entire Javascript file. The common idea of incremental update is to subcontract. When the subcontract is updated, the user still needs to download a new subcontract.

At this point, if we follow The incremental update mechanism of Android and merge data locally with the cache file by differential description, and then update is performed, we can make the user almost aware of the upgrade without waiting, and we can reduce the number of resource requests.

Practice thinking

Each time we update the file, we need to generate descriptive data between previous versions (which may be cached by different users) and the current version of Diff information.

On the browser, check whether the cache file exists in the local server. If no cache file exists (or LocalStorage is not supported), the full file is downloaded and cached to the local server for next use. If yes, obtain the Diff data and check whether the Diff data contains the update information of this version. If no, the local cache is too old and the update gap is large. You need to download the full file and cache it again. If so, the diff information is merged with the local file to generate a new version, js is executed via eval, cached locally, and the hash is updated.

How do I generate DIFF information?

Several implementations come to mind in this step:

  1. When requesting resources, the server computs and caches diFF information. However, it requires the cooperation of the server and consumes some computing resources.
  2. Diff information is generated by engineering tools. This is the most practical solution for the front end, so let’s start with this solution. The most commonly used tool is Webpack, so let’s use Webpack as an example for a plugin. Recently, I wrote a Diffupdate-Webpack-plugin, which is still in the prototype stage. It is a little sketchy, but the basic idea is consistent. The following code can be referenced in it.

1. Plug-in writing

The basic idea is to implement a class with the Apply method and use the compiler. Plugin to listen for Webpack life cycle events and perform operations in the callback function.

2. Cache files

The first thing to say is that WebPack provides compilation objects, which represent a single version build and generation resource. We need to get information about the files to be exported through the Compilation. chunks and then cache them for future comparison.

compilation.chunks.forEach(chunk= > {
    const { hash } = chunk;
    chunk.files.forEach(filename= > {
        if (filename.indexOf('.js')! = =- 1) {
        // Get the file from assets and use source to get the content
        fileCache[filename] = fileCache[filename] || [];
        // ...
    });
});
/ /...
// Webpack can generate resource files with compilation. Assets [filename] = an object containing source and size (neither function is allowed)
compilation.assets['filecache.json'] = {
    source() { return JSON.stringify(fileCache); },
    size() { return JSON.stringify(fileCache).length;}
}        
Copy the code

3. File comparison

Here we can use either fast-diff or diff, but I personally think that the diff library comparison is not so accurate, this can be chosen according to the actual situation.

Expand the code above:

const fastDiff = require('fast-diff');
// ...
const diffJson = {}; // Diff information for each version of a file
// ...
const newFile = compilation.assets[filename].source(); // The new file after compilation
diffJson[filename].forEach((ele, index) = > {
    const item = fileCache[filename][index]; // Historical files
    const diff = this.minimizeDiffInfo(fastDiff(item.source, newFile)); // Simplify diff information to reduce unnecessary interference
    ele.diff = diff;
});
Copy the code

When we first build, we don’t have a history version, so we don’t get any diff information at this point. In subsequent builds, we will get the contents of the previous versions of the files from the cache and compare them with the latest files one by one to generate diFF information, and then overwrite the last generated DIFF file, so that the user can get the diFF information corresponding to the latest version whenever the version gap is limited.

The method I implemented to generate diff information is as follows:

What the browser does

Now that we have generated the diff information, we need to ask the browser to get the diff information, load the cache JS, and merge the diff.

Access to js

We first write a loadScript method, passing in the file name of the JS file that needs to be loaded, first judge whether there is a corresponding cache in LocalStorage (here also judge whether support LocalStorage), if not, request the resource and stored in LocalStorage. If so, we merge the diff information and update the cache locally.

function mergeDiff(str, diffInfo) {
    var p = 0;
    for (var i = 0; i < diffInfo.length; i++) {
      var info = diffInfo[i];
      if (typeof(info) == 'string') {
        info = info.replace(/\\"/g.'"').replace(/\\'/g."'");
        str = str.slice(0, p) + info + str.slice(p);
        p += info.length;
      }
      if (typeof(info) == 'number') {
        if (info < 0) {
          str = str.slice(0, p) + str.slice(p + Math.abs(info));
        } else {
          p += info;
        }
        continue; }}return str;
 }
 function loadFullSource(item) {
    ajaxLoad(item, function(result) {
      window.eval(result);
      localStorage.setItem(item, JSON.stringify({
        hash: window.__fileHash,
        source: result,
      }));
    });
 }
function loadScript(scripts) {
    for (var i = 0, len = scripts.length; i < len; i ++) {
      var item = scripts[i];
      if (localStorage.getItem(item)) {
        var itemCache = JSON.parse(localStorage.getItem(item));
        var _hash = itemCache.hash;
        var diff;
        // Get diff information
        if (diff) {
          var newScript = mergeDiff(itemCache.source || ' ', diff);
          window.eval(newScript);
          localStorage.setItem(item, JSON.stringify({
            hash: window.__fileHash,
            source: newScript,
          }));
        } else{ loadFullSource(item); }}else{ loadFullSource(item); }}}Copy the code

Get diff information

We can obtain the diff information by requesting the diff. Json generated in the previous step. However, this method has a disadvantage, that is, all the DIff information of JS used will be obtained. In addition, as mentioned above, we need to pass in the required JS file. In most cases, we use the HTml-webpack-plugin to inject the generated JS file into the template using script tags, but this will not achieve our purpose. Then we need to modify the output information, but fortunately, the HTmL-Webpack-plugin allows us to modify its output

Modify the HTML – webpack – the plugin

The html-Webpack-Plugin provides the following events

Here’s a stupid method (because I’m busy with others, I didn’t find out how to hijacks the script changes). First, cache the template when the html-webpack-plugin-before-htmL-processing event happens. Then the html-webpack-plugin-after-HTmL-processing event compares the difference between the generated file and the template content, replaces the diff information, and compacts the js logic such as operation cache comparison and inserts it into the HTML, so that when the client reads the HTML, You will get the latest diFF information, and there is no need to manually fill in the corresponding JS. Bingo!

 let oriHtml = ' ';
 // Cache the template before it is modified
compilation.plugin('html-webpack-plugin-before-html-processing', (data) => {
    oriHtml = data.html;
});
// Replace the generated script tag, insert the diff information, and fill the imported JS list into the loadScript method
compilation.plugin('html-webpack-plugin-after-html-processing', (data) => {
    const htmlDiff = diff.diffLines(oriHtml, data.html);
    const result = UglifyJS.minify(insertScript);
    // ...
    for (let i = 0, len = htmlDiff.length; i < len; i += 1) {
        const item = htmlDiff[i];
        const { added, value } = item;
        if (added && /.test(value)) {
              let { value } = item;
              const jsList = value.match(/ (? <=src=")(.*? \.js)/g);
              value = value.replace(/.' ');
              const insertJson = deepCopy(diffJson);
              for (const i in insertJson) {
                if (jsList.indexOf(i) === - 1) delete insertJson[i]
              }
              newHtml += `<script>${result.code}</script>\n<script>window.__fileDiff__='The ${JSON.stringify(insertJson)}'; </script><script>loadScript(The ${JSON.stringify(jsList)}); </script>\n${value}`;
        } else if (item.removed) {
  
        } else{ newHtml += value; }}});Copy the code

The effect

The first load, no local cache, read the full file


Why not use PWA?

  1. Cache mechanism Limits

    If we update the ServiceWorker subthread code in the new release, when the browser gets a new file when visiting a web page, it will install the new file and trigger install. The old active Service Worker is still running, and the new Service Worker is in the waiting state after being installed. The new Service Worker does not take effect on the next reopened page until all opened pages are closed and the old Service Worker automatically stops. If you want to update immediately you need to do some processing in the new code. The individual clients are notified first by calling the self.skipWaiting() method in the install event and then by calling the self.clients.claim() method in the Active event.

    If the browser cache sw.js, it will not get the latest version of the ServiceWorker code. In practice, the index.html will also be cached, and in our fetch event, if the cache hits, it will be fetched directly from the cache. This will cause that even if our index page is updated, the browser will always get the index page cached by the previous ServiceWorker, so some ServiceWorker frameworks support us to configure resource update policies, such as we can do for the home page. The network request is used to obtain the resource first, if the resource is obtained, the new resource is used, and the cache is updated, if not, the cached resource is used

  2. compatibility

    Service Worker support is not high and IE is not supported for the time being, but LocalStorage is better.


Write in the last

Above is a Javascript differential update implementation of a train of thought, write a little rough, or hope to bring you a new idea, thank you 🙏