Front-end caching best practices

preface

Caching, this is a platitude topic, is often used as a front-end interview knowledge point.

This paper focuses on how to set up the cache in the actual project, and gives a more reasonable scheme.

Strong and negotiated caching

When we talk about caches, we tend to divide them into strong caches and negotiated caches. The main difference between the two is the need to verify to the server that the local cache is still valid when using it. Negotiation cache, as the name implies, is the need to negotiate with the server to determine whether to use local cache.

The problem points of the two caching schemes

Strong cache

As we know, strong caching is mainly controlled by cache-Control and Expire fields in the HTTP request header. Expire is a field from the HTTP1.0 standard, which we can ignore here. We focus on the cache-control field.

Generally, we set the cache-control value to “public, max-age= XXX”, which means that the Cache will be accessed again within XXX seconds and no request will be sent to the server.

Obviously, if the resource on the server is updated within XXX seconds, the client will see the same old content without forcing the refresh. Wouldn’t it be perfect if you took your time and accepted it? However, most of the time, it is not as simple as you think, if the release of a new version, the background interface is also synchronized update, then GG. The cached user is still using the old interface, and that interface has been killed in the background. How to do?

Negotiate the cache

The biggest problem with negotiating a cache is that every time you have to verify that the cache is valid with the server, it seems like it’s easy to ask me if I’m valid anyway. However, for an aspiring coder, this is not acceptable. There’s no point in caching if you’re going to the server every time.

Best practices

The point of caching is to reduce requests and use more local resources, giving users a better experience and reducing server pressure. Therefore, the best practice is to hit strong caches as much as possible and invalidate client caches with updates.

How do you get users to use the latest resource files the first time after an update? The clever front-end came up with a way to change the path of static resources during the update so that they would be accessed for the first time and there would be no caching problems.

The great Thing about WebPack is that it lets you put hash values on file names when you pack.

entry:{
    main: path.join(__dirname,'./main.js'),
    vendor: ['react'.'antd']
},
output:{
    path:path.join(__dirname,'./dist'),
    publicPath: '/dist/',
    filname: 'bundle.[chunkhash].js'
}
Copy the code

To sum up, we can conclude a reasonable cache scheme:

HTML: Use a negotiated cache.
CSS&JS& Images: use strong caching and file names with hash values.

Hashing is also tricky

Webpack provides three hash computations: Hash, Chunkhash, and Contenthash. So what’s the difference between the three?

Hash: This is related to the build of the entire project. The hash value of the files generated by the build will be the same. Whenever a file changes in the project, the hash value of the entire project will change.
Chunkhash: Resolves dependent files based on different Entry files, builds corresponding chunks, and generates hash values.
Contenthash: Indicates the hash value generated by the file content. The contenthash value varies with the content.

Obviously, we’re not going to use the first one. If you change a file, after packaging, the hash of other files has changed, and the cache is invalidated. That’s not what we want.

What are the main applications of Chunkhash and Contenthash? In real projects, we usually reference the CSS from the corresponding CSS file in the project. If we use chunkhash, when we change the CSS code, we will find that the HASH value of the CSS file changes as well as the hash value of the JS file. This is where contenthash comes in handy.

ETag calculation

Nginx

Nginx’s default ETag calculation method is “file last modified time hexadecimal – file length hexadecimal”. Example: ETag: “59E72C84-2404”

Express

The Express framework uses serve-static middleware to configure the caching scheme, in which an NPM package called ETAG is used to implement ETAG calculations. It can be seen from its source code that there are two methods of calculation:

Method 1: Use the file size and modification time

function stattag (stat) {
  var mtime = stat.mtime.getTime().toString(16)
  var size = stat.size.toString(16)

  return '"' + size + The '-' + mtime + '"'
}
Copy the code

Method 2: Use the hash value and content length of the file

function entitytag (entity) {
  if (entity.length === 0) {
    // fast-path empty
    return '"0-2jmj7l5rSw0yVb/vlWAYkK/YBwk"'
  }

  // compute hash of entity
  var hash = crypto
    .createHash('sha1')
    .update(entity, 'utf8')
    .digest('base64')
    .substring(0, 27)

  // compute length of entity
  var len = typeof entity === 'string'
    ? Buffer.byteLength(entity, 'utf8')
    : entity.length

  return '"' + len.toString(16) + The '-' + hash + '"'
}
Copy the code

ETag or last-Modified is preferred

Negotiated cache, with ETag and Last-Modified fields. When these two fields exist at the same time, which one takes precedence?

In Express, the fresh package is used to determine if it is the latest resource. The main source code is as follows:

function fresh (reqHeaders, resHeaders) {
  // fields
  var modifiedSince = reqHeaders['if-modified-since']
  var noneMatch = reqHeaders['if-none-match']

  // unconditional request
  if(! modifiedSince && ! noneMatch) {return false
  }

  // Always return stale when Cache-Control: no-cache
  // to support end-to-end reload requests
  // https://tools.ietf.org/html/rfc2616# section - 14.9.4
  var cacheControl = reqHeaders['cache-control']
  if (cacheControl && CACHE_CONTROL_NO_CACHE_REGEXP.test(cacheControl)) {
    return false} / /if-none-match
  if(noneMatch && noneMatch ! = =The '*') {
    var etag = resHeaders['etag']

    if(! etag) {return false
    }

    var etagStale = true
    var matches = parseTokenList(noneMatch)
    for (var i = 0; i < matches.length; i++) {
      var match = matches[i]
      if (match === etag || match === 'W/' + etag || 'W/' + match === etag) {
        etagStale = false
        break}}if (etagStale) {
      return false/ /}}if-modified-since
  if (modifiedSince) {
    var lastModified = resHeaders['last-modified'] var modifiedStale = ! lastModified || ! (parseHttpDate(lastModified) <= parseHttpDate(modifiedSince))if (modifiedStale) {
      return false}}return true
}
Copy the code

We can see that if a refresh is not enforced and the request header is if-modified-since and if-none-match, eTAG and last-Modified are judged first. Of course, if you don’t like this strategy, you can implement one yourself.

Addendum: How to set up the backend

This is mostly about how the front end is packaged, but what about the back end? As we know, the browser determines the cache scheme based on the relevant fields in the response header. So the key on the back end is to return the cached fields for each request. Using nodejs as an example, if strong browser caching is required, we can set it like this:

res.setHeader('Cache-Control'.'public, max-age=xxx');
Copy the code

If caching needs to be negotiated, it can be set like this:

res.setHeader('Cache-Control'.'public, max-age=0');
res.setHeader('Last-Modified', xxx);
res.setHeader('ETag', xxx);
Copy the code

Of course, there are already many libraries out there that make it easy to configure these things. Wrote a simple demo, convenient friends in need to understand the principle, interested can read the source code

conclusion

When doing front-end caching, we set up strong caching for as long as possible, and do version updates by file name plus hash. When subcontracting code, some infrequent common libraries should be packaged separately so that they can be cached more consistently.

Above, if there are mistakes, welcome to correct!

@Author: TDGarden