preface

In front-end development, caching helps to speed up the loading of web pages, and caching can be reused repeatedly, so it can reduce traffic and bandwidth overhead.

There are many types of CDN cache, database cache, proxy server cache, and browser cache. This article will look at browser caching in Web development. This is often asked about or used in real development environments. How to define the concept of cache is something the front end must learn. If you like my article, welcome to comment, welcome to Star~. Welcome to my Github blog

The body of the

The browser cache problem refers primarily to HTTP caching – the protocol layer. H5’s new storage and database caches, which are application-level caches, are not included in this analysis. Let’s begin the analysis of caching in earnest.

The protocol level cache, in fact, can be divided into force cache and contrast cache.

Mandatory cache

First, let’s take a look at a sequence diagram for forced caching to understand the request pattern for different cases of forced caching:

From the figure, it is not hard to see that the only way to go to the server to get the latest resources is to force the cache. At the protocol level, there are two Expires and cache-control fields that can cause forced caching.

See me at 1.0 – Expires

One of the earliest examples is the Expires field, which represents the expiration time of the cache, which is the expiration time + the server’s time at the time, and is then set to be returned to the server in the header. Therefore, the time is an absolute time, for example:

Expires: Thu, 10 Nov 2017 08:45:11 GMTCopy the code

Photo examples:

Setting this field in the response header tells the browser that it does not need to request again until it expires.

However, there are drawbacks to setting this field:

Because the time is absolute, the user may change the local time on the client. As a result, the browser determines that the cache is invalid and requests the resource again. In addition, the time on the client and server is inconsistent, resulting in cache invalidity.

I came at 1.1 — cache-control

Given the shortcomings of Expires, in HTTP/1.1, a field, cache-Control, was added to indicate the maximum valid time for a resource Cache, during which a client does not need to send a request to the server

The difference between the two is that the former is absolute time while the latter is relative time. Let’s take an example to illustrate:

Cache-Control: max-age=2592000Copy the code

Photo examples:

Cache-control: cache-control: cache-control: cache-control

  1. Max-age: the maximum effective time, as we can see in the example above

  2. No-cache: indicates that the resource is not cached. That is, the browser is informed that the resource is not cached

  3. S-maxage: the same as max-age, but only used for shared caches, such as CDN caches

  4. Public: shared cache for multiple users. Default value

  5. Private: Cannot be shared by multiple users. After HTTP authentication, the field is automatically converted to private.

To summarize, Expires has been gradually replaced by cache-control since HTTP1.1. Cache-control is a relative time. Even if the client time changes, the relative time does not change. This keeps the server and client time consistent. Cache-control is also very configurable.

Compared to the cache

With the mandatory cache out of the way, let’s look at comparative caches. Before explaining this, let’s just guess that forced caching means that the cache does not need to request resources until its expiration date. So what about the principle of comparison caching?

Without further ado, let’s also start with a sequence diagram that compares the cache, as shown below:

The process of comparing caches is to first obtain the corresponding data id from the cache, and then send a request to the server to confirm whether the data is updated. If it is updated, the new data and the new cache are returned. Otherwise, a 304 status code is returned to inform the client that the cache is not updated and can continue to be used.

This compensates for some of the limitations of mandatory caching. Comparison caching is mainly used for resource files that need to be updated dynamically from time to time.

The fields cached in the protocol are last-modified and if-Modified-since.

A good companion to another — Last-Modified

Last-modified: The server tells the client when the resource was Last Modified, for example

Last-Modified: Thu, 10 Nov 2015 08:45:11 GMTCopy the code

If-modified-since: If the request header contains this field, the server will compare the value of if-modified-since with the value of last-Modified. If the value is equal, it indicates that the request has not been Modified. Otherwise, it indicates modification, responds with 200 status code, and returns data.

This field can be used with cache-control.

But he has certain flaws:

  1. If the resource is updated at a rate of less than seconds, then the cache cannot be used because it has a minimum time of seconds.

  2. If the file is dynamically generated by the server, the update time for this method is always the generation time, even though the file may not have changed, so it does not work as a cache.

Let me perfect it — Etag

Since last-Modified is still defective, although it is used in most cases, we may need to take a look at our other friend, Etag, when it comes to scenarios like the one we described above.

The server stores the Etag field of the file. The Etag field can be compared with the if-no-match field sent by the client each time. If the Etag field is equal to the if-no-match field, it indicates that the file has not been modified, and the response is 304. Otherwise, it indicates that the status has been modified, and the data is returned in response to the 200 status code.

Finally, let’s refresh our memory with a schematic:

So far, the two types of caching have been explained, and I wonder if you have a general idea in mind that you can answer when asked. I hope we can make progress together, Fighting.

Differences caused by browser behavior

Finally, let’s talk about how browser behavior can cause cache changes.

Here’s how the browser’s behavior can result in a request:

  1. Refresh the web page => If the cache is not invalid, the browser directly uses the cache. Otherwise, it requests data from the server
  2. Manual refresh (F5) => The browser will consider the Cache invalid and request the server with cache-control: max-age=0, and then ask the server if the data is updated.
  3. Force refresh (Ctrl + F5) => The browser ignores the Cache, adds the cache-control: no-cache field to the request server, and then pulls the file from the server again.

Caching on mobile

This caching mechanism may be sufficient on the PC side, since the PC side does not need to worry about network problems.

However, the mobile end is not, any increase in network requests, for mobile end loading time is relatively large (who calls the mobile end network is too bad, 3G, 2G). So what’s wrong with the above cache? In fact, forced caching is not too much of a problem, because you don’t want the server to send requests as long as the cache doesn’t expire; But in the case of comparison caching, 304 is a huge problem because it causes useless requests. Each time before using the cache, a request acknowledgement is sent to the server, resulting in network latency.

A good cache requires two things:

  1. After the data is cached, minimize server requests
  2. If resources are updated, the client resources must be updated as well.

So, the general way we use it is:

Add representations to the resource files, such as config.f1ec3.js, config.v1.js, etc., and then give the resource a longer cache time, such as a year

Cache-Control: max-age=31536000Copy the code

In this way, it will not cause the phenomenon of 304 packets. Then, once the resource is updated, we can change the identifier behind the resource to achieve static resource non-overwrite update.

conclusion

This article provides a general breakdown of the browser cache section, as well as a detailed analysis. It can be mainly divided into:

  1. Mandatory cache

    • Expires field

    • Cache-control fields

  2. Compared to the cache

    • Last – Modefied fields

    • Etag logo

  3. Cache changes caused by browser behavior

  4. Caching policies on mobile devices

In fact, when talking about mobile terminal cache strategy, there is no special detailed analysis, just roughly explain the current cache strategy that everyone is using. I’ll probably write a more detailed article on mobile caching later.

Finally, if you have any questions about what I have written, you can discuss them with me. If THERE are any mistakes in my writing, please correct them. You like my blog, please follow me Star~ yo. We summarize and make progress together. Welcome to my Github blog