preface

Recently, I was asked about the browser cache strategy in an interview, so I want to refresh my knowledge of caching.

The browser stores recently requested documents locally for the user, and when the user visits the same page again, the browser can load the files directly from the local disk.

Meaning of browser cache:

    1. It avoids redundant data transmission and saves traffic.
    1. Speed up the user’s access to web pages;
    1. Reduced server stress;

Reading this article, you will learn:

1.Browser Cache type2.Cache priority3.Strong cache vs. negotiated cache4.How do I configure a cache policy5.Example analysis gold digging website cache strategyCopy the code

Type of browser cache

The Web front-end cache can be roughly divided into database cache, server cache (proxy server cache, CDN cache), and browser cache.

Browser caches also contain many things: HTTP caches, indexDB, cookies, localstorage, and so on. We will only discuss HTTP caching here.

If the browser has cached data, the browser can divide the cache into mandatory cache and negotiated cache based on whether to send requests to the server.

1. Force cache

The user requests data. If the cache is hit and the cache is not invalid, the user does not request the data from the server, but directly obtains the data from the local resource.

If the user requests data and the cache is invalid, the user requests resources from the server again. After the data is returned, the user stores the data to the browser cache again according to the cache rules.

So how does the browser determine if the cache is invalid?

The response header for the mandatory-cached has two fields indicating an expiration rule (Expires/ cache-control);

  • 1.1 Expires: The Expires value is the expiration time returned by the server. In the next request, if the request time is shorter than the expiration time returned by the server, the cached data is directly used. However, Expires is an HTTP 1.0 thing, and the default browser now uses HTTP 1.1 by default, so its role is largely ignored. Another problem is that the expiration time is generated by the server, but the client time can be out of sync with the server time, which can lead to errors in cache hits. So HTTP 1.1 uses cache-control instead.

  • 1.2 cache-control: Cache-control is the most important rule. The common values are private, public, no-cache, max-age, and no-store.

1.Public: indicates that the response can be cached by any object (including the client that sent the request, the proxy server, and so on), even if it is not normally cacheable;2.Private: Indicates that the response can only be cached by a single user and not as a shared cache (that is, the proxy server cannot cache it). The private cache can cache the response content, such as the local browser of the corresponding user.3.No-cache: Forces the cache to submit requests to the original server for validation (negotiated cache validation) before publishing a cache copy;4.No-store: The cache should not store anything about client requests or server responses, that is, no cache is used;Copy the code
  • 1.3 ExpiresCache-ControlThe relationship between:
/ / the same
1.It's all mandatory caching;/ / the difference
1.Expires is http1. 0Cache-control is HTTP11.The rules;2.The Expires time is an absolute time, which is easy to cause errors. The expiration time of cache-control is a relative time, which is not a problem in the Cache.3.Both can exist in a request, but do not work in a request at the same time. In HTTP1. 0Cache-control doesn't work, Expires does. In HTTP11.Expires doesn't work in an environment where cache-control does. Currently, it's generally HTTP11.So Expires exists as a form of backwards compatibility;4.Cache-control has more options and more powerful functions. It is recommended to use it. Expires, as a strong cache, has only one function and is not recommended.Copy the code

2. Negotiate cache

When a user requests data, the browser directly sends a request to the server to negotiate and compare the server and local resources to verify whether the local resources are valid.

Negotiated caching typically uses if-modified-since/ last-Modified and if-none-match/Etag. The server decides whether the browser cache is available.

  • Last-Modified / If-Modified-Since

Last-modified: When the server responds to a request, it tells the browser when the resource was Last Modified.

If-modified-since: When the browser requests a resource again, the browser notifies the server of the last time the resource was Modified.

If the last modification time is less than or equal to if-modified-since, the response header returns 304, telling the browser to continue using the saved cache. If the value is greater than if-modified-since, the resource has been Modified and the status code 200 is returned.

  • If-none-match / Etag

Etag: a unique identifier that tells the browser that the current resource is in the browser when the server responds to a request (generation rules are determined by the server)

If-none-match: this field is used to notify the server of the unique identifier used by the client to cache data when the server is requested again. After receiving the request, the server finds if-none-match and compares it with the unique identifier of the requested resource. If the request is different, the resource has been changed again. In this case, the server responds to the entire resource content and returns the status code 200. If yes, the resource has not been modified. In this case, the browser responds to HTTP 304, telling the browser to continue using the saved cache.

Etag vs. Last-Modified:

1.Etag is superior to Last-Modified in accuracy. Last-modified is accurate to s. If the resource changes several times within 1s, Etag is able to determine and return the latest resource.2.In terms of performance, Last-Modified is superior to Etag because last-Modified only records the time, whereas Etag requires the server to regenerate the hash value, so the performance is slightly worse.3.Etag is superior to last-Modified in priority, and both Etag and last-Modified can exist together. After the local cache expires, the browser sends a Request packet to the server. The Request Header contains if-none-match and last-Modified-since (Compared with the server's Etag and last-Modified, Etag has a higher priority). Used to verify that the local cache data is consistent with the server. On the server side, Etag is determined first. If the same, return304; If not, last-Modified is continued and a decision is made to return the new resource. Returns if the server verifies that the local cache is the same as the server304, the browser loads the local cache; Otherwise, the server returns the requested resource with the new Etag and last-Modified time;Copy the code

3. The difference between mandatory cache and negotiated cache

After the forced cache and negotiated cache hit the cached resource, the resource is read from the local cache. If forced caching is in effect, there is no need to make a request to the server. Negotiating caches, whether or not caching is used, must send a request to the server to negotiate.

The two types of cache rules can exist at the same time, and the forced cache has a higher priority than the negotiated cache. That is, if the forced cache rule takes effect, the cache is used directly and the negotiated cache rule is not executed. If the mandatory cache rule does not take effect, you need to check the negotiation cache.

Second, cache request process

Browser caching process:

Iii. Case analysis

Analyze the cache strategy of some files of Nuggets:

Note:

  • HTML: The cache lasts for 0s. When a page is loaded, the browser is forced to negotiate the cache with the source server each time.
  • CSS: The change frequency is low, the local cache is allowed, and the cache duration is mandatory. The cache duration varies with CSS files. Enforce cache invalidation before negotiating cache;
  • Js: allows the use of local cache, and there is a mandatory cache time (different JS files, according to the need to set); Enforce cache invalidation before negotiating cache;
  • Image: an image that does not change frequently. Local cache is allowed. There is a mandatory cache time.

conclusion

Generally, the cache policies for different types of files such as CSS, JS, and image on a page are roughly the same. That is, both strong cache and negotiated cache policies exist. For strong cache, given the valid time of the local cache max-age, the max-age size is generally determined according to different file types. For negotiated caches, given last-Modified and Etag identifiers, the server side verifies the validity of the client cache. This chapter provides a brief introduction to the browser caching policies for each component on the official website. However, some files have special cache Settings. For example, many JS, CSS, image and so on in the page are added version numbers, forced to refresh the cache, etc.