Browser cache rules

What is browser caching?

Every browser implements HTTP caching. When the browser and the server make HTTP requests, the browser will store some files in memory cache or disk cache. When you visit the website again, you can directly use the resources in the cache to optimize the user experience.

Advantages of browser caching

  • Reduce the pressure on the server. You don’t have to read the disk every time
  • Optimize the user experience. Page load times are significantly shorter
  • Saves traffic and reduces bandwidth consumption.

The evolution of the browser cache

Existing browser caches come in two ways:

  • Strong caching (Expires, cache-Control, pragma fields)
  • Negotiation cache (last-modified/if-modifed-since, eTag/if-none-match fields)

If we go through the history of browser caches, we will remember the use of both types of caches.

Example of a request:

Size of request header: 1KB size of response header: 1KB size of response content: 10KB complete request: 12KBCopy the code
  1. No browser cache before

Before there was no caching, the request would be sent to the server each time, and the server would check the disk each time, pull out the request content and send it back to the client.

Request 10 times, cost 10*12KB = 120KB traffic, time is also 10 times the request round trip time.

So traffic and time are related to the number of requests.

Disadvantages: - Waste of user traffic - server pressure - poor user experienceCopy the code
  1. Introducing browser caching

Procedure: the browser first requests A.js and caches a.js to local disk. (1+10+1 =12KB) the browser requests a.js again and directly accesses the browser cache (200, from cache) without sending requests to the server. (0 KB)…

N requests: 12KB

Advantages: - Save user traffic - reduce server pressure - user experience is very goodCopy the code
Cons: - Browsers can't tell when server resources change and use out-of-date caching all the timeCopy the code
  1. The server tells the browser when the resource Expires

First request

Browser requests (1KB)

The server returns both the resource and its Expires to the browser and stores them on the browser disk (10KB+1KB).

Second request

The browser looks at the Expires field when it requests again, finds that the resource has not expired, and uses the cache directly (0KB).

When the browser makes a request again, it finds that the resource has expired and sends a new request (12KB) to the server to obtain the new resource.

N Requests: This parameter has nothing to do with times

Advantages: Saves a lot of traffic for users and relieves the pressure on the server. After the expiration time, the client can obtain new server resources and the expiration time of the next resourceCopy the code
Disadvantages: After expiration, the server resources will be rerequested regardless of any real changesCopy the code
Summary: The above procedure is really HTTP1. 0Force-caching, which means that the contents of the cache are force-used within that time.Copy the code
  1. The server tells the browser when the resource was Last Modified.

To resolve the forced cache issue, the server returns a last-Modified field indicating when the resource was Last Modified.

First request

The browser makes the request (1KB). The server returns the resource, its expiration date, and the Last time the resource was Modified last-Modified to the browser (10KB+1KB).

Second request

When the browser requests again and finds that the resource has not expired, it directly uses the cache (0KB)

When the browser requests the resource again, it finds that the resource has expired and sends the request to the server again with the if-Modified-since field (equivalent to last-Modified). The server compares the Last modification time of the resource with the field returned by the client.

If so, return 304 to tell the client that the resource has not changed and use the cache (2KB).

If not, 200 is returned to inform the client that the resource has changed, along with the new resource content and the corresponding Last-Modified and Expires fields (12KB).

Cycle the above process…

Advantages: saves a lot of traffic for users and reduces the pressure on the server. After the expiration, the server checks that the file does not change and does not send the a.js to the browser, saving 10KB of traffic. Within the expiration time, the server detects if the file has changed and sends the latest A.js to the browser, which can get the latest A.js.Copy the code
Cons: Last-Modified is only accurate to the second, and can't capture if a file changes more than once in a second. When a resource is modified, but the content of the resource is not modified, the server resource is re-requested. (Example: after a file is modified, it is changed back)Copy the code
Summary: The above procedure is really HTTP1. 0After the client and server negotiate with each other after the cache expires, the server tells the client whether to use local cache.Copy the code
http1. 0The downside of caching: Expires uses an absolute time. The client time can be adjusted by the user, so the cache time is inaccurate. Last-modified is only accurate to the point where the server resource has been Modified but the content has not changed, and the request will still be resent.Copy the code
  1. The server increases relative time (cache-control) to control the forced Cache

The problem with an inaccurate Expires field cache time is caused by absolute time, so HTTP1.1 refers to a relative time, and the server tells the browser how many seconds will pass before the resource Expires and don’t bother me again.

Cache-control:max-age=100

Other values for the cache-Control field are covered below.

Cache-control and Expires can exist at the same time. Http1.1 takes precedence over HTTP1.0, so if cache-Control is present, Expires is ignored.

Advantages: Addresses HTTP1. 0Forced caching - An issue with an Expires field being incorrectly timedCopy the code
Disadvantages: Still has the disadvantage of forced caching - after expiration, the server content will still be rerequested without changeCopy the code
  1. Server added (Etag/ if-none-match) to listen for file content changes

The cache-control field solved the forced Cache issue in 1.0, but the last-modified negotiation Cache issue in 1.0 was still unresolved, so HTTP1.1 introduced Etag/ if-none-match.

First request

The browser makes the request (1KB).

The server returns the resource, absolute time Expires of the resource, relative time cache-Control (max-age=10), last-Modified time when the resource was Last Modified, and the unique number Etag to the browser (10KB+1KB).

The browser stores these fields and resource contents on disk or in memory.

Second request

Within 10 seconds, the browser requests again and finds that the resource has not expired (cache-control priority is greater than Expires) and uses the Cache (0KB).

10 seconds later, when the browser requests the resource again, it finds that the resource has expired, and sends the request to the server again with if-Modified-since and if-none-match fields. The server compares if-none-match with the server’s own Etag value.

If so, return 304 to tell the client that the resource has not changed and use the cache (2KB).

If not, 200 is returned to inform the client that the resource has changed, along with the new resource content and the corresponding last-Modified, Expires, Cache-Control, and Etag (12KB).

Cycle the above process…

Note: Etag changes only if the content of the resource itself changes.

Advantages: Addresses HTTP1. 0Negotiated cache -- last-Modified field times accurate to the second and rerequests only when the server's own resources change.Copy the code
Disadvantages: The browser is still unable to actively learn about server resource changes during the cache life.Copy the code

How can browsers proactively learn about server resource changes?

The above caching process is perfect, but there is a problem. What if the browser keeps using the old resource (cache) when the server resource is updated before the cache expires?

The solution is to not force HTML caching.

Process:

HTML requests only use the negotiation cache, which is validated with the server each time.

When one of the js, CSS, img files introduced in HTML is changed (A.111.js is changed to A.222.js), the content of the HTML file Etag will also change, and a new HTML file will be requested and returned.

For files that have not changed in the HTML, the local cache is used directly on request.

For files that have changed in the HTML, the server is re-requested.

This way, every time we re-request the page, we will be aware of the changes in each file.

Cache-control fields

Cache-control can also be used in request headers, and in a different sense.

Response headers
  • public:

Cache-control: public

Put in the response header indicates that CDN, cache server, etc., can cache the content of this request. And return the content to other users for use.

  • private

Cache-control: private

Indicates that the request content can only be cached by the client or browser and cannot be placed in the public cache. This effectively prevents the public cache from storing response content with personal information.

  • no-cache

The cache-control: no - Cache

Putting it in the response header tells the client or cache server that it is ok to cache the resource, but not to let the browser or cache server use mandatory caching, and to use negotiated caching to confirm the freshness of the resource to the server.

Here’s a question: is max-age=0 the same as no-cache? I think it achieves the same function, kneel for your unique opinion ha ha

  • no-store

The cache-control: no - the store

Indicates that the resource is not cached by the browser or cache server.

  • Must -revalidate, which can be cached, but must be verified with the source server before being used.

  • Proxy-revalidate, which requires the cache server to confirm the cached resource to the source server.

  • max-age

The cache-control: Max - age = 30

When you tell the browser to request again in 30 seconds, you just use the cache.

Request header
  • no-cache

This field is the client’s instruction to the cache server, do not give me the cached resource, please forward my request to the server, confirm the freshness of the resource.

  • no-store

Implies that the cache server does not cache any part of the request (or response).

  • max-age

The cache-control: Max - age = 30

This field indicates in the request header that the cached resource can be cached to the client if the remaining cache time is less than 30 seconds.

When the response header contains cache-control :’no-cache’, does the request header carry the relevant cache-control field?

  • The HTML file

Will be. In the process of learning, it is found that the request header of HTML file will carry cache-control :max-age=0 when the page is refreshed. This is the behavior of the browser itself. When it identifies an HTML file, it will automatically add a cache-control :max-age=0 in the request header. The purpose is to prevent fetching an expired cache.

  • Other documents

Don’t. Other files do not automatically carry the cache-control field.

Browser Cache Example

After exploring the cache fields, let’s put them into practice. Since Chrome does some optimizations for our requests (fields and states are inaccurate), we went straight to Firefox and opened the project console.

After disable cache is checked, the page is refreshed. At this time, all requests do not use cache, and all return status 200, and the size of the response content is displayed.

First request:

Then, we open a random request and look at its request and response headers:

The response header can read the information:

The server tells the browser that the browser will cache the resources locally and request to use the mandatory cache again in 7776000 seconds. After 7776000 seconds, please use the negotiated cache to confirm with me.

The request header can read the information:

Instead of using mandatory caching or not accepting caching from the cache server, the browser sends a request directly to the browser to confirm the freshness of the resource.

Request again:

It was found that the request returned 304, which is the result of the negotiation cache.

The request header carries a field from the negotiation cache. The server finds the same field in the response header of a request and returns 304.

Caches for different refresh behaviors

Before introducing the refresh behavior, let’s first introduce the common cache locations, namely memory cache and disk cache, respectively.

Memory cache: fast read time, short storage time, and the disk cache is released after TAB is closed. Disk cache: slow read time, long storage time, and can store large files after TAB is closedCopy the code

Browsers have their own rules about what resources should be stored where, and the underlying rule is to use whichever is underutilized.

  • General page refresh behavior

If the TAB page is not closed, the memory cache and disk cache can be read, and the forced cache and negotiated cache can be used.

  • Forced to refresh

A forced refresh clears the cache and does not use it.

  • Enter the URL

If TAB is disabled, memory cache is eliminated and disk cache is available.

What do browsers and developers need to do about caching?

Front-end development needs to do things about the user experience of course, but also about their maintenance of JS, HTML, image files, etc. That is, when users load static files, cache can be properly used to improve user experience. There is no need to use forced caching when loading other non-static files.

What does the browser do?

  • When certain response headers are identified, carry the corresponding headers to the next request
  • When the user performs some refresh behavior, the browser itself adds some request headers to determine whether to use the cache
  • Different browsers implement caching differently, with different results.

Reference article:

Juejin. Cn/post / 684490… Juejin. Cn/post / 684490…