This is the second day of my participation in Gwen Challenge

A page obtains resources through a network request through three steps: initiating the request, processing the request, and responding to the resource. If you can reuse the previously acquired resources, you can save the waiting time of requests and network traffic, and improve the performance of the website.

Caching is a technique for saving a copy of a resource and using it directly on the next request. This technology has related applications in browser, gateway, server and so on. This article focuses on browser-specific resource caching mechanisms.

The cache location

In descending order of priority, browser resources are cached in the following locations:

  1. Service Worker
  2. memory cache
  3. disk cache(HTTP cache)
  4. push cache

Service Worker

Service Worker is a script that the browser runs in the background independently of the web page. It has an independent JS running environment to assist the front-end page to complete tasks that need to be performed quietly in the background. Its cache is permanent. It is cleared by the browser unless the API is manually called or the capacity exceeds the limit.

A resource fetched by the Service Worker’s fetch() method, even if it does not hit the Service Worker cache or even actually goes to the network request, The Network panel in Chrome will also be marked as from ServiceWorker.

memory cache

Almost all network requests are automatically cached by the browser. In most cases, when the browser TAB is closed, the memory cache associated with the page is invalidated. The memory cache ignores HTTP headers, except for no-store. When cache-control: no-store is set, the resource is not cached.

disk cache

The cache related fields in the HTTP header are defined by the cache policies associated with disk cache. It is persistent storage that actually exists in the file system, and it allows the same resources to be used across sessions and even across sites. The browser automatically cleans up the oldest and most likely outdated ones according to its own algorithms.

push cache

Push Cache is in HTTP2 and is used when all three caches fail. It only exists in the Session, is released once the Session ends, has a short cache time (about 5 minutes in Chrome), and does not strictly implement the cache instructions in the HTTP header.

The caching process after requesting the resource

  1. The Service Worker’s handler determines whether to store the cache.
  2. Disk cache is determined based on the HTTP header. Strong cache takes precedence over negotiation cache.
  3. Memory Cache Stores a reference to a resource for future use.

Strong caching (cache-Control and Expires)

The browser determines whether the resource Cache has expired based on the Expires and cache-Control fields. If not, the browser uses the cached resource directly. If it expires, the negotiation cache is entered.

Expires is an HTTP1.0 field that represents a cache expiration date. It is an absolute time that depends on the local time of the client.

Cache-control is an HTTP1.1 field that indicates the maximum lifetime of cached resources.

Cache-control has a higher priority than Expires. When both exist, cache-Control is used as the criterion.

Negotiate the cache

When the strong cache is invalid, the browser authenticates the negotiation cache based on the following two groups of fields, and the server determines whether the cache content is invalid.

Last-Modified & If-Modified-Since

Last-modified is the Last time the resource file was changed.

  • The server uses last-Modified to tell the client when the resource was Last Modified, and the browser records the value and resource in the cache database.
  • The next time the same resource is requested, the Last last-modified value is written to the if-Modified-since field in the request header;
  • The server compares the if-modified-since value in the request header with the last-Modified field. If the value is equal, the server returns a 304 status code indicating that the resource can be used again. The browser takes the cached resource directly. Otherwise, the cache expires and 200 status code and new resources are returned.

ETag & If-None-Match

ETag is a unique identifier generated by the server based on the current file content. It will change as the content changes.

  • The server passes the ETag to the browser through the response header;
  • The browser will place this value in the request header as the if-none-match field on the next request and send it to the server.
  • After receiving if-none-match, the server compares it with the ETag of the resource on the server. If they are consistent, 304 status code is returned, otherwise 200 status code and the new resource are returned.

ETag has a higher priority than Last-Modified and is more accurate than last-Modified. If the file is Modified multiple times within a second, the last-Modified value does not change. However, ETag has poor performance and needs to be calculated by algorithm.

Heuristic cache

If Expires, Cache-Control: max-age, or cache-Control: s-maxage are not present in the response header and last-Modified is set, the browser defaults to a heuristic algorithm called heuristic caching. The cache time is usually taken from the Date of the response header minus 10% of the last-modified value, i.e. (date-last-Modified) * 10%.

conclusion

The cache order is invoked when the browser requests the resource:

  1. Call the fetch event of the Service Worker
  2. Check the memory cache
  3. Check the disk cache: If the disk cache is strong and not invalid, the server is not requested and the status code is 200. If the strong cache fails, the negotiated cache is used to determine whether it is 304 or 200 by comparing the protocol header

After the server resource responds:

  1. If an HTTP response header is configured, the response is stored in the disk cache.
  2. Pass a reference to the response to the memory cache (ignoring any HTTP headers except no-store).
  3. The response is stored in the Service Worker’s Cache Storage (the Service Worker script calls the put method).

reference

Read the front-end cache, by little mushroom little brother

Get an in-depth understanding of the browser’s caching mechanism