HTTP caching mechanism

Summary: Cache requested resources locally and reuse them directly for subsequent requests as much as possible, reducing Http requests and significantly improving site and application performance.

So when do you cache resources locally? When do cache resources expire? When are these cached resources used? This paper begins with these three questions.

HTTP cache mechanism flow

Since only GET requests are cached, this refers to generic GET resource requests.

Strong cache

There is no need to send additional requests to the server, using the local cache directly. In Chrome, the local strong cache is divided into two types: disk cache and memory cache. If you check the Networks in DevTools, you can see the request status is 200. And the request followed by from disk cache and from memory cache is using a strong cache, as shown in the following two figures.

I have not yet understood how Chrome controls two kinds of strong cache, so I do not expand, so as not to mislead readers, hope to have a master point out !!!! The two strong caching strategies are related to the rendering process’s life cycle, which in turn roughly corresponds to the TAB TAB:

Chrome employs two caches — an on-disk cache and a very fast in-memory cache. The lifetime of an in-memory cache is attached to the lifetime of a render process, which roughly corresponds to a tab. Requests that are answered from the in-memory cache are invisible to the web request API.

Whether to use strong caching is controlled by three HTTP header fields: Expires, Pragma, and cache-Control.

Expires

The Exipres field is a field in Http/1.0 that has the lowest priority of the three cache control fields.

As shown in the figure, the Expires value in the response header is a timestamp. When a request is made, if the local system time is before this timestamp, the cache is valid; otherwise, the cache is invalid and the negotiation cache is entered. If the response header sets Expires to an invalid date, such as 0, it represents a past date, i.e. the resource has expired.

Cache-Control

Cache-Control
HTTP / 1.1

  • no-store: Disable the use of cache, each request to the server to get the latest resources;
  • no-cache: Does not use the strong cache, directly enters the negotiation cache module, and requests the server to verify whether the resource is “fresh”.
  • private: Private cache. The intermediate proxy server cannot cache resources
  • public: public cache. Intermediate proxy servers can cache resources
  • max-age: Unit: second, the maximum cache validity time. The start time is the Date field in the cache header, that is, the validity period reaches responseDate + max-age. When the request is initiated, the cache expires.
  • must-revalidate: Once the cache expires, it must be re-validated to the server.

Pragma

Pragma is a generic header field specified in HTTP/1.0 for backward compatibility with cache servers that only support the HTTP/1.0 protocol. This field has only one value: no-cache, which behaves like cache-control: no-cache, but it is not explicitly defined in the HTTP response header, so it cannot be used as a complete replacement for the cache-control header defined in HTTP/1.1.

If both Pragma and cache-control fields exist, Pragma takes precedence over cache-Control.

Negotiate the cache

When strong Cache expires or request header fields are not set to strong Cache, such as cache-control :no-cache and Pragma:no-cache, the negotiated Cache section is entered. The negotiated cache involves two pairs of header fields, last-modified/if-Modified-since, and ETag/ if-none-match.

If the request header contains if-modified-since or if-none-match, the server is sent to check whether the resource has changed. If the resource has changed, the server returns 200. The browser calculates whether the resource has been cached and uses the resource. If not, the cache is hit and 304 is returned. The browser updates the cache header based on the response header, extends the validity period, and uses the cache directly.

Last-Modified/If-Modified-Since

Last-modified/if-modified-since is the resource modification time. When requesting a resource for the first time, the server puts the Last Modified time of the resource in the last-Modified field of the response header. When requesting the resource for the second time, the browser automatically puts the last-Modified value of the resource in the Last response header into the if-Modified-since field of the second request header. The server compares the last modification time of the server resource to the if-modified-since value in the request header, and returns 304 If it is equal to the cache hit, or 200 If it is not.

ETag/If-None-Match

The value of ETag/ if-none-match is a string of hash values (different hash algorithms). It is the identifier of the resource. When the resource content changes, the hash value also changes. The process is similar, except that the server compares the hash value of the server resource to the if-none-match value in the request header, but the comparison is different because there are two types of ETag:

  • Strong check: The hash value of the resource is unique. If it changes, the hash value changes.
  • Weak check: Indicates the resource hash valueW/At the beginning, if the resource changes are small, it is also possible to hit the cache.

For example:

ETag: “33a64df551425fcc55e4d42a148795d9f25f89d4″ ETag: W/”0815”

The difference between

  1. ETag/If-None-MatchPriority thanLast-Modified/If-Modified-SinceHigh;
  2. Last-Modified/If-Modified-SinceThere is a 1S problem where the server modifies the file within 1S and returns an error when requested again304.

Proxy service cache

vary
HTTP / 1.1
Accept-Encoding
vary

conclusion

In the actual application process, reasonable use of cache mechanism for some resources that are not updated frequently can effectively improve the response speed of the system and improve user experience.