A word about the header field of the HTTP cache

Common HTTP cache header fields are:

Expires: Response header that represents the expiration date of the resource
Cache-control: Request/response headers, Cache Control fields, precise Control of Cache policies
If-modified-since: Request header, the last time the resource was Modified, told to the server by the browser
Last-modified: Response header, the Last time the resource was Modified, told by the server to the browser
Etag: Response header, resource identifier, told by the server to the browser
If-none-match: request header, cache resource identifier, told by the browser to the server

Where, strong cache:

Expires (HTTP / 1.0)
Cache-control HTTP / 1.1) (

Negotiation cache:

Last-modified and if-Modified-since (HTTP/1.0)
ETag and if-none-match (HTTP/1.1)

Cache process analysis

The browser communicates with the server in reply mode. That is, the browser initiates an HTTP request and the server responds to the request. When the browser initiates a request for the first time, it decides whether to cache the result according to the cache id of the HTTP header in the response packet. If yes, the request result and cache ID are stored in the browser cache. The simple process is as follows:

From the figure above, we can know:

Each time the browser initiates a request, it first looks up the result of the request and the cache identifier in the browser cache
Each time the browser gets the result of the returned request from the server, it stores the result and the cache id in the browser cache

These two conclusions are key to the browser cache mechanism, which ensures that every request is cached and read. Once we understand the browser cache rules, all the problems will be solved

This paper will also carry out detailed analysis around this point. For your convenience, we divide the caching process into two parts according to whether the HTTP request needs to be re-initiated to the server:

Strong caching: The process of looking up the result of a request to the browser cache and deciding whether to use the cached result based on the result’s caching rules
Negotiation cache: After the strong cache is invalid, the browser sends a request to the server with the cache id. The server decides whether to use the cache based on the cache ID

Strong cache (cache control)

Strong caching indicates whether the cache is used during the cache (whether the cache is valid) and whether the HTTP request needs to be re-sent

The fields that Control strong caching are Expires and cache-control, where cache-control takes precedence over Expires

Expires (HTTP / 1.0)

The value is the expiration time of the server’s return cache of this request result:

Expires: Wed, 22 Oct 2018 08:41:00 GMT
Copy the code

Indicates that the resource will expire after Wed, 22 Oct 2018 08:41:00 GMT and needs to be requested again.

In addition, Expires is limited to the client time, and changing the client time can invalidate the cache.

So cache-control is now in HTTP/1.1

Cache-control HTTP / 1.1) (

Cache-control: max-age=30
Copy the code

The value of this attribute indicates that the resource will expire after 30 seconds and needs to be requested again. This means that if the request is made again within 30 seconds, the cache will be used directly. Strong caching takes effect.

It compares to Expires:

The Expires time value in an HTTP response packet is an absolute value
Cache-control in the HTTP response packet is max-age=600, which is a relative value (Expires is limited by the client time).

In addition to max-age, it has the following values:

Note the following no-cache: the resource is still cached, and the cache must be verified by the server before it can be used

Is max-age=0 equivalent to no-cache?

In the literal sense of the specification, max-age expiration means revalidation SHOULD be done, and no-cache means revalidation MUST be done. But the reality is that, depending on the browser implementation, for the most part the two behave the same. (If max-age=0, must-revalidate is equivalent to no-cache.)

conclusion

Since HTTP/1.1, Expires has been replaced by cache-control. Cache-control is a relative time. Even if the client time changes, the relative time does not change. This keeps the server and client time consistent. Cache-control is also very configurable.

Cache-control takes precedence over Expires, and in order to be compatible with HTTP/1.0 and HTTP/1.1, we set both fields in the actual project.

Negotiated cache (cache validation)

If the cache expires:

No cache-control or Expires
Cache-control and Expires
Set up the no – cache

Need to initiate a request to verify that the server resource has been updated:

There are updates, return 200, update the cache
304 is returned. The browser cache validity period is updated

Last-modified and if-Modified-since (HTTP/1.0)

Last-modified (response header)
If-modified-since (request header)

Last-modified indicates the Last Modified date of the local file. If-modified-since will send the last-Modified value to the server and ask the server If the resource has been updated Since this date. If it has been updated, the new resource will be sent back, otherwise the 304 status code will be returned.

However, this approach has some disadvantages, such as:

For load-balanced servers, last-Modified generation may vary from server to server
The GMT format has minimum units, for example, and will not be recognized if it changes within a second

ETag and if-none-match (HTTP/1.1)

To solve the above problem, HTTP/1.1 adds this set of tags

ETag (response header)
If-none-match (request header)

An ETag is similar to a file fingerprint, which is a unique identification sequence of a file. When a resource changes, the ETag will be generated again. If-none-match will send the current ETag to the server and ask whether the ETag of the resource has changed. And the ETag priority is higher than last-Modified

Using ETag allows you to accurately identify resource changes, even when updates are made within seconds, allowing the browser to sense and use the cache more effectively

ETag strength and weakness

The ETag mechanism supports both strong and weak checks. They are distinguished by the presence of “W/” at the beginning of the ETag identifier, as in:

"123456789" - a strong ETag validator W/"123456789" - a weak ETag validatorCopy the code

A strong ETag requires that resources match exactly at the byte level. A weak ETag has a “W/” tag in front of the value. It only requires that resources remain semantically unchanged, but may have some internal changes (such as reordering tags in HTML, or adding a few extra Spaces).

Than the response

The server tells the proxy server that it needs to cache two versions of this resource by specifying Vary: accept-encoding:

The compression
uncompressed

In this way, the old browser and the new browser, through the proxy, get the uncompressed and compressed versions of the resource respectively, avoiding the embarrassment of both getting the same resource.

Vary: Accept-Encoding, User-Agent
Copy the code

As set up above, the proxy server will cache resources based on compression and browser type. This way, the same URL can return different cached content for PC and Mobile.

Three minutes a day