In-depth understanding of HTTP caching mechanisms and principles

One, foreword

Last week ali’s interviewer asked the interview question “Can you tell us about the process of 304 and the header properties that affect the cache?” OMG……. I only had a general understanding that 304 status code represents cache before, and I did not do a special in-depth study on this part because I did not step into the pit of cache in the normal project development process. Therefore, when asked this question, I felt a blow in the head. I also reflected on it. I recruited myself as a software development engineer in school, but I had never thought of engaging in front-end development before, so the front-end foundation can be almost ignored. I have been working for more than one year, in order to perform well at work, I have invested a lot of energy in my work, working until 10 PM is routine operation, and working overtime at least one day on weekends; And spare time, often read front-end related books to make up the foundation, such as “JavaScript Advanced Programming”, “CSS authoritative Guide”, “Sass practice”, “JavaScript high-performance programming”…. There are no less than 10 books such as Webpack Practice and Node.js. There are still a lot of breadth involved, but there are problems: I have not been engaged in front-end for a long time, and front-end knowledge is miscellaneous. If I have no special preparation, if I suddenly ask you a knowledge point, although you probably know what it is, it is difficult for you to explain clearly in an orderly way.

So, the primary task of the next stage: “lay a solid foundation of the front end and deeply understand the principle of the technology stack”. Talk less nonsense, knowledge knowledge, do not understand to figure out! The following “theory + practice” to thoroughly understand the HTTP caching mechanism and principles!

Second, cache rules and parsing

For your understanding, I assume that there is a cached database in the browser that stores cached information. When the client requests data for the first time, no corresponding cached data exists in the cache database. Therefore, it requests the server and stores the data to the cache database after the server returns the request. As shown in the following flow chart:

HTTP caching rules are divided into two broad categories based on whether a request needs to be redirected to the server (
Mandatory cache.
Compared to the cacheBefore going into detail about these two rules, let’s use a sequence diagram to give you a brief understanding of these two rules.

(1) If the cache data already exists, only based on
Mandatory cache, the data request process is as follows:

(2) When the cache data already exists, only based on
Compared to the cache, the data request process is as follows:

We can see the difference between the two types of caching rules,
Mandatory cacheIf it works, there is no need to interact with the server
Compared to the cacheInteraction with the server is required, whether or not it takes effect.

Both types of cache rules can exist at the same time,
Mandatory cachePriority over
Compared to the cacheThat is, when executed
Mandatory cacheIf the cache is in effect, the cache is used directly and no longer executed
Compared to the cacheThe rules.

Cache common fields

Http1.0 caching scheme

Note:

(1) Setting Expires or cache-Control is useless if Pragma: ‘no-cache’ is used, indicating that the weight of the Pragma is higher than that of the latter.

(2) If Expires is set, when the client needs to request data, it will first compare the current system time with the Expires time. If the Expires time is not exceeded, it will directly read the cached data in the local disk and do not send the request.

2, Http1.1 caching scheme

2.1 cache-control field

2.1.1 Cache-Control as the request header field

Cache-control: no-cache

The purpose of the no-cache directive is to prevent an expired resource from being returned from the cache. If the request sent by the client contains the no-cache command, it indicates that the client will not receive cached resources. Each request fetched the resource from the server and returned 304.

Cache-control: no-store

Use the no-store directive to indicate that the requested resource will not be cached, and that the next time any other request is made for the resource, it will still be fetched from the server, returning 200, the resource itself.

2.1.2 Cache-Control as the response header field

Cache-Control: public

When you specify the use of a public directive, it makes it clear that other users can take advantage of the cache.

Cache-Control: private

When a private directive is specified, the response only takes a specific user as an object, as opposed to the behavior of a public directive. The cache server provides resource caching for this particular user, and the proxy server does not return requests from other users to the cache.

Cache-Control: no-cache

If the response from the server contains the no-cache instruction, the server must first confirm its validity each time the client requests it. If the resource has not changed, 304 is returned.

Cache-Control: no-store

The resource that responds is not cached, that is, the next time the user requests 200, the resource itself is returned.

Cache-control: max-age=604800 (unit: seconds)

If the time is exceeded, the system obtains the resource from the server again.

2.2 Request header Fields & Response header fields

2.2.1. Request header field

2.2.2. Response header fields

Note:

(1) if-none-match has a higher priority than if-modified-since, so If both exist at the same time, the former is followed.

4. Experimental verification

1. Experiment 1 — The requested resource is not modified, and two cases of cache occurrence are verified

Server usage
node.js, used by the client
axios To make a request:

1.1 request header/response header Settings

(1) Server response header setting:

res.setHeader('Cache-Control', 'public, max-age=10');

(2) The client request header uses the default Settings

1.2. Experimental procedures

(1) Request 3 times, request resources for the first time; The second request is made within 10 seconds, and the third request is made after 10 seconds. (During the experiment, the resources on the server are not changed.)

1.3. Experimental results

The HTTP information of the three requests is shown in the figure below. From the information in the figure, it can be concluded that the first request for the resource is obtained from the server. The second (within 10 seconds) request for the resource is directly fetched from the browser cache (without confirmation to the server); When I request the resource for the third time (10 seconds later), I obtain the resource from the server because the resource cache time (10 seconds) has expired. The server determines that the resource is not changed with the local cache, so 304 is returned to let the client directly obtain the resource from the browser cache. The following describes the interaction of the three operations in detail according to the HTTP header information.

1.3.1 First request for resources

The screenshot of the request header and response header of the first request for the resource is shown below. Because the resource is requested for the first time, there is no local cache, so the resource is directly obtained from the server. As we can see from the screenshot, the response header of the server returning the resource contains three attributes related to the resource cache:

(1)
cache-control: public, max-age=10

The cache rule setting, in our example, is set to
publicAnd set the cache expiration time to 10 seconds.

(2)
etag: W/"95f15b-16994d7ebf6"

A unique identifier of a resource that the client will carry in the request header the next time it accesses the resource
etagVerify to the server whether the resource has been modified;

(3
) Last-Modified: Tue, 19 Mar 2019 07:26:12 GMT

When the resource was last modified. When the client accesses the resource next time, it carries the information in the request header to the server to check whether the resource is modified.

1.3.2 Second resource request (within 10 seconds, before cache time expires)

The screenshot of the request header and response header for the second request for the resource is shown below. Since the resource is not invalidated in the local cache for the second request, the resource is fetched directly from the browser cache.

1.3.3 Third resource Request (10 seconds later, after cache time expires)

2. Experiment 2 — The requested resource was modified to verify the two cases of cache occurrence

Server usage
node.js , used by the client
axios To make a request:

2.1 Request header/response header Settings

(1) Server response header setting:

res.setHeader('Cache-Control', 'public, max-age=20');

(2) The client request header uses the default Settings

2.2. Experimental steps

(1) Request 3 times, request resources for the first time; The requested resource is then modified on the server, and the resource is requested again within 20 seconds for a second time, and again 20 seconds later for a third time

2.3. Experimental results

The HTTP information of the three requests is shown in the figure below. From the information in the figure, it can be concluded that the first request for the resource is obtained from the server. The second (within 20 seconds) request for the resource is directly fetched from the browser cache (without confirmation to the server); When I request the resource for the third time (20 seconds later), I obtain the resource from the server because the resource cache time (20 seconds) expires. The server determines that the resource is different from the locally cached resource and returns the resource. The following describes the interaction of the three operations in detail according to the HTTP header information.

2.3.1 Requesting resources for the first time

The screenshot of the request header and response header for the first resource request is shown below. The details are the same as in Section 1.3.1 and will be repeated here.

2.3.2 Second request for resources (within 20 seconds, i.e. before cache time expires)

A screenshot of the request header and response header for the second request for the resource is shown below (note: even though the resource on the server has changed at this point, the old resource is returned from the cache because the resource cached in the browser has not expired).

2.3.3 Third Request for resources (20 seconds later, that is, after the cache time expires)

The screenshot of the request header and response header of the third request for the resource is as follows. Because the resource is requested for the third time, the local cache of the resource is invalid, so it is added to the request header
If-Modified-Since and
If-None-Match Property to the server to confirm whether the resource has been changed, as you can see in the following figure, the property in the response header
etag And attributes of the request header
If-None-MatchDifferent, responding to the attributes of the header
If-Modified-Since And attributes of the request header
last-modified Different; So the server returns the latest resource for that resource.

Five, the summary

1. For the mandatory cache, the server informs the browser of a cache time, within the cache time, the next request, directly use the cache, not within the time, the implementation of the comparison cache strategy.

2. For comparison cache, Etag and Last-Modified in the cache information are sent to the server through a request for verification by the server. When the 304 status code is returned, the browser directly uses the cache.

The summary flow chart is as follows:

If you have any questions, please leave a comment

In-depth understanding of HTTP caching mechanisms and principles

One, foreword

Second, cache rules and parsing

Cache common fields

Http1.0 caching scheme

2, Http1.1 caching scheme

4. Experimental verification

Five, the summary

Related Posts

15 Useful Tips for JavaScript Arrays you can’t Help but know

Vue project was packaged and launched

Three small examples of vue quick Start