preface

Summary: cache has always been a front-end pain point, a lot of front-end do not know what the cache is, so as to create some trouble for themselves, this article, as always, with easy to understand the text and examples to tell about the cache, I hope to let you get.

  • Blog address: Cache details

  • Zhihu Column && Brief Book Special Topic: Front-end Aggressor (Zhihu)

  • Blog address: Damonare’s personal blog

Blue sky waiting for misty rain, and I am waiting for you.

The body of the

Caching is a technique for saving a copy of a resource and using it directly on the next request.

To be honest, I didn’t really know how to introduce caching at first, so I used the relatively official definition above. I think almost every developer has encountered a cache problem, and there are many cases where we say that the problem has been fixed and you can just clean the cache. In this article, we’ll dig into the cache anecdotes in detail.

πŸ¦‹ Type of cache

Many developers tend to refer to cookie, webStorage, and IndexedDB data as caches because they are all stored on the client side and are indistinguishable. In fact, this is not rigorous, cookies exist more for the server to distinguish between users, webStorage and IndexedDB are more used to store specific data and large amounts of structured data (files /blobs) on the client side.

There is really only one kind of cache – it is a copy of the requested resource. Imagine if our client kept a copy of every resource. Clients will blow up and developers will go crazy! So we need a protocol to handle caches that allows developers to control the creation and removal of caches. Who? Who else? HTTP. HTTP defines a number of request and response fields for caching, and this is what we’ll focus on next, to see how these fields affect caching.

What? You ask me why cache? 😱

That would be too easy to say 🀣, there are many benefits to caching:

  1. Relieve server stress (not having to request resources every time);
  2. Improved performance (it is certainly much faster to open a local resource than to request it back and open it again);
  3. Reduce bandwidth consumption (I’m sure you can understand that);

πŸ€¦β™€οΈ then again, since the cache is so good, what if I request a server in the middle of a proxy cache? What if THE proxy server caches my resources and I cannot get the latest resources from the source server? HTTP certainly has this appeal in mind. And then we’re going to break it down layer by layer.

At a macro level, πŸ‰ caches fall into two categories: private caches and shared caches. Shared caches are those that can be cached by all levels of proxy. A private cache is a cache that is exclusive to the user and cannot be cached by all levels of agents.

🐜 can be microscopically divided into the following three categories:

1. Browser cache

I’m sure that if you use 🌎(Chrome,Firefox,IE, etc.) a lot, you know that these browsers have a clear cache feature in their Settings. This feature exists to remove copies of resources stored on your local disk, i.e. clear the cache.

The point of caching is to respond faster when a user clicks the back button or returns to a page. Especially on websites with multi-page applications, caching an image can be especially useful if you use the same image on multiple pages. 😏

2. Proxy server cache

Proxy caches are similar to browser-side caches, but on a much larger scale. Because they serve thousands of users, large companies and LARGE ISPs often set them up on firewalls or operate them as a separate facility. (Unless otherwise specified, all cache servers mentioned below are proxy servers.)

Because the cache server is not part of the source of the client or server, they exist in the network, the route requests must go through they will only take effect, so you can actually go to manually set the browser’s agent, or by an intermediate server forwarding, so that users perceive the existence of the proxy server naturally. πŸ€₯

A proxy server cache is a shared cache that does not serve just one user but is often used by a large number of users and is therefore effective in reducing time and bandwidth usage: the same cache may be reused multiple times.

3. Gateway cache

Also known as proxy cache or reverse proxy cache, the gateway is also an intermediate server. The gateway cache is usually deployed by the website administrator himself to allow the website to have better performance. πŸ™‚

CDNS(Web content distributors) distribute gateways to cache the entire (or part) of the Internet and sell caching services to websites that need them, such as Qiuniuyun and Youpaiyun in China.

4. Database cache

Database cache refers to when our application is extremely complex, the table is also very complex, we have to carry out frequent database queries, which may lead to database overload, a good way is to put the data after the query into memory, the next query directly from memory. Database caching is not covered in this article. πŸ™ƒ

πŸ¦„ Cache policy of the browser

The goals of the cache:

  • A successful response to a retrieval request: for a GET request, the response status code: 200 is considered successful. A response containing, for example, an HTML document, image, or file;
  • Constant redirection: Response status code: 301;
  • Available cache response: Response status code: 304, this is questionable, Chrome caches 304 cache Settings, Firefox;
  • Error response: a page with response status code 404;
  • Incomplete response: response status code 206, returns only partial information;
  • In addition to the GET request, if a response is matched as a defined cache key name;

That gives us an idea of what we can and should cache. πŸ€—

The browser’s handling of the cache is determined by the response header returned when the resource is first requested.

How do browsers determine if a resource should be cached and how to cache it ❓ response headers! Response headers. Response headers. It’s so important that it should be repeated for three times. ✌ ️

Let’s look at 🌰 :

Age:23146
Cache-Control:max-age=2592000
Date:Tue, 28 Nov 2017 12:26:41 GMT
ETag:W/"5a1cf09a-63c6"
Expires:Thu, 28 Dec 2017 05:27:45 GMT
Last-Modified:Tue, 28 Nov 2017 05:14:02 GMT
Vary:Accept-Encoding
Copy the code

1. Strong cache phase

The above request header is from a CSS file on the baidu home page. I’ve removed a few fields that are irrelevant to the cache and kept only the above ones. For a moment, Expires is the field in HTTP/1.0 that defines the cache and specifies an absolute time when the cache will expire. Cache-control :max-age=2592000 cache-control :max-age=2592000 cache-control :max-age=2592000 cache-control :max-age=2592000 cache-control :max-age=2592000 cache-control :max-age=2592000 cache-control :max-age=2592000 The higher version takes precedence, max-age > Expires.

This is the strong caching phase. When the browser tries to access the CSS file again and finds a cache of the file, it determines whether it has expired based on the last response. If not, it uses the cache. Load the file, OVER! ✌ ️

The Firefox browser displays a gray 200 status code.

The status code of Chrome is as follows:

200 (from disk cache) 200 OK (from memory cache)

Said more: * * * * about cache access to or from memory from disk access, to find a lot of materials, a plausible conclusion: Chrome will according to the local memory usage to determine where cache storage, if memory utilization rate is very high, inside the disk, memory will temporarily in the very high utilization rate of memory. This can explain why the same resource is sometimes from memory cache and sometimes from disk cache.

So what happens when the CSS file expires? ETag and Last-Modified should make their debut.

Last-modified first, this field is when the file was Last Modified;

ETag? An ETag is a tag for a file. Well, HTTP doesn’t have a clear way of generating it, so in theory it doesn’t matter as long as it doesn’t have to be generated repeatedly, such as using a crash-proof hash function for the resource content, using a hash of a recently modified timestamp, or even just a version number.

####2. Negotiation cache phase

When the browser attempts to access the CSS file again and finds that the cache is out of date, it carries if-moified -Since and if-none-match in the header of the request. The server uses these two fields to determine whether the resource has been modified. If there are any changes, the browser returns status code 200 and the new content. If there are no changes, the browser returns status code 304. The browser receives status code 200 and does what it wants to do with it (equivalent to accessing the file for the first time). The cache is then set up based on the newly returned response header. (This step is different, it is found that different browsers are different, Chrome will set the cache for 304, Firefox does not)πŸ˜‘

The two fields carry the following contents (corresponding to the values carried by last-Modified and ETag above respectively) :

If-Moified-Since: Tue, 28 Nov 2017 05:14:02 GMT
If-None-Match: W/"5a1cf09a-63c6"
Copy the code

That’s where the negotiation cache ends.

3. Heuristic caching phase

Let’s change the response header above:

Age:23146
Cache-Control: public
Date:Tue, 28 Nov 2017 12:26:41 GMT
Last-Modified:Tue, 28 Nov 2017 05:14:02 GMT
Vary:Accept-Encoding
Copy the code

Found no? None of the fields the browser uses to determine cache expiration times! So what to do? One might say that the next request goes directly to the negotiation cache phase with if-moified -Since. No, the browser also has a heuristic cache phase 😎

According to the time difference between Date and Last-Modified of the two time fields in the response header, 10% of their value is taken as the cache period.

This is the heuristic caching phase. It’s easy to ignore this stage, but it actually works all the time. So don’t yell and don’t get angry if you run into one of those default caching potholes. browsers are just following the heuristic caching protocol.

I drew the following diagram to illustrate the browser’s overall cache strategy:

πŸ‘Œ Now that I’ve covered caching strategies, I’ll take a closer look at the contents of the different HTTP header fields and their relationships.

πŸ¦€HTTP header field related to cache

What is an HTTP message? HTTP packets are a concept that consists of the following two parts:

  1. Header: contains many fields, such as cookie, cache, packet size, packet format, etc.);
  2. Body: The part of the HTTP request that is actually transferred, for example: an HTML document, a JS file;

We’ve seen how the browser handles caching, and we’ve briefly mentioned a few related fields. 🀧 Next, let’s look at these fields in detail:

1. Generic header field

The field names instructions
Cache-Control Control the specific behavior of the cache
Pragma HTTP1.0 legacy field that enforces validation caching when the value is “no-cache”
Date Date and time the message was created (this field is used in the heuristic caching phase)

2. Response header field

The field names instructions
ETag The unique identity of the resource generated by the server
vary Proxy server cache management information
Age How long the resource is stored in the cache proxy (depending on the size of max-age and s-maxage)

3. Request header field

The field names instructions
If-Match A conditional request that carries the ETag of the resource from the previous request, which the server uses to determine whether the file has new changes
If-None-Match In contrast to if-match, the server uses this field to determine whether the file has new changes
If-Modified-Since Check whether the modification time of two resource accesses is the same
If-Unmodified-Since Check whether the modification time of two resource accesses is the same

4. Entity header field

The field names instructions
Expires Tell the client the absolute time when the resource cache will expire
Last-Modified The time when the resource was last modified

πŸ¦… Browser cache control

HTTP/1.1 regulates 47 header fields, 12 of which are relevant to caching. The next two sections will introduce you one by one. πŸ€“

1. Cache-Control

Cache-control instructions tell the client or server how to handle the cache. This is also one of the 11 fields with the most instructions. Let’s look at the request instruction first:

instruction parameter instructions
no-cache There is no Force the source server to validate again
no-store There is no No content of the request or response is cached
Max – age = [s] Cache duration, in seconds The duration of the cache is also the maximum Age value for the response
Min – fresh = [s] necessary The response is expected to remain valid within the specified time
no-transform There is no Agents cannot change media types
only-if-cached There is no Fetch from cache
cache-extension New Instruction Token

Response instruction:

instruction parameter instructions
public There is no Either party can cache the resource (client, proxy, etc.)
private Can be omitted Only certain users can cache this resource
no-cache Can be omitted The cache must be validated before being cached
no-store There is no No content of the request or response is cached
no-transform There is no Agents cannot change media types
must-revalidate There is no Cacheable but must be confirmed with the source server
proxy-revalidate There is no The intermediate cache server is required to validate the cached response
Max – age = [s] Cache duration, in seconds The duration of the cache is also the maximum Age value for the response
S – maxage = [s] necessary The maximum Age value for the public cache server response
cache-extension New instruction token (token

Note that the no-cache directive is often mistaken for “no cache”, which is not accurate. The no-cache directive means that you can cache, but you should check with the server every time you use it. No-store does not cache content. Other instructions can also be combined, such as:

Cache-Control: max-age=100, must-revalidate, public
Copy the code

The above directive means that the cache is valid for 100 seconds, after which access needs to be validated by sending a request to the source server. This cache can be cached by both the proxy server and the client.

2. Pragma

This is a field in HTTP/1.0, but it has a high priority, and tests have found that Pragma is prioritised over cache-Control and Expires in Chrome and Firefox, so it still works for backwards compatibility. πŸ€” might be used like this:

<meta http-equiv="Pragma" content="no-cache">
Copy the code

Pragma is a generic header field, and when used on the client side, the convention requires that we add the meta meta tag above to the HTML (and maybe hack it after the body).

In fact, this form of caching disabled is of limited use:

  1. Only Internet Explorer can recognize the meaning of this meta tag. Other major browsers can only recognize this meta tagCache-Control: no-storeFor meta tags (seeprovenance)
  2. Pragma does not necessarily apply to the request field if the meta tag is identified in IE, but it does cause the current page to issue a new request every time (only for the page, resources on the page are not affected). A brief introduction to the browser HTTP caching mechanism

You can test your own copy of the code that follows to simulate server-side decisions.

Server response added'Pragma': 'no-cache', the browser behaves similarly to forcing a refresh.

3. Expires

This is another HTTP/1.0 field, and as stated above it defines the absolute time at which the cache expires.

Also, we can use it directly in an HTML file:

<meta http-equiv="expires" content="Thu, 30 Nov 2017 11:17:26 GMT">
Copy the code

What if the time is already past? YES!!!!!! Refreshing the page resends the request.

**Pragma disables caching, and the Pragma field takes a higher priority if an Expires is defined. * * πŸ€–

A big problem with πŸ€–Expires is that it returns the server’s time, but the time used to determine Expires is the client’s time. This makes Expires very passive, because the user may change the client’s time, resulting in an error in determining the cache time. This is one of the reasons the cache-control :max-age directive was introduced.

4. Last-Midified

The next few fields are checksum fields, or fields that play a role in the negotiation cache phase. The first is last-Modified, which not only plays a role in negotiation caching, but also plays a crucial role in the heuristic caching phase.

The first time a browser requests a URL, the server returns a status code of 200, the entity content of the response is the resource requested by the client, and a Last-Modified attribute marks the time when the file was Last Modified on the server. like this:

Last-Modified : Fri , 12 May 2006 18:53:33 GMT
Copy the code
If-Modified-Since

When the browser requests the URL a second time, it stores the first last-modified value in if-modified-since and sends it to the server to verify that the resource has been Modified, according to the HTTP protocol. like this:

If-Modified-Since : Fri , 12 May 2006 18:53:33 GMT
Copy the code

The server uses the if-modified-since field to determine whether the resource has been Modified between these two accesses and to return the full resource. If there are changes, the resource is returned with status code 200. If there are no changes, only the response header, status code 304, is returned to inform the browser that the local cache of the resource is still available.

USES:

  • Verify that the local cache is available
If-Unmodified-Since

This field is literally the opposite of if-modified-since, but is not handled the other way around. Returns 200 and the resource if the file has not been modified between accesses, or 412 if the file has been modified (preprocessing error).

USES:

  • And containIf-RangeThe scope of the header request is used in conjunction with the breakpoint continuation function, that is, if the resource is not modified to continue the download, if the resource is modified, the continuation is meaningless.
  • In POST and PUT requests, concurrency control is optimized so that when multiple users edit a document, if the server’s resources have been modified, the submission of the edit will be rejected.

😈 last-Modified has several disadvantages: It is impossible to accurately determine whether a resource has been Modified. For example, if a file has been changed many times within one second, it cannot be determined by the last-Modified time (in seconds). For example, if a resource has been Modified but the actual content has not changed, it cannot be determined by last-Modified. Hence the introduction of the ETag field πŸ‘‡ in HTTP/1.1

5. ETag

The server can generate a unique identifier (such as an MD5 identifier) for the resource through some custom algorithm, and then pass this identifier to the client in the response header the first time the browser requests a URL. The return status on the server side will be 200.

ETag: abc-123456
Copy the code

The ETag value may contain a W/ prefix to indicate that a weak comparison algorithm should be used (this is icing on the cake, since if-none-match uses and only uses this algorithm). πŸ™„

If-None-Match

If if-none-match and if-modified-since exist together, if-none-match has a higher priority.

When the browser requests the URL a second time, the browser stores the value of the first ETag in if-none-match and sends it to the server to verify that the resource has been modified, according to the HTTP protocol. like this:

If-None-Match: abc-123456
Copy the code

In a Get request, the server returns the requested resource with a response code of 200 if and only if no resource on the server has an ETag attribute value that matches the one listed in the header. If no resource’s ETag value matches, the 304 status code is returned.

POST, PUT, or other requests to change a file. If no resource has a matching ETag value, the 412 status code is returned.

If-Match

In the case of the request methods GET and HEAD, the server returns the resource only if the requested resource satisfies one of the ETags listed in this header. For PUT or other insecure methods, resources can be uploaded only if the conditions are met.

USES:

  • The For GET and HEAD methods, used with the Range header, can be used to ensure that the scope of the new request is the same resource as the scope of the previous request. If the ETag does not match, a 416 (range request cannot be satisfied) response needs to be returned.
  • For other methods, especially PUT,If-MatchThe header can be used to avoid update loss problems. It can be used to detect that the user wants to upload updates that do not overwrite those made after the original resource was retrieved. If the conditions of the request are not met, a return is required412(preprocessing error) response.

Of course, ETag has its own disadvantages compared to Last-Modified, such as a performance sacrifice due to the need to generate tokens for resources. πŸ˜•

About strong and weak checksums:

ETag 1 ETag 2 Strong Comparison Weak Comparison
W/”1″ W/”1″ no match match
W/”1″ W/”2″ no match no match
W/”1″ “1” no match match
“1” “1” match match

🐝 Server cache control

When Expires and cache-Control :max-age= XXX exist together, it depends on the HTTP version of the Cache server application. HTTP/1.1 servers prioritize max-age and ignore Expires, while HTTP/1.0 caching servers prioritize Expires and ignore max-age. Let’s take a look at the two fields that are relevant to the cache server.

6. Vary

What does Vary do? Imagine a scenario where a web page serves different content on mobile. How can a cache server differentiate between mobile and PC? In case you haven’t noticed, the browser carries the UA field on each request to indicate the source, so we can use the user-agent field to distinguish between different clients. It is used as follows:

Vary: User-Agent
Copy the code

For example, if gZIP compression is enabled on the source server, but the user uses an older browser that does not support compression, how can the cache server return? You can set it like this:

Vary: Accept-Encoding
Copy the code

Of course, you can also use it like this:

Vary: User-Agent, Accept-Encoding
Copy the code

This means that the cache server differentiates cached versions by user-Agent and accept-encoding. These two fields in the request header determine what is returned to the client.

7. Age

Cache-control: max-age=[seconds]; cache-control: max-age=[seconds];

What is the purpose of this field? Used to distinguish the requested resource from the source server or the cache server’s cache.

🀧 The value must be combined with another field, that is, Date. Date is the time when the packet is created.

Date

If you press F5 to refresh frequently and find that the Date in the response does not change, you have hit the cache server’s cache. One of the following responses is 🍐 :

Accept-Ranges: bytes
Age: 1016859
Cache-Control: max-age=2592000
Content-Length: 14119
Content-Type: image/png
Date: Fri, 01 Dec 2017 12:27:25 GMT
ETag: "5912bfd0-3727"Expires: Tue, 19 Dec 2017 17:59:46 GMT Last-Modified: Wed, 10 May 2017 07:22:56 GMT Ohc-Response-Time: 1 0 0 0 0 0 Server: BFE /1.0.8.13- SSLPool-patchCopy the code

The picture above is from the response field of a picture on baidu’s home page. We can see that Age=1016859, indicating that the resource has been on the cache server for 1016859 seconds. If the file is modified or replaced, Age is accumulated from 0 again.

The Age header usually has a value close to 0. Indicates that this message object has just been fetched from the original server; The other values represent the difference between the current system time of the proxy server and the value of the generic header Date in this reply message.

The above statement boils down to one equation:

Static resource Age + static resource Date = Date of the original serverCopy the code

🐲 Impact of user operations on cache

After searching for a long time, there is no authoritative summary about this aspect. Finally, I was surprised to find it in Baidu Encyclopedia. I added a user’s forced refresh response to the browser. Force refresh: Ctrl+F5 for Windows, Command + Shift +R for MAC. :relieved:

operation instructions
Open a new window If cache-control is specified as private, no-cache, and must-revalidate, the server will be re-accessed whenever a new window is opened. If a max-age value is specified, the server will not be accessed again. For example, cache-control: max-age=5 indicates that the server will not be accessed again for 5 seconds after the page is accessed.
Press Enter in the address bar If the value is private or must-revalidate, the server is accessed only for the first time and never again. If the value is no-cache, it will be accessed every time. If the value is max-age, access will not be repeated until expiration.
Press the back button If the value is private, must-revalidate, or max-age, the access is not reaccessed. If the value is no-cache, the access is repeated each time.
Press the refresh button (200 from cache is enabled for Chrome, 200 from cache is enabled for Chrome, etc.)
Press the force refresh button Re-request as first entry (return status code 200)

From Baidu Encyclopedia

Wink: What if you want to keep the browser from sending a new authentication request when it hits the refresh button? Add resources dynamically after the page is loaded via a script:

$(window).load(function() {
  	var bg='http://img.infinitynewtab.com/wallpaper/100.jpg';
  	setTimeout(function() {$('#bgOut').css('background-image'.'url('+bg+') ');
  	},0);
});
Copy the code

From zhihu

🐩 HTML cache

This part of the preparation should be called offline storage. Currently the most common is Appcache, but Appcache has been removed from the Web standard, and ServiceWorker may be a suitable solution for the foreseeable future.

1. Appcache

This is a new feature in HTML5 that allows users to access pages without an Internet connection through offline storage. Documents can be loaded offline even if the user clicks refresh.

To use this method, add the appCache file to the HTML file:


      
<html manifest="manifest.appcache">
<head>
  <meta charset="UTF-8">
  <title>* * *</title>
</head>
<body>
  <div id="root"></div>
</body>
</html>
Copy the code

🀠 The manifest feature in a Web application can be specified as a relative path to the cache manifest file or as an absolute URL(the absolute URL must be cognate with the application). The cache manifest file can use any extension, but the MIME type that transmits it must be TEXT /cache-manifest.

* * note: ** On the Apache server, to set the MIME type for the manifest (.appCache) file, You can add AddType text/ cache-manifest.appCache to an.htaccess file in the root directory or in the same directory as the application.

CACHE MANIFEST
Files that need to be cached, whether online or not, are read from the cache
# v1 2017-11-30
# This is another comment
/static/logo.png

# Note: Uncached files are always retrieved from the network
NETWORK:
example.js

An alternate path if the resource cannot be retrieved, such as index.html, or 404 page is returned
FALLBACK:
index.html 404.html
Copy the code

An example of a complete cache manifest file is shown above.

** Note: the ** home page must be cached, because AppCache is mainly used for offline applications. If the home page is not cached, it cannot be viewed offline, so adding index.html to a NETWORK does not work.

This feature has actually been removed from the Web standard, but it is still supported by many browsers, so it is mentioned here.

You can test it with the latest Firefox(version 57.0.1) and the console will display the words πŸ‘‰ :

AppCache API has been deprecated and will be removed in a few days. For offline support, try using Service Worker.

The latest Chrome(version 62.0.3202.94) doesn’t have this warning. 🐻

Here are a few reasons why AppCache isn’t a favorite:

  1. Once manifest is used, there is no way to clear these caches, only to update the cache, or to clear the browser’s cache themselves;
  2. If one of the updated resources fails to update, then all resources fail to update and the previous version of the cache is used.
  3. The home page is forcibly cached (using manifest pages) and cannot be purged;
  4. Appache files may not be updated in a timely manner because browsers treat appCache files differently.
  5. Once the above drawbacks go wrong, it will make users mad and developers mad!

2. Service Worker

Service Worker is an experimental feature and not recommended for online environments. πŸ’ here is a general introduction.

The Service worker essentially acts as a proxy server between the Web application and the browser.

πŸ™‚ First, a short story:

We all know that the browser’s JS engine is single-threaded. It’s like a big Boss that only does one thing at a time. For this reason, W3C(HR) has hired a secretary (Web worker) for the Boss. The big Boss can send trivial things to the secretary Web worker to do, and then send a wechat (postMessage) notification to the big Boss. The big Boss can obtain the results of the things done by the secretary Web worker through onMessage. Late afternoon, it’s time to go home! The Boss went home to coax his son, the secretary also went out on a date, no one to work overtime! This is not possible! W3C(HR) has also put forward the idea of recruiting a program 🐡, OK, Service Worker hired successfully! So the program πŸ™ˆ stuck to its job and began the endless road to overtime. Here’s what the ape did in general:

  • Background Data Synchronization
  • Respond to resource requests from other sources
  • Centrally receive computationally expensive data updates, such as geolocation and gyroscope information, so that multiple pages can leverage the same set of data
  • Compilation and dependency management of CoffeeScript, LESS, CJS/AMD modules on client side (for development purposes)
  • Background service hook
  • Custom templates are used for specific URL patterns
  • Performance enhancements, such as prefetching resources that a user might need

– the Service Worker API

Note: Service Workers is superior to previous similar attempts (such as AppCache mentioned above) because they do not support terminating operations when they go wrong. Service workers have more control over everything. How do you control it?

Service Workers make use of Promise, an important feature of ES6, and use the new FETCH API when intercepting requests because FETCH returns Promise objects. There are three important parts of Service Workers: events, Promises, and Fetch requests. OK, talk is cheap, show you the code. πŸ€“

App.js: tell the browser to register a JavaScript file as a service worker, check whether the service worker API is available, and register the service worker if it is:

/ / use ServiceWorkerContainer. The register () method for the first time registration service worker.
if (navigator.serviceWorker) {
  	navigator.serviceWorker.register('./sw.js', {scope: '/'})
      	.then(function (registration) {
          	console.log(registration);
      	})
      	.catch(function (e) {
          	console.error(e);
      	});
} else {
  	console.log('This browser does not support Service workers');
}
Copy the code

Take a look at the sw.js file as a service worker. Here is an example:

const CACHE_VERSION = 'v1'; // Version of the cache file
const CACHE_FILES = [ // The file to cache
	'./test.js'.'./app.js'.'https://code.jquery.com/jquery-3.0.0.min.js'
];

self.addEventListener('install'.function (event) { // Listen for worker install events
    event.waitUntil( // Delay the install event until cache initialization is complete
        caches.open(CACHE_VERSION)
		.then(function (cache) {
			console.log('Cache open');
			returncache.addAll(CACHE_FILES); })); }); self.addEventListener('activate'.function(event) {// Listen on worker activate events
    event.waitUntil(// Delay the activate event until
        caches.keys().then(function(keys) {
            return Promise.all(keys.map(function(key, i){
                if(key ! == CACHE_VERSION){return caches.delete(keys[i]); // Clear the old version cache}}))}))}); self.addEventListener('fetch'.function(event) { // Intercepts the resource request for the page
    event.respondWith(
        caches.match(event.request).then(function(res) { // Check whether the cache hit
            if (res) { // Returns the resource in the cache
                return res;
            }
            _request(event); // Perform the requested backup operation}})));function _request(event) {
    var url = event.request.clone();
    return fetch(url).then(function(res) {// Use fetch to request online resources
        // Misjudge
        if(! res || res.status ! = =200|| res.type ! = ='basic') {
            return res;
        }

        var response = res.clone(); // Creates a clone of the response object, stored in a separate variable

        caches.open(CACHE_VERSION).then(function(cache) {// Cache resources fetched from online
            cache.put(event.request, response);
        });
        returnres; })}Copy the code

Clearing a Service Worker is also simple:

if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('/sw.js', {scope: '/'}).then(function(registration) {
    // registration worked
    console.log('Registration succeeded.');
    registration.unregister().then(function(boolean) {
      // if boolean = true, unregister is successful
    });
  }).catch(function(error) {
    // registration failed
    console.log('Registration failed with ' + error);
  });
};
Copy the code

Compared with AppCache, the API of Service Worker has increased a lot, and its usage is more complicated. But it can be seen that Service Worker is the future, and it is even more powerful for Web app. In addition to Chrome and Firefox, Service Worker support has recently been added to Safari. Expect it to shine in the future. πŸ€—

πŸ¦‰ Emulation implements server-side decision making

Here, using node native code, is a simple simulation of how the server sends a response, including handling the negotiated cache:

var http = require('http');
var fs = require('fs');
var url = require('url');

process.env.TZ = 'Europe/London';

let tag = '123456';

http.createServer( function (request, response) {  

   var pathname = url.parse(request.url).pathname;

   	console.log("Request for " + pathname + " received.");
   	const fileMap = {
	   'js': 'application/javascript; charset=utf-8'.'html': 'text/html'.'png': 'image/png'.'jpg': 'image/jpeg'.'gif': 'image/gif'.'ico': 'image/*'.'appcache': 'text/cache-manifest'
   	}
   	fs.readFile(pathname.substr(1), function (err, data) {
		if (request.headers['if-none-match'] === tag) {
			response.writeHead(304, {
				'Content-Type': fileMap[pathname.substr(1).split('. ') [1]],
				'Expires': new Date(Date.now() + 30000),
				'Cache-Control': 'max-age=10, public'.'ETag': tag,
				'Last-Modified': new Date(Date.now() - 30000),
				'Vary': 'User-Agent'
			});
	   } else {             
			response.writeHead(200, {
				'Content-Type': fileMap[pathname.substr(1).split('. ') [1]],
				'Cache-Control': 'max-age=10, public'.'Expires': new Date(Date.now() + 30000),
				'ETag': tag,
				'Last-Modified': new Date(Date.now() - 30000),
				'Vary': 'User-Agent'
			});
			response.write(fs.readFileSync(pathname.substr(1)));        
      	}
      	response.end();
   	});   
}).listen(8081);
Copy the code

Code as above. If you haven’t used Node before, copy the code and save it as file.js, install Node, type node file.js, and create an index. HTML file in the same directory. Browser enter localhost: 8081 / index. HTML was simulated. πŸ€“

πŸ¦† Some q&A about caching

1. Problem: The request is cached, so the new code does not take effect

Solution:

  • Server response addedCache-Control:no-cache,must-revalidateInstruction;
  • Modify the request headerIf-modified-since:0orIf-none-match;
  • Modify the request URL. Add a random number after the request URL. The random number can be a timestamp or hash value, for example, damonare.cn? A =1234

2. Problem: Local code is not updated due to server caching

Solution:

  • Set cache-control :s-maxage instruction properly;
  • Set cache-control :private to prevent the proxy server from caching resources.
  • CDN cache can be refreshed using the cache refresh interface set by the administrator.

3. Q: What is the difference between cache-control: max-age=0 and no-cache

Answer:

Max-age =0 and no-cache are different in tone. Max-age =0 indicates that the client should verify the validity of the cache to the server. No-cache tells the client to verify the validity of the cache with the server before using it.

Afterword.

Reference Documents:

  • Discussion on browser HTTP caching mechanism
  • Cache-Control
  • What’s the difference between Cache-Control: max-age=0 and no-cache?
  • Header Field Definitions
  • Caching Tutorial
  • If-Unmodified-Since
  • If-Match