Enter a web address in your browser

DNS cache

The DNS is the Domain Name System.

We usually enter the domain name when we surf the Internet, but we need IP to establish TCP connection, so we need DNS resolution, the domain name is resolved into IP.

The DNS performs recursive query during DNS resolution. Try the following methods to obtain IP addresses corresponding to domain names in sequence:

  • Browser cache
  • System cache (DNS cache of the Hosts files of the user operating system)
  • Router cache
  • DNS cache of Internet Service providers (DNS cache servers of China Unicom, China Mobile, China Telecom and other Internet service providers)
  • Root DNS server
  • Top-level domain name server
  • Master domain name server

During DNS resolution, recursive query is performed based on the preceding steps. If the query is not performed in the current step, the system automatically switches to the next step and searches for the DNS server. If it doesn’t, the browser will say the page failed to open.

CDN cache

CDN is a Content Delivery Network.

The CDN will “insert itself” in the DNS resolution process. DNS resolution may give the IP address of the CDN server, so that you get the CDN server instead of the actual address of the target website.

Because CDN caches most of the website’s resources, such as images and CSS stylesheets, some HTTP requests do not need to be sent to the target website, and CDN can directly respond to your request and send the data to you. Pages dynamically generated by background services such as PHP and Java belong to “dynamic resources”, which cannot be cached by CDN and can only be obtained from the target website. Your HTTP request then begins its long journey across the Internet, passing through numerous routers, gateways, and proxies to reach its destination.

The front-end cache

The HTTP cache mentioned in this article mainly refers to the cache used in HTTP requests, which can be controlled by both the server side and the client side, but is mainly controlled by the server side. The browser cache is set by the front-end development time.

HTTP cache

Where cache resources go

Memory Cache

Memory Cache is a Memory Cache that stores resources in the Memory and obtains resources from the Memory without re-downloading them. Webkit already supports memory caching. Currently, Webkit resources are divided into two types: main resources, such as HTML pages or download items, and derived resources, such as embedded images or script links in HTML pages, which correspond to two classes in the code: MainResourceLoader and SubresourceLoader. Webkit supports memory caching, but only for derived resources. The corresponding class is CachedResource, which is used to store raw data (CSS, JS, etc.) as well as decoded image data.

Disk Cache

Disk Cache, as its name implies, caches resources to disks. The CurlCacheManager directly obtains resources from disks without re-downloading them for next access.

Memory Cache Disk Cache
The same Only some derived resource files can be stored Only some derived resource files can be stored
The difference between The data is cleared when you exit the process Data is not cleared when exiting the process
Storage resources Scripts, fonts, and images are stored in memory Generally, non-scripts are stored in memory, such as CSS

Because CSS files are rendered once they are loaded, we don’t read them very often, so they are not suitable for caching in memory, but scripts like JS can be executed at any time, if the script is on disk, we need to fetch it from disk to memory when we execute the script, so the IO overhead is very high. The browser may become unresponsive.

HTTP cache usage flow

Strong cache

When a browser loads a resource, it determines whether or not a strong Cache hits, mainly cache-control and Expires.

Cache-Control

The cache-control field for Headers has a number of values:

  • Max-age =n is the most commonly used parameter. It marks the resource validity period and is valid within n seconds from the time when a response packet is created (when a non-client receives a packet). Max-age =0 is basically the same as no-cache.
  • No-store does not allow caching and is used for some data that changes very frequently, such as a seckill page
  • No – the literal meaning of the cache it easily confused with the no – the store, the actual meaning is not does not allow the cache, but can be cached, but must go before use server validation is expired, whether to have the latest version
  • Must-revalidate is another word similar to no-cache. It means that the cache can be used if it does not expire, but must be verified by the server if it does.
  • Public: can be cached by all users, including end users and intermediate proxy servers such as CDN.
  • Private: the device can be cached only by the browser of the terminal user and cannot be cached by a trunk cache server such as the CDN.

Nginx configuration

location / {
	root   html;
	index  index.html index.htm;
	add_header Cache-Control no-store;
	# add_header Cache-Control no-cache;
	# add_header Cache-Control max-age=31536000;
}
Copy the code

Expires

This field is the HTTP1.0 era specification, and its value is a gmT-formatted time string in absolute time, such as:

Expires: Mon, 26 Apr 2021 16:57:55 GMT
Copy the code

This time represents the time when the resource expires, before which the cache is hit. One obvious disadvantage of this approach is that since the outage time is an absolute time, it can lead to cache clutter when the server and client times diverging significantly. Today’s browsers use HTTP1.1 by default, so its role is largely ignored. Cache-control also has a higher priority.

Negotiate the cache

When the strong cache is Not hit, the browser sends a request to the server. The server determines whether the cache is hit based on the information in the header. If it is hit, the server returns 304 Not Modified, telling the browser that the resource is Not updated and can use the local cache.

There are five fields in the request header: if-modified-since, if-none-match, if-unmodified-since, if-match, and if-range.

The most commonly used fields are last-modified/if-modified-since and Etag/if-none-match, with the latter having higher precedence.

Last-Modified / If-Modified-Since

Last-modified is the time when a file was Last Modified. Is returned to the client by the server in Response Headers.

If-modified-since is the last-modified value returned by the Last Request (in Request Headers) when the client initiates the Request again. This field tells the server when the resource was Last Modified. After receiving the request, the server finds that the resource contains the if-Modified-since field, and compares the value of the if-Modified-since field with the time when the resource was last Modified on the server. If the last modification time of the server resource is greater than the value of if-Modified-since, the resource is returned with status code 200; otherwise, 304 is returned, indicating that the resource is not updated and the cache can continue to be used.

Disadvantages of last-Modified:

  1. If the server resource changes periodically, if the resource changes back to its original state within a period of time, we think we can still use the cache, but last-Modified has changed, so Etag needs to be further optimized.
  2. If a file has been Modified several times in a second, last-modified is a second, so the new version in that second is indistinguishable.

Etag / If-None-Match

Etag is an Entity Tag, a unique identifier of a resource generated by a server. It is used to solve the problem that file changes cannot be accurately identified during the modification time. Using Etag, you can accurately identify resource changes, enabling the browser to make more effective use of the cache.

The Etag is returned by the server in Response to the request, in the Response Header. If-none-match indicates that when the client initiates the request again, it carries the Etag value returned by the last request and uses this field to tell the server the Etag value returned by the last request. If the request header contains if-none-match, the server compares the value of the if-none-match field with the Etag value of the resource on the server. If the value is consistent, 304 is returned, indicating that the resource is not updated and the cache file is used. If they are inconsistent, the resource file is returned with the status code 200.

Etags also need to be divided into strong eTAGS and weak Etags. A strong Etag requires that resources match exactly at the byte level. A weak Etag, preceded by a “W/” tag, requires that resources remain semantically unchanged, but may have internal changes (such as reordering tags in HTML, or adding a few extra Spaces).

Browser cache

Cookie

Cookies are mainly used to store user information. The content of cookies can be automatically transmitted to the server in same-origin requests with a capacity of 4 KB. The same web site and browser share this data. Different tabs access the same data.

document.cookie // A JSON string that can read the cookie of the current page
document.cookie = "key=value; path=path; domain=domain" // One cookie can be appended. Only one cookie can be set or updated at a time
Copy the code

LocalStorage

A type of Web Storage, the data of LocalStorage is stored in the browser until the user clears the browser cache data. The capacity is 5 MB. The same web site and browser share this data. Different tabs access the same data.

localStorage.setItem(key, value);
localStorage.getItem(key);
localStorage.removeItem(key);
localStorage.clear();
Copy the code

SessionStorage

A Web Storage, other properties of SessionStorage with LocalStorage, but its life cycle with the life cycle of the TAB page, when the TAB page is closed, SessionStorage will also be cleared, capacity of 5 MB. The same web site and browser do not share this data, and different tabs do not access the same data.

SessionStorage.setItem(key, value);
SessionStorage.getItem(key);
SessionStorage.removeItem(key);
SessionStorage.clear();
Copy the code

Web SQL

The Web SQL database API is not part of the HTML5 specification, but it is a separate specification that introduces a set of APIs for manipulating client databases using SQL.

Web SQL databases work in the latest versions of Safari, Chrome, and Opera browsers, essentially by embedding SQLite databases.

The openDatabase() method opens/creates the database

// Parameters: database name, version number, description text, database size, create callback
var db = openDatabase('mydb'.'1.0'.'Test DB'.2 * 1024 * 1024);
Copy the code

The transaction() method controls the execution of a transaction

The executeSql() method performs SQL queries

db.transaction(function (tx) {  
   tx.executeSql('CREATE TABLE IF NOT EXISTS LOGS (id unique, log)');
});
Copy the code

IndexedDB

IndexedDB is an underlying API for storing large amounts of structured data (also file/binary Large objects (BloBs)) on the client side. The API uses indexes for high-performance searches of data. While Web Storage is useful for storing smaller amounts of data, it is inadequate for storing larger amounts of structured data. IndexedDB provides a solution to this scenario.

In plain English, an IndexedDB is a browser-provided local database that can be created and manipulated by web scripts. IndexedDB allows you to store large amounts of data, provide a lookup interface, and build indexes. These are all things that LocalStorage does not have. In terms of database type, IndexedDB is not a relational database (it does not support SQL query statements) and is closer to a NoSQL database.

IndexedDB has the following characteristics.

  1. Key-value pair storage. IndexedDB uses an object Store internally to store data. All types of data can be stored directly, including JavaScript objects. In the object warehouse, data is stored as “key-value pairs”. Each data record has a corresponding primary key. The primary key is unique and cannot be duplicated, otherwise an error will be thrown.
  2. Asynchronous. IndexedDB does not lock the browser and users can still perform other operations, in contrast to LocalStorage, where operations are synchronous. Asynchronous design is designed to prevent massive data reads and writes from slowing down the performance of a web page.
  3. Support transactions. IndexedDB supports transaction, which means that if one of a series of steps fails, the entire transaction is cancelled and the database is rolled back to the state before the transaction occurred, without overwriting only a portion of the data.
  4. The IndexedDB is subject to the same origin restriction, with each database corresponding to the domain name that created it. Web pages can only access databases under their own domain names, but not cross-domain databases.
  5. IndexedDB has much more storage space than LocalStorage, usually no less than 250 MB or even no upper limit.
  6. Binary storage is supported. IndexedDB can store not only strings but also binary data (ArrayBuffer objects and Blob objects).

Specific API can be reference with IndexedDB – Web API interface reference | MDN (mozilla.org)

Application Cache

Application Cache is the Cache of Application programs, before HTML5, we need to access the network to access, this is no doubt the website repeatedly request the server, resulting in slow speed, for PC users, the network is relatively stable, the loading speed is not too bad. But what about mobile? Mobile terminal depends on wireless signal, signal tower, position is not fixed, affected by nearby buildings, etc. A series of causes network instability, we can not change users, also can not abandon the slow network users. Also, in the hybrid app world, the built-in WebView is often used to load HTML pages, which can still cause problems if the connection speed is too slow.

In the actual development, mainly use Application Cache and LocalStorage technology, they come from HTML5 technology.

  • Application Cache: Typically used for caching static resources (static pages).
  • LocalStorage: Commonly used for AJAX request caching to store non-critical AJAX data.

PWA (Progressive Web App)

PWA (Progressive Web App) is a Progressive Web application. Cache API is a set of storage mechanism with PWA service worker empowerment to realize the offline function of requested data. Similar to application Cache, it provides very fine-grained storage control, with the content completely controlled by scripts. Fetch is often used with service workers, and different headers of the same URL can store multiple responses. Cross-domain sharing is not provided and is completely isolated from HTTP caching.

BFCache

The round-trip Cache, or back-forward Cache, is a browser strategy to speed up the rendering of historical pages on the Forward and Back buttons. When users go to a new page, the browser DOM state of the current page is saved to BFCache. When the user clicks the back button, the page is loaded directly from BFCache, saving time for network requests.