preface

Caching mechanisms are everywhere, including client side caching, server side caching, proxy server caching, etc. What has caching capabilities in HTTP is the browser cache. As an important means of web performance optimization, HTTP caching is of great significance to friends engaged in Web development. This article will clean up the HTTP cache around the following aspects:

  • Rules for caching
  • Caching scheme
  • Advantages of caching
  • Request execution for different refreshes

Rules for caching

We know that HTTP caches belong to client caches, and we’ll talk about why. So we assume that browsers have a cached database that stores static files (images, CSS, JS, etc.) that don’t change very often. We divide caches into mandatory caches and negotiated caches. I’ll describe the caching rules for each of these caches in detail.

Mandatory cache

When the requested data is already in the cache database. The client retrieves data directly from the cached database. When the requested data is not in the cache database, the client gets the data from the server.

Negotiate the cache

Also known as contrast the cache, the client will be obtained from the cache database to a cached data identification, after get id request to the server to investigate whether failure (fresh), if there is no failure server will return 304, at this time the client directly from the cache access to the requested data, if failure, the server will return the updated data.

Tip:

The two types of cache mechanisms can exist at the same time. The forced cache has a higher priority than the negotiated cache. When the forced cache is executed, if the cache is hit, the database data is directly cached instead of cache negotiation.

Caching scheme

The above gives us an idea of how caching works, but how does the server determine if the cache is invalidated? We know that when a browser interacts with a server, it sends request data and response data, which we call HTTP packets. The packet contains the header and the body. The cache-related rule information is contained in the header. The content in the BOBY is what the HTTP request actually transmits. An example of HTTP header is as follows:

Mandatory cache

For mandatory caching, the header in the server response is indicated by two fields — Expires and cache-Control.

Expires

The value of Exprires is the data expiration time returned by the server. When the request time on a second request is less than the returned time, the cached data is used directly. Because server and client times can be inconsistent, which can lead to errors in Cache hits, on the other hand, Expires is a product of HTTP1.0, so cache-control is now mostly used instead.

Cache-Control

Cache-control has many attributes, each of which has different meanings. Max-age =t: cache content will expire in t seconds no-cache: negotiated cache is used to verify cached data no-store: all content will not be cached.

Negotiate the cache

Negotiation caches require comparison to determine whether caches can be used. The first time the browser requests data, the server responds with the cache id along with the data to the client, which backs it up to the cache. When requested again, the client sends the cached identity to the server, which determines based on this identity. If not, the 304 status code is returned and the browser can use the cached data directly. For negotiated caches, the cache identity needs to be understood, and we’ll focus on two caching schemes for it.

Last-Modified

Last-modified: When the server responds to a request, it tells the browser when the resource was Last Modified.

If-modified-since: The request header contains this field when the browser requests the server again, followed by the last modification time obtained in the cache. The server receives the request with if-modified-since, and compares it with the last Modified time of the requested resource. If the request is consistent, 304 and the response header are returned. The browser only needs to retrieve the information from the cache. Whether a file has been modified since a certain point in time

  1. If it is: then start transmitting the response as a whole, and the server returns: 200 OK
  2. If Not Modified: then simply transmit the response header and the server returns: 304 Not Modified

If-unmodified-since: Indicates whether the file has not been modified Since a point in time

  1. If not modified: Start ‘continue’ transfer file: server returns: 200 OK
  2. If the file is modified: No feed, the server returns 412 Preprocessing failed

The difference between these two is that one is downloaded after modification and the other is downloaded without modification. Last-modified is not particularly good, because if on the server a resource is Modified but its actual contents have not changed at all, the entire entity is returned to the client because the last-Modified time does not match (even if the client has an identical resource in its cache). To address this issue, HTTP1.1 introduced Etag.

Etag

Etag: This field tells the browser the unique identity of the current resource generated by the server when the server responds to a request (the generation rule is determined by the server)

If-none-match: If the browser requests the server again, the header of the request packet contains this field, and the following value is the identifier obtained from the cache. After receiving the secondary packet, the server compares if-none-match with the unique identifier of the requested resource.

  1. If no, the resource has been modified. The status code 200 is returned in response to the entire resource content.
  2. If yes, the resource does not need to be modified. In this case, the browser responds to the header and directly obtains data from the cache. Return status code 304.

However, in practical applications, Etag is seldom used because the calculation of Etag is obtained by algorithm, which will occupy the computing resources of the server side. All the resources of the server side are precious.

Advantages of caching

  1. Redundant data transmission is reduced and broadband traffic is saved
  2. Reduce the server burden, greatly improve the performance of the site
  3. This makes it faster for clients to load web pages, which is why HTTP caches are client caches.

Request execution for different refreshes

  1. I’m going to write the URL in the browser’s address bar, and I’m going to press Enter and I’m going to find this file in the cache. (the) fastest
  2. F5, F5 is basically telling the browser, don’t be lazy, go to the server and see if this file is out of date. So the browser boldly sends a request with if-modify-since.
  3. Ctrl+F5 tells the browser to delete this file from your cache and then go to the server to request a full resource file. The client then completes the force-update operation.