Qi Yunlei, front engineer of Wedoctor Cloud service team. Focus on node.js basic ecological construction and scheme precipitation in Web applications.

SSR is a resource-intensive task that requires caching in order to resist higher traffic and provide faster service.

And CDN cache, as the primary support of static resources, is suitable for armed to SSR pages?

Prior to the start

You’re probably familiar with CDN. If you’re not, let’s take you through a series of questions and answers.

Why access CDN?

Abstract a simple request link, easy to understand the location of CDN.

Access to the front:


User -> Nginx -> App Server

After the access:


User -> CDN -> Nginx -> App Server

Seems to increase a layer of transmission costs, in fact, not.

CDN utilizes its vast server resources to dynamically optimize access routing, provide nearby access nodes, obtain data from source stations with lower latency and higher bandwidth, and optimize user experience at the network level.

Why is CDN Cache enabled?

Before opening: Browser -> CDN -> Nginx -> App Server1 -> App Server2 ->…

Enabled: Browser <-> CDN

CDN can cache resources requested by users and can contain HTTP response headers. The next time any user requests the same resource, the cached resource responds directly to the user, saving all the subsequent steps that would have been handled by the source site.

A more intuitive expression is to cut off the request link.


How do I enable CDN cache?

Without considering the self-developed CDN, the steps to enable CDN caching are very simple:

  1. Access the CDN service by domain name and enable cache for paths
  2. The cache-control response header is set at the source site for more flexible Control of the Cache rules, but is not required

Which services can enable CDN caching?

Most websites are suitable for accessing CDN, but CDN caching can be enabled only when SSR pages meet certain conditions

  • None User status
  • Timeliness requirements are not high, can accept at least minute delay

How do I determine if I hit the cache?

Different CDN platforms have slightly different detection methods, which are essentially the identification field of the response header. The Tencent CDN is used as an example, and the response header X-cache-lookup is shown respectively

  • Hit From MemCache: Hit the memory of the CDN node
  • Hit From Disktank: hits a disk on a CDN
  • Hit From Upstream: failed to Hit the cache

If the field does not exist, CDN is not configured on the page or caching is disabled.

CDN cache optimization

The important index used to measure the cache effect is the cache hit ratio. Before the formal setting of CDN cache, we will understand several points to improve the cache hit ratio. These points also serve as criteria for evaluating whether a system should access CDN caches.

Extended cache time

Increasing the time of cache-control is the most effective measure. The longer the Cache lasts, the less chance the Cache will fail.

This can significantly improve cache hit ratio even when page views are low.

Note that cache-control can only inform the CDN of the upper limit of the Cache time, and does not affect its early elimination by the CDN. Resources with low traffic will be cleaned up quickly. CDN protects its resources from being wasted by the cache mechanism of step-by-step precipitation.

Ignore URL parameters

The full URL accessed by the user may contain various parameters, which are treated by CDN as separate resources by default, and each resource is a separate cache.

Some parameters are obviously not expected. For example, after page links are shared through wechat and other channels, statistical parameters set by various channels are hung at the end. The average number of visits to a single resource is significantly reduced, which reduces caching effectiveness.

CDN supports background filter parameter options to ignore urls? The following parameters. In this case, the same URL must be regarded as the same resource file.

In Tencent CDN, the function of ignoring parameters cannot take effect for a CERTAIN URL, but only for the whole domain name, which makes filtering parameters an extremely risky operation. You are not advised to enable this option unless the CDN cache is dedicated to the domain name. Even if all CDN resources in the same domain name do not depend on the URL parameter, there is no guarantee that they will not be pitted in the future.

Active cache

Turning passivity into initiative makes it possible to achieve 100% cache hit ratio.

The commonly used active cache is resource preheating, which is more suitable for static files with a clear URL path. Dynamic routing cannot be delivered to CDN intelligent preheating unless specific addresses are pushed in turn.

Evolution of the code

After talking about several key points of CDN cache optimization, we can know that the configuration of CDN background needs to be treated with caution. In actual operation, I have made several stages of adjustment, but after all, the specific configuration depends on the CDN service provider, so this paper will not discuss in depth.

Now, let’s turn our attention to the evolution of the code layer.

1. Master the cache

There is a premise for code configuration, that is, the CDN background needs to enable the support of reading source station cache-control.

Then, by simply adding the response header, the operation and maintenance can take over the initiative of setting CDN cache rules.

Using node.js Koa middleware as an example, the global initialization version is as follows

app.use((ctx, next) = > {
  ctx.set('Cache-Control'.`max-age=300`)
})
Copy the code

Of course, the omissions in the code above are numerous. In SSR application, there is no need to cache all the pages, which needs to supplement the judgment condition of the path.

Second, control path

Although CDN background can also configure the path, but the configuration mode and the number of paths have limitations, not as flexible as the code form.

If we only need to cache the /foo page, add an if judgment

app.use((ctx, next) = > {
  if (ctx.path === '/foo') {
    ctx.set('Cache-Control'.`max-age=300`)
  }
})
Copy the code

This falls into the first pitfall. It’s important to pay attention to how the route handles path. Generally, ‘/foo’ and ‘/foo/’ are two separate paths. Path === ‘/foo’ may have missed processing requesting path /foo/ because ctx.path === ‘/foo’.

3. Supplementary paths

The pseudocode is as follows

app.use((ctx, next) = > {
  if ([ '/foo'.'/foo/' ].includes(ctx.path)) {
    ctx.set('Cache-Control'.`max-age=300`)
  }
})
Copy the code

In addition, the configuration of CDN background also needs to avoid this problem. In Tencent CDN, directories and files apply to different page paths.

Ignore the degraded page

In the event of a server rendering failure, to improve fault tolerance, we return the degraded page to the client rendering. If CDN cache degraded pages due to occasional network fluctuations, it will continue to affect user experience for a period of time.

Degrade is a custom variable introduced to indicate whether the page is degrade

app.use(async (ctx, next) => {
  if ([ '/foo'.'/foo/' ].includes(ctx.path)) {
    ctx.set('Cache-Control'.`max-age=300`)
  }

 await next()   // Cancel the cache when the page is degraded  if (ctx._degrade) {  ctx.set('Cache-Control'.'no-cache')  } }) Copy the code

Yes, that’s not the last trap.

Cookies and state governance

As mentioned above, CDN can optionally cache HTTP response headers, but this option works for the entire domain name and is generally required.

The new problem comes from a response header that does not want to be cached.

The setting of the application Cookie depends on the set-cookie field in the response header. The cache of set-cookie directly causes the Cookie of all users to be refreshed to the same value.

There are several solutions, one is that the page does not Set any cookies, the other is that the proxy layer filters out the set-cookie field. Unfortunately, Tencent CDN does not support the response header filtering, this step fault tolerance must be operated by yourself.

app.use(async (ctx, next) => {
  const enableCache = [ '/foo'.'/foo/' ].includes(ctx.path)

  if (enableCache) {
    ctx.set('Cache-Control'.`max-age=300`)
 }   await next()   // Cancel the cache when the page is degraded  if (ctx._degrade) {  ctx.set('Cache-Control'.'no-cache')  }  // Cache pages do not Set set-cookie  else if (enableCache) {  ctx.res.removeHeader('Set-Cookie')  } }) Copy the code

The code added above is designed to remove set-cookies before the page responds, but the load order of the middleware is difficult to control. Some (middleware) plug-ins, in particular, implicitly create cookies, which can make cleaning cookies cumbersome. If the subsequent maintainer does not know, set-cookies are likely to be added back into the response header. Therefore, this kind of cleaning up is handled as much as possible in the agent layer, not in the code logic.

In addition to cookies, you may face other state information management issues. For example, the login state of the requested user is stored in the renderState of Vuex. At this time, the USER information is embedded in the HTML page. If CDN cache is used, the problem similar to the uncleared set-cookie will occur on the client side. There are many similar examples, their solution idea is very similar, before accessing CDN cache must do a comprehensive check on the status information.

Customize the cache path

Now the function is working, but the cache rules are complex. If you want to set up more pages, you need to customize the cache time separately. This code still needs to change constantly.

For example, we only want to cache /foo/:id, not /foo/foo, /foo/bar, etc.

Ctx. set(‘ cache-control ‘, ‘no-cache’) is the default to the first line of the middleware.

For example, if we want to cache the /foo page for 5 minutes and the /bar page for 1 day, we need to introduce a time profile.

This middleware and its configuration becomes more and more difficult to maintain.

Therefore, we change the way of thinking, cache rules are no longer handed over to the middleware, but transferred to Vue SSR entry-server, and page level configuration can be achieved through metadata. Due to the differences of SSR schemes, the specific implementation is not described.

7. Cache failure

Cache invalidation is a neutral term. How to deal with CDN cache invalidation has to be weighed carefully.

On the one hand, it can add intermittent service stress and, in the case of Serverless applications, computing costs. On the other hand, many scenarios have to be actively triggered to actually update resources.

The dark side of CDN caching cannot be ignored. While caching is transparent to users, it can be a hindrance to products and technologies.

If not handled properly, it will affect the timely release of new features, block the burial point of all post-installed services, increase the cost of risk perception, and fail to guarantee consistency, increasing the difficulty of troubleshooting online problems.

Therefore, it is necessary to set up a trigger service to refresh and warm up the cache to improve the developer experience. However, CDN cache controllability is very low, refresh can not be fully real-time effect.

For pages with frequent changes, it is best to consider enabling CDN cache after entering a stable period. Even for stable and high-traffic pages, precautions against CDN cache penetration also need to be considered.

Once CDN cache is reused in SSR architecture, be prepared for long-term adjustment decisions.

conclusion

CDN cache is a sharp edge. It can intercept almost all the requests for the source station in the case of heavy traffic, and can provide a very flexible load.

So is SSR application suitable to access CDN cache? Again, enumerate the many issues mentioned above…

  • Path control
  • Page down
  • State governance
  • Cache invalidation

The answer is up to you.

In fact, very few SSR page scenarios require CDN caching, such as portal home pages.

General services with low traffic and scattered paths only need to use dynamic CDN acceleration and static file cache, which can basically meet the optimization needs of CDN proxy layer.