Content Delivery Network (CDN) releases the Content of the source site to the edge of the Network close to the user, so that the user can obtain the required data nearby. This reduces Network congestion, improves the response speed of requests, and reduces the load of the source site.

Many students are familiar with the functions and functions of CDN, but they may not understand the principle of CDN as I did before. Therefore, this paper aims to briefly describe the working principle and core components of CDN.

1. Process of accessing the source site

To show how CDN works more clearly, let’s first review the process of requesting data directly to the source site without using caching:

As shown in the preceding figure, if the network name to be accessed is “join.qq.com”, the client first searches for the IP address corresponding to the domain name in the hosts file and hosts cache. If the host does not have this information, the local DNS will ask for the IP address corresponding to the domain name. If the local DNS still does not contain the IP address of the domain name, the local DNS queries the root DNS, top-level DNS, and authoritative DNS in sequence, and finally sends the IP address to the client. The client sends HTTP requests to the remote source server through IP address and obtains the corresponding data content.

This section describes how to obtain an IP address corresponding to a domain name and send an HTTP request in DNS iterative resolution mode. The provider of the source site binds the domain name of the source site to the service host by configuring the authoritative DNS. In this way, the client can obtain the IP address corresponding to the domain name of the source site through the DNS service and communicate with the source site through the IP address.

DNS record type

For the sake of subsequent discussion, you need to understand how DNS responds to query requests.

In the DNS system, the most common resource record is Internet class record, which consists of four fields: Name, Value, Type, and TTL. Name and Value can be interpreted as a pair of key-value pairs, but their meanings depend on the Type of Type. TTL records the time when the record should be deleted from the cache. Among the types of resource records, the most common and important types are:

  1. A Record (Address)

    A record describes the mapping between the target domain Name and the IP address. The target domain Name is matched with the Name field in A record, and the Value field (IP address) of the matched record is output to the DNS response packet.

  2. NS Record (Name Server)

    NS records describe the mapping between the target domain Name and the DNS that resolves the domain Name. The Name field of NS records is matched according to the target domain Name, and the Value field (THE DNS IP address that resolves the target domain Name) of the matched record is output to the DNS response packet.

  3. CNAME record

    CNAME records are used to describe the correspondence between the destination domain name and the alias. If A record can convert the destination domain name to the IP address of the corresponding host, then CNAME record can convert one domain name (alias) to another domain name. If multiple CNAME records point to the same domain name, Multiple requests from different domain names can be directed to the same server host. Also, the CNAME record usually corresponds to an A record that provides the IP address of the domain name being converted.

2. Process of obtaining cached content through CDN

In the previous chapter, the process of directly accessing the source site through the DNS service is introduced. In contrast, CDN directs our requests to the source site to the cache node closer to the user, rather than the source site.

The process diagram of request response through CDN is shown in the figure. As you can see by the picture, in the DNS domain name a new global load balancing system (GSLB), the main functions of the GSLB is based on the IP address of the user’s local DNS to judge the position of the user, users select distance closer local load balancing (SLB) system, and the SLB IP address as the results back to the local DNS. SLB is mainly responsible for judging whether the cache server in the cluster contains the resources of the user request data, if requested resource exist in the cache server, depending on the cache server in the cluster nodes, load and the number of connections of health factors such as the optimal selection of cache node, and HTTP requests will be redirected to the optimal cache node.

To explain the working principle of CDN more clearly, the following takes the HTTP request to “join.qq.com/video.php” initiated by the client as an example:

  1. When a user initiates an HTTP request to “join.qq.com/video.php”, the local DNS obtains the IP address of domain name “join.qq.com” through “iterative resolution”.
  2. If the local DNS cache does not contain the record of the domain name, theThe root DNSDNS query packets are sent.
  3. The root DNSIf the prefix of the domain name is “com”, the domain name is resolvedcomtheTop DNSThe IP address of
  4. Local DNS toTop DNSDNS query packets are sent.
  5. Top DNSThe prefix of the domain name is “qq.com”. Search the local record for the prefixAuthoritative DNSAnd reply to the IP address of
  6. Local DNS toAuthoritative DNSDNS query packets are sent.
  7. The authoritative DNS found a field whose NAME is “join.qq.com”CNAME record(configured by the service provider), the Value field of the record is “join.qq.cdn.com”. In addition, we also found another record whose NAME field is “join.qq.cdn.com” and whose Value field is the IP address of GSLB.
  8. The local DNS sends DNS query packets to the GSLB.
  9. GSLB according toLocal DNSThe approximate location of the user is Shenzhen, and the IP address of SLB in South China with the best overall consideration is selected to fill in the DNS reply packet as the final result of DNS query.
  10. The local DNS replies to the CLIENT’s DNS request with the IP address of the previous step as the final result.
  11. The client sends an HTTP request to the SLB based on the IP address: “join.qq.com/video.php”.
  12. SLB takes resource constraints, health, and load of each node in the cache server cluster into consideration and selects the optimal cache node to respond to the HTTP request from the client (the status code is 302, and the redirection address is the IP address of the optimal cache node).
  13. After receiving the HTTP reply from the SLB, the client redirects to the cache node.
  14. The cache node determines whether the requested resource exists or expires and directly returns the cached resource to the client. Otherwise, the cache node updates the data at the source site and replies again.

The key steps are 6 to 9. Different from the common DNS process, the service provider (source site) is required to configure its records in its authoritative DNS. The A record directly pointing to the source site is changed into A CNAME record and its corresponding A record, and the CNAME record converts the target domain name into the alias of GSLB. The A record translates the alias to the IP address of the GSLB. Through this series of operations, the power of resolving the target domain name of the source site is handed over to GSLB, so that GSLB can guide users’ requests to the nearest “cache node” according to geographical location information, which relieves the load pressure and network congestion of the source site.

The above mainly introduces the most common working mode in CDN at present. This working mode decouples domain name and target IP by using CNAME, and devolve the resolution right of target IP to GSLB, which is convenient to realize more customized functions and a more flexible way.