In performance optimization, a common recommendation is to deploy resources on a CDN, so the question is, what is a CDN? What’s the good of that?

DNS

Let’s start with the Domain Name System (DNS).

It is a distributed database that links domain names and IP addresses. The mapping between a domain name and an IP, known as a record, can be divided into various types

  • A: Address indicates the IP Address to which A domain name refers. A domain name can have multiple A records.
  • NS: Name Server: indicates the address of the Server that stores the next-level domain Name information
  • MX: Mail eXchange, the address of the server that receives E-mail
  • CNAME: Canonical Name, returns another domain Name, makes the current query domain Name pick that domain Name, multiple domain Name -> server mapping.
  • PTR: Pointer Record. This parameter is used only for IP address query

DNS consists of the following three parts

  • Name Resolver (resolver)
  • Domain Name Space
  • Name Server

If you want to visit Baidu.com, you need to check its IP address, such as 220.181.57.216, through the DNS system first.

DNS query process

So the question is, how does DNS look up IP by domain name? Let’s take the example of a browser typing www.example.com,

  1. Check browser cache
  2. Check operating system caches, such as hosts files
  3. Checking the router cache
  4. If the first few steps fail, the ISP’s LDNS server is queried
  5. If the LDNS Server is not found, the Root Server requests resolution.

    1. Return the top-level domain name (TLD) address of the.com,.cn,.org server. There are only 13 servers in the world. In this example, the.com address is returned
    2. The request is then sent to the TLD, which returns the address of the subdomain name (SLD) server, as in this example. Example
    3. It then queries the SLD domain name server for the target IP by domain name, which in this example returns the address of www.example.com
    4. The Local DNS Server caches the result and returns it to the user.
DNS Security
  1. DNS reflection/amplification attack

    Sends DNS requests for extensive domain name query to a large number of open DNS servers, and forges the source IP address of the DNS request to be the target IP address of the attack. Because the requested data is much smaller than the corresponding data, attackers can use this technique to amplify the bandwidth resources and attack traffic they master.

  2. DDOS attacks may cause domain name resolution failure
  3. DNS/ domain name hijacking Intercepts domain name resolution requests, analyzes the requested domain name, returns a false IP address, or makes the request unresponsive within a hijacked network. DNS hijacking is implemented by tampering with the data on the DNS server and returning an incorrect query result to the user.
  4. DNS pollution DNS pollution is a method that prevents common users from communicating with the host because they obtain a false destination IP address. When a user accesses an address, a domestic server (non-DNS) monitors the user’s access to the marked address and sends back an incorrect address disguised as a DNS server. The difference between DNS hijacking and DNS hijacking is that DNS hijacking modifies DNS resolution results, while DNS hijacking does not pass through the DNS server and returns error messages
  5. DNS information hacking has been modified
DNS optimization

As you can see, DNS resolution is a long process, how to optimize this process?

  1. DNS Prefetching

    Before a user requests a link, the browser tries to parse the domain name of the link and then cache it. This eliminates the need for DNS resolution when the actual request is made. You can respond in the server by setting the value of X-dns-prefetch -Control to ON to start pre-resolution

Or in HTML

<meta http-equiv="x-dns-prefetch-control" content="on">Copy the code

Preresolution of a specific domain name

< link rel = "DNS - prefetch" href = "/ / fonts.googleapis.com" >Copy the code
  1. Domain of convergence

    It is recommended that static resources be placed under only one domain name to effectively reduce DNS requests

  2. httpdns

    Sending domain name resolution requests to the HTTPDNS server based on Http replaces the traditional method of sending domain name resolution requests to the carrier’s Local DNS based on DNS, avoiding domain name hijacking and precise scheduling. It’s a two-step process

    1. The client directly accesses the HttpDNS interface to obtain the IP address with the optimal access delay configured on the DNS. (For disaster recovery, retain the carrier’s LocalDNS for domain name resolution.)

    2. The client sends a service protocol request directly to the obtained IP address. Using Http requests as an example, you can send a standard Http request to the IP returned by HttpDNS by specifying the host field in the header.

CDN

What is the CDN

So with DNS out of the way, I’m ready to move on to the CDN, which stands for Content Delivery Network, It can redirect the user’s request to the nearest service node in real time according to the network traffic and the connection of each node, the load status, the distance to the user and the response time. Its purpose is to enable users to obtain the content needed nearby, solve the situation of crowded Internet network, improve the response speed of users to visit the website.

A typical CDN system consists of the following three parts

  • Distribution service system

    The most basic unit of work is the Cache device. The Cache (edge Cache) is responsible for responding directly to the end user’s access request and providing the local cached content to users quickly. In addition, the cache synchronizes the content with the source site. The updated and unavailable content is obtained from the source site and saved locally. The number, scale, and total service capacity of Cache devices are the most basic indicators to measure the service capacity of a CDN system

  • Load balancing system

    The main function is responsible for scheduling access to all users who initiate service requests and determining the final actual access address provided to users. The two-level scheduling system is divided into global load balancing (GSLB) and local load balancing (SLB). GSLB optimizes each service node to determine the physical location of the cache that provides services to users based on the proximity principle. SLB is responsible for device load balancing within nodes

  • Operation management system

    It is divided into operation management and network management subsystem, which is responsible for the collection, sorting and delivery work necessary for business level interaction with external systems, including customer management, product management, billing management, statistical analysis and other functions.

The process of CDN

The method of using CDN is very simple, just need to modify their DNS resolution, set a CNAME to point to the CDN service provider.

The process for users to access unused CDN cache resources is as follows:

  1. The browser resolves the domain name to obtain the IP address corresponding to the domain name.
  2. The browser uses the OBTAINED IP address to send a data access request to the service host of the domain name.
  3. The server returns response data to the browser

After using the CDN

  1. When a user clicks the content URL on the web page, the local DNS system resolves the domain name to the CDN dedicated DNS server pointed to by the CNAME.
  2. The DNS server of the CDN returns the GLOBAL load balancing device IP address of the CDN to the user.
  3. A user sends a content URL access request to the global load balancing device of the CDN.
  4. CDN The global LOAD balancing device selects a regional load balancing device of the region to which the user belongs based on the USER IP address and URL of the requested content and sends requests to the device.
  5. The LLB selects an appropriate cache server to provide services for users. The selection criteria are as follows: Determine which server is closest to the user based on the user IP address. According to the content name carried in the URL requested by the user, determine which server has the content required by the user; Query the current load of each server and determine which server has service capability. Based on the above analysis, the LAN load balancer returns the IP address of a cache server to the global load balancer.

  6. The global load balancer returns the IP address of the server to the user

  7. The user sends a request to the cache server. The cache server responds to the request and sends the content required by the user to the user terminal. If the cache server does not have the content the user wants, and the zone balancer still allocates it to the user, the server requests the content from its upper-level cache server until the source server that traces it back to the web site pulls it locally.

There’s too much writing on it. It’s a little convoluted, you know? The common point is that the resources accessed by users are originally stored in your own server. By modifying DNS, users can select the appropriate CDN cache server to obtain resources according to IP and other conditions.

The advantages of CDN

What’s the good of that?

  1. Local Cache acceleration accelerates the access speed
  2. Mirroring service eliminates bottlenecks between carriers and ensures good access quality for users on different networks
  3. Remote acceleration: Automatically selects the cache server
  4. Bandwidth optimization, network traffic sharing, stress reduction,
  5. Cluster defense
  6. Cost savings

The last

This post is part of the Advanced Front End series. Follow star on this blog or follow me on Github

reference

  1. DNS cached by the browser
  2. 1.2 Basic working process of CDN – 51CTO.COM
  3. CDN technology details
  4. The DNS to proofread take