Like attention, no more lost, your support means a lot to me!

🔥 Hi, I’m Chouchou. GitHub · Android-Notebook has been included in this article. Welcome to grow up with Chouchou Peng. (Contact information at GitHub)

preface

  • DNS is often the first step in the network request, in the computer network interview, DNS is in addition to HTTP, TCP more important knowledge, its importance can be imagined.
  • In this article, I will comb through the principles of graphical DNS & HTTPDNS. Please be sure to like and follow if you can help, it really means a lot to me.

series

  • “Computer network | graphic DNS & HTTPDNS principle”

Related articles

  • The cryptography | sense!, signature and digital certificates are?”

directory


1. The principle of DNS

1.1 DNS profile

A Domain Name (Domain) is a Name used to identify a host or host group on the Internet. It is equivalent to an alias for an IP address and is easier to remember than an obscure IP address.

The Domain Name System (DNS) is a basic Internet service that resolves Domain names to IP addresses. The Server providing this service is called the Domain Name Server (DNS).

1.2 DNS Resolution Process

The domain name system on the Internet is a distributed system with a four-layer tree hierarchy structure, as shown in the following figure:

  • Local Name Server (Local DNS) : If DHCP is used, the Local DNS is provided by an Internet service provider (ISP), such as China Unicom and China Telecom.

  • Root Name Server: if the local DNS fails to query the resolution result, the system queries the Root Name Server and obtains the IP address of the TOP-LEVEL domain Name Server. There are 13 root DNS servers (in addition to their mirrors), which are not used for domain name resolution directly, but only to indicate queryable top-level DNS servers. The site records the existing 13 root domain name server: www.internic.net/domain/name… ;

  • Top-level Name Server: manages the secondary domain names registered under this TOP-LEVEL Name Server, such as **.com top-level domain Name Server **, and Baidu.com authoritative Server is an authoritative domain Name Server registered in.com.

  • Authoritative Name Server: unique in a specific region, responsible for maintaining the mapping between domain names and IP addresses in the region. In the DNS reply packet, the flag bit AA identifies whether the DNS record is from an authoritative DNS server. Otherwise, it may be from the cache.

DNS resolution is divided into recursive query and iterative query. The Local DNS server performs recursive query between the client and the Local DNS server, while the DNS server performs iterative query.

** Tip: ** If DNS servers use recursive queries between them, the burden on the root DNS server is too heavy, whereas if clients use iterative queries with local DNS servers, the DNS service becomes opaque to clients.

  • Recursive query:

** If the DNS server fails to find the domain name, it sends query request packets to other DNS servers as clients. The clients only need to wait for the final result. ** Rendering in pseudocode is probably easier to understand, like this:

fun dns(client: String, server: String, domain: String): String {if (server query domain successfully) {return "IP"} return DNS (server, "other DNS servers ", domain)}Copy the code
  • Iterative query:

The so-called iterative query is: ** If the DNS server fails to find the domain name, it does not complete the subsequent query work for the client, but replies to which DNS server to query the next step, and then the client sends the query request to the new DNS server. ** Rendering in pseudocode is probably easier to understand, like this:

fun dns(client: String, server: String, domain: String): String {while (true) {if (server query domain successfully) {return "IP"} else {// Client continue to iterate query server as client = "other DNS servers"}} }Copy the code

The following uses www.baidu.com as an example to describe the DNS resolution process:

  • 0. Check the DNS cache first. As we will see in the next section, if the cache ages or is not matched, the client needs to send a query request packet to the local DNS

  • 1. The client sends a query packet to the local DNS server query www.baidu.com. The local DNS server checks its cache. If the cache is aged or not hit:

  • 2. Local DNS sends a query packet to the root DNS server. Query www.baidu.com Returns the address of the.com top-level DNS server (if no record is found).

  • 3. The local DNS sends a query packet to the.com top-level domain name server. Query www.baidu.com returns the address of the authoritative domain name server where Baidu.com resides (if no record is found).

  • 4. The Local DNS sends a query packet to the Baidu.com authoritative domain name server query www.baidu.com to obtain the IP address, store it in the cache, and return the packet to the client

1.3 the DNS message

In the next section, we will capture DNS packets, so in this section, we will introduce the FORMAT of DNS packets. DNS defines three types of packets: query packets, reply packets, and update packets. Their structures are generally consistent.

  • Header
    • 1. Transaction ID: A Transaction ID is used to associate DNS queries with replies. Each time a DNS client sends a query request, a different ID is used, and the server repeats this ID in its response
    • Flags: indicates the flag field of the packet. For details, see the following figure
    • 3. Question Count: Specifies the number of questions
    • 4. Answer Resource Record count: Specifies the number of resources to be answered in the Answer section
    • 5. Authority Resource Record Count: Specifies the number of Authority Resource records
    • Additional Resource Record Count: Specifies the number of Additional Resource records

  • Question

Question Indicates the query questions. The number of questions is the same as the Question Count field in the packet header. Note that DNS lookup is divided into forward lookup and reverse lookup. Forward lookup resolves a domain name to an IP address, and reverse lookup resolves an IP address to a domain name. The query type is the most important field in the question entry, and there are only five commonly used types listed here:

QTYPE describe
A (1) Resolve a domain name to an IPv4 address
NS (2) Domain Name Service
CNAME (5) The name of the specification
PTR (12) The IP address is resolved to a domain name
AAAA (28) Resolve a domain name to an IPv6 address

NS and CNAME are hard to understand. Here’s an explanation:

CNAME (Canonical NAME) is a Canonical NAME or alias used to refer one domain NAME to another. In this case, if you need to change the IP address, you do not need to change the mapping of each domain name. You only need to change the A record, and the CNAME record will automatically point to the new IP address. You’ll see CNAME directly in action in Section 1.4 DNS parsing.

NS (Name Server) : specifies the DNS Server to resolve the domain Name

  • Resource Record

The format of the answer resource record, authoritative resource record, and additional resource record is the same. TTL (Time to Live, in seconds) indicates the lifetime of the resource record, that is, the allowed cache Time. 0 indicates that the record can only be used for the current response and cannot be cached.

1.4 DNS Packet Transmission Protocol

DNS uses both TCP and UDP at the transport layer and occupies port 53. When can these two protocols be used?

  • TCP is used for zone transport

For zone traffic (which is used to balance the load), a much larger amount of data needs to be transmitted than for a simple query and reply message. UDP is not reliable for data transmission. Therefore, TCP is used to transmit large amounts of data with higher reliability.

  • UDP is used for domain name resolution

To obtain the IP address of a domain name, multiple DNS servers are used. If TCP is used, each DNS request has a three-time handshake connection delay, which makes the DNS service slow.

The MAXIMUM length of a UDP packet segment is 512 bytes according to the DNS protocol. If a DNS packet segment is too long, the PACKET segment is truncated (the TC (Truncation) bit in the DNS packet header is set to 1). Redundant data is discarded. This is because UDP is connectionless and it is impossible to determine which UDP packets belong to the same DNS packet segment.

1.5 DNS Resolution Field

Computer network is a discipline developed in practice. It is not enough to just stay at the level of learning theoretical knowledge. We will learn DNS resolution in practice. In this case, we use WireShark to fetch DNS request for query www.baidu.com. The steps are as follows:

  • Step 1: Set the WireShark filtering criteria

In the filter bar input conditions: icmp | | DNS, the diagram below:

  • Step 2: Ping the terminalwww.baidu.com

Enter ping www.baidu.com on the terminal, as shown below:

  • Step 3: View DNS query and reply packets

You can return to the WireShark and view the captured messages. Two DNS packets are displayed, one is a query packet and the other is a reply packet, as shown in the following figure:

Now let’s look at the two DNS messages in detail. With the foundation of the previous section, I believe it is very simple to read these two messages. DNS protocol packet segment:

  • Query message:

Query the IPv4 address of www.baidu.com (A record). The flag bit indicates the following information: This is A query message; This is a forward resolution; The packet is not truncated. Ask the server to perform recursive queries;

  • Reply message:

  • Transport layer & Network layer:

The figure also shows that the DNS uses UDP for domain name resolution and the port number is 53, which is consistent with the analysis in the previous section. Also, you can see that the first hop of the IP packet is sent to the LAN router, not directly to the local DNS server, which is reasonable.

1.6 the DNS cache

A complete DNS query requires access to multiple DNS servers to obtain the final result, which must bring certain delay. To improve the latency, the DNS service does not access the DNS server for every request. Instead, the DNS service caches DNS records locally after a single request. Specifically, the DNS service is a multi-level cache:

Browser cache > OPERATING system cache > Router cache > Local DNS cache > DNS query

The cache is not always valid. As mentioned earlier, the TTL (Time to Live) value in DNS reply packets determines the valid Time of DNS records in the cache. Note that TTL is only a reference value. The actual cache validity time used may not be equal to this value, or even a fixed value. This also leads to some “side effects” of DNS caching, which I’ll talk about later.


2. DNS problems

After the previous section of DNS theoretical knowledge learning and practical exploration, I believe that you have established a certain understanding of DNS. So, is DNS a complete service, and what are the problems with it in practice? Let’s talk about that in this video.

2.1 DNS Query delay

As can be seen from the analysis in section 1, a complete DNS query process requires access to multiple DNS servers to obtain the final result, which will certainly bring a certain delay. In practical terms, this time is not trivial.

Tip: Youzan technical team points out that the DELAY of DNS resolution fluctuates greatly. In good cases, it can be completed in a few milliseconds or more than ten milliseconds. In bad cases, it may take a lot of time: “Youzan WebView Accelerated Platform Exploration and Construction” — Youzan Mobile group

2.2 Cache Consistency

The existence of DNS cache reduces latency at the expense of consistency. To be specific, Local DNS is implemented by region and carrier. Therefore, the implementation policies for domain name resolution cache are inconsistent. Sometimes, the resolution result of the Local DNS may not be the nearest and optimal node. Sometimes, the Local DNS does not comply with the TTL limit, but sets a fixed time. This may cause some clients to access the old IP address in the cache even after the domain name points to the new IP address.

In addition to the carrier’s cache policy, cache poisoning is also the cause of reduced DNS availability. An attacker can use DNS hijacking to trick the DNS server into caching false DNS records with a large TTL, thus deceiving clients for a long time.

2.3 DNS Hijacking (Man-in-the-middle Attack)

DNS lacks security mechanisms for encryption, authentication, and integrity protection, which may cause network integrity problems. The most common domain name hijacking attack spoofs the transaction ID in the DNS packet header. Because the transaction ID in the query packet and the reply packet match, the camouflaged DNS server can send the forged packet with the same transaction ID to the client in advance to realize domain name hijacking (if the valid packet has not arrived yet). Resolved the domain name of the target website to the wrong IP address.

Tip: the method of obtaining transaction ID mainly adopts network monitoring and serial number guessing. For details, please refer to “Principles of Computer Network Security” (Chapter 8) — written by Wu Lifa

2.4 Inaccurate scheduling

Due to problems such as caching, forwarding, and NAT, the authoritative DNS server may misjudge the location and carrier of the client. As a result, IP addresses that are accessed by different carriers are resolved, slowing down the access speed of users.


3. The principle of HTTPDNS

There are some problems with DNS, but the solution is not to throw out the whole system for fear of choking. The solution is to get IP addresses in a different way: HTTPDNS.

3.1 HTTPDNS profile

Different from traditional DNS resolution, the HTTPDNS server is built on its own. When a client needs DNS resolution, it does not send DNS query packets to the Local server. Instead, the client accesses the HTTPDNS interface through a request. The server returns the nearest IP address based on the location and carrier of the client.

Of course, for disaster recovery purposes, when HTTPDNS is unavailable, the degrade policy is triggered and carrier LocalDNS is used for domain name resolution.

3.2 HTTPDNS advantage

The main advantages of HTTPDNS over DNS are as follows:

  • Reduce the time delay

Shorten the query link, unlike DNS query need to access multiple DNS servers to get the final result;

  • Domain name anti-hijacking Domain name resolution requests are directly sent to the HTTPDNS server, bypassing the carrier’s Local DNS to avoid domain name hijacking.

  • Precise scheduling Because the DNS server obtains the real client IP address instead of the Local DNS IP address, the DNS server can obtain the most accurate resolution result based on the client location and carrier information, enabling the client to access the nearest service node

  • Quick to take effect

When domain name resolution results change, the HTTPDNS service is not affected by the multi-level cache of the traditional DNS service, and domain name updates can be quickly overwritten to full clients.

3.3 Forward Benefits of HTTPDNS

At present, Tencent, Ali and Baidu all have their own HTTPDNS solutions, the author collected their public use benefits, specific as follows:

  • tencent

    • Official documentation: covered over 400 million + users, reduced access failures due to domain hijacking by more than 60% and reduced average latency by 22%;
    • Official blog: Average user access latency decreased by more than 10%, and access failure rate decreased by more than a fifth;
  • baidu

    • Official blog: iOS hijacking rate decreased from 0.12% to 0.0002%, Android hijacking rate decreased from 0.25% to 0.05%, the second point of income is not obvious, because the main target group of Feed business is in China, Baidu node layout is relatively rich in China, the overall quality of service is also high;
  • Ali hasn’t checked…


4. To summarize

In terms of test taking, it is recommended to master the principle of FOUR-layer DNS resolution & HTTPDNS, understand the problems existing in DNS, understand THE FORMAT of DNS packets TTL, and several query types.


The resources

  • Domain Name System – Wikipedia
  • How DNS Works — Microsoft documentation
  • Principles of Computer Network Security (chapter 8). By Lifa Wu
  • Tcp-ip protocol and its Application (Chapter 8) by Cheng Yu Lin
  • A new approach to Global Precise Traffic Scheduling – Details of HttpDNS Service. By Liao Weijian
  • Baidu App network depth optimization series a DNS optimization — CAI Rui
  • “Swastika! The Inevitable DNS Interview Questions” (verse 2) — I’m a programmer bitch
  • Why does DNS use UDP instead of TCP? (Che xiaopang’s answer) — Zhihu q&A
  • A Detailed Description of TCP/IP volume 1: Protocols. By Kevin R. Hall and W.Richard Stevens
  • What Really Happens When You Navigate to a URL (Verse 2) by Igor Ostrovsky

Practical resources

  • Online DNS query tool
  • HTTPDNS aliyun
  • HTTPDNS · Tencent Cloud
  • HTTPDNS · Baidu Cloud

Recommended reading

  • Cryptography | is Base64 encryption algorithm?
  • Interview questions | back algorithm framework to solve problems
  • The interview questions | list questions summary algorithm
  • Java | show you understand the ServiceLoader principle and design idea
  • Android | interview will ask Handler, are you sure you don’t look at it?
  • Android | show you understand NativeAllocationRegistry principle and design idea
  • Computer composition principle | Unicode utf-8 and what is the relationship?
  • | why floating-point arithmetic of computer constitute principle is not accurate? (Ali written test)

Creation is not easy, your “three lian” is chouchou’s biggest motivation, we will see you next time!