reference

  • “The Top-down Approach to Computer Networks”
  • How the Web Is Connected

This article is a summary of learning DNS, combining the DNS chapters of both books, so there will be some references to both books. This article is more theoretical knowledge, read actually quite interesting.

preface

We often identify people in many ways. For example, in real life we identify people by their names. Can be identified by the ID number; It’s even possible to identify someone by a nickname (hahaha). But in certain circumstances, one method of identification may be more appropriate than another. For example, in the train ticket purchase system, the procedure may be carried out by the ID card number as the unique identification, rather than the name identification on the ID card.

DNS: Directory service of the Internet

Hosts on the Internet, like us, can be identified in many ways. One way to identify a host is to use its host name, such as Baidu.com, google.com, etc. These names are easy to remember and happy to be accepted by us. However, host names provide little information about the host’s location on the Internet. Moreover, because host names can be composed of an indefinite number of alphanumeric characters, or even modern Chinese characters, routers can be difficult to handle. For these reasons, hosts can also be identified with IP addresses.

Reasons for using domain names and IP addresses

TCP/IP determines communication objects based on IP addresses. Therefore, a message cannot be sent to the other party without knowing the IP address of the other party. This is similar to making a phone call. A domain name you can think of as a nickname saved in the address book.

In fact, we could have requested the IP address directly, and that would have been fine, but the IP address consists of four bytes of pure numbers, just like the phone code, which is hard to remember. Therefore, it is better to use a server name for a domain name than an IP address.

So can we not IP address, direct domain name ok, reasonable is ok. So what if instead of using IP addresses, we use domain names? The length of an IP address is 32 bits, that is, 4 bytes. By contrast, the length of a domain name can be as short as dozens of bytes, and the longest can even reach 255 bytes. In other words, with an IP address, you only need to process four bytes of numbers, whereas with a domain name, you need to process dozens to 255 bytes of characters, which adds to the burden on the router and takes longer to transfer data. So using domain names to determine who to communicate with is not a wise choice.

So the solution is to let people use names and routers use IP addresses.

DNS basic services

DNS services

We have just introduced that there are two ways to identify a host. By hostname or IP address, we prefer a host name that is easy to remember, while routers prefer fixed-length, hierarchical IP addresses. To compromise these different preferences, we need a directory service that can translate hostnames into IP addresses, which is the Domain Name System (DNS).

DNS process

DNS is typically used by other application-layer protocols (so DNS is the application layer in the seven-layer network model), including HTTP, SMTP, and FTP, to resolve user-supplied host names into IP addresses.

For a simple example, what happens when we type baidu.com/index.html into the browser’s address bar? In order for a user’s host to send an HTTP request to the Web server baidu.com/index.html, the host must obtain the IP address of baidu.com/index.html, so it goes through the following procedures:

  • The user host runs the DNS client
  • The browser extracts the host name baidu.com from the URL and passes the host name to the DNS client
  • The DNS client sends a request containing the host name to the DNS server
  • The DNS client eventually receives a reply message containing the IP address of the host name and notifies the browser
  • After receiving the DNS IP address, the browser immediately sends a TCP connection request to the HTTP server process on port 80 of this IP address

Basic work of the DNS server

The basic job of the DNS server is to receive the query message from the client and then return the response according to the content of the message.

The query message from the client contains the following three types of information:

  • Domain name: Name of the server or mail server
  • Class: When the DNS solution was first designed, the application of DNS to other networks besides the Internet was also taken into account. The information used to identify networks is no other network except the Internet, so the value of Class is always IN representing the Internet
  • Record type: Identifies the record type of the domain name. For example, if the type is A (Address), the domain name corresponds to an IP Address. When the Mail eXchange (MX) type is set to MX, it indicates that the domain name corresponds to the Mail server. The information returned by the server to the client varies with the record type.

The DNS server stores records corresponding to the preceding three types of information. The DNS server searches for the content that meets the query request and responds to the client based on the records.

For example, to query the IP address of the baidu.com domain name, the client sends a query message to the DNS server containing the following information

  • Domain name = baidu.com
  • Class=IN
  • Record type =A

The DNS server then searches the existing records for records that all match the domain name, Class, and record type.

DNS provides other important services

In addition to translating host names into IP addresses, DNS provides some important services

  • Host alias: A program can call DNS to get the canonical host name corresponding to the host alias and the IP address of the host
  • Mail server alias: A mail program can invoke DNS to resolve the provided mail server alias to get the canonical hostname and its IP address for that host
  • Load distribution: DNS is also used to distribute load between redundant servers

Hierarchy of domain names

In front of the introduction, we are suppose to query information has been stored in the DNS server record, if this is the school’s internal network Web and E-mail server to a limited number of environment, all information can be stored on a DNS server, the query method is very simple, query, it is ok to return. However, there are numerous servers on the Internet, and it is impossible to store all the information of these servers in one DNS server. Therefore, the information to be queried must not be found in the DNS server.

In fact, it is about information distributed in multiple DNS servers, these DNS servers relay with each other, so as to find out the information to query, but this mechanism is very complicated, so it needs to introduce how information is registered and saved on the DNS server.

All information on the DNS server is stored in a hierarchical structure based on domain names. The hierarchical structure is similar to a school structure, for example, a person in a class in a grade in a school. Domain names in DNS are separated by periods, such as www.baidu.com, which represent the boundaries between different levels, just like different divisions of the school structure.

In this hierarchy, domain name information is registered with the DNS server. Each domain is processed as a whole, that is, the information of a domain is stored on the DNS server as a whole. A domain cannot be separated into multiple DNS servers. However, the relationship between DNS servers and domains is not one-to-one. One DNS server can store information about multiple domains.

Find the CORRESPONDING DNS server and obtain the IP address

The client actually wants to directly request the DNS server, and then it tells me the IP address, but there are tens of thousands of DNS servers in the Internet, certainly can’t find one by one, so the following method is used for quick search. First, register the IP addresses of the DNS servers that manage the lower-level domains with their upper-level DNS servers, then register the IP addresses of the upper-level DNS servers with the higher-level DNS servers, and so on. In this way, we can query the IP address of the lower-level DNS server through the upper-level DNS server and send query requests to the lower-level DNS server.

A bit of useless trivia: in the domain hierarchy paragraph, fields like com and CN seem to be the topmost domains, but there is actually another level above them, called the root domain. Unlike com and cn, the root domain has its own name. Therefore, it is often ignored when entering a domain name. To specify the root domain, you need to perform the following operations: www.baidu.com. This adds a period to the end of the domain name, and this period represents the root domain. I tried to input and access, but failed to access. If you are interested, please know why.

The process of obtaining IP addresses

  • The client first accesses the nearest DNS server
  • If the latest DNS server does not contain the cached domain name information, the DNS server forwards the request information to the root DNS server
  • If no information is saved on the root domain server, the system searches the domain name server successively. For example, if no information is found on the www.baidu.com or com domain name server, the system searches the baidu.com domain name server. If no information is found on the www.baidu.com domain name server, the system searches the domain name server again
  • After locating the final DNS server that manages the domain name, the DNS server responds to the client with the IP address of the domain name.

DNS Cache service

Sometimes you do not need to start the search from the highest level of the root domain name, because the DNS server has the caching function, which can remember the domain name queried before. If the domain name and related information to be queried is already in the cache, the desired information can be retrieved directly from the cache, and subsequent queries can start and work down from the cache location. Caching can reduce query time compared to looking up from the root domain every time.

If the domain name to be queried does not exist, the response will be cached. In this way, the next time to query the domain name does not exist, can also do fast response.

The same as other services, the caching mechanism has the problem of timeliness. That is, after the information is cached, the original registration information may change, and the information stored in the cache may be wrong. Therefore, the information stored in the DNS server has a validity period. When the validity period expires, the data is deleted from the cache. Also, when responding to a query, the DNS server tells the client whether the response is from the cache or from the DNS server that manages the domain name.

DNS Security

The DNS is a critical service on a network, including Web and email services. If the DNS fails to work properly, all services that depend on it fail to work properly. So, how can DNS servers be attacked?

DDoS attacks on DNS servers

The first thing that comes to mind is a DDoS attack on a DNS server. For example, an attacker can send a large number of packets to each DNS root server, leaving most legitimate DNS requests unanswered. This massive DDoS attack on the DNS root server actually happened on October 21, 2002. In this attack, the attacker used a botnet to send large ICMP ping packets to each of the 13 DNS root servers. Although the attack succeeded in directing a large number of packets to the root server, many DNS root servers were protected by packet filtering, so there was little or no impact on users’ web surfing.

This is helped by the fact that most local DNS servers cache the IP address of the top-level DNS server, making these requests often bypassed the DNS root server.

Man-in-the-middle attack

In a man-in-the-middle attack, an attacker intercepts a request from a host and returns a forged answer. The attacker sends a forged answer to a DNS server, tricking the server into accepting forged records in its cache. This attack can redirect trusted clients to the site the attacker wants you to visit, however, these attacks are difficult to implement because they require capturing groups or choking the server.

Reflection attack

Another important DNS attack is not a DNS service attack per se, but a DDoS attack that takes full advantage of the DNS infrastructure to attack the target host. In this attack, an attacker sends DNS requests to a number of authoritative DNS servers, each with a fake source address for the target host, and these DNS servers send their answers directly to the target host.

If the response to these requests is much larger than the request, the attacker may flood the target host without generating a large amount of traffic on its own.