Hi, I’m Hui Ye and I’m so cute. The story begins with a classic interview question: What happens from the time you enter the URL to the time the page loads? I believe you should be familiar with this topic. This series is my “article to understand JS series” after the second series. Designed to make you understand the loading process, simple answers and in-depth answers can be fluent!

preface

First of all, computers on the Internet are identified and communicated by IP addresses.

Therefore, when we visit Baidu website through the domain name, for example, https://www.baidu.com/, in fact, are internal domain name resolution, find baidu IP address, Then get the required resource files (HTML + CSS + JS or image and video files…) from the server of the other party to load the page.

So, it’s not like you think, it’s direct domain name communication.

And remember IP address, often is not so friendly to people, in order to facilitate memory, the use of domain name to replace IP address identify site address.

And domain name resolution, can be divided into two processes, first request cache, cache can not be found, and then to find the DNS server, we mainly according to these two types of analysis to the following.

Cache parsing

Browser cache

When a user types a URL into a browser and hits Enter, our domain name resolution begins.

First, the browser checks the cache to see if there is an IP address for the domain name, and if there is, the resolution is over.

Of course, we’re not always lucky enough to have one in the cache.

Because the browser cache domain name is also limited. Not only is the size of the browser cache limited, but its time is also limited. It usually ranges from a few minutes to a few hours.

And this cache time, if it’s too long or too short, it’s not good, because if it’s too short it’s going to have to be parsed again next time, which is a waste of resources. But too long, if the domain name bound to the IP address has been changed, then the cache is always directed to the old server address, the page remains the same. Or is the other party directly simply changed the server, that is not directly even the page can not open. (People move, you still keep looking for people’s old address, it is not stupid!)

Operating system cache

Of course, if the browser cache is not found, our operating system has a corresponding cache, so the computer design is sound.

In Windows you can set this up with the C:\Windows\System32\drivers\etc\hosts file. You can resolve any domain name to any accessible IP address. You can also specify the IP address corresponding to the domain name. But it was just that little feature that allowed hackers to take advantage of it.

Domain name hijacking occurs when you programmatically modify the system’s domain name resolution system to resolve the domain name you want to access to its designated IP address.

Therefore, in the later update, hosts was changed to a read-only file that could not be modified to avoid the above problems.

Router cache

If the operating system also has no cache, it looks for the result of the IP address resolution in the router.

It is important to note that caching only saves the result of the resolution, neither of which can actually complete the resolution of the domain name.

DNS

concept

The Domain Name System (DNS) is the core service of basic Internet resources. It is used to translate IP addresses into Internet Domain names. Its main function is the equivalent of a phone book for Internet IP addresses.

LDNS

Of course, if there is no corresponding DNS resolution result in the above cache, the real request will be made to the DNS server, the first request is LDNS, that is, the local DNS. These servers are usually located close to the user and perform well, so they cache the resolution results, and about 80% of the resolution is done in this step. Therefore, LDNS undertakes the main domain name resolution work.

When none exists in the cache server, the information is passed to the root DNS server.

The DNS server

DNS servers are classified into three types

Graph TD root DNS server --> Top-level DNS server --> Authoritative DNS server

There are 13 root DNS servers in the world. The root DNS server does not resolve domain names directly, but distributes different resolution requests to the following servers.

Take https://www.baidu.com for example:

After receiving the resolution request from the local DNS, the root domain name learns that the suffix is.com and returns the IP address of the top-level domain name server responsible for.com to the local DNS.

The local DNS then sends a request to the.com top-level domain name server, and the.com top-level domain name server checks to see if it has resolution results for the domain name. If it doesn’t, The IP address of the authoritative DNS server of QQ.com is returned to the local DNS, followed by www.qq.com, and so on, recursively according to the level of the domain name.

Graph TD. root DNS server -->.com top-level DNS server --> qq.com authoritative DNS server --> www.qq.com authoritative DNS server

And you can see that our domain name is very closely related to the hierarchy of DNS servers.

To layer the vast web of the Internet.