This is the 13th article in the CXuan Computer Network series for programmers.

So far we have covered the application layer, the transport layer, the network layer, and the data link layer, so it is time to string them all together and do a comprehensive review. So I’m going to talk to you about how these protocols work in a computer network, how packets are sent and received, from the URL input, to the click of the call, to the final page in front of you.

First, I opened the Web Browser and entered maps.google.com in the Google Browser URL bar.

Then…

Find the DNS cache

The browser checks for cache in four places at this stage. The first place is the browser cache, which is the DNS record.

The browser maintains a DNS record for the sites you visit for a fixed period of time. Therefore, it is the first place where a DNS query is run. The browser first checks to see if the URL has a corresponding DNS record in the browser to find the IP address of the target URL.

I’m a Chrome browser, so on a Mac, I can’t use Chrome ://net-internals/# DNS to find the corresponding IP address. It can be found on Windows.

So how does Mac query DNS records? You can use
nslookupCommand to look up, but that’s not what we’re talking about.

DNS(Domain Name System) is a distributed database that maintains the mapping of URLs to their IP addresses. In the Internet, IP address is a kind of address that the computer can understand, and the alias address of DNS is the address that we humans can understand and remember. DNS is responsible for mapping the address that the human memory into the address that the computer can understand. Each URL has a unique IP address to correspond.

For example, Google’s official website is www.google.com, while Google’s IP address is 216.58.200.228. You can access either address by typing in the URL, but the IP address is hard to remember, while Google.com is simple and clear. DNS is like the home phone book we used a few years ago. If you want to call CXuan, you might not remember the phone number of CXuan. In this case, you need to consult the phone book to find the phone number of CXuan.

The second area the browser needs to check is the operating system cache. If the DNS record is not in the browser cache, then the browser will make a system call to the operating system, which in Windows is getHostName.

On Linux and most UNIX systems, unless installed
nscdOtherwise, the operating system may not have DNS cache.

NSCD is a name service cache on Linux systems.

The third area to check is the router cache. If the DNS record is not on your computer, the browser will maintain the DNS record with the router it is connected to.

If the router to which it is connected also has no DNS record, the browser checks to see if there is a cache in the ISP. ISP cache is the cache of your local communication service provider. Since ISP maintains its own DNS server, the nature of its cache of DNS records is also to reduce the request time and achieve a faster response effect. Once you’ve visited certain sites, your ISP may cache those pages for quick access the next time. Are you shocked that you often watch small movies? It would be nice to have a webcam connected to the Internet.

You may be confused as to why the browser needs to check so many caches in the first place. You may be uncomfortable with the fact that caches may reveal our privacy, but these caches are crucial for regulating network traffic and reducing data transfer times.

So, the above query involving the DNS cache is as follows.

If there is no DNS record in any of the above four steps, then there is no DNS cache and a DNS query needs to be initiated to find the IP address of the target URL (in this case, maps.google.com).

Initiate a DNS query

As mentioned above, if I want to connect my computer to maps.google.com and communicate with it, I need to know the IP address of maps.google.com. Due to the design of DNS, the local DNS may not be able to provide me with the correct IP address. It would then need to search multiple DNS servers across the Internet to find the correct IP address for the website.

Here’s a question, why do I need to search multiple DNS servers to find the IP address of a website? One server won’t work?

Because DNS is a distributed domain name server, each server only maintains a part of the IP address to network address mapping, no one server can maintain the whole mapping relationship.

In the early design of DNS there was only one DNS server. This server will contain all the DNS mappings. This is a centralized design, which is not suitable for today’s Internet, which has a large and growing number of hosts. There are several problems with this centralized design

  • A single point of failureIf the DNS server crashes, the entire network crashes.
  • Traaffic Volume, a single DNS server has to handle all the DNS queries, which can be millions or tens of millions of levels, which is difficult for a single server to handle.
  • Distributed Centralized Database (Distributed Distributed Database), a single DNS server is not possiblenearbyAll users, assuming that the DNS server in the United States is not near enough to be used for queries in Australia, where queries are bound to pass through slow and congested links, causing serious delays.
  • Maintenance (maintenance), which is costly to maintain and requires frequent updates.

Therefore, under the current network situation, DNS can not be designed centrally, because it has no extensibility at all, so it adopts distributed design. The characteristics of this design are as follows

Distributed, hierarchical database.

The first problem that distributed design solves is the scalability of DNS server. Therefore, DNS uses a large number of DNS servers, and their organization pattern is generally hierarchical, and they are distributed around the world. No single DNS server can have a map of all the hosts on the Internet. Instead, these mappings are distributed across all DNS servers.

Roughly speaking, there are three types of DNS servers: root DNS servers, top-level Domain (TLD) DNS servers, and authoritative DNS servers. The hierarchical model of these servers is shown in the figure below

  • Root DNS serverThere are more than 400 root domain name servers around the world, managed by 13 different organizations. A list of root domain name servers and organizations are available athttps://root-servers.org/The root domain name server provides the IP address of the TLD server.
  • The top-level domain DNS server, there are TLD servers or clusters of servers for every top-level domain such as com, org, net, edu, and gov and for all national domain names UK, fr, ca, and jp. For a list of all top-level domains seehttps://tld-list.com/. The TDL server provides the IP address of the authoritative DNS server.
  • Authoritative DNS serverWith publicly accessible hosts on the Internet, such as Web servers and mail servers, the organization of these hosts must provide accessible DNS records that map the names of these hosts to IP addresses. An organization’s authoritative DNS server houses these DNS records.

After understanding the design concept of DNS server, we return to the steps of DNS search, DNS query is mainly divided into three kinds

There are three types of queries that occur in a DNS lookup. By combining these queries, an optimized DNS resolution process can reduce the transmission distance. Ideally, cached record data could be used, allowing the DNS domain name server to directly use non-recursive queries.

  • Recursive query: In a recursive query, a DNS client requires that the DNS server (typically a DNS recursive parser) will record the response client with the requested resource, or return an error message if the parser cannot find the record.

  • Iterative query: In an iterative query, if the DNS server being queried does not match the query name, it will return a reference to the authoritative DNS server in the lower domain space. The DNS client will then query the reference address. This process continues to use the other DNS servers in the query chain until an error or timeout occurs.

  • Non-recursive queries: This query is usually performed when a DNS parser client queries the DNS server for a record that it has access to, because it has authority on the record or because the record exists in its cache. DNS servers typically cache DNS records and can return cached results directly after a query arrives to prevent further bandwidth consumption and load on the upstream server.

The above medium responsible for starting the DNS lookup is the DNS parser, which is usually the DNS server maintained by the ISP. Its main responsibility is to ask other DNS servers in the network for the correct IP address.

If you want to know more about DNS news, please refer to the ten-thousand word long article explosive liver DNS protocol!

So for maps.google.com, if the ISP maintains a server with no DNS cache record, it will issue a query to the DNS root server address, which will redirect it to the.com top-level domain server. The.com top-level domain name server redirects it to the Google.com authority server. The google.com name server will find the IP address that matches maps.google.com in its DNS record, return it to your DNS parser, and then send it back to your browser.

It is important to note here that DNS query packets pass through many routers and devices before reaching servers such as the root domain name. Each router or device passes through uses the routing table to determine which path is the fastest choice for the packet to reach its destination. It involves routing algorithms, if friends want to know the routing algorithms, can take a look at this article https://www.cisco.com/c/en/us…

ARP request

I’ve read a lot of articles that haven’t mentioned this, and that’s the ARP request process.

When do you need to send an ARP request?

There’s actually a condition here

  • If the DNS server and our host are in the same subnet, the system will perform an ARP query on the DNS server following the following ARP procedure
  • If the DNS server and our host are on a different subnet, the system will follow the ARP procedure below to query the default gateway

The full name of the ARP Protocol is Address Resolution Protocol(Address Resolution Protocol). It is a Protocol used to achieve the mapping from IP Address to MAC Address, that is, to ask the MAC Address corresponding to the target IP.

In short, ARP is an address solving protocol that takes an IP address as a cue to locate the MAC address of the next host that should receive the data subcontract. If the target host is not on the same link, the MAC address of the next-hop router is looked up.

About why have IP address, but also have MAC address overview can see zhihu this answer
https://www.zhihu.com/questio…

The general workflow of ARP is as follows

Suppose that A and B are located on the same link, and the conversion of the router is not required. Host A sends an IP packet to Host B, and the address of Host A is 192.168.1.2, and the address of Host B is 192.168.1.3. Neither of them knows the MAC address of the other. Host C and host D are other hosts on the same link.

Host A wants to get the MAC address of Host B. Host A will broadcast an ARP request packet to all hosts on the Ethernet. The ARP request packet contains the MAC address of the IP address of Host B that Host A wants to know.

The ARP request packets sent by host A are received and parsed by all hosts/routers on the same link. Each host/router checks the information in the ARP request packet, and if the target IP address in the ARP request packet is the same as its own, it writes its own host’s MAC address into the response packet and returns it to host A

Thus, MAC address can be obtained from IP address through ARP to achieve communication within the same link.

So, in order to send an ARP broadcast, we need to have a target IP address, and we also need to know the MAC address of the interface used to send the ARP broadcast.

This is where the concept of an ARP cache comes in.

Now you know that you can determine the MAC address by sending an ARP request before sending an IP packet. So is it necessary to send each time through the broadcast -> encapsulation ARP response -> back to the host this series of processes?

Think about it, how does a browser do that? Browsers have built-in caches that cache addresses that you’ve been using most recently, so the same is true for ARP. The key to efficient ARP operation is maintaining the ARP cache (or table) on each host and router. This cache maintains the mapping of each IP to MAC address. By mapping the MAC address obtained by the first ARP as IP to MAC into an ARP cache table, the next time a datagram is sent to this address, there is no need to resend the ARP request. Instead, the MAC address in the cache table is directly used for datagram sending. Each time an ARP request is sent, the corresponding mapping in the cache table is cleared.

Through ARP cache, reduce the use of network traffic, to a certain extent, prevent ARP broadcast.

In general, after sending an ARP request, it is more likely that the same request will be sent again. Therefore, using ARP cache can reduce the sending of ARP packets. In addition, not only can the sender of the ARP request cache the MAC address of the ARP receiver. The receiver is also able to cache the IP and MAC addresses of the ARP requester, as shown below

However, MAC addresses are cached for a certain period of time, after which the cached contents are cleared.

For a deeper understanding of the ARP protocol, see this article by CXuan.

Arp, the man behind the net


So, the browser will first query the ARP cache, and if the cache hits, we will return the result: target IP = MAC.

If the cache is not hit:

  • Look at the routing table to see if the target IP address is in a subnet in the local routing table. If yes, use the interface connected to that subnet, otherwise use the interface connected to the default gateway.
  • Query the MAC address of the selected network interface
  • We send an ARP request for the data link layer:

Depending on the type of hardware connected to the host and router, it can be divided into the following situations:

Direct:

  • If we are directly connected to the router, the router will return oneARP Reply(See below).

Hub:

  • If we connect to a hub, the hub will broadcast ARP requests to all other ports, and if the router is connected to it, it will return oneARP Reply

Switch:

  • If we are connected to a switch, the switch checks the local CAM/MAC table to see which port has the MAC address we are looking for, and if not found, the switch broadcasts the ARP request to all other ports.
  • If there is an entry in the switch’s MAC/CAM table, the switch will send an ARP request to the port that has the MAC address we want to query
  • If routers alsoThe connectionIn which it returns oneARP Reply

ARP Reply:

Now that we have the IP address of the DNS server or the default gateway, we can proceed with the DNS request:

  • Port 53 is used to send UDP request packets to the DNS server. If the response packet is too large, TCP protocol is used
  • If the local /ISP DNS server does not find a result, it sends a recursive query to the higher DNS server, one level at a time, until it reaches the originating authority, and returns the result if it finds one.

(these are from: https://github.com/skyline754.)


Encapsulating TCP packets

After the browser gets the IP address of the target server, it knows the port number based on the port in the URL (the default port number for HTTP is 80, and the default port number for HTTPS is 443), and prepares the TCP packet. Packet packaging will go through the following layers of processing, when the data arrives at the target host, the target host will parse the packet, the complete request and parsing process is as follows.

Here will not be introduced in detail, readers can read this article CXuan TCP/IP basics for a detailed understanding.

The browser establishes a TCP connection to the target server

After going through the DNS and ARP lookup process above, the browser will receive a target server’s IP and MAC address, and the browser will establish a connection with the target server to transfer the information. There are many Internet protocols that can be used here, but the transport-layer protocol used to establish the connection by the HTTP protocol is TCP. So this step is the process of establishing a TCP connection between the browser and the target server.

TCP connection establishment requires three TCP/IP handshakes. The three handshakes are the process of exchanging SYN synchronization and ACK acknowledgement messages between the browser and server.

Suppose you have the client host on the left and the server host on the right. At the beginning, both ends are CLOSED.

  1. The server process is ready to receive a TCP connection from an external source. The server-side process is then inLISTENState, waiting for client connection requests.
  2. The client makes a connection request to the server in which the first synchronization bit is SYN = 1 and an initial sequence is selected, abbreviated as SEQ = X. The SYN segment is not allowed to carry data and consumes only one sequence number. At this point, the client entersSYN-SENDState.
  3. After the server receives the client connection, it needs to confirm the client’s message segment. In the acknowledgement segment, both the SYN and ACK bits are set to 1. The confirmation number is ACK = X + 1, and you also choose an initial sequence number for yourself, SEQ = Y. Note that this segment cannot carry data either, but also consumes a sequence number. At this point, the TCP server comes inSyn-received (synchronously RECEIVED)State.
  4. After the client receives the response from the server, it also needs to confirm the connection. Confirm that the ACK in the connection is set to 1, the serial number is seq = x + 1, and the ACK is y + 1. TCP specifies that this segment may or may not carry data. If it does not carry data, the sequence number of the next segment is still seq = x + 1. At this point, the client entersESTABLISHED (Connected)state
  5. The server also enters after receiving the customer’s confirmationESTABLISHEDState.

This completes the three-handshake phase of establishing a connection, and the two parties can communicate directly.

The browser sends an HTTP request to the Web server

Once the TCP connection is established, it’s time to start transferring data directly! The browser sends a GET request, which asks the target server to provide a page from maps.google.com. If you fill out a form, it makes a POST request. In HTTP, GET and POST requests are the two most common requests. That’s more than 90 percent of all HTTP requests.

In addition to the request type, HTTP requests contain many, many pieces of information, the most common being Host, Connection, User-Agent, Accept-Language, and so on

First, the Host represents the Host on which the object resides. Connection: close indicates the non-persistent Connection that the browser needs to tell the server to use. It requires the server to close the connection after sending the object in response. User-Agent: This is the request header used to tell the Web server that the browser type is Mozilla/5.0, or Firefox. Accept-Language tells the Web server that the browser wants the French version of the object if the server supports the French type, otherwise the default version of the server will be sent. Below we mainly introduces the entity fields (concrete can refer to https://developer.mozilla.org… MDN official website learning)

There are four types of HTTP request headers: generic, request, response, and entity headers.

There is a lot to be said for each of these four headers, but if you want to learn more about HTTP headers, you can refer to this article on CXuan

Deep understanding of HTTP headers

The server processes the request and sends back a response

This server contains a Web server, the Apache server, which receives the request from the browser and passes it to the request handler and generates a response.

The request handler is also a program, it is generally written in.NET, PHP, Ruby and other languages, used to read the request, check the content of the request, cookies, if necessary to update the information on the server such a program. It composes responses in specific formats such as JSON, XML, and HTML.

The server sends back an HTTP response

The server response contains the requested page and the status code, the compression type (Content-Encoding), how to Cache the page (Cache-Control), the cookie to set, privacy information, etc.

For example, here is a response body

For an in-depth understanding of HTTP requests and responses, see this article

After reading this HTTP article, you should be able to argue with the interviewer

The browser displays the relevant HTML content

The browser displays the HTML content in stages. First, it renders the bare HTML skeleton. It will then examine the HTML markup and send a GET request to GET other elements on the page, such as images, CSS stylesheets, JavaScript files, and so on. These static files are cached by the browser, so when you visit the page again, you don’t have to request it again. Finally, you should see the content displayed by maps.google.com appear in your browser.

After WeChat searched the “programmer CXuan” concern public account, I replied to CXuan in the background and received all the PDFs. These PDFs are as follows

Links to six PDF books