“This article has participated in the call for good writing activities, click to view: the back end, the big front end double track submission, 20,000 yuan prize pool waiting for you to challenge!”

This problem is really a platitude, as a front-end or back-end should be remembered in mind, this process can be said to be very rough, can also be discussed in detail, which shows a lot of knowledge points related to the network, so it must be sorted out

Generally speaking, it can be divided into the following processes

  • Your URL

  • The DNS

  • TCP connection (three-way handshake) connection multiplexing

  • Sending an HTTP request (four parts of the request)

  • The server sends an HTTP response

  • Close the TCP connection (four waves)

  • The browser parses the rendered page

1. Your URL

First we are in the browser to enter the URL,URL Chinese name is called the unified resource locator, unified resource locator is from the Internet to get the location of resources and access to a concise expression, is the address of the Standard resources on the Internet. Every file on the Internet has a unique URL that contains information indicating where the file is and what the browser should do with it.

Components: Protocol: // hostname [:port] / path / [;parameters] Here we should pay attention to when the browser follows the same-origin policy, we front access interface often cross domain problems, all domain here is the collection protocol, domain name and port number, is the agreement with domain, domain name and port number are the same, is any different cross domain about cross domain and the ways to solve cross-domain can refer to below;

2. The DNS

DNS resolution is the process of finding which machine has the resources you need. When you type an address into your browser, such as www.google.com, it’s not really a Google address. The unique identification of every computer on the Internet is its IP address, but IP addresses are not easy to remember. Users prefer to find other computers on the Internet using easy-to-remember sites, such as baidu’s. So Internet designers need to make a tradeoff between user convenience and usability, and that tradeoff is a url to IP address translation, a process known as DNS resolution. It actually acts as a translator, realizing the translation from web address to IP address. How does url – to – IP address translation work?

2.1DNS Resolution Process

DNS search sequence: Browser cache -> OPERATING system cache -> local host file -> Router cache -> ISP DNS cache -> top-level DNS server/root DNS server Analyze the process of searching the IP address of www.google.com: . -> .com -> google.com. -> www.google.com. Some people are wondering why there’s an extra dot. It’s not that I have an extra dot. The root domain name server (root domain name server) is the root domain name server (root domain name server). By default, the root domain name server (root domain name server) is the root domain name server (root domain name server). ->.com -> google.com. -> www.google.com.

2.2 Two DNS query methods: Recursive query and iterative query

  1. Recursive analysis

    When the local DNS server cannot answer DNS queries by itself, it needs to query other DNS servers. There are two ways to do this, and the one shown here is recursive. The local DNS server queries other DNS servers. Generally, the local DNS server queries the root domain server of the domain name first, and then the root DNS server searches the domain name from one level to the next. The query result is returned to the local DNS server, and the local DNS server returns the query result to the client.

  2. Iterative parsing

    If the local DNS server cannot answer DNS queries by itself, it can also resolve DNS queries by iterative query, as shown in the figure. The local DNS server does not query the domain name of other DNS servers. Instead, it returns the IP addresses of other DNS servers that can resolve the domain name to the client DNS program. The client DNS program then queries these DNS servers until the query result is obtained. In other words, iterative parsing just helps you find the relevant server, not look it up. For example, the server IP address of Baidu.com is 192.168.4.5. Please check it yourself. I am busy, so I can only help you here.

2.3 DNS Load Balancing

When a site has enough users, if each resource request is on the same machine, that machine can pop at any time. The solution is DNS load balancing, The principle of it is for the same in the DNS server host name configure multiple IP addresses, in response to the DNS query, the DNS server for every query will with the IP address of the DNS host file record return different analytic results according to the order will lead to different client access machines, make different client access server, so as to achieve the negative The purpose of load balancing. For example, it can be based on the load of each machine, the distance of the machine from the user’s geographical location, etc.

2.4 the DNS cache

To increase access efficiency, the computer has a domain name caching mechanism, when visiting a website and get its IP, the domain name and IP will be cached down, the next visit, there is no need to request the domain name server to obtain IP, directly use the CACHE IP, improve the response speed. Of course, the cache has an effective time, when the effective time, again request web site, or need to request domain name resolution.

But the domain name caching mechanism can also cause problems. For example, if the IP address has changed, the access fails if the IP address in the cache is still used. For example, the IP address of the same domain name is different between the Intranet and the Internet. For example, the IP address mapped from the Internet to the Intranet is different. If a computer accesses the domain name from the Internet and then accesses the domain name from the Intranet, the DNS cache also accesses the IP address from the Internet. As a result, the access fails. You can manually clear the DNS cache or disable the DNS cache mechanism depending on the situation. Type :chrome:// DNS/into your Chrome browser and you can see chrome’s DNS cache. The system cache is stored in /etc/hosts(Linux)

3. TCP connection (three-way handshake)

After the first step of DNS domain name resolution, the IP address of the server is obtained. After the IP address is obtained, a connection is established, which is accomplished by TCP protocol, mainly through the three-way handshake.

  1. First the Client sends a connection or request packet: The Client sends a SYN packet to the server. Then the Client is in the SYN_SENT state and waits for confirmation from the server.
  2. After receiving a SYN packet, the server must acknowledge the client’s SYN (ACK = J +1) and send a SYN packet (ACK = K). In this case, the server enters the SYN_RECV state.
  3. After receiving the SYN+ACK packet from the server, the client sends the ACK packet (ACK = K +1) to the server and allocates resources. After the packet is sent, the client and the server enter the ESTABLISHED state and complete the three-way handshake. In this way, the TCP connection is ESTABLISHED

4. Send an HTTP request

After the TCP connection is established, an HTTP request is made. A typical HTTP Request header needs to include the request method, for exampleGETorPOSTEtc., less commonly usedPUTandDELETEAnd the HEADOPTIONAnd the TRACE method

. In the form of a message to tell the server what we need, a complete HTTP request contains the request start line, request header, request body three parts.

Add (1)GET contrast POST

Reference answer:W3school: GET vs. POST

If you look at the names of other request methods, you can GET a general idea of what method to use when, which is a good example of semantics. GET and POST are two HTTP request modes. HTTP is an application layer protocol based on TCP/IP. Both GET and POST use the same transport layer protocol, so there is no difference in transmission. In the format of packets without parameters, the biggest difference is the method name of the first line. Without parameters, they differ only in the first few characters of the message. The first line of a POST request message looks like this: POST/URI HTTP/1.1 \r\n GET request message looks like this: GET/URI HTTP/1.1 \r\n For example, if the parameter is name=qiming. C, age=22. The simplified version of GET packets looks like this

GET /index.php? Name = qiming. C&age = 22 HTTP / 1.1 Host: localhostCopy the code

The simplified version of the POST method looks like this

POST /index.php HTTP/1.1 Host: localhost Content-Type: application/x-www-form-urlencoded name=qiming. C&age =22Copy the code

Conclusion under

  • GET – Requests data from the specified resource. It’s side-effect-free, idempotent, and cacheable
  • POST – Used to submit data to be processed to a specified resource, has side effects, is not idempotent, is not cacheable
  • Parameters. The parameters of GET are placed in the query parameters of the URL and the parameters (data) of POST are placed in the body of the request message.
  • Security. GET is more secure than POST (only more secure)
  • The LENGTH of A GET URL is limited. POST can transfer a lot of data. The length of a GET parameter (URL query parameter) is limited to 1024 characters. POST parameters (data) have no length limit (also 4-10MB limit)
  • GET is used to read data, POST is used to write data, and POST is not idempotent (idempotent means that no matter how many times the request is sent, the result is the same).

Add (2) the difference between HTTPS and HTTP

The main differences between HTTP and HTTPS are as follows

  1. HTTPS requires you to apply for a certificate from a CA. Generally, there are few free certificates, so a certain cost is required.

  2. HTTP is a hypertext transmission protocol, and information is transmitted in plain text. HTTPS is a secure SSL encryption transmission protocol.

  3. HTTP and HTTPS use completely different connections and use different ports, the former 80 and the latter 443.

  4. HTTP connections are simple and stateless; HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identity. It is more secure than HTTP.

5. The server returns an HTTP response

After receiving the HTTP Request from the browser, the server encapsulates the received HTTP packet into an HTTP Request object and processes it through different Web servers. The processing result is returned as an HTTP Response object, including the status code, Response header, and Response body.

Added: Common status codes

We all know 404 page does not exist, 500 server error, 301 redirect, 302 temporary redirect, 200OK, 401 unauthorized anything.

It mainly introduces three status codes and related knowledge, they are 304 negotiation cache, 101 protocol upgrade, and 307HSTS jump.

  • Let’s start with the 304 negotiated cache. That’s the basics. Trust me, as soon as you mention 304 negotiation cache, the interviewer will be tempted to ask you, what is negotiation cache? The difference between a negotiated cache and a mandatory cache is that the mandatory cache does not require access to the browser and returns 200. The mandatory cache does not require access to the server, but is fetched directly from the browser, at this point 200. If there is a hit, it will modify the cache header in the browser, and finally get the information from the browser, which is 304. If there is no hit, it will get the information from the server, which is 200. Now it’s time to show off your extensive knowledge of the browser cache. My general answer: browser caches are divided into mandatory caches and negotiated caches, with read mandatory caches being preferred. A mandated cache can be expires or cache-control, where a expires is a specific time, which is the older standard, whereas a cache-control is usually a specific time, which is newer and has a higher priority. The negotiated cache includes ETAG and Last-Modified. Last-modified is set according to the last modification time of the resource, while ETAG is a value calculated based on the content of the resource, so it has a higher priority. The difference between the negotiated cache and the mandatory cache is that the mandatory cache does not require access to the browser and returns 200, whereas the negotiated cache requires access to the server and returns 304.

  • The 101 protocol upgrade is mainly used for Websockets, but can also be used for http2 upgrades. Websocket features and efficacy are not detailed, we are very familiar. Http2 supports multiple requests for a single connection, binary, compressed header, server push, etc. Specific understanding is also their own Google Baidu, here is not detailed. HTTPS, HTTP,http2, and its spdy are different, and they have advantages and disadvantages, and they have what links, these knowledge need to be searched by the reader.

  • 307 HSTS jump This is more advanced and was originally used to redirect a POST request to a new POST request, but it is also used for HSTS jumps. HSTS is short for HTTP Strict Transport Security (HSTS). It requires that the next time a browser accesses a site, it uses HTTPS instead of HTTP and HTTPS. In this way, SSL stripping attacks can be avoided. In this way, an attacker attacks the server when the user uses HTTP to access the server and impersonates himself as a user. The attacker and the server use HTTPS to access the server and the user and the server use HTTP to access the server. To do this, add strict-transport-security to the server response header, and you can set max-age. Of course, speaking of SSL strip attacks, you must be very interesting. What other methods can be used to attack supposedly secure HTTPS? What I’ve learned here is that there are SSL hijacking attacks, presumably trusting third party security certificates, which are used by proxy software to listen for HTTPS. If there is more, welcome to add.

Only three status codes can involve so much knowledge, for the status code, we can not just one-sided to recite the status code and the corresponding meaning, to take the initiative to dig, in-depth, with the help of HTTP status code to establish their own network system.

Common status code :200, 204, 301, 302, 303,304, 400, 401, 403, 404, 500,503

  • 200 OK Indicates that the client request is successful
  • 204 No Content Is successful, but does not return the body of any entity
  • 301 Moved Permanently redirected Permanently. The Location header of the response packet should contain the new URL of the resource
  • 302 Found Temporary redirection. The URL in the Location header of the response packet is used to locate the resource temporarily
  • 303 See Other The requested resource has another URI. The client should use the GET method to obtain the requested resource
  • 304 Not Modified The server content is Not updated and can be read directly from the browser cache
  • 400 Bad Request Indicates that the client Request has syntax errors and cannot be understood by the server
  • 401 Unauthonzed indicates that the request is unauthorized. This status code must be used with the WWW-Authenticate header field
  • 403 Forbidden Indicates that the server receives a request but refuses to provide the service. The reason for not providing the service is usually given in the response body
  • 404 Not Found The requested resource does Not exist, for example, an incorrect URL was entered
  • 500 Internel Server Error Indicates that an unexpected Error occurs on the Server. As a result, the client request cannot be completed
  • 503 Service Unavailable Indicates that the server cannot process requests from clients. The server may recover after a period of time

6. Close the TCP connection (wave four times)

  1. The Client sends a REQUEST to interrupt the connection, that is, a FIN packet. After the Server receives a FIN packet, it says, “I have no data to send to you from the Client.” However, if you have incomplete data to send, you do not need to close the Socket and continue sending data.

  2. The server sends an ACK saying, “I have received your request, but I am not ready yet. Please continue to wait for my message.” Wait: The Client enters the FIN_ wait state and waits for the FIN packet from the Server.

  3. When the Server confirms that data has been sent, it sends a FIN packet to the Client to tell the Client that the data has been sent and that it is ready to close the connection.

  4. After receiving the FIN packet, the Client “knows that the connection can be closed. However, the Client still does not trust the network and is in time__wait state because the Server does not know that the connection is closed. If the Server does not receive the ACK, the Client can retransmit the packet. “When the Server receives an ACK, it knows it is ready to disconnect.” If the Client waits for 2MSL and still does not receive a reply, then the Server is shut down normally. Well, the Client can close the connection. Ok, the TCP connection is closed!

7. The browser parses the rendered page

The browser receives the HTML,CSS, and JS files

  • Parsing HTML
  • Download CSS (cache)
  • Parsing the CSS
  • Download JS (cache)
  • Parsing JS
  • Download the pictures
  • Resolution images
  • Rendering the DOM tree
  • Render style tree
  • Perform JS

Specifically (take WebKit as an example) an HTML document is parsed by an HTML parser to build a DOM Tree, and CSS existing in HTML is parsed by a CSS parser to build Style Rules, which are combined to form an Attachment. After the completion of the Render Tree construction, enter the layout stage (layout/reflow), each stage will be assigned a should appear on the screen of the exact coordinates. Finally, after all nodes are traversed and drawn, a page is displayed.

The process is complicated and involves two concepts: reflow(reflow) and repain(redraw). The following can speak

When a JS file is encountered during the document loading process, the HTML document will suspend the render (load parsing render synchronization) thread, and not only wait for the js file loading in the document, but also wait for the parsing execution to complete, before resuming the HTML document rendering thread. Because JS may modify the DOM, the most classic document.write, this means that all subsequent resource downloads may not be necessary until JS execution is complete, which is the root cause of JS blocking subsequent resource downloads.

JS parsing is done by the JS parsing engine in the browser. JS is single-threaded, that is, only one thing can be done at a time, and all tasks need to be queued so that the first task can finish before the next one can start. However, there are some tasks that are time-consuming, such as IO reads and writes, so you need a mechanism to do the synchronous and asynchronous tasks first.

The execution mechanism of JS can be regarded as a main thread plus a task queue. Synchronous tasks are tasks that are executed on the main thread, and asynchronous tasks are tasks that are executed on the task queue. All synchronization tasks are executed on the main thread, forming an execution stack. An asynchronous task will place an event in the task queue when it has a result. When a script runs, it first runs the execution stack in sequence, then extracts the events from the task queue and runs the tasks in the task queue. This process is repeated constantly, so it is also called Event loop.

Supplement reflux and redraw

Each element in the DOM node exists in the form of a box model, which requires the browser to calculate its position and size, a process called relow. Once the location, size, and other properties of the box model, such as color and font, are determined, the browser begins to draw the content, a process called repain. Pages will inevitably experience reflow and Repain when they first load. Reflow and Repain processes can be very performance draining, especially on mobile devices, and can ruin the user experience, sometimes causing pages to stagnate. So we should reduce reflow and repain as little as possible.

Reflow, also known as Layout, usually means that the content, structure, position, or size of an element has changed, requiring recalculation of the style and rendering tree. Repaint: when the changes to an element only affect the appearance of the element (e.g. background color, border color, text color, etc.), you simply apply a new style to the element. This process is called Repaint. So the cost of Reflow is much higher than the cost of Repaint. Every node in the DOM tree has a reflow method, and a node’s reflow is likely to result in reflows for its children, or even its parents, and siblings. The following actions are likely to be costly: 1. Adding, deleting, or modifying DOM nodes will result in Reflow or Repaint. 2. When moving the DOM, or doing an animation. 3. Content changes. 4. When modifying CSS styles. 5. Resize the window (mobile doesn’t have this problem), or scroll. 6. When modifying the default font of a web page.

Reflow is used for the following reasons: 1. Initial Incremental. Some JS are operating on the DOM tree. 3, Resize, some of its components have changed the size. StyleChange, if the CSS properties have changed.

Tip this paper summarizes the process from URL input to page display in detail. There are many knowledge points in this process, and I have summarized some of them. If there is any deficiency, I hope to point out what happened from URL input to page display in reference.