From entering the URL until the page loads

The process from entering the URL until the site page loads

  • The DNS
  • Establishing a TCP Connection
  • Sending an HTTP request
  • The server responds to the request and returns an HTTP packet
  • The browser parses the loaded page

The DNS

DNS resolution is the translation of web addresses to IP addresses. When we type an address into a browser, such as http://wwww.baidu.com, it is not the actual address of a website. On the Internet, each machine is uniquely identified by its IP address, but IP addresses are not easy to remember. So we translate the IP address into the address of a website. This requires DNS resolution.

When we enter the URL, we will search the local DNS server for a matching IP address for the first time. If no matching is found, the local DNS server will send a request to the root DNS server to match the IP address. If the domain name does not exist on the root DNS server, A local domain name sends a request to a COM top-level domain name server, and so on. Until a matching address is found. And the browser automatically caches the result of the parsing for the next request.

If the domain name is not requested for the first time, the system searches for the matching domain name in the local cache and then in the higher-level domain name service…….

A TCP connection

TCP three-way handshake establishment process:

The TCP process on the server first creates the transfer control block TCB and prepares to receive the connection request from the client process. Then, the server process is in the LISTEN state and waits for the connection request from the client. If any, the server process responds.

  1. The TCP process of the client also creates the transmission control module TCB first, and then sends a connection request packet segment to the server. SYN=1, ACK=0 in the header of the packet segment, and an initial serial number seq= I is selected. According to TCP, the SYN=1 segment cannot carry data, but consumes a sequence number. At this point, the TCP client process enters the SYN-sent state, which is the first handshake of the TCP connection.

  2. After receiving the request packet from the client and agreeing to establish a connection, the server sends an acknowledgement message to the client. Confirm SYN=1, ACK=1, ACK= I +1, and select an initial sequence number seq= J. The packet is also a SYN=1 packet segment, which cannot carry data but consumes a sequence number. At this point, the TCP server enters the SYN-RCVD state, which is the second handshake of the TCP connection.

  3. After receiving the acknowledgement from the server, the TCP client process sends an acknowledgement to the server. The ACK of the packet segment is 1, the ACK number is J +1, and its serial number is seq= I +1. According to the TCP standard, the ACK packet segment can carry data, but if it does not carry data, it does not consume the sequence number. Therefore, if it does not carry data, the sequence number of the next packet segment is still SEq = I +1. The TCP connection is ESTABLISHED, and the client enters the ESTABLISHED state. This is the third handshake of the TCP connection. It can be seen that the client can already send the packet segment carrying data.

After receiving the confirmation, the server enters the ESTABLISHED state.

The HTTP request

The process of sending an HTTP request is to construct an HTTP request packet and send it to the specified port of the server through TCP (HTTP 80/8080, HTTPS 443). The HTTP request packet consists of three parts: the request line, the request header, and the request packet.

The request line

Format: Method request-URL http-version CRLF

eg: GET index.html HTTP/1.1
Copy the code

Common request methods:

  • GET: requests to obtain the resources identified by the request-URI
  • POST: Appends new data to the data identified by the request-URI
  • HEAD: Response message header for a Request to get the resource identified by request-URI
  • PUT: Requests the server to store a resource and identifies it with request-URI
  • DELETE: requests the server to DELETE the resource identified by request-uri. In contrast to the PUT
  • TRACE: Indicates the tracing path. The client can TRACE the transmission path of request messages.
  • CONNECT: The tunnel protocol is required to link the proxy. A tunnel must be established in the communication with the proxy server to achieve TCP communication through the tunnel protocol. Mainly used for SSL links

The request header

Allows the client to pass additional information about the request and the client itself to the server

PS: The client does not necessarily refer to the browser. Sometimes you can use the Linux CURL command and the HTTP client test tool.

Common request headers include Accept, accept-charset, accept-encoding, accept-language, Content-Type, Authorization, Cookie, user-agent, and so on.

Request body

When using POST, PUT, and other methods, the client is usually required to pass data to the server. This data is stored in the request body. The request header contains some information related to the request body. For example, Web applications today usually use Rest architecture, and the requested data format is json. You need to set content-Type: application/json.

The server processes the request and returns HTTP packets

The back-end device receives TCP packets on a fixed port. Procedure TCP links are processed, HTTP is parsed, and an HTTP request object is encapsulated according to the packet format

HTTP response packet: status code, response header, and response packet

Status code: consists of three digits

  • 1XX: indicates that the request has been received and processing continues.
  • 2xx: success – The request is successfully received, understood, or accepted.
  • 3xx: Redirect – Further action must be taken to complete the request.
  • 4XX: client error – The request has a syntax error or the request cannot be implemented.
  • 5xx: Server side error – The server failed to fulfill a valid request. The common status codes are :200, 204, 301, 302, 304, 400, 401, 403, 404, 422, 500(please find what they represent by yourself).

Response header:

Common response header fields are: Server, Connection… .

Response message:

The text information returned by the server to the browser, usually HTML, CSS, JS, images and other files are stored in this section.

The browser parses the rendered page:

Browser engine

  • Rendering engine: take the page content, integrate the page information, calculate the page display, layout the page, and output to the monitor or printer
  • JS engine: parsing and executing JS code to achieve the dynamic effect of web pages

Browser kernel type

  • Trident, IE
  • Gecko: Netscape, Firefox
  • Presto: Opera
  • Webkit: Safari, Chrome

Rendering principle:

  1. Parsing THE HTML to rebuild the DOM tree: The rendering engine begins parsing the HTML document

  2. Build render tree: Parse CSS, compute node styles according to CSS selectors, and create render tree

  3. Layout rendering tree: called from the root node cabinet, calculates the size, position of each element, and gives the exact coordinates of where each node should appear on the screen

  4. Draw the render tree: Walk through the render tree, drawing each node using the UI back end

reflow/repaint

Reflow: A part of the page has changed, affecting the layout, and needs to be rerendered. For example, the folding and expansion of tree directory (essentially the display or hiding of elements) will cause reflow as long as the occupying area, positioning mode and margin attribute of an element on the page change

Repaint: Changing the background color, text color, and border color of an element without affecting the layout of surrounding elements causes repaint