What happens from the time you enter the URL to the time the page loads?

Simply divide into 6 processes:

  • The DNS
  • A TCP connection
  • Sending HTTP requests
  • HTTP Response Return
  • TCP Disconnection
  • The browser parses the rendered page

The DNS

DNS is a distributed database that translates domain names like www.google.com into IP addresses and maps requests to remote servers. In other words, DNS records urls and their corresponding IP addresses on the Internet. So a domain name like www.google.com is resolved to an IP address 197.251.230.45

DNS query process:

The operating system first looks up the IP address in the local cache and if it doesn’t have one it goes to the DNS server that’s configured on the system and if it doesn’t have one, it goes directly to the DNS root server, So what you’re going to do is you’re going to find the server that’s responsible for the com domain and you’re going to go to that server and you’re going to look up the Google domain and the next level 3 domain is actually configured by us, you can assign an IP to the WWW domain, and then you can assign an IP to the other level 3 domain

A TCP connection

TCP three-way handshake

The client sends a packet with SYN=1, Seq=X to the server port (the first handshake, initiated by the browser, tells the server I’m going to send the request)

The server sends back a response with SYN=1, ACK=X+1, Seq=Y as confirmation (second handshake, initiated by the server, telling the browser I’m ready to accept it, send it now)

The client sends back a packet with ACK=Y+1, Seq=Z, which means “handshake over” (the third handshake, sent by the browser, tells the server I’m sending soon, get ready to accept).

Sending HTTP requests

After the TCP connection is established, the browser can use HTTP/HTTPS to send requests to the server. The server receives the request and parses the request header. If the header contains information about the cache, such as if-none-match and if-Modified-since, the server verifies whether the cache is valid. If the cache is valid, the server returns the status code 304.

HTTP Response Return

Browser to get the response data, parse the response content, first of all, the browser will determine what is a status code, if 400 or 500 will be an error, if would be redirected to the 300, there will be a redirect counter, avoid redirects, and several times more than the number also complains, if 200 is began to parse the file browser, If it is in gzip format, it will extract the file first and then know how to decode the file according to the encoding format of the file.

TCP Disconnection

Four times to wave

After the client and server establish a TCP connection through the three-way handshake, disconnect the TCP connection after data transfer is complete.

  • First wave: Set SEQ and ACK for host 1 (client or server), and send a FIN packet segment (FIN, SEQ = A) to host 2. Host 1 enters the FIN-WAIT_1 state. The client has no data to send to the server.
  • Second wave: Host 2 receives the FIN packet from host 1 and sends an ACK packet (ACK = A +1) to host 1. The client enters the FIN_WAIT_2 state, indicating that it agrees to close the request
  • Third wave: Host 2 sends a FIN packet to host 1 to close the connection, and the server enters the LAST_ACK state. (FIN, seq = b)
  • For the fourth time, host 1 sends a FIN packet and an ACK packet (ACK = B +1) to host 2. Host 1 enters the TIME_WAIT state. Host 2 will close the connection after receiving the ACK message. At this time, host 1 still does not receive a reply after waiting for 2MSL, it proves that the Server station has been shut down normally, and host 1 can also shut down

The browser parses the rendered page

After decoding the file successfully, the rendering process will officially start. DOM tree will be built according to HTML first, and CSSOM tree will be built if CSS is available. If a script tag is encountered, it will determine whether async or defer, the former will download and execute JS in parallel, and the latter will download the file first and wait for HTML parsing to complete and then execute sequentially.

If none of the above is present, the rendering process will be blocked until JS execution is complete. Encounter file download will go to download files, here if the use of HTTP/2 protocol will greatly improve the efficiency of multi-graph download.

After the CSSOM and DOM trees are built, the Render tree is generated, which determines the layout, style, and many other aspects of the page elements

During the Render tree generation, the browser starts to invoke GPU rendering, compose layers, and display the content on the screen

HTTP

HTTP is short for Hyper Text Transfer Protocol. It is used to Transfer hypertext from the World Wide Web server to the local browser.

HTTP is a TCP/ IP-based communication protocol to transfer data (HTML files, image files, query results, etc.)

HTTP features

There is no connection

Connectionless means to limit processing to one request per connection. The server disconnects from the customer after processing the request and receiving the reply from the customer. In this way, transmission time can be saved.

Simple and quick

Each resource URI is a fixed, image, page address (uniform resource character), and is very simple to process (when a client requests a service from the server, it just needs to pass the request method and path).

stateless

HTTP is a stateless protocol. Stateless means that the protocol has no memory for transaction processing. The lack of state means that if the previous information is needed for subsequent processing, it must be retransmitted, which can result in an increase in the amount of data transferred per connection. On the other hand, the server responds faster when it doesn’t need the previous information.

HTTP2

HTTP/2 significantly improves web page performance compared to HTTP/1.

In HTTP/1, for performance reasons, we introduced Sprite diagrams, inlining small diagrams, using multiple domain names, and so on. This is all because browsers limit the number of requests per domain (Chrome typically limits the number of connections to six). When a page requests a lot of resources, the Head of line blocking causes that when the maximum number of requests is reached, The remaining resources need to wait for other resource requests to complete before initiating requests.

The introduction of multiplexing in HTTP/2 allows all requested data to be transmitted over a single TCP connection. Multiplexing is a good solution to the browser’s problem of limiting the number of requests to the same domain name. It also makes it easier to achieve full speed transmission, since new TCP connections require a slow increase in transmission speed.

Binary transmission

The heart of all performance enhancements in HTTP/2 lies here. In previous versions of HTTP, we transferred data as text. A new encoding mechanism was introduced in HTTP/2, where all transmitted data is split and encoded in binary format.

multiplexing

In HTTP/2, there are two very important concepts: frame and stream. A frame represents the smallest unit of data, and each frame identifies which stream it belongs to. A stream is a data stream composed of multiple frames.

Multiplexing means that multiple streams can exist in a TCP connection. In other words, multiple requests can be sent, and the peer end can know which request belongs to by the identifier in the frame. By using this technique, the queue header blocking problem in older VERSIONS of HTTP can be avoided and the transmission performance can be greatly improved.

The head of compression

In HTTP/1, we transmit the header as text, and in cases where the header carries a cookie, we may need to repeat the transfer of hundreds to thousands of bytes at a time.

In HTTP /2, the transmitted headers are encoded using the HPACK compression format, reducing their size. An index table is maintained at both ends to record the headers that have been recorded. The key names of recorded headers can be transmitted later in the transmission process. After receiving data, the peer end can find the corresponding value by the key names.

Server push

In HTTP/2, a server can send multiple responses to a single request from a client. For example, if a request requests index.html, the service will probably respond to index.html, logo.jpg, and CSS and JS files at the same time, because it knows the client will want those things. This is equivalent to gathering all the resources in one HTML file

HTTPS

HTTP is transmitted in plaintext, which poses a major security risk. When packets are hijacked on the Internet, you are exposed to others naked, with no privacy at all. HTTPS was introduced to address network security risks.

HTTPS is an HTTP channel aiming at security. On the basis of HTTP, HTTPS ensures the security of the transmission process through transmission encryption and identity authentication. HTTPS adds SSL layer to HTTP, which is based on SSL.

The connection and difference between HTTPS, SSL, and TLS

SSL is a protocol layer on top of TCP under HTTP. It is based on the HTTP standard and encrypts data transmitted through TCP. Therefore, HPPTS is short for HTTP+SSL/TCP. TLS is the new version 3.1 of Secure Socket Layer (SSL)

HTTPS Communication Process

  • Browser requests
  • The server returns the public key + signing certificate
  • The browser asks the CA authentication authority whether the certificate is valid
  • The CA authentication result is returned
  • Browsers use public keys to encrypt symmetric secret keys
  • The server decrypts the public key with its own private key to obtain the symmetric key and initiate communication

HTTPS principle

A story over HTTPS

Advanced advanced series portal

  • Advanced advanced Webpack
  • Advanced performance optimization
  • Advanced Vue chapter,
  • Advanced HTTP,
  • Advanced regularization and algorithms,
  • Advanced Node.js,
  • Advanced engineering chapter