HTTP series: Front-end performance optimization for the HTTP process and network layer

There are several ways to communicate information between the client and the server:

XMLHttpRequest/ajax/axios / $. Ajax/fetch data interaction
Cross-domain processing solutions: Ajax, FETCH, JSONP, postMessage…
HTML/CSS /js/image/ audio/video…
webscoket
.

An HTTP procedure includes:

The client passes information to the server or sends a Request to the server.
The server receives information from the client and returns the relevant content to the client.
Request + Response = an HTTP transaction

All the contents transmitted between the client and server are called HTTP packets. An HTTP packet contains the following information:

Start line: Basic information (including the HTTP version).
- Request start lineGET{Request method} /res-min/themes/marxico. CSS {Request address} HTTP/1.1{HTTP version}' 。
- Response start line'HTTP/1.1{HTTP version number} 200{HTTP response status code} OK{Status code description}'
Header: Request header (client -> server), response header (server -> client)
Principals: Request principal (client -> server), response principal (server -> client)

Data transfer between the client and the server, relying on the network (communication mode TCP/UDP… Transfer protocol HTTP/HTTPS/FTP…) . So, what are the details of this process?

What happens between entering the URL and seeing the page? How can this process be optimized?

The usual question

URL parsing (identifying URLS)
Check caches (strong cache, negotiated cache {for resource file requests} & local storage {for data requests})
DNS server resolution (Domain name resolution: the Internet IP address of the server is resolved based on the domain name)
TCP Three-way handshake (establish network connection channel between client and server)
Information communication between client and server is realized based on HTTP/HTTPS and other transport protocols
TCP waves four times (to release the established network channel)
Client rendering (rendering the page and effects)

Here’s a closer look at each process

URL parsing (identifying URLS)

URI Uniform resource identifier, including

URL: Uniform resource locator
URN: Specifies the unified resource name

A URL ‘http://user:[email protected]:80/st/index.html?xxx=xxx&xxx=xxx#video’, structural analysis is as follows:

Transfer protocol

Transfer protocol: HTTP/HTTPS/FTP…

HTTP Hypertext transfer protocol. That is, in addition to text (such as strings), you can transmit other information (such as file streams, binary or Buffer formats, or BASE64 data).
HTTPS=HTTP+SSL Indicates HTTP that is more secure. The content is encrypted, a protocol typically used for products that involve payments
FTP File transfer protocol. It is used to deploy developed files to servers based on FTP tools such as Filezilla
.

Login Authentication Information

User name password: user:pass, which is generally not used

The domain name

Domain name: www.baidu.cn

Top level domain bidus.cn
Level-1 domain name www.baidu.cn
The secondary domain is video.baidu.cn
Level 3 domain student.video.baidu. Cn
.

If you buy a top-level domain name, you can assign a secondary/tertiary domain name later.

A domain name is used to alias the external IP address of the server for easy memory

After the domain name and server are purchased, a resolution record must be generated on the DNS server for future DNS resolution

The different suffixes of.com /.cn /.net /.org /.gov/also have some different meanings

If the protocol, domain name, or port number is different, it is cross-domain. Cross-domain issues will be addressed in a separate article later. The following are cross-domain:

'www.baidu.com' VS 'www.qq.com'Domain:
'www.baidu.com' VS 'video.baidu.com': Cross-domain (same primary domain, but different subdomains)
' www.baidu.com:80' VS 'www.baidu.com:443'Domain:
'http://www.baidu.com' VS 'https://www.baidu.com'Domain:

The following are homologies:

‘http://www.baidu.com:80/st.html’ VS ‘http://www.baidu.com:80/index.html’ : the same origin

The port number

Port number: 80. Ports are used to distinguish between multiple items on a server (each item is actually a service). The value ranges from 0 to 65535

Default port number: Enter the address in the browser address bar. We don’t write the port number, the browser will help us add it. It will be passed to the server with the port number. HTTP ->80, HTTPS ->443, FTP ->21.

The path name of the requested resource

The requested resource path name is /st/index.html. The resource file required by the client can be found based on the path.

The user may see a rewritten URL, such as an Ajax data request with the interface address/API /list, and the backend may return something else based on the non-existent address

Question mark parameter information (query string)

Question mark parameter information:? xxx=xxx&xxx=xxx

Pass the information to the server. GET series requests typically pass parameters this way. XXX =xxx&xxx= XXX This format is called X-wwW-form-urlencoded format
In the case of a page jump, the query string can pass the information to another page

Fragment identifier (HASH value)

HASH value: #video

The anchor point positioning
HASH routing

Other issues (URL compilation issues)

If a url is as follows

let url = ` http://www.xxx.com/index.html?lx=1&from=http://www.qq.com/'.
Copy the code

The query string from contains a full domain name, browse its analytical problems when, http://www.xxx.com/index.html?lx=1&from= will be resolved as a url, http://www.qq.com/ will be resolved as another url.

So we encode the entire URL, or the following query string, so that the browser recognizes only one URL

Code classification:

encodeURI & decodeURICompile:Space and Chinese, generally compile the information in the entire URL

let url =  'http://www.xxx.com/s t/index.html? X =1&name= hello &from=http://www.qq.com' 
console.log(encodeURI(url))
Copy the code

Spaces and Chinese are encoded

encodeURIComponent & decodeURIComponentCompile:Space and Chinese and some special symbolsTherefore, it is generally only used to compile the value of the passed information, rather than the entire URL, to solve the URL can not resolve or pass information gibberish problems.
```
    let url =  ` http://www.xxx.com/st/index.html?x=1&name=The ${encodeURIComponent('你 好')}&from=The ${encodeURIComponent('http://www.qq.com')} ` 
    console.log(url)
Copy the code
```

escape & unescape: used for client page information transfer or compilation of some information “for example: Compilation of Chinese content in cookies”
```
console.log(escape('hello'));
console.log(encodeURIComponent('hello'));
Copy the code
```

Check the cache

Caching processing is a very important means of optimizing based on the HTTP network layer “Resource file requests targeted”

Two ways to check the cache

Strong cache, negotiated cache (for resource file requests)
Local storage (for data requests)

The cache location

Memory Cache: Memory Cache
Disk Cache: Indicates the hard Disk Cache

The difference between:

When opening a web page: The browser first searches for a matching Cache in the Disk Cache. If a matching Cache exists, the browser uses it. If no, the browser sends a network request.
Common refresh (F5) : The Memory Cache is available because TAB is not disabled. Therefore, if there is a Cache in Memory, the Memory Cache will be used first, and the Disk Cache will be looked for second
Forced refresh (Ctrl + F5) : The browser does not use caching, so requests are sent with cache-Control: no-cache headers, and the server returns 200 and the latest content

Strong cache

Strong caching: Expires/cache-control

A strong cache works like this:

The first time you request a resource, return the request result and the Cache identifier (response header Expires/ cache-control) and store the request result and the Cache representation in the browser Cache
The mechanism of caching identity:
- Expires: Cache expiration, which specifies when a resource Expires (HTTP/1.0)
- Cache-control: Returns a response header like this'the cache-control: Max - age = 2592000'Within 2592000 seconds (30 days) after the first resource is obtained, the request is sent again to read the cached information (HTTP/1.1).
If the request is sent again, it will first check the Cache information and the Cache identifier Expires/ cache-Control. If there is one and it is not expired, the client reads the cached information directly and does not send the request to the server
Note that cache-control has a higher priority than Expires if both are present

Problem: the file is cached locally, but the resource file for the service is updated. How do we ensure that we get the latest content?

Set the timestamp when requesting a resource file

For example: the first time . If the server resource has been updated, send the request again with a different timestamp, so that it will not go to the local strong cache, but instead pull the latest resource
File HASH name

For example: the first time, if the server resources are updated, the file name is re-hashed (webpack), so that the local strong cache is not used

So HTML files will never be strongly cached

For the first time,

<html>
    <link href='dasdasd43546.css'>
</html>
Copy the code

The second time

<html>
   <link href='75675675fsdff6.css'>
</html>
Copy the code

Negotiate the cache

Negotiated cache is a process in which the browser sends a request to the server with the cache identifier after the mandatory cache is invalid, and the server decides whether to use the cache according to the cache identifier

This is how the negotiated cache works (using Etag as an example) :

The first time the request is sent, the response contains the response header (say Etag: S35B56F) and the response body, and then the content is cached
The second time I send the request to the background, I will send the HTTP request with the cache identifier (if-modi fix-since/if-none-match), for example, the request header is: if-modi fix-since: S35B56F, this identifier is an identifier of the Etag returned for the first time
The server determines whether the file is updated based on the Etag
- Not updated: returns 304, returns no content, and informs the client to read the local cache information
- Updated: return 200, along with the latest resource information, and the new value of my Last modhoma /Etag
The browser receives the return message
- If 200: indicates the latest information, render it directly and cache the latest results and presentation locally
- If it is 304: get the content from the local cache and render it

Meaning of last-modified/ETag:

Last-modified: Records the time when the server resource file was Last updated
ETag: Generates a different identity whenever the server resource file changes

The meaning of the negotiated cache:

When the strong cache is invalid (or does not exist) (for example, HTML) then you can do the negotiation cache, and then verify the negotiation cache
Each time the resource is checked to the server for updates

Baidu can take a look at a CSS file cache Settings

Note:

The strong cache or negotiated cache is set on the server side, and the client browser will handle the related information according to the returned information. There is no need to set the front end separately

Note: The strong cache does not send the request, the local cache content is used if the cache is found before the request is sent, the negotiation cache sends the request (because it asks the background if the cache is used, and negotiates with the background), and if 304 is returned, the local cache is used.

Local storage cache

From this point of view to cache, you need to use JS encoding, logical processing

There are two ways to store the cache locally:

The page is not closed, and for infrequently updated data, we read cached data, which is generally stored in memory, the page is refreshed, the stored data is gone
When the page closes and opens again, we can also read the data in the cache, which is called persistent storage, and we can set the expiration time

Several schemes for storing data on the client:

(global) variable store “vuex/redux” : when the page is refreshed or reopened after closing, the previously stored data is lost (due to memory release problem)
cookie
WebStorage: localStorage & sessionStorage
IndexedDB
Cache
Manifest Offline storage

localStorage VS sessionStorage

New HTML5 API “not compatible with IE8 and below”

LocalStorage: persistent localStorage no expiration time is displayed. The content of the localStorage is displayed when the page is closed. You need to manually delete the content (or uninstall the browser)
SessionStorage indicates the sessionStorage. After the page is closed, the stored information disappears “but the page refresh does not disappear”.

localStorage VS cookie

The data stored in the local area is restricted by same-source access. Only the content stored in the local area is allowed to be read

Cookies only allow a maximum of 4KB to be stored under a single source, so you can’t store too much data. LocalStorage can store 5MB content under the same source!
Cookies need to set the expiration time, after the time will be invalid, and there are path restrictions. LocalStorage is persistent storage, no expiration time, unless you set some expiration processing mechanism.
Cookie unstable “some browser built-in clear operation, may be the cookie cleared; open the non-trace browsing or privacy mode, can not store cookie information.
Cookies are compatible with earlier browsers
Cookies are not strictly local storage and have a lot of connections to the server. When a client sends a request to the server, it sends the local cookie information based on the request to the server by default. And if the server returns a response header with a set-cookie field, the browser also defaults to storing this information locally in a Cookie on the client side. LocalStorage is strictly localStorage and has no connection with the server by default.

Use code to demonstrate the use of local caching principle

Store cached data in global variables (flush cache disappears) :

let submit = document.querySelector('#submit'),
     runing = false;

let serverData;
submit.onclick = function () {
   if (runing) return;// Simple anti-shake treatment
   runing = true;

   if (serverData) {
      // If there is data, use the cached data instead of fetching it from the server
      console.log('The data requested back is:', serverData);
      runing = false;
      return;
   }

   // Pull the data from the server and store it in a global variable
   axios.get('http://127.0.0.1:8888/home_banner').then(response= > {
      console.log('The data requested back is:', response.data);
      serverData = response.data;
      runing = false;
   });
};
Copy the code

sessionStorage:

 submit.onclick = function () {
   if (runing) return;
   runing = true;

   let data = sessionStorage.getItem('@A');
   if (data) {
      // If there is data, use the cached data instead of fetching it from the server
      console.log('The data requested back is:'.JSON.parse(data));
      runing = false;
      return;
   }

   // Pull the data from the server
   axios.get('http://127.0.0.1:8888/home_banner').then(response= > {
      console.log('The data requested back is:', response.data);
      sessionStorage.setItem('@A'.JSON.stringify(response.data));
      runing = false;
   });
Copy the code

LocalStorage The expiration time must be added to persistent storage.

submit.onclick = function () {
   if (runing) return;
   runing = true;

   let data = localStorage.getItem('@A');
   if (data) {
      data = JSON.parse(data);
      // You can set an expiration standard of 1 hour
      if (new Date() - data.time <= 60 * 60 * 1000) {
         console.log('The data requested back is:', data.data);
         runing = false;
         return; }}// Pull the data from the server and store it in a global variable
   axios.get('http://127.0.0.1:8888/home_banner').then(response= > {
      console.log('The data requested back is:', response.data);
      localStorage.setItem('@A'.JSON.stringify({
         time: +new Date(),
         data: response.data
      }));
      runing = false;
   });
}
Copy the code

DNS Server resolution

Two methods of parsing

Recursive query
Iterative query

If a large number of domain names are used to access resources, the resources are deployed on multiple servers. In this case, more DNS resolution is required, which consumes more time

Having multiple servers has its own benefits:

Reasonable use of resources, high performance servers are used to store data, poor performance is used to store static resources, reduce funds
Enhanced stress resistance, multiple servers to reduce stress
To increase HTTP concurrency, multiple servers can increase the concurrent requests of total servers. For example, Baidu:

Put different resources on different servers

DN analytic optimization:

Each DNS resolution takes 20 to 120 milliseconds. You can use DNS Prefetch to reduce the number of DNS requests.

Such as:

<meta http-equiv="x-dns-prefetch-control" content="on">

<link rel="dns-prefetch" href="/ / static. 360 buyimg. com"/>

<link rel="dns-prefetch" href="/ / misc. 360 buyimg. com"/>

<link rel="dns-prefetch" href="/ / img10.360 buyimg. com"/>

<link rel="dns-prefetch" href="/ / d. 3.cn"/>

<link rel="dns-prefetch" href="/ / d. Jd. com"/>
Copy the code

How it works: Link creates a separate thread to load resources while HTML continues to render down. Therefore, you can set up a separate county to resolve all DNS and cache them. If you encounter this domain name again, for example, img’s SRC contains this domain name, then DNS resolution does not need to be resolved again.

TCP Three-way handshake

The purpose of the TCP three-way handshake is to establish a stable and reliable transmission channel between the client and the server and ensure that both sides can send and receive messages.

UDP does not require a three-way handshake, but is fast and unstable.

The data transfer

HTTP request and response.

For this part, check out my previous post

HTTP series: AJAX basic combing, Axios basic use combing (juejin.cn)

TCP waves four times

Release the established link channel

After receiving a SYN request packet from a client, the server can directly send a SYN+ACK packet
But close connection, when FIN message server received, probably will not be immediately closed link to transmit data (is), so can only reply an ACK packet, first tell clients: “you FIN I received a message,” only when the server all the packets are sent out, I can send a FIN message, so you can’t send together
The server sends a FIN packet only after all the data that is being transmitted is complete

So you need a four-step handshake

TCP connection optimization methods:

Keep the long link, do not close the HTTP channel, and set the Connection: keep-alive header

Client rendering

Client rendering optimization can be seen in my previous two articles

Browser Rendering process and CRP optimization 1: Rendering Process (juejin.cn)

Browser rendering process and CRP optimization ii: CRP optimization

Summarizes network layer performance optimization

Network optimization is a key aspect of front-end performance optimization, because most of the consumption occurs at the network layer, especially the first page load, and how to reduce wait time is important. Performance tuning at the HTTP level can reduce the effect and time of white screens

Just said a process of HTTP, in the middle of the way interspersed with some optimization methods, here to do a summary, by the way to add some content

Using the cache
- Implement strong cache and negotiated cache for static resource files (extension: how to ensure timely refresh when files are updated)
- For not updated frequently USES the local storage interface data for data cache (extension: cookies/localStorage vuex | redux difference)
DNS optimization
- Deployed on multiple servers, increasing HTTP concurrency (resulting in slower DNS resolution)
- DNS Prefetch
TCP has three handshakes and four waves
- Connection: keep alive
The data transfer
- Reduce the size of data transfers
  - Content or data compression (Webpack, etc.)
  - Make sure GZIP compression is enabled on the server (usually 60%)
  - Batch data requests (e.g. pull-down refresh or paging to ensure that the first load request has less data)
- Reduce the number of HTTP requests
  - Resource file merge processing
  - Font icon, some don’t send a request, try to use the font to do, Ali icon
  - Figure CSS Sprite – Sprit
  - Pictures of BASE64
CDN Server “Geographically Distributed”

Deploy servers in multiple places to keep the speed of access consistent across regions
Using HTTP2.0

Reduce white screen:

Loading Humanized experience
Skeleton screen: Client skeleton screen (also loading) + server skeleton screen
Image lazy loading

Reducing white screen is also a problem worth exploring, later to write a detailed article

Differences between HTTP versions

Some differences between HTTP1.0 and HTTP1.1

For caching, HTTP1.0 mainly uses last-modified and Expires as the criteria for caching, while HTTP1.1 introduces more Cache Control policies: ETag and cache-control
For bandwidth optimization and the use of network connections, HTTP1.1 supports breakpoint continuation, i.e. return code 206 (Partial Content).
Error notification management, in HTTP1.1 added 24 error status response codes, such as 409 (Conflict) to indicate that the requested resource and the current state of the resource Conflict; 410 (Gone) indicates that a resource on the server has been permanently deleted…
Host header processing, in HTTP1.0 it is assumed that each server is bound to a unique IP address, so the URL in the request message does not pass the hostname. However, with the development of virtual hosting technology, there can be multiple homed Web Servers on a physical server, and they share an IP address. HTTP1.1 Request and response messages should support the Host header field, and Request messages without the Host header field will report an error (400 Bad Request).
Persistent connections. Connection: keep-alive is enabled by default in HTTP1.1, which somewhat makes up for the drawback of HTTP1.0 having to create a Connection on every request

New features of HTTP2.0 compared to Http1.x

The new Binary Format, http1.x parsing is based on text, text protocol Format parsing is a natural flaw, text is a variety of forms, there must be many scenarios to be robust consideration, Binary is different, only recognize the combination of 0 and 1, Based on this consideration of HTTP2.0 protocol resolution decided to use binary format, convenient and robust implementation
Header compression. Http1.x headers contain large amounts of information and must be sent repeatedly. HTTP2.0 uses encoder to reduce the size of the headers that need to be transmitted. This reduces the size of the transfer required
For example, my web page has a request for sytle. CSS. When the client receives the data of sytle. CSS, the server will push the file of Sytle. js to the client. No more requests
```
// Set the Link command in the application generated HTTP response header Link: </styles.css>; rel=preload; as=style, </example.png>; rel=preload; as=imageCopy the code
```
MultiPlexing

Concurrent requests refer to the establishment of multiple link channels, rather than multiple HTTP requests on a single channel
- HTTP/1.0 each time the request is responded, a TCP connection is established and closed when used up
- HTTP/1.1 “long connection”, however, several requests queued serial single thread processing, after the request waiting for the return of the previous request to get the opportunity to execute, once there is a request timeout, the subsequent request can only be blocked, there is no way, that is, people often say the line block;
- HTTP/2.0 “multiplexing” Multiple requests can be concurrently executed on a connection. If a request task is time-consuming, it will not affect the normal execution of other connections.