Summary of the ordinary learning encountered in some questions and answers, if there is wrong, please rational correction

What do these request headers mean, and in what situations do they occur

200 indicates that the request was successfully processed. This status code is normally returned
204 indicates that the request was successfully executed, but there is no data. The browser does not need to refresh the page or direct to a new page. For example, when we submit the form or jump to the A TAB, if the status code returned by the target page is 204, the page will not jump.

Usage scenario: If you only need to return success or not, you can consider using status code 204 as the return information to save redundant data transmission, such as pre-check request (OPTIONS) sent across domains.

301 indicates a permanent redirect. That is, the requested resource is assigned a new URL, which should be used in the future

Usage scenarios: 1. Do not want to renew the expired domain name (or find a more suitable domain name for the website), and want to change the domain name 2. In the search engine search results appear without the WWW domain name, and with the WWW domain name is not included, this time can use 301 redirect to tell the search engine we target domain name is which 3. Server space is reallocated

302 indicates a temporary redirect. The requested resource is temporarily assigned a new URL. The new URL is used for this request. The URL redirected in future requests may also change

Usage scenario: the user who has not logged in to the user center is redirected to the login page, the user visits the 404 page is redirected to the home page, the AD page, etc. Using 302 May lead to the url hijacking problem

304 indicates that the requested page has not been modified since the last request. When the server returns this response, the web page content is not returned. It has to do with the browser cache

Matched with the Last Modified/If Modified Since/ETags/ if-None-matched field of the response header to negotiate cache

What are the common HTTP status codes and what do they mean

1XX Intermediate state of protocol processing

101 When HTTP is upgraded to WebSocket, the server sends status code 101 if it approves the change

2 xx success

204 The request from the client is processed correctly by the server. 204 The request succeeds but the response packet does not contain the body part of the entity. 206 A range request is made by the client

3 xx redirection

301 Permanent redirection: indicates that a resource has been assigned a new URL. 302 Temporary redirection: indicates that a resource has been assigned a new URL. 304 When a client has a cache that may expire, the client asks the server whether the cache is still reusable with information such as the cache identifier eTAG and time. 304 tells the client to reuse the cache. The 303 resource has another URL, and the server requires the client to temporarily redirect the resource using get request 307 to send the same request to the new URL

4XX Client error

403 The access to the requested resource is denied by the server, but the request is allowed, but the request does not meet the condition. 404 The requested resource is not found on the server. 405 The current request method is not allowed Check content-Type for unsupported media types

5XX Server error

502 Bad Gate Invalid gateway 501 Indicates that the server does not support a function required by the current request. 503 Indicates that the server is temporarily overloaded or is being shut down for maintenance and cannot process requests

Five layer principle model

The data is encapsulated as follows

What is the difference between HTTP and HTTPS

HTTP is a hypertext transmission protocol, and information is transmitted in plaintext. Therefore, it has security risks. HTTPS addresses HTTP insecurity by adding SSL/TLS between the TCP and HTTP network layers to encrypt packets.
It is relatively easy to establish an HTTP connection. HTTP packets can be transmitted after TCP three-way handshake. In HTTPS, after the TCP three-way handshake, you need to perform the SSL/TLS handshake to transmit encrypted packets. So HTTP pages respond faster than HTTPS, which is more costly to the server than HTTP
The HTTP port number is 80 and the HTTPS port number is 443.
HTTPS applies for a digital certificate from a CERTIFICATE Authority (CA) to ensure that the identity of the server is trusted.

What are the advantages and disadvantages of SSL

Secure Sockets Layer (SSL) and its successor Transport Layer Security (TLS) are a Security protocol that provides Security and data integrity for network communications. TLS and SSL encrypt network connections between the transport layer and the application layer.

Advantages:

Reliability: Authenticating users and servers to ensure data is sent to the right clients and servers;
Confidentiality: Encrypting data to prevent it from being stolen;
Integrity: Maintains data integrity to ensure that data is not changed during transmission.

Disadvantages:

Technical threshold is relatively high, the application certificate is more cumbersome
Information transfer is more complex than HTTP
Encryption and decryption, key negotiation takes a certain time, increasing the delay

What is the principle of asymmetric encryption and symmetric encryption, what is the difference

Symmetric encryption: Encryption and decryption use the same password or the same logical encryption mode. The AES algorithm commonly used includes AES-128 AES-192 AES-256 (with different key lengths).
Asymmetric encryption: Encryption and decryption using different secret keys, one as the public key and the other as the private key. The most widely used algorithm is RSA

General process

To send A message to B, BOTH A and B generate A pair of public and private keys for encryption and decryption.
A’s private key is kept confidential, and A’s public key is told to B. B’s private key is confidential, and B’s public key tells A.
When A wants to send A message to B, A encrypts the message with B’s public key because A knows B’s public key.
A sends this message to B (the message has been encrypted with B’s public key).
After receiving the message, USER B decrypts user A’s message with its private key. None of the other recipients of this message can decrypt it because only B has B’s private key.

Which phase of HTTPS uses asymmetric encryption and which phase uses symmetric encryption, and why

Asymmetric encryption is used to exchange session keys before communication is established
In the process of communication, symmetric encryption session key is used to encrypt plaintext data
Symmetric encryption is efficient because a lot of data needs to be transmitted during transmission. Therefore, symmetric encryption is used during communication, and asymmetric encryption is used to ensure security in public key exchange projects

Asymmetric encryption generally have what algorithm to achieve

RSA: Its security is based on large number decomposition, using the product of two very large prime numbers as the material for generating the key
ECC: Based on the mathematical problem of discrete logarithms of elliptic curves, it generates public and private keys using specific curve equations and basis points
Elgamal: Based on the discrete logarithm problem

What is the websocket connection process

The Browser establishes a connection with the WebSocket server using a TCP three-way handshake. If this connection fails, the subsequent process is not executed and the Web application receives an error message.
After TCP connection is successfully established, Browser/UA transmits WebSocket supported version number, protocol version number, original address, host address and so on to the server through HTTP protocol.
After the WebSocket server receives the handshake request sent by Browser/UA, if the data and format of the packet are correct, and the protocol version number of the client and the server matches, etc., it accepts the handshake connection and gives the corresponding data reply, which is also transmitted by HTTP protocol.
When Browser receives the data packet from the server, if the content and format of the data packet are correct, the connection is successful and the onOpen message is triggered. At this time, Web developers can send data to the server through the Send interface. Otherwise, the handshake connection fails, and the Web application receives an onError message and knows why the connection failed.

See: www.ruanyifeng.com/blog/2017/0…

Http1 differs from HTTP2

Http1 can only compress the body part, not the header. In HTTP/1, we transmit the header as text. In cases where the header carries cookies, we may need to transmit hundreds to thousands of bytes at a time. In HTTP /2, the transmitted headers are encoded using the HPACK compression format, reducing their size. An index table is maintained at both ends to record the headers that have been recorded. The key names of recorded headers can be transmitted later in the transmission process. After receiving data, the peer end can find the corresponding value by the key names.
Http2 adopts the technique of multiplexing, in which multiple streams are sent within a TCP connection, representing all packets (frames) of each request or response. Through this technology, can avoid HTTP in the old version of the queue blocking problem, greatly improve the transmission performance, solved HTTP /1.1 pipeline blocking problem
Http2 can implement server-side push. The server is no longer reactive and can actively send messages to the client
The BINARY format of HTTP2 packets increases transmission efficiency

Http2 another question: is the multiple HTTP requests to reuse the same TCP connection, but the transport layer of the TCP protocol is don’t know how many a HTTP request, in case of packet loss to trigger the retransmission mechanism, the TCP connection in all HTTP requests must be waiting for the package

Ajax in relation to HTTP

AJAX is simply an HTTP request made by a browser with a same-origin policy constraint. AJAX request objects are what browsers open up to JS to invoke HTTP requests
AJAX requests are restricted by the same origin policy of the browser and have cross-domain problems
When AJAX makes complex requests, the browser issues OPTIONS prechecks (HTTP itself does not precheck).
In terms of usage, AJAX is easier to use, with less low-level details and more browser features (such as auto-loading same-domain cookies, etc.)
Ajax requests are sent via XHR or FETCH, and HTTP requests can be sent by typing and going to a URL, submitting a form, and loading an external file
The value of the X-requested-with field in the header of the Ajax request is XMLHttpRequest
Ajax and JS are more interactive to get the corresponding data

The difference between a request made by entering a URL in a browser and ajax

Enter the url

Web page refresh
Js could not get the response data
Only get requests can be sent. Parameters must be included in the URL
There are no cross-domain restrictions

Sending ajax

The page doesn’t refresh
Js can get the response data
Can send multiple requests, parameters can be included in the URL and body
Subject to cross-domain constraints

OSI seven layer model

What is the unit of physical layer transmission

Bits, transmitting binary photoelectric signals

GET whether you can place content in the body

Yes, the HTTP specification allows any method to send a request entity. There is no difference in how they are sent, but in browsers, because of the limitations of the XMLHttpRequest specification, ajax sends HTTP requests in browsers, so carrying entities in GET requests is not recommended

Whether the HTTP request body has a size limit

For POST requests, the browser has no limit on the body size, but the Web server has a limit and can be configured in the Web server

How does the HTTP packet get to the server

Address resolution – Protocol – Port – Host name: The DNS resolves the host IP address – Object path
Encapsulates Http request packets
Encapsulate a TCP packet and establish a TCP connection (three-way handshake)
The client sends the request
Server response
Close the TCP connection (4 waves)

Keep the link open by adding Connection:keep-alive to the request/response header

TLS working Principle

Asymmetric encryption ensures session key security, while symmetric encryption encrypts data with session key to ensure information transmission security
Integrity: the data is encrypted and sent together with the data digest. After decryption, the data digest is compared with the decrypted digest
Digital certificate: Uses the public key of the CA to verify whether the public key certificate sent by the server is trusted

TLS1.2’s four handshakes

The client sends a request that includes: a random value (used to generate the conversation key), supported protocols, a list of encryption suites (including random number generation algorithms, symmetric encryption algorithms, hash digest algorithms), and supported compression methods
The server receives a request from the client and sends a response to the client containing: a random value (used to generate the dialog key) Determine the protocol used determine the encryption method used The certificate used by the server
The client receives the server’s certificate and verifies that it is valid. If the certificate is ok, it sends a request to the server, including: Generate a random value (encrypted with the certificate’s public key), encoding change notification (subsequent messages will be sent using an agreed encryption method and secret key), client end of handshake notification (hash value of everything previously sent, used for server validation)
The final response from the server includes: Server receives the client after the third random number (decrypted) combined with the previous two clear random number, “the session key” used in the computation to generate the session, the coding change notice (information will be used subsequently agreed encryption methods and key send), shaking hands server end notification (sent all the content in front of the hash value, Used for client verification

The improvement of TLS1.3

TLS1.3 eliminates a large number of algorithms based on TLS1.2 to improve security. At the same time, it saves the time of regenerating the key by using session reuse, and 0-RTT connection is achieved by using PSK (the first time the authentication is sent directly with the data)

How does the client know that the CA certificate issued by the server is not hijacked or tampered with

To sum up, the CA public key is used to decrypt the digital signature in the CA certificate issued by the server, and then compare with the summary of the server public key in the CA certificate. If they are consistent, they are not tampered with

How does a client verify that a CA certificate is valid

The server registers its public key with the CA
The CA uses its own private key to digitally sign the public key of the server and issues a certificate. Digital signature: The public key of the server is abstracts and encrypted with the private key of the authentication authority
The server sends the public key certificate to the client, and the client verifies the public key certificate
The client uses the CA’s public key to decrypt the digital signature in the public key certificate to obtain the server public key summary
The client abstracts the server public key and compares the public key abstracts in the previous step
After obtaining the public key, the client encrypts the packet and sends it. The server decrypts the packet using its own private key

Common digest algorithm: SHA384,SHA256 number represents how many bits long hash value

GET is different from POST

In contrast, GET is idempotent and has no side effects. POST modiates resources on the server
GET passes parameters by concatenating urls; POST transmits parameters through the body
Get caches,Post doesn’t, so Post rerequests when the page goes backwards, but Get doesn’t
Urls have a length limit that affects Get requests, but the length limit is browser-specified, and the body length of a POST is usually determined by the server
Post is a bit more secure than Get (Get request parameters are included in the Url, and can also be written in the body), but it is the same in the capture case (both HTTP protocol to be specific).
GET generates a TCP packet; POST generates two TCP packets. For GET requests, the browser sends both HTTP headers and data, and the server responds with 200 (return data). For POST, the browser sends a header, the server responds with 100 continue, the browser sends data, and the server responds with 200 OK (returns data). Two-packet TCP has great advantages in verifying packet integrity.

What is the HTTP request URL length limit

The HTTP protocol has no limit on URL length, but different browsers have different length limits

What are the HTTP request methods

GET: obtains data from the server
POST: submits data to the server for processing
HEAD: gets the document HEAD from the server
PUT: Stores the body of the request on the server
DELETE: deletes data from the server
OPTIONS: Prechecks requests to communicate which request methods and header fields are available

When is the request method options

In the W3C standard, the standard solution for cross-domain use is CORS, or cross-domain resource sharing
In CORS, there are two types of requests: simple and non-simple
A simple request
- The request method is one of HEAD/GET/POST
- HTTP headers do not exceed the following fields:
  - Accept
  - Accept-Language
  - Content-Language
  - Last-Event-ID
- Content-type: Application/X-www-form-urlencoded, multipart/form-data, text/plain
Either a simple request or a non-simple request
When non-simple cross-domain requests are sent, an HTTP query request is added before formal communication, known as a precheck request
The request method used for prechecking requests is OPTIONS, which asks the server which request methods are supported and which headers are accepted

Data: www.ruanyifeng.com/blog/2016/0…

How does http2 multiplexing work

Http2 divides the information originally transmitted over HTTP into smaller frames, binary encodes them, and then maps them to messages belonging to a particular stream.
On a TCP connection, we can continuously send frames to each other. The Stream identifier of each frame identifies which stream the frame belongs to, and when received by the other party, all frames of each stream are concatenated according to the Stream identifier to form a single block of data.
We can treat each request or response as a stream, so that multiple requests become multiple streams, and the data of these different streams is divided into multiple frames and sent to each other interleaved in a connection. This is multiplexing in HTTP2.

Binary frames in HTTP/2

Structure is as follows

Flow characteristics:

Concurrency. Unlike HTTP/1, multiple frames can be sent simultaneously on an HTTP/2 connection. This is also the basis for multiplexing.
Since the sex. Stream ids are not reusable, but are incremented sequentially, and when the upper limit is reached, a new TCP connection is opened to start from the beginning.
Two-way sex. Both client and server can create streams without interfering with each other. Both can act as sender or receiver.
You can set the priority. You can set the priority of data frames so that the server can handle important resources first to optimize the user experience.

What is frame ID and what does it do

The Stream Identifier in the frame header is used to indicate which Stream it belongs to

If a request takes a long time for the server to respond, what are the ways in which the client can receive it directly after the server finishes processing it

The client keeps polling until the server returns from processing
With WebSocket, the server notifies the client when the processing is complete

Can the request timeout be set to 24 hours

You can. The server and the foreground are required to complete the configuration

This section describes TCP and UDP

TCP: Transmission control protocol UDP: user datagram protocol

TCP connection-oriented (for example, dial up to establish a connection before making a phone call); UDP is connectionless, that is, no connection is required before sending data
TCP provides reliable services. That is to say, data transmitted through the TCP connection is error-free, not lost, not repeated, and in order to arrive; UDP does its best to deliver, i.e. reliable delivery is not guaranteed
TCP byte stream oriented, in fact, TCP treats data as a series of unstructured byte streams; UDP is packet oriented
Each TCP connection can be point-to-point only. UDP supports one-to-one, one-to-many, many-to-one and many-to-many interactive communication
TCP header cost 20 bytes; The header of UDP has a small overhead of only 8 bytes
TCP logical communication channel is full-duplex reliable channel, UDP is unreliable channel
TCP has congestion control and traffic control, while UDP does not

Four waves of the hand

After data transmission is complete, both parties can release the connection. Now, the application process of A sends A connection release packet segment to its TCP client, stops sending data, and actively closes the TCP connection. A sends FIN = 1 at the head of the connection release packet segment, and seq = U to wait for B’s confirmation
B sends an acknowledgement with ack = U + 1 and the sequence number of the packet is seq = V. The TCP server process notifies the upper-level application process. The connection from A to B is released, and the TCP connection is half-closed. If USER B sends data, User A still receives it
If B has no data to send to A, its application notifies TCP to release the connection
A must send an acknowledgement message after receiving the connection release packet segment. In the acknowledgement packet, ACK = 1, ACK = W +1, and seq = U +1.

Three handshakes, the effect of each handshake

Three handshakes

The TCP of A sends A connection request packet to B. The synchronization bit in the header is SYN = 1 and the serial number is seq = X, indicating that the serial number of the first data byte is X
After receiving the connection request packet, THE TCP of USER B sends an acknowledgement if the TCP of user B agrees the connection request packet.
- B should set SYN = 1, ACK = 1, ACK = X + 1, and seq = y in the confirmation packet segment.
After receiving the packet, USER A sends an ACK = 1 to user B. The ACK number is ACK = Y + 1. The TCP of USER A notifies the upper-layer application process that the connection is established.
After receiving the confirmation from host A, TCP of USER B also notifies the upper-layer application process that the TCP connection is established.

The effect of each handshake

First handshake: the client sends a network packet and the server receives it. The server can verify that both the client’s sending capability and the server’s receiving capability are normal.
Second handshake: The server sends the packet and the client receives it. In this way, the client can conclude that the receiving and sending capabilities of the server and the client are normal.
Third handshake: The client sends the packet and the server receives it. In this way, the server can conclude that the receiving and sending capabilities of the client and the sending and receiving capabilities of the server are normal.

The purpose of the three handshakes

The client and server need to connect before communication, and the “handshake” function is to prove that the sending ability of the client and the receiving ability of the server are normal
Prevent invalid connection request message segments from suddenly being transmitted to the server and causing errors
Exchange the initial sequence number, via SEQ and ACK

Why can’t TCP “handshake” be 1 or 2

If only two handshakes are required to establish a connection, the client still needs to receive the response from the server before entering the ESTABLISHED state. The server enters the ESTABLISHED state after receiving the connection request. If the network is congested and the connection request sent by the client fails to reach the server, the client times out and resends the request. If the server receives the request correctly and confirms the response, the two sides communicate and release the connection after the communication is complete. At this point, if the failed connection request reaches the server, the server will enter the ESTABLISHED state after receiving the request, either waiting to send data or actively sending data because there are only two handshakes. However, the client has already entered the CLOSED state, and the server will wait forever, wasting the connection resources of the server
If only one handshake is required to establish a connection, the client cannot determine whether the connection is successful if it does not receive any response from the server after sending a connection request

The characteristics of http2

Multiplexing, using binary frame division
Head compression, using HPACK algorithm to build an index table
Server push
Set the request priority

How to use HTTP2,http2 use conditions

How to use
- Build HTTP2 server with Nginx
Conditions of use
- Http2 support server and client (open New Window)
- The domain name must be HTTPS (encrypted connection over TLS/1.2 or above).
- The openSSL version of the server must be greater than 1.0.2

The QUIC protocol used in HTTP3 is based on UDP

Whether it’s queue head blocking in HTTP1, or if one request loses a packet during multiplexing in HTTP2, all other requests will wait, these are TCP based, and TCP takes a lot of time to establish a connection

Is an HTTP request a thread or a process

Each HTTP request needs to be controlled using one thread

Does HTTP have a limit on the number of requests in the browser, and what is its policy

Yes, a maximum of six to eight TCP connections can be established for each domain name

How does HTTPS ensure data security

The CA certificate is used to ensure server authenticity
Asymmetric encryption is used to ensure the confidentiality of session key
Symmetric encryption is used to ensure the confidentiality of sessions
Digital signature is used to ensure data integrity

How does HTTP2 eliminate queue header blocking

Each request/response is called a message, and each message is broken down into several frames for transmission. Each frame is assigned a serial number. Each frame belongs to a data stream at the transport layer, while multiple streams can exist on a connection. Each frame is transmitted independently on the stream and connection and then assembled into a message upon arrival, thus avoiding request/response blocking.

Which layer does TCP/IP belong to (OSI)

TCP belongs to the transport layer and is responsible for communication between applications (specifying ports to distinguish different applications). IP belongs to the network layer and is responsible for packet forwarding of routes (specifying IP addresses facilitates the selection of forwarding paths) and communication between hosts

Common HTTP headers (request headers, response headers)

Common request headers

Request header	instructions	The sample
Accept	Acceptable response Content Types	Accept: text/plain
Accept-Charset	Acceptable character set	Accept-Charset: utf-8
Accept-Encoding	The encoding of acceptable response content.	Accept-Encoding: gzip, deflate
Accept-Language	List of acceptable response content languages.	Accept-Language: en-US
Authorization	Authentication information of the resource to be authenticated in HTTP	Authorization: Basic OSdjJGRpbjpvcGVuIANlc2SdDE==
Cache-Control	Used to specify whether caching is used in the current request/reply.	Cache-Control: no-cache
Connection	The type of connection that the client (browser) wants to use preferentially	Connection: keep-alive
Cookie	An HTTP Cookie Set by the previous server via set-cookie (see below)	Cookie: $Version=1; Skin=new;
Content-Length	The length of the request body in base 8	Content-Length: 348
Content-Type	MIME type of the request body (for POST and PUT requests)	Content-Type: application/x-www-form-urlencoded
If-None-Match	It is allowed to return 304 Not Modified if the corresponding content has Not been Modified, in conjunction with eTAG	If-None-Match: “9jd00cdj34pss9ejqiw39d82f20d0ikd”
If-Modified-Since	304 unmodified is allowed if the corresponding resource has not been modified	If-Modified-Since: Dec, 26 Dec 2015 17:30:00 GMT
Origin	Initiate a request for cross-domain resource sharing (this request requires the server to include an Access-Control-Allow-Origin header in the response to indicate the source that Access Control allows).	Origin: www.itbilu.com
User-Agent	The browser identity string	The user-agent: Mozilla /…
Referer	Represents the previous page visited by the browser, which can be considered to have been brought to the current page by links from previous visits	Referer: itbilu.com/nodejs

Common response headers

Response headers	instructions	The sample
Access-Control-Allow-Origin	Specifies which web sites can be shared across domain source resources	Access-Control-Allow-Origin: *
Age	The duration, in seconds, of the response object in the proxy cache	Age: 12
Allow	Valid actions for a particular resource;	Allow: GET, HEAD
Cache-Control	Notify all caching mechanisms, from the server to the client, of whether or not they can cache the object and for how long. The unit is second	Cache-Control: max-age=3600
Connection	The options expected for the connection	Age: 12
ETag	An identifier for a particular version of a resource, usually a message hash	ETag: “737060cd8c284d8af7ad3082f209582d”
Expires	Specify a date/time after which this response is considered expired	Expires: Thu, 01 Dec 1994 16:00:00 GMT
Last-Modified	The last modification date of the requested object	Last-Modified: Dec, 26 Dec 2015 17:30:00 GMT
Location	Used when redirecting, or when a new resource is created.	Location: www.itbilu.com/nodejs

More: www.cnblogs.com/honghong87/…

TCP/IP reference model

The application layer is responsible for specific application protocols, such as HTTP, FTP, SMTP, DNS, and so on
Transport layer, which is responsible for inter-application communication protocols, such as TCP and UDP
The Internet layer, which is responsible for grouping forwarding protocols, such as IP, across the network
The network interface layer, which is responsible for the concrete signal representation (physical layer) and the routing of packets from one node to another (data link layer)

In the original OSI seven-layer reference model, the presentation layer and session layer between application layer and transport layer are abandoned. The data link layer is merged with the physical layer to form the network interface layer

Why does TCP require three handshakes and four waves to establish a connection

Shake hands three times

If you hold it only once, the client cannot determine whether the connection was successfully created
If the packet is held only twice, the server will create a connection unilaterally when the packet is received after it has been in the network for some time
Therefore, when the server sends a connection request, if the client does not agree (no ACK), the server will not establish a link

Waved four times

The server may still be transmitting data during the second wave. Therefore, the server sends a reply to the client first to prevent the client from continuing to send data, and then waves a third time to tell the client that the data transfer is complete
Upon the fourth wave, the client receives the FIN packet from the server and sends an ACK message to disconnect the server

How does the client determine whether the public key delivered by the server is not tampered with by a middleman

A public key certificate issued by the server, containing the public key summary encrypted by the CA private key, and the public key
After obtaining the public key certificate, the client uses the locally configured CA public key to decrypt the public key summary encrypted by the CA private key and extract the public key summary contained in the public key certificate
By comparing the decrypted summary with the calculated one, we can judge whether it is tampered by the middleman

The transport layer has those protocols

TCP: reliable byte stream transport protocol
UDP: unreliable datagram transfer protocol

What is the function of IP protocol at the network layer

Add AN IP header to the data and encapsulate it into an IP datagram, which contains the source and destination addresses. This packet is used to help the router forward packets and provide the capability to send data from host A to host B across the network

What are the headers of TCP packets and what do they represent

Source port and destination port fields — 2 bytes each. Ports are the service interfaces between the transport layer and the application layer. Transport layer reuse and reuse functions are realized through ports
Ordinal field — 4 bytes. Each byte in the data stream transmitted over a TCP connection is numbered. The value of the ordinal field refers to the ordinal number of the first byte of the data sent in the paragraph. seq
The confirmation number field – 4 bytes, is the sequence number of the first byte of the data expected to be received in the next segment of the message from the other party.
Data offset (header length) — four bits. It indicates how far the start of the DATA in the TCP packet segment is from the start of the TCP packet segment. The unit of “data offset” is a 32-bit word (measured in 4 bytes).
Reserved field – 6 bits reserved for future use, but should be set to 0 for now. Emergency URG – When URG  1, indicates that the emergency pointer field is valid. It tells the system that there is urgent data in this message segment and that it should be transmitted as soon as possible (equivalent to high-priority data). Acknowledgment ACK – The acknowledgment number field is valid only if ACK  1. When ACK  0, the confirmation number is invalid. PSH (PuSH) — receive TCP receives a packet segment PSH = 1 and delivers it to the receiving application as soon as possible, rather than waiting until the entire cache fills up and then delivers up the ReSeT RST (ReSeT) — when RST  1, Indicates that a serious error occurred in the TCP connection (for example, due to a host crash or other reasons) and the connection must be released before the transport connection is re-established. Synchronous SYN — Synchronous SYN = 1 indicates that this is a connection request or connection accept message. Terminate FIN (FINish) – Used to release a connection. FIN  1 Indicates that data on the sender end of the packet segment has been sent and the transport connection is released.
Window field – 2 bytes, used to set the basis of the sending window, in bytes.
Check and — 2 bytes. The scope of validation and field validation includes the header and data parts. When calculating the checksum, a 12-byte dummy header is added to the front of the TCP packet segment.
Emergency pointer field – 16 bits, indicating the number of bytes of emergency data in the column (emergency data is placed first in the column data). URG equals 1
Option field – variable length. TCP initially provides only one option, the maximum packet segment length (MSS). The MSS tells the TCP peer: “The maximum length of the data field in the packet segment that my cache can receive is MSS bytes.”

Maximum Segment Size (MSS) indicates the Maximum length of a data field in a TCP packet Segment. The data field plus the TCP header equals the entire TCP packet segment. So, MSS is “TCP segment length minus TCP header length”

Fill field – this is so that the entire header length is a multiple of 4 bytes.

How does TCP check packet data loss

Checksum, through the head checksum
Sequence number: TCP uses the cumulative confirmation mechanism. When a TCP packet is sent, the TCP ack indicates the sequence number of the next packet to be received. If the received packet is inconsistent with the expected one, packet data is lost

What do the status codes 304 and 403 stand for

304: When the client has a cache that may expire, it will ask the server whether the cache can still be reused with eTAG, time and other information of the cache. 304 tells the client that the cache can be reused, namely, negotiation cache
403: Access to the requested resource is denied by the server. The resource is allowed, but the request does not meet the condition

How to set cookies

The cookie is set on the server

The server adds a set-cookie field to the Response Header

`Set-Cookie:"name=value; expires=session; path=/; domain=.sugarat.top; HttpOnly; secure; sameSite=lax"`
Copy the code

In the form of key-value pairs

document.cookie = 'name=xxx; value=xxx'
Copy the code

What are the properties of cookies

The value of the common

Cookie attribute	Introduction to the	Special instructions
name	key	—
value	value	—
Expires	Expiration time	A fixed time, with a default session value, that is, the cookie expires when the browser is closed
Max-Age	Expiration time	If the value is greater than 0, the cookie will be saved to a local file. If the value is 0, the cookie will be deleted immediately. If the value is less than 0, it is equivalent to a session cookie
Domain	Deliverable host name (domain name)	Not specified: The default value is the host part of the current document access address (excluding the subdomain). If it is set to. A.com, a.com,a.a.com, b.a.com, etc. can be accessed, that is, any subdomain of a.com can access the cokie
path	Effective path	The path must appear in the path of the resource to be requested before it can be sent
Secure	Safety sign	This cookie is carried only when the HTTPS protocol is used
HTTPOnly	Safety sign	Scripts are not allowed to change/retrieve this cookie to help avoid XSS attacks
SameSite	Safety sign	Control cross-site request access to cookies. Cookies can be set so that they are not sent during cross-site requests, thus preventing cross-site request forgery attacks (CSRF).
PRIORITY	priority	Three priorities are defined: Low, Medium, and High. When the number of cookies exceeds, the cookies with the lower priority will be cleared first

Value of the SameSite property

Strict: Only one party is allowed to request with cookies, that is, the browser will only send cookies requested by the same site, the current web page URL is exactly the same as the request target URL
Lax: Allow section (Get request to navigate to target URL) to carry cookies in third party requests
None: Cookies are sent across sites or not
The previous default was None, and on Chrome Stable 80 and later the default is Lax

What is the MSL

MSL indicates the Maximum Segment Lifetime. MSL indicates the Maximum Segment Lifetime for which a packet is discarded. Usually two minutes
The difference between MSL and TTL in the IP header: THE unit of MSL is time, while TTL is the number of hops through the route. Therefore, MSL should be greater than or equal to the time when TTL consumption is 0 to ensure that the message has died naturally.

Why does TCP have to wait 2MSL for the last “wave” before entering the CLOSED state

Eliminate packets from old connections in the network
Fully ensure the reachability of ACK packets. When the ACK packet is lost, if the SERVER receives a FIN packet that is retransmitted due to timeout, it can reply to the ACK packet (and re-time 2MSL), making the server enter the CLOSED state.

TCP application scenarios UDP application scenarios

TCP: Applies to applications that require reliable transmission, such as file transmission
UDP: Suitable for real-time applications (IP phone, video conference, live broadcast, etc.)

How to solve the browser limit on the number of HTTP requests for the same domain name

With HTTP2, you only need to establish a TCP connection, and multiple requests and responses are transmitted as frames
If multiple domain names are bound to the same server and different domain names are used for front-end requests, the maximum number of TCP concurrent requests imposed by the browser can be 6-8

Can the three-way handshake carry data

[RFC793] [SYN] [RFC793] [SYN] [RFC793] [RFC793] [SYN] [RFC793] [RFC793] [RFC793]
After receiving the ACK + SYN packet from the server, the client establishes resources related to TCP. In this case, both the client and the server establish TCP connections. Therefore, the third ACK packet can carry data

UDP header format

Source port: This field occupies the first 16 bits of the UDP packet header and usually contains the UDP port used by the application that sends the datagram. The receiving application uses the value of this field as the destination address to send the response. This field is optional, so the sending application may not write its own port number into it. If no port number is written, set this field to 0. Thus, the receiving application cannot send the response.
Destination port: a 16-bit port used by the UDP software on the receiving computer.
Length: This field contains 16 bits and indicates the length of the UDP packet, including the UDP packet header and UDP data. Because the length of the UDP packet header is 8 bytes, the minimum value is 8.
Check value: This field is a 16-bit field that checks whether data has been corrupted during transmission.

How does TCP ensure reliable transmission

Reliable: Guarantees that the byte stream read by the receiver process from the cache is the same as the byte stream sent by the sender

Checksum: verifies and checks the TCP packet header. If transmission error occurs, the packet is discarded and ICMP sends an error message
Timeout retransmission: When the sender does not receive an ACK acknowledgement from the receiver, the packet is lost and retransmitted, or three duplicate acknowledgements of a packet are received
Each packet is assigned the corresponding serial number, and the cumulative confirmation mechanism is adopted to determine the order

HTTP caching mechanism

First, only the resources obtained by the GET request can be cachedSo to sum up

When loading a resource, the browser determines whether the strong cache is matched based on the expires and cache-control parameters in the request header. If yes, the browser reads the resource directly from the cache without sending a request to the server
If the strong cache is not hit, the browser must send a request to the server to verify with last-Modified and ETAG that the resource is in the negotiated cache. If it is, the server returns the request, but does not return the data for the resource and reads the resource from the cache
If neither hits, load the resource directly from the server

Cache-Control

The first are the cache-Control and Pragma fields in the request header

Pragma: no-cache is compatible with HTTP 1.0, while cache-Control: no-store is provided with HTTP 1.1. Therefore, Pragma: no-cache can be applied to HTTP 1.0 and HTTP 1.1, while cache-Control: no-store can only be applied to HTTP 1.1.

The following describes the main values of the cache-control field

The values	role
public	Resources can be cached by any intermediate node (client and proxy server)
private	Only clients can Cache, the default value of cache-control
max-age=< seconds >	Indicates that the cache contents will expire in XXX seconds
s-maxage=< seconds >	Like max-age, it only takes effect on proxy servers (such as CDN cache). S-maxage has a higher priority than max-age
no-cache	You can cache, but you should go to the server each time to verify that the cache is available and enter the negotiation cache phase. Instead of using cache-control for pre-validation, use Etag or Last-Modified fields to Control the Cache. Equivalent to max-age:0, must-revalidate
no-store	Nothing is cached, either by force or by negotiation
must-revalidate	Cacheable but must be confirmed with the source server
proxy-revalidate	The intermediate cache server is required to validate the cached response

Strong cache

In the case of strong caching, only data is requested from the browser

When loading a resource, the browser determines whether the strong cache is matched based on the expires and cache-control parameters in the request header. If yes, the browser reads the resource directly from the cache without sending a request to the server

The main steps

To viewCache-Controlthemax-ageProperty is outdated compared to the Date property property
If it does not containmax-ageThe Expires attribute checks to see if it contains the Expires attribute, and compares the Expires value to the Date attribute in the header to determine if the cache is still valid.

If neither max-age nor Expires attributes are present, look for last-Modified information in the header. If so, the lifetime of the cache is equal to the Date value in the header minus the last-modified value divided by 10

Negotiate the cache

A strong cache miss triggers the following steps

Will look in the request headerIf-None-MatchField with the value of the ETag response header that was last returned by the server
If it’s not in the request headerIf-None-MatchField is looked for in the request headerIf-Modified-SinceField with the value last returned by the serverLast-ModifiedDate value in the response header
- ifIf-None-MatchwithIf-Modified-SinceIf no, the server requests data directly.
If the request header containsIf-None-MatchorIf-Modified-Since, the system checks the validity of the source server. If the resources on the source server have not changed, the system returns the value304; If there are changes, return200;

If-None-MatchwithetagThe comparison,The if-modified-since andLast-modified ` ` ` comparison

How do I control HTTP caching

A better use of HTTP caching mechanism can reduce requests, use more local resources, give users a better experience at the same time, but also reduce the pressure on the server. The best practice is to hit strong caches as much as possible and invalidate client caches with updates
The current strategy is to use a negotiated cache for HTML, a strong cache for CSS, JS, and images, and a hash value for file names.
The reason is that when the page is updated, the HTML is negotiated by the cache, so you can definitely get the latest page. In the case of CSS, JS, and image packaging, because the file name has hash value, if the file has been modified, the requested link will be different, equivalent to the first request, and the latest resources will be pulled from the server again

Setting strong Cache

res.setHeader('Cache-Control'.'public, max-age=xxx');
Copy the code

Setting the Negotiation Cache

res.setHeader('Cache-Control'.'public, max-age=0');	// If strong cache expires, negotiation cache is used, or public, no-cache is set directly
res.setHeader('Last-Modified', xxx);
res.setHeader('ETag', xxx);

Copy the code

HTTP request composition

Both the request and the reply are made up of four parts: the starting line and the head empty line entity

The general structure is shown in the figure

What is CDN and how does it work

CDN, or content distribution network in Chinese, is designed to reduce transmission delay and find the nearest node

There are two core functions:

Caching: The process of copying resources to a CDN server
Source retrieval: The process by which the CDN finds that it does not have the resource (usually because cached data has expired) and turns to the root server (or its upper layer server) for the resource

The basic principle of CDN is to use a variety of cache servers to distribute these cache servers to relatively concentrated regions or networks where users visit websites. When users visit websites, the global load technology is used to direct users’ access to the nearest normal cache server, and the cache server directly responds to user requests.

The general process for users to access websites without CDN acceleration through browsers is as follows:

The user enters the domain name to visit in the browser.
The browser requests the DNS server to resolve the domain name.
The DNS server returns the IP address of the domain name to the browser.
The browser uses this IP address to send requests to the server.
The server returns the requested content to the browser;

The user visits a website that uses a CDN, and the process will look like this:

The user enters the domain name www.processon.com to the browser. If the browser finds that there is no DNS cache in the local domain for the first time, the browser sends a request to the DNS server of the website.
The browser requests the DNS server to resolve the domain name. Because the CDN adjusts the domain name, the DNS server finally gives the domain name resolution right to the DNS server that CNAME refers to the CDN.
CND’s DNS load balancing system resolves domain names and returns the IP addresses that respond fastest to users.
The user makes a request to the IP address (CND server);
CND load balancing equipment will choose a suitable cache server for users to provide services;
The user makes a request to the cache server;
The cache server responds to the user request and returns the content required by the user to the user.

What happens from entering the URL to displaying the page

Parsing the URL

- GET can encode only URL and accept only ASCII characters. - THE URL encoding rules are UTF-8 and Chinese characters are gb2312. - encodeURIComponent can be used for encodingCopy the code

Build the request line

GET   /     HTTP/1.1Method request path request protocol/versionCopy the code

Find strong cache

- Check whether there is a strong cache. If there is a strong cache, the system directly parses resourcesCopy the code

The DNS

- The browser first checks its cache for an IP address corresponding to the resolved domain name. If so, - Check whether the operating system cache has the corresponding resolved result (hosts file in Win) - Request the local domain name server (LDNS) to resolve the domain name. If no, proceed to the next step - DNS root server queryCopy the code

Establishing a TCP Connection

- Three handshakesCopy the code

Check whether an HTTPS request is made

- If yes, the TLS handshake is performedCopy the code

The client sends an HTTP request, and the server processes the response and returns an HTTP packet

-200: Go to the next step, if 4xx or 5xx will report an error, if 3XX is redirected to continue parsing - if it is gzip format will first unzip, and then decode the file according to the encoding format of the fileCopy the code

The browser parses the rendered page

If the value of the Content-Type field is determined by the browser to be a download Type, the request is submitted to the browser’s download manager and the navigation of the URL request ends. But if it’s HTML, the browser will continue with the navigation process.
The browser assigns the page to a renderer process that begins parsing the HTML file
- Lexical analysis: tokenization
- Parsing: Build a DOM tree
- Process allocation strategy: when a new interface is opened normally, a new renderer process is created. If the new interface is on the same site as the existing interface, the previous renderer process is reused. If the new interface and the previous interface are not on the same site, a new process is created
Parse HTML(Hypertext Markup Language)–>DOM(Document Object Model) tree
- If the script tag is encountered, the async or defer attributes will be determined. -async: the file will be downloaded in parallel, and js-defer: the file will be downloaded in parallel and then executed in sequence after the HTML parsing is complete. – If none: blocks the rendering process until JS is downloaded and executed
- Encounter link download and parse CSS(Cascading Style Sheets)–>CSSOM(CSS object Model) tree – Link tag reference – Style tag in style – element inline style property – standardized property values

DOM Tree + CSSOM Tree –> Layout Tree: After the CSSOM Tree and DOM Tree are built, start to generate the Layout Tree(calculate the DOM node specific style rules including inheritance rules, cascading style sheets).

- Layout Tree Confirms the size and position information of the interface elements (the backflow process takes place here) and draws the pixel information of the elements (the redraw process takes place here) - layers the RLayout Tree and generates a hierarchical Tree. (There are many complex effects in the interface, to facilitate the implementation of these effects, the rendering engine needs to generate special layers for specific nodes and generate a corresponding layer tree) - Generate a drawing list for each layer and submit it to the rendering engine's composition thread. - Composition threads divide layers into blocks and convert blocks into bitmaps in the rasterized thread pool. Normally, rasterization is done using GPU acceleration (the renderer communicates with the GPU process) - when all the tiles are rasterized, the compositing thread sends the DrawQuad command to the browser process. - The browser process generates the page according to the DrawQuad message and displays it on the monitor.Copy the code

Reflux and redraw

Backflow: Based on the generated render tree, backflow obtains geometric information (location, size) of the nodes – calculates the position and size of the visible Dom nodes in the viewport of the device. This calculation phase is backflow – to know the exact size and position of each visible node in the viewport, the browser traverses from the root of the render tree
Repainting: The absolute pixels of nodes are obtained according to the geometric information of nodes obtained from the rendering tree and reflux. After the generated rendering tree and reflux stage, the specific geometric information and styles of all visible nodes are obtained, and then each node of the rendering tree is converted into the actual pixels on the screen, which is called repainting node

From the above relationship, it can be seen that backflow will definitely cause redrawing, while redrawing will not necessarily cause backflow

Disconnecting TCP Connections

- Four wavesCopy the code

TCP traffic control

Using sliding window to realize flow control, there are mainly two methods: selective retransmission and n step back

See: www.bilibili.com/video/BV19E…

Watch it and you understand

TCP congestion control

There are four main algorithms

Slow start
Congestion avoidance
Fast retransmission
Fast recovery

See: juejin. Cn/post / 684490…

Some common front-end network knowledge points (constantly updated)