preface

Js, Vue, React, Node?

No!

The obvious answer is that the front end is now really powerful and complex……

  • Know HTML, CSS, Js
  • Follow up with frameworks like Jquery, bootstarp, angularJs.1x.2x
  • Unfortunately Vue and React went mainstream
  • Small programs are quietly born
  • Can cross-platform work? React Native , flutter
  • Discard the server side. Node pops up
  • .

Add to that the various UI frameworks, the building tools, the front end is going in all directions, my god, pain!

But there is a knowledge point, in the front end area inside is necessary, browser, about the browser knowledge, in today’s front end is also particularly important, frequently asked in the interview, a variety of local storage, cache, network, Http, Tcp/IP, security and so on, today with me to learn the browser related knowledge

Browser kernel and rendering engine

What are the common browser kernels

  • Trident kernel: IE,360, Sogou browser, etc. [also known as MSHTML]
  • Presto kernel: Opera7 and above. [Opera kernel was Presto, now Blink;]
  • Webkit kernel: Safari,Chrome, etc. [Chrome Blink (WebKit branch)]

Why do we need to know something about the basic browser kernel? Because all web browsers, E-mail clients, and other applications that need to display Web content need a kernel

Browser Rendering engine Some people say that Css engine and Js engine, actually more specific should be rendering engine and Js engine, why say so, because the browser in rendering a web page, there are generally three steps

  1. HTMLParser is used by browsers to parse HTML into DOM trees.
  2. The browser parses the CSS into a CSS Rule Tree (commonly known as CSSOM Tree) through CSSParser.
  3. The browser uses JavaScript to parse JS code through the DOM API or CSSOM API and applies it to the layout. It constructs the Render Tree from the DOM Tree and the CSSOM Tree.

The final Rander tree is an abstract representation of the document structure of the entire page, and then renders the response results as required. Therefore, combined with the three steps, a simple Css rendering engine literally cannot fully express the first two steps

  • Rendering engine
  • Js engine

Rendering engine: it is mainly responsible for obtaining the content of the web page (HTML, XML, images, etc.), and calculating the display mode of the web page, and then output to the browser. The different kernel of the browser will have different syntax interpretation of the web page, so the rendering effect is different

JS engine: it is responsible for parsing and executing javascript code to achieve the dynamic effect of web pages

The local store

  • cookie
  • sessionStorage
  • localStorage

Let’s compare this with a table

features cookie localStorage sessionStorage
The life cycle Usually generated by the server, you can set the expiration time Persistent storage The current session is stored at the session layer
Data store size 4K 5M 5M
Whether the cross-domain It is carried in the same source HTTP request. By default, cross-domain is not allowed. You need to set cross-domainwithCredentials = true, the server side needs to allow Cross-domain is not allowed by default and can be resolved using postMessage cros
Storage location On the server side, each request is carried in a header The hard disk memory

One more thing to note about some cookie attributes is security

  • Value Should be encrypted if it is used to save the user login status
  • Http-only The Cookie cannot be accessed through Js, reducing XSS attacks
  • Secure can only be carried in requests using HTTPS
  • Same-site specifies that the browser cannot carry cookies in cross-domain requests to reduce CSRF attacks

Browser cache

Simply put, browser caching is the behavior of the browser to store the resources obtained through HTTP locally

Cache priority

  1. Let’s look it up in memory
  2. If it does not exist in memory, look for it in hard disk
  3. If there is none on the hard drive, then the network request is made
  4. Requested resources are cached to disk and memory

Classification of cache

  • Strong cache
  • Negotiate the cache

Let's start with some logic

  1. When a client requests a resource, it first determines whether the resource matches the strong cache based on the HTTP header of the resource. If the resource matches, the client directly obtains the cache resource from the local server and does not send the request to the server
  2. When a strong cache does not match, the client sends a request to the server. The server uses the Request header to verify that the resource matches the negotiated cache. If it does, the server returns 304 and tells the client to get it from the cache
  3. When the negotiation cache is also dead, the server returns the resource to the client
  • When CTRL + F5 forces a page refresh, load directly from the server, skipping strong cache and negotiation cache
  • When F5 refreshes a web page, the strong cache is skipped, but the negotiated cache is checked

Strong cache

  • Expires (the specification in HTTP1.0, which is a time string in GMT format with an absolute time, representing the expiration time of the cached resource)
  • Cache-control :max-age (this is an Http1.1 specification. Strong caches use their max-age value to determine the maximum lifetime of cached resources. The value is in seconds)

Cache-control has several other commonly used properties:

  1. No-cache: Indicates that a negotiated cache is required and a request is sent to the server to confirm whether to use the cache.
  2. No-store: Disables caching and requests data again each time.
  3. Public: can be cached by all users, including end users and intermediate proxy servers such as CDN.
  4. Private: It can only be cached by the browser of the end user and is not allowed to be cached by intermediate proxy servers such as CDN.

Cache-control and Expires can be enabled on the server at the same time. Cache-control has a higher priority when both Settings are enabled

Strong cache disadvantages

After the cache expires, the request for the resource will be reissued regardless of whether the resource has changed or not, but we want the resource file to remain unchanged. Even if the resource expires, the request will not be reacquired, and the old resource will continue to be used, hence the negotiated cache

Negotiate the cache

  • Last-Modified / If-Modified-Since

The last-modified value is the Last time the resource was updated in GMT format and is returned with the server’s response. When the browser requests the resource again, the request header will contain if-modified-since, This value is the last-Modified value returned before caching. After receiving if-Modified-Since, the server determines whether the negotiation cache was hit based on the Last Modified time of the resource

  • ETag / If-None-Match

ETag is a string of numeric codes that uniquely identifies the content of the resource and is returned with the response from the server. The server compares if-none-match in the request header with the ETag of the current resource to determine whether the resource has been modified between the two requests. If no change is made, the resource matches the negotiation cache

Why do you need ETag/if-none-match when you have last-modified/if-modified-since?

If the cached file is opened locally, even if the file is not Modified or changed back within a certain period of time, the last-modified file will be changed and the server will not be able to hit the cache

conclusion

  • Strong cache has a higher priority than negotiated cache
  • As long as caching is used, the server does not return resources
  • The strong cache does not send requests to the server
  • The negotiation cache sends the request to the server

Http Network request type

  • Get: Sends a request to obtain server data
  • Post: submits data to the resource specified by the URL
  • Put: Submits data to the server for modification
  • Head: Indicates the Head of the request page to obtain meta information about the resource
  • Delete: Deletes some resources on the server.
  • Connect: Establishes a connection tunnel for the proxy server.
  • Options: Lists the methods of requests that can be made to a resource, often used across domains

There are common differences between Get and Post

  • Get contains the parameters in the URL, connected with the & symbol, while POST passes the parameters through the request body
  • Get requests are cached actively, but POST requests are not unless manually set
  • Post is more secure than Get. Get requests can be rolled back in the browser, while Post requests are requested again
  • Get request parameters are completely preserved in the browsing history, and Post parameters are not
  • Get requests are limited in the length of the parameters they pass in the URL, while Post requests are not
  • Get requests can only use URL encoding, and Post can use other types of encoding

Network request status code

There are five basic categories

  • 1XX (informational status code) Accepted request being processed
  • 2XX (Success status code) The request is processed properly
  • 3XX (redirection) requires additional operations to complete the request
  • 4XX (Client error) The client request failed and the server could not process the request
  • 5XX (Server error) The server failed to process the request

Common status code:

  • 200 If the request is successful, the message is returned
  • 301 Permanent redirection, which is cached, indicates that the request URL is permanently changed, and the new URL prevails thereafter
  • 302 Temporary redirection, not cached, indicates that the request URL is temporarily changed
  • The Get method uses a negotiated cache, and the server returns a status code that satisfies the condition
  • 400 Request error
  • 401 Require authentication, generally refers to no permissions, common in need Token
  • 403 The server is inaccessible
  • 404 No resource was found matching the requested URL
  • 500 Common server side errors
  • 503 indicates that the server is loaded and cannot process requests

Http1.0, Http1.1, Http2.0 difference

The Http 1.0

  • HTTP 1.0 stipulates that the browser only keeps a short connection with the server. Each time the browser requests a TCP connection with the server, and the server disconnects the TCP connection after the request processing is complete. It can also force long links to be turned on, such as SettingsConnection: keep-alivefield

The Http 1.1

  • Pipelining, which allows clients to send multiple requests simultaneously within the same TCP connection (Http pipeline mechanism is a technology to submit multiple Http requests in batches. In the process of transmission, there is no need to wait for the response from the server, and only the request methods such as GET and HEAD can be piped)
  • Cache processing is introducedCache-Control,Etag/If-None-MatchEtc.
  • Added some error status response codes

Http 2

  • Multiplexing is used, that is, within a single connection, both the client and the browser can send multiple requests or responses at the same time, and they do not have to be sequentially matched.
  • Allows the server to actively push resources to the client

Http and Https

Http is a hypertext transfer protocol, based on the Tcp/Ip communication protocol to transfer data

  • The request information is transmitted in plain text, which is easy to be caught by eavesdropping
  • Data integrity is not verified and is easy to be tampered with
  • No authentication, there is security

Https can be understood as Http + SSL condom layer protocol. The SSL certificate authenticates the identity of the server and encrypts the data transmitted between the browser and the server (symmetric + asymmetric).

So what’s the difference

  • Data encryption or not: Http is in plaintext and HTTPS is in ciphertext
  • Default port: The default Http port is 80 and the default Https port is 443
  • Resource consumption: Https communication consumes more CPU and memory resources than HTTP communication because of encryption and decryption processing
  • Security: HTTP is not secure, HTTPS is relatively secure

What is the Https process? SSL is used to encrypt and decrypt data, and then Http is used to transfer data (ciphertext)

  • When the user enters an HTTPS URL into the browser, the server is connected to port 443 by default
  • The server must have a set of digital certificates, also known as SSL (Condom Layer Protocol), which are essentially a pair of public and private keys (usually required)
  • The server returns its own digital certificate (including the public key) to the client
  • After receiving the digital certificate from the server, the client authenticates it. If the certificate is valid, it generates a key (symmetric encryption) and encrypts it using the public key of the certificate
  • The client initiates a second HTTP request in HTTPS, sending the encrypted client key to the server
  • After receiving the ciphertext sent by the client, the server uses its own private key to decrypt the ciphertext asymmetrically. After decrypting the ciphertext, the server obtains the client key, and then uses the client key to symmetrically encrypt the returned data, so that the data becomes ciphertext
  • The server returns the encrypted ciphertext to the client
  • After receiving the ciphertext sent by the server, the client uses its own key (client key) to decrypt the ciphertext symmetrically to obtain the data returned by the server

Tcp three handshakes and four waves

Before we talk about THE Tcp transmission protocol, let’s first understand the Tcp packet. I found a picture on the Internet. Please check it out in detail

I have made a note of the packet information required by the TCP three-way handshake and four-way wave in the diagram, which needs to be briefly understood, and then I will use the two diagrams to understand the TCP three-way handshake and four-way wave

Three-way handshake

  1. The first three-way handshake is initiated by the client to create a Tcp link to the server, which will flag THE SYN bit at position 1 and carry a request sequence number seq = xSerial number X is 32 bits, randomly generated (client-facing)
  2. When a server receives a request from a client to create a link, the server must respond by setting the ACK flag to 1 and generating the ACK number = x + 1Verify that the sequence number is the received request sequence number + 1At the same time, the server also initiates a link creation request to the client, which flags the SYN bit at position 1 and also carries a request sequence number seq = yThe sequence number Y is 32 bits, randomly generated (server oriented)
  3. At this point, the client knows that the server has received the request and agreed to create the link, so it needs to send a response to the server by sending the request again, setting the ACK flag to 1 and generating the ACK = y + 1

Ok, through the above 3 steps, completed TCP three handshake, why not two? The reason for this is simple. After the first two steps, the client knows that the link can be created, but the server is still confused, so the last step is to give the server a response, telling the server that I am ready to create the link

Four times to wave

Once you’ve learned the three handshakes, the four wave is a little easier. It’s basically the same principle

  1. Similarly, the four waves are also initiated by the client, which sends a disconnect request to the server with the flag bit FIN position 1 and a request sequence number seq = x
  2. When the server receives a disconnection request from the client, it also sends a response, so it sets the ACK flag to 1 and generates an ACK = x + 1
  3. At this point, the client receives the request (the server has received its own disconnection), so it is sitting and waiting for the disconnection, but the server may have some other things to do, such as return data…. So when the server is done with the task at hand, it will send a break request to the client, that is, place the FIN at position 1, and generate a break sequence number seq = y
  4. The client received the disconnect request from the server, Jacket, you have finished processing, then we disconnect

Ok, through the above 4 steps, completed the TCP four waves, the principle is very simple, is a two-way question and answer

Differences between Tcp and Udp

  • TCP is a connection-oriented protocol, meaning that a secure and reliable connection (the legendary three-way handshake) must be established before sending or receiving data, while UDP is connectionless, requiring only the destination port number to send data
  • TCP supports only point-to-point, while UDP supports one-to-one, one-to-many, many-to-many, and many-to-one
  • TCP is relatively inefficient because links are created and broken, whereas UDP is not, so it is relatively fast
  • TCP is byte stream oriented and UDP is packet oriented
  • TCP ensures data correctness, while UDP may cause packet loss

Front end common attack and defense

XSS

csrf

SQL injection