What happens when you enter the URL in the browser

Bi li bi

  • 1. Synthesize urls to make network requests
  • 2.DNS domain name resolution
  • 3. Establish a TCP connection and shake hands for three times
  • 4. The SERVER sends an HTTP request, processes the request, and returns the response result
  • 5. Close the TCP connection for four times. The network request ends
  • 6. The browser process takes the result of the request, parses the response header, prepares the rendering process, and submits the document
  • The renderer process renders and parses HTML

1.Synthesis of URL

The BROWSER UI thread determines the input content based on the URL rules. If the input content meets the URL rules, the browser synthesizes a valid URL based on the protocol and notifies the network process of connection requests. If you are searching for content, the default engine is used for the search

2. DNS Domain name Resolution

When we type the URL: www.google.com/ into a browser, the browser first looks for the IP address that corresponds to the domain name.

  • 1. Search from the browser cache
  • 2. Search for the hosts file on the local PC
  • 3. Search the local DNS cache (in the router cache)
  • 4. Search for the local DNS server
  • 5. If it is not available, the DNS root server will be queried directly. This step will find the server responsible for the com domain name
  • 6. Go to the server to query the secondary domain name google.com
  • 7. Then query the address of www.google.com
  • 8. Return it to the DNS client and cache it

DNS recursive query is introduced above, there is a kind of iterative query, the difference is that the former is the DNS server configured by the system to do the request, the results of the data returned to the client; The latter is requested by the client.

Client-local DNS server: This part is a recursive query.

Local DNS server — Extranet: This part is an iterative query.

  • Recursion: The client sends a request only once, asking the other party to give the final result.

  • Iteration: The client makes a request, and if the other party is not authorized to answer, it returns a list of other name servers that can answer the query,

    The client makes further requests to the returned list until it finds the name server that is ultimately responsible for the domain name being looked up and gets the final result from it.Copy the code
  • Authorization answer: Query the DNS server for a domain name. The DNS server is responsible for the domain name. The result is the authorization answer.

3.TCP establishes a three-way handshake

Check whether the protocol is HTTPS. HTTPS is actually composed of HTTP + SSL/TLS, that is, an encrypted message module is added to HTTP. Both the server and client are encrypted through TLS. Therefore, the transmitted data is encrypted. TCP three handshake and four wave understanding and interview questions

TCP Connection handshake

Both the server and client are initially CLOSED. The parties will create a TCB. After the server is created, it enters the LISTEN state and waits for the client to send data

First handshake

The client sends a SYN request packet to the server. After the request is SENT, the client enters the SYN-sent state and waits for confirmation from the server.

Set the SYN position to 1 and Sequence Number to x.

Second handshake

After receiving a SYN packet segment, the server needs to confirm the SYN packet segment. If the connection is approved, an ACK + SYN response is sent with the SYN position set to 1. The response also contains the initial sequence number of the data communication. After the response is sent, the system enters the SYN-received state.

Third handshake

The client receives a SYN+ACK packet from the server. An ACK message is also sent to the server. After the ACK is sent, both the client and the server enter the ESTABLISHED state

SSL Handshake Procedure

  • The first stage of establishing security capabilities includes protocol version session Id password component compression method and initial random number
  • Phase 2 The server sends the certificate key exchange data and the certificate request, and finally sends the request-end signal of the corresponding phase
  • Phase 3 If a certificate requests the client to send the certificate, the client sends the key exchange data and sends the certificate authentication message
  • The fourth stage changes the password component and ends the handshake protocol
  • When this is done, the client and server can begin transferring data.

note

ESTABLISHED: Indicates that the reply field is valid. That is, the TCP reply number is included in the TCP packet. There are two values: 0 and 1. If the value is 1, the response field is valid; otherwise, it is 0. According to the TCP protocol, this parameter is valid only when ACK=1, and the ACK value of all packets sent after the connection is established must be 1. SYN(SYNchronization) : used to synchronize the sequence number when a connection is set up. When SYN=1 and ACK=0, it indicates that this is a connection request packet. If the peer agrees to establish a connection, SYN=1 and ACK=1 should be set in the response packet. Therefore, a SYN value of 1 indicates that this is a connection request or connection accept message.Copy the code

The problem

Why does TCP require three handshakes instead of two?

This prevents the invalid connection request segment from being received by the server, resulting in an error. For the third handshake, an acknowledgement message is sent to the server

If the client sends A connection request A and the timeout is due to network problems, TCP will initiate the timeout retransmission mechanism to send the request to REQUEST B again. If request A reaches the server after both ends are closed, the server assumes that the client needs to establish the TCP connection again, and responds to the request and enters the ESTABLISHED state. In this case, the client is in the CLOSED state, which causes the server to wait for a long time, wasting resourcesCopy the code

PS: During the connection establishment, TCP resends SYN packets when either end is disconnected. The TCP retries five times. SYN FLOOD attacks may occur during the connection establishment. In this case you can either lower the number of retries or simply reject the request if it cannot be processed.

When a connection is established and the request is completed, the TCP connection is not immediately broken. Thanks to the KEEP-alive attribute in HTTP 1.1, the connection can be kept for a short period of time. The specific disconnection depends on the Settings of the server or the client.

4.The HTTP request is sent, the server processes the request and returns the response result

Sending an HTTP request

  • After the TCP connection is established, the browser can use HTTP/HTTPS to send requests to the server. The complete HTTP request includes the request start line, the request header, and the request subject

The request line

Method Request-URL HTTP-Version CRLF

The server processes the request and returns the response

  • The server receives the request, parses the request header,
  • If the header contains cache information, such as if-none-match and if-modified-since, then the cache is valid.
  • If it is valid, the resource is returned with status code 304. If not, the resource is returned with status code 200

An HTTP response packet also consists of three parts: a status code, a response header, and a response packet.

5.Close the TCP connection with four waves

  • First wave: Client A sends A request to release the link and sends A FIN packet to server B. The client enters the FIN_WAIT_1 state

  • Second wave: After receiving the link release request, server B sends an ACK packet and enters CLOSE_WAIT state. Indicates approval of the release request. In this case, server B does not receive the data sent by USER A, but can still send data to user A because the TCP connection is bidirectional

  • Third wave: Server B sends A FIN packet to user A to close the link, and user B enters the LAST_ACK state

  • Fourth wave: After receiving the release request, user A sends A confirmation reply to user B. User A enters time-wait state. The state lasts for 2MSL (maximum segment lifetime, which refers to the duration of the packet segment in the network. The timeout will be discarded). If there is no resending request from B within this period, the state is CLOSED. When B receives the confirmation reply, it also enters the CLOSED state.

Why does A enter the time-wait state and WAIT 2MSL before entering the CLOSED state?

In order to ensure that B can receive A’s confirmation. If A enters the CLOSED state directly after sending the confirmation reply, if the confirmation reply does not arrive due to network problems, B cannot be CLOSED normally.

6. Parse the response header and submit the document

  • If the response header is content-Type: text/ HTML. So I’m going to prepare a render process, ready to render. If it is an application/ OCtet-stream, byte stream type data will be submitted to the browser’s download manager.

  • Browsers prepare renderers according to the rules, usually one for each TAB, but the same renderer is used for different tabs from the site.

  • Submit the document, which is the response data returned

    1. The browser sends a “submit document” message to the renderer. Upon receiving this message, the renderer channels the data to the web process.

    2. When the data transfer is complete, the renderer process returns a message to the browser process confirming the submission.

    3. After receiving this information, the browser process will update the browser interface status, including the security status, URL of the address bar, historical status of forward and backward, and update the Web page

    4. This explains why, when you type an address into your browser’s address bar, the previous page does not immediately disappear, but takes a while to load before the page is updated.

7.Browser rendering

The browser is a process of parsing and rendering

In chronological order of rendering, the pipeline can be divided into the following stages: DOM tree building, style tree calculation, page layout, layering, rasterization, and display

  • 1. The renderer transforms the HTML content into a readable DOM tree structure
  • 2. The rendering engine converts CSS styleSheets to styleSheets that browsers can understand, and browsers have default styleSheets. Figure out the style of the DOM node.
  • Create a Layout tree and calculate the Layout information for the elements. That is, traversing the DOM tree CSS tree to generate the render tree. Build a layout tree that contains only visible elements
  • 4. Create a hierarchical layout tree and generate a hierarchical treeLayerTree.
  • 5.PaintGenerate a draw list for each layer and submit it to the composition thread.The above steps are all on the main thread, both redraw and reflow

    A draw list is simply a list of draw orders and draw instructions that are actually done by the compositing thread in the rendering engine.

  • 6. The compositing thread divides the layers into blocks because they are too large, usually 256×256 or 512×512, and rasterizes the blocks into bitmaps.
  • 7. The composite thread sends the DrawQuad command to the browser process. The browser process generates the page on command and displays it on the monitor.
  • 7. Synthesizer thread collects the DrawQuads graph block information (location in memory, and page location information), generates synthesizer frame according to this information, and then transmits it to browser process through IPC process, and then to GPU process to render on the screen.

Build a DOM tree

Byte -> character -> token -> Node -> Object model

After the browser gets the HTML bytes from the web or hard drive, it goes through a process of parsing the bytes into a DOM tree,

  • The raw bytes of HTML are converted to characters encoded in the file.
  • The browser then converts the string into various token tags according to the HTML specification,
  • The token is then transformed into a Node Node object that defines its properties and rules.
  • The most important work is to establish the father-son sibling relationship of each node, and finally parse it into a tree-like object model, namely DOM tree.

Style calculation

The rendering engine translates CSS styleSheets into styleSheets that browsers can understand, calculating the style of DOM nodes.

All values need to be converted into standardized calculated values that the rendering engine can easily understand. This process is called attribute value normalization. Once the processing is complete, the inheritance and cascading of styles are dealt with, a process some articles refer to as the CSSOM build process.

Page Layout Tree

The layout process is to exclude functional and non-visual nodes such as script and meta, exclude display: None nodes, calculate the location information of elements, determine the location of elements, and build a layout tree containing only visible elements

Layer, rasterize

  • The main thread traverses the Layout tree to generate the Layer Tree. It is then passed to the compositor thread. The composition thread blocks the layer and sends it to the rasterization thread for rasterization. And store them in GPU memory

    (Rasterized threads are still in the render process)

  • When rasterization is complete, the composite thread generates a DrawQuad command

The layer tree

The rendering engine also needs to generate specific layers for specific nodes and a corresponding LayerTree. If a node has no corresponding layer, the node is subordinate to the parent node’s layer.

Compute the style result for each layer node

Fill each node into the layer’s bitmap (Paint)


According to

Finally, the composite thread sends the draw block command to the browser process. The browser process generates the page according to the command and displays it on the monitor. The rendering process is complete.

Second, the computer network model TCP

Layer 4 model (TCP/IP)

Application layer (FTP,HTTP), Transport layer (TCP/UDP), Network layer (IP), and network interface layer (link layer)

  • The link layer

Used to handle the part of the hardware connected to the network. Including operating system, hardware device driver, network adapter, optical fiber and other physical visible parts.

The process of sending an HTTP request

When the client via HTTP to launch a request, the layers of the protocol, in turn, to the packaging of the request, and carry the corresponding head, finally in the link layer generated Ethernet packets, packet through the physical media host “fiber such as” to each other, each other after receiving data packets, and layers of unpacking, adopts the corresponding protocol Finally, the application layer data is handed over to the application for processing.

Differences between TCP and UDP

TCP

The characteristics of

  • Connection-oriented, full-duplex: bidirectional transmission
  • Based on byte stream transmission, the data is packed into packet segments without limiting the size, ensuring orderly acceptance and automatically discarding repeated packets
  • Reliable transmission service: guaranteed reachability, packet loss and retransmission.
  • Traffic buffering: Resolve the mismatch between the processing capabilities of both parties (sliding window). This function is mainly used by the receiving party to ensure that the receiving party can receive data in time.
  • Congestion control: Prevents network congestion. This prevents excessive data congestion and excessive network load

Packets in the three-way handshake

  • SYNSynchronize the serial number of a packet to establish a connection. Send seq: x; In response to x + 1; The value of seq is the order in which packets are accepted;
  • ACKResponse packet.
  • FINThe packet to close the connection

Byte stream transmission

TCP determines how many bytes a message segment should contain based on the value of the peer window and the current network congestion. Default (526) bytes

ARQ protocol is the timeout and retransmission mechanism protocol.

  • Stop waiting for ARQ
  • Wait for ARQ continuously

The difference between

To summarize the differences: TCP is a connection-oriented, reliable, byte stream based transport layer protocol.

UDP is a connectionless transport layer protocol. (As simple as that, other TCP features are gone).

TCP is used in the financial field, for example, FIX protocol is a TCP-based protocol, while UDP is widely used in the field of live games and entertainment, voice calls, and query domain names.

  • 1.TCP is a connection-oriented protocol. UDP is a connectionless protocol.
  • 2. Reliability TCP is more reliable. If TCP is lost during transmission, it is resended. UDP provides no guarantee, transmits what it receives, does not back up data, and does not care whether the other party receives it or not.
    1. Orderliness TCP ensures the orderliness of messages, even if they arrive in a different order to clients. UDP does not provide an order guarantee.
  • 4.UDP efficient header overhead is small, only 8 bytes, TCP header minimum 20 bytes, maximum 60 bytes

Why TCP is reliable

  • Data sorting
  • Acknowledgement and retransmission mechanism, packet loss, retransmission
  • Sliding Windows use flow control: Window and timer use. The TCP window indicates the maximum amount of data that both parties can send and receive
  • Congestion control

TCP wave is in TIME_WAIT state before returning to CLOSE state

2MSL Indicates the maximum duration of two packets

  • If the fourth wave is made, the client releases the connection after sending the last ACK packet, and if it is lost, the server remains in a waiting state
  • If the client does not receive the last ACK packet, the FIN packet will be resend, the client will send the ACK packet again, and then continue to wait to close, so 2MSL is required.

Which protocols use TCP and UDP

  • FTP, HTTP, and HTTPS use TCP.
  • DNS, DHCP, and TFTP use UDP

How to prevent clickjacking? What is X-frame-Option?

Click on the hijacked

It is done by overwriting invisible frames to mislead the victim to click.

x-frame-option

The response header is a tag used to indicate to the browser whether to allow a page to be displayed in, or. Sites can use this feature to ensure that their content is not nested into someone else’s site and avoid clickjacking attacks.

  • DENY The page cannot be displayed on the IFrame page
  • SAMEORIGIN allows codomain frames
  • ALLOW-FROM http://whsir.com/Specify the source page

Principles of Content Delivery Network (CDN)

  • When a client requests resources, it sends a domain name resolution request to the local LDNS. If yes, the request is returned directly, if no DNS query is authorized
  • After the authorization is resolved, the IP address corresponding to domain name CNAME is returned

HTTP protocol

The characteristics of HTTP

  • stateless
  • Plaintext transmission is a protocol in which packets (mainly headers) do not use binary data but text
  • Queue header blocking When HTTP is enabled for a long connection, a shared TCP connection can process only one request at a time. If the current request takes too long, other requests are blocked.

Say something about caching in HTTP (why many sites open quickly the second time)

Browser cache

Also known as Http cache, it is divided into strong cache and negotiated cache. The strong cache has a higher priority. The negotiation cache is enabled only when the strong cache fails to be matched.

1.Strong cache 200(from cache)

Strong caching is controlled using the Expires and cache-Control fields in the HTTP header. In a strong cache, when a request is made again, the browser determines whether the target resource “matches” the strong cache based on the expires and cache-control parameters. If so, the browser directly obtains the resource from the cache without communicating with the server.

1. Step 1: Check whether the current cache is expired. If it is not, the server will not communicate with the cache and directly enable the files in the cache

2. Step 2: If the cache expires, a request will be sent to see if the file has been modified. If not, 304 will be returned to continue using the cache file

expires

  • Implement strong caching. In the past we’ve been using Expires. When the server returns a Response, write the expiration time in the Response Headers field.expires: Wed, 12 Sep 2019 06:12:18 GMT
  • When requesting the resource again from the server, the browser compares the local time with the Expires time if the local time is less than the Expires time. Then it will go directly to the cache and fetch that resource.
  • Because you’re comparing expires to local time, expires has its limitations

Cache-Control

  • HTTP1.1 is new because of expires limitationsCache-ControlField to complete the Expires task.
  • cache-control: max-age=31536000Max-age is not a timestamp but a length of time
  • Cache-control is more accurate than Expires and has a higher priority. When cache-control and Expires are present together, we use cache-control.

Cache-control the difference between no-cache and max-age equals 0.

  • no-storeDisable caching completely and get it from the browser every time.
  • no-cacheYou can cache it locally, you can cache it on a proxy server, but this cache requires server validation to be used, or you can return 304.
  • max-age> 0 reads directly from the browser cache
  • max-age<=0 before sending HTTP request to server, check ETag/ last-Modified before retrieving the resource. If the resource has been Modified, return 200; if not, return 304

2.Negotiate the cache

The negotiated cache depends on the communication between the client and the server. The client asks the server for the cached information. If the server says it has not changed, the resource is redirected to the browser cache with status code 304. The negotiation cache determines last-Modified and Etag

Last-Modified

  • Response Headers after the first request: last-Modified: Fri, 27 Oct 2017 06:35:57 GMT
  • Subsequent requests carry a timestamp field called if-modified-since, which is the last-modified value returned to it in response:
  • The server compares the timestamp to the last time it was modified on the server to determine whether it has changed.

disadvantages

  • 1. If a file is edited but its contents remain unchanged, a request is still triggered. The server doesn’t know if we’ve actually changed the file, it’s still judging by the last edit time. This resource is therefore treated as a new resource when requested again, triggering a full response – and re-requested when it should not be.
  • 2.If- modified-since can only be checkedsecondsIs the minimum unit of time difference. If the change time is 100ms, the change is not detected and the request is not re-initiated

Etag

An Etag is a unique identifier string generated by the server for each resource. This identifier string can be encoded based on the content of the file, and the corresponding Etag is different as long as the content of the file is different, and vice versa. Therefore, Etag can accurately sense changes in the file. ** Reads the last cached ETag value as if-none-match and sends it to the server along with the request data. ** The server compares the value of if-none-match

disadvantages

  • The Etag generation process costs the server extra money and affects the performance of the server

Etag senses file changes more accurately than last-Modified and has a higher priority. When both Etag and Last-Modified exist, Etag prevails.

Caching scheme

  • HTML: negotiated cache;
  • CSS, JS, image: strong cache, file name with hash.

Impact of refreshing on cache

  1. When CTRL + F5 forces a page refresh, load directly from the server, skipping strong and negotiated caching.

  2. When F5 refreshes a web page, it skips the strong cache, but checks the negotiated cache.

  3. I’m going to write the URL in the browser’s address bar, and I’m going to press Enter and I’m going to find this file in the cache. (the) fastest

Talk about the difference between HTTP1 and HTTP1.1 and HTTP2 and the difference between HTTP3

1. HTTP1.x

Defect: threads block. There is a limit to the number of requests for the same domain name at the same time

2. Http1.0

Disadvantages: The browser only maintains a short connection to the server, and each request from the browser requires a TCP connection to the server (TCP connection creation is expensive, because the client and server need to shake hands three times).

The server disconnects the TCP connection immediately after the request is processed. The server does not track each client or record past requests. Solution: Add header information — non-standard Connection field Connection: keep-alive

3. HTTP1.1

  • Persistent Connection A TCP Connection is not closed by default and can be reused by multiple requests. You do not need to declare Connection: keep-alive(Most browsers allow up to six persistent connections for the same domain name).
  • Added request methods PUT,DELETE, and OPTIONS
  • The browser in question blocks the number of requests for the same domain name. As a result, when the maximum number of requests is reached, the remaining resources need to wait for other resources to complete the load before sending requests

4. HTTP2

Header compression multiplexing server push

  • Binary format instead of text format;
  • Full multiplexingMultiplexing, that’s itThe HTTP header is blocked. This removes the maximum limit of six requests for the same domain name.
  • Header Compression HTTP is a stateless protocol, with repeated headers such as cookies for each request. HTTP2 optimizes this by introducing header compression, which is compressed using gzip or COMPRESS and then sent; On the other hand, both the client and the server maintain a header table where all fields are stored, producing an index number, and then don’t send the same field, just the index number.
  • Server push: The server is no longer completely passive in receiving and responding to requests, it can also create streams to send messages to clients (deprecated by Google).

5. HTTP3 (QUIC)

Based on UDP protocol, added multiplexing, TLS1.3 encryption, flow control, ordered delivery, retransmission and other functions.

  • Multiplexing: The transmission of a single data stream ensures orderly delivery without affecting other data streams.

HTTPS+HTTP2.0+TCP UDP. Resolve TCP header blocking problem.

TCP packet header blocking is at the packet level. If the previous packet is not received, subsequent packets will not be uploaded to HTTP. If there is packet loss, it will cause congestion.

Talk about the difference between HTTP and HTTPS, and how HTTPS connects

/ / juejin. Cn/post / 695576…

1. What is HTTPS?

  • HTTPS is the secure version of HTTP, which creates an encryption layer over HTTP and encrypts transmitted data. HTTP + SSL/TSL
  • TLS (Transport Layer Security) is a more secure upgrade to SSL. Above the transport layer and below the application layer

2. Why use HTTPS

  • HTTP communicates with the server in plain text, which is easy to be monitored and tampered with
  • Authentication cannot be performed during communication between HTTP and services, and phishing events are likely to occur

3. What is the encryption principle of HTTPS

Symmetric encryption

  • Both ends have the same key, and both know how to encrypt and decrypt files

Asymmetric encryption

  • Public key and private key. The public key is used to encrypt file data, and the private key is used to decrypt file data. The private key is known only to the party distributing the public key

4. HTTPS connection process

  • The browser makes a request to the server;
  • When the target server receives the request, theThe digital certificateBack to the browser (server sidePublic key B +And digital signatures).
  • After receiving the certificate, the browser verifies whether the certificate is valid. If yes, proceed to the next step. (This step is to verify identity and prevent hijacking)
  • The browser retrieves the server from the certificatePublic key B +, and then generate a key X locally, and then use the key to encrypt the transmission to the server.
  • The server receives the data and uses itThe private key - BDecrypt the data and obtain the keyx;
  • Then the two sides communicate, and the key X is encrypted.

Status code

2XX indicates success

  • 200 OK: indicates that the request from the client is processed correctly on the server
  • 204 No content: Indicates that the request is successful, but the response packet does not contain the body of the entity
  • 205 Reset Content: Indicates that the request is successful, but the response packet does not contain the body part of the entity. Different from 204 response, the requestor is required to Reset the Content
  • 206 Partial Content for a range request

3 xx redirection

  • 301 Moved permanently, permanently redirects: indicates that the resource has been assigned a new URL
  • 302 Found, temporary redirection, indicating that the resource was temporarily assigned a new URL
  • 303 See Other: indicates that another URL exists for the resource. Use the GET method to obtain the resource
  • 304 Not Modified: indicates that the server allows access to the resource but the request condition is not met
  • 307 Temporary redirect the client is expected to keep the request method the same and send requests to the new address

4XX Client error

  • 400 Bad Request: Syntax errors exist in the request packet
  • 401 Unauthorized: The request to be sent requires authentication information that is authenticated through HTTP
  • 403 Forbidden: Access to requested resources is denied by the server
  • 404 not found: No requested resource was found on the server

5XX Server error

  • 500 Internal sever error: an error occurred when the server executed the request
  • 502 Bad Gateway: The Gateway or proxy server receives an invalid response when executing a request
  • 501 Not Implemented: The server does Not support a function that is required for the current request
  • 503 Service Unavailable: The server is temporarily overloaded or is being stopped for maintenance and cannot process requests
  • 504: The server request times out

Talk about status codes in HTTP

200 request successful, redirects starting with 3 such as 301 permanent redirects, 302 temporary redirects, etc. 304 cache

Talk about request methods in HTTP

  • GET: Usually used to obtain resources
  • POST: Submits data
  • PUT: Modifies data
  • HEAD: obtains the meta information of the resource
  • DELETE: deletes resources
  • OPTIONS: cross-domain request
  • TRACE: Responds to the request

GET and POST

  • Semantically different, GET is usually used to get data, and POST is usually used to submit data
  • GET is harmless when the browser falls back, while POST resubmits the request.
  • Cache: GET requests are actively cached by the browser. POST is not cached by default
  • Encoding: GET can only urL-encode and can only accept ASCII characters, while POST has no limitation
  • History: The GET parameter is retained in the browser history. The POST will not

What’s the difference between 304 and 302 and 301

  • 301 permanent redirection, for example, when a company’s domain name is changed, is used for domain name redirection
  • 302 is mainly used for temporary jumps such as jump loginslocationfield
  • 304 indicates that the requested resource is available in the cache without modification.

Talk about HTTP headers

The HTTP header has a response header and a request header

Generic field

  • Cache-control: controls the Cache behavior
  • Connection: The type of Connection that the browser wants to use preferentially, such as keep-alive and close
  • Date: indicates the time when a packet is created
  • Transfer-encoding: Transfer Encoding mode

Request header

  • Accept: Indicates the media type that can be correctly received
  • Accept-charset: Character set that can be accepted correctly
  • Accept-encoding: List of accepted Encoding formats
  • Accept-language: List of languages that can be correctly received
  • Host: indicates the domain name of the server
  • User-agent: indicates the client information
  • Referer: Indicates the previous page visited by the browser
  • If-modified-since: The local resource has not been Modified
  • If-none-match: Local resource not modified returns 304 (comparison flag)
  • Cookie: Cookie of the current site

Response headers

  • Server: indicates the name of the Server
  • Location: the client redirects to a URL
  • ETag: indicates the resource identifier
  • Age: the duration of the resource in the proxy cache
  • Content-type: tells the client the content type of the actual returned content.
  • Content-encoding: Indicates the Encoding format of the Content
  • Content-language: Indicates the Language in which the Content is used
  • Content-length: Indicates the Length of the request body
  • Content-type: indicates the media Type of the Content
  • Expires: Indicates when the content Expires
  • Last_modified: Indicates the time when the content was last modified

Follow the request header for the Content-Type

  • application/x-www-form-urlencoded
  • Multipart /form-data form uploads can be used to upload files
  • application/json
  • text/xml
  • text/html
  • text/javascript
  • text/css
  • Image/jpge, etc

4. Principle of browser

1. What process does Chrome need to start to open a page

  • Browser main process: mainly responsible for interface display, user interaction, sub-process management, and storage
  • Network process: mainly responsible for the loading of network resources on the page,
  • GPU process: However, webpage and UI interface of Chrome are all drawn on GPU, which makes GPU become a common requirement of browser
  • Rendering process: The core task is to turn HTML, CSS, and JavaScript into web pages that users can interact with. Both the typography engine Blink and JavaScript engine V8 run in this process
  • Plug-in process: the plug-in process is mainly responsible for running plug-ins. Plug-ins are prone to crash. Therefore, the plug-in process is isolated to ensure that the plug-in process crash does not affect the browser and page.

If the TAB page browser process is opened later, the network process and GPU process are shared and will not be restarted. If two pages belong to the same site and page B is opened from page A, they will share the same rendering process, otherwise, a new rendering process will be opened.

2. Browser cache, when is memory stored on disk and when is it stored in memory

From Memory Cache or from disk Cache

The exact mechanism is not quite known; It should be written in memory when resources are not too large and there is plenty of memory

1, first look up memory, if memory exists, load from memory; 2. If it is not found in the memory, obtain it from the hard disk. If it exists in the hard disk, load it from the hard disk. 3. If it is not found in the hard disk, make a network request; 4. The loaded resources are cached to the hard disk and memory.

3. The browser is blocked when parsing documents

4. Why is DOM manipulation slow?

Because DOM is something that belongs in the rendering engine, and JS is something that belongs in the JS engine. When we manipulate the DOM through JS, which actually involves communication between two threads, there is bound to be some performance loss. Manipulating the DOM more than once means that threads are communicating with each other all the time, and manipulating the DOM can cause redraw backflow, which can lead to performance issues.

6. Blocking behavior during browser loading.

  • CSS does not block Dom parsing, but CSS loading blocks Dom tree rendering. CSS loading blocks subsequent JS statements.
This is because the browser generates a DOM tree and a CSS tree separately during rendering, so parsing does not block. However, DOM and CSS trees are merged together to form a rendering tree before rendering begins, which blocks rendering.Copy the code
  • Js affects dom and CSS parsing
  • The js from the external chain plus defer means that the JS files will be downloaded in parallel, but will be executed sequentially after the HTML parsing, before the DOMContentLoaded event.
  • External chain JS plus async, JS file download and parsing will not block rendering, script files using async flag will be executed immediately once loading is completed; .

7. Why does JS affect DOM parsing

  • The HTML parser pauses when it encounters JS, and the JS engine steps in.
  • So the rendering engine performs a CSS file download when it encounters a JavaScript script, regardless of whether the script manipulates CSSOM,

8. Repaint and Reflow

  • Redraw is when a node needs to change its appearance without affecting its layout, such as changing color
  • Backflow is when layout or geometry properties need to be changed.
  • Backflow must occur redraw, redraw does not necessarily cause backflow. synthetic
  • (transform) Performs compositing animation directly on a non-main thread, bypassing the rearrangement and redraw phases. This is the most efficient

9. How to insert tens of thousands of DOM

  • For this problem, first of all, we can’t insert tens of thousands of DOM all at once, which would definitely cause a lag, so the focus of the problem should be how to render the DOM in batches. Most of us can think of a requestAnimationFrame to iterate on the DOM, but there’s another way to solve this problem: virtualized scroller.
  • The principle of this technique is to render only the content in the visible area, not the non-visible area at all, and replace the rendered content in real time as the user scrolls

An operation that causes backflow

Flashing light from More tools -> Rending -> Paint flashing light

  • The browser window changes
  • The size or position of the element changed
  • Element content changes, such as one more line or one less line of text
  • Font size changes
  • Delete or add elements
  • Activate CSS pseudo-classes, such as Hover, mouse over
  • rolling
  • Common geometry attributes are width, height, padding, margin, left, top, border, and so on.
  • The most overlooked operation: Get attributes that need to be computed in real time when you need attributes like this: OffsetTop, offsetLeft, When offsetWidth, offsetHeight, scrollTop, scrollLeft, scrollWidth, scrollHeight, clientTop, clientLeft, clientWidth, clientHeight, The browser also backflows to retrieve these values.
  • Backflow is also triggered when we call the getComputedStyle method, or currentStyle in IE. The principle is the same, both for a “immediacy” and “accuracy”.

How to avoid backflow

  • Avoid table layouts. Any change to an element in the table affects the whole world, and its change causes the element before it to be redrawn.
  • When changing the js style, do not change it one by one. Finally, change it all together by using the class name
  • If animation is implemented, ensure that the parent element is positioned first. Use the transform – translate more.
  • Some browsers also promote the transform element to a separate layer, skip redraw backflow calculations, and use GPU acceleration to help make animation updates more efficient. So I can set a translateZ(0)

9. Cross domain

why

  • Browsers have the same origin policy for security reasons.” It is not necessarily the browser that restricts cross-site requests, it is also possible that cross-site requests can be made normally, but the return result is blocked by the browser
  • If there is a difference between protocol, IP, and port it is cross-domain

The solution

1.JSONPThe principle is to take advantage of the script tag has no cross-domain limitation vulnerability, through the script tag request interface and provide a callback function to receive data

  • JSONP only supports GET requests; XMLHttpRequest has a better error handling mechanism than JSONP
<script type="text/javascript">
  function dosomething(data) {
    // Process the obtained data
  }
</script>
<script src="http://example.com/data.php?callback=dosomething"></script>
Copy the code

2.CORS

Background Settings: access-Control-allow-origin: *

  • Simple request: GET HEAD POST

Content-Type:text/plain multipart/form-data application/x-www-form-urlencoded

  • Complex request

Before formal communication, an HTTP query request is added, called a “precheck” request, for the Option method, to know whether the server allows cross-domain requests.

3.postMessageInternet explorer 8 +,

4. The node agent

5. Nginx reverse proxy

10. How to ensure that the page file can be complete total hit browser

1. The data packet is sent to the host

2. The host forwards the data packet to the application

3. Complete delivery of data packets to applications

Security issues

XSS cross-site scripting attacks

Execute malicious scripts in the browser, stored, reflective, document (one belief, two uses)

  • Escape user input
  • Leverage the browser’s secure contentCSP
  • Leverage the HttpOnly attribute of the Cookie

CSRF cross-site request forgery

Hackers lure users to click on the external chain, submit local cookie information, and fake users to send requests.

  • The samesite in the cookie is set to Strict completely disallows third-party requests to carry cookies
  • Verify source site (can be forged not recommended)
  • Set the token

Local storage for the browser

  • cookie,localStorage,sessionStorage,indexedDB

Cookie

  • The storage size is relatively small, the maximum is 4K
  • Each request under the same domain name carries a cookie
  • Cookies are plaintext and can be hijacked, so set httpOnly

webStoreage

  • The storage capacity is relatively large, the maximum is about 5M
  • The operation method is simple setItem, getItem.
  • Does not participate in server-side communication
  • LocalStorage is a persistent storage, while sessionStorage is a session level storage, the current session page will disappear after closing.

indexedDB

  • Non-relational databases that run on the browser side theoretically have no upper limit of storage
  • Asynchronous operation. Database reads and writes are I/O operations
  • The same-origin restrictions

Fourth, the git

1. Switch branches

// View the branch
git branch -a
// Switch branches
git checkout -b dev origin/dev
//
git merge dev
Copy the code