The first book I read before switching to a new career was Illustrated HTTP. HTTP is also divided into HTTP1, HTTP2, and HTTP3. Before, often hear someone by phishing site harm, do not know behind the original CSRF attack, what the same origin policy, security sandbox is more do not understand. The principles of the browser chapter 4, take you in-depth study of the browser network and security.

29. HTTP/1

HTTP / 0.9

HTTP/0.9 was proposed in 1991, which was mainly used for academic communication and completed the transfer of HTML hypertext content between networks. The complete process is as follows:

  1. HTTP is based on TCP. The client establishes a TCP connection with the server based on the IP address and port.
  2. After establishing the connection, send the GET request line information, such as GET/index.html.
  3. After receiving the requested information, the server reads the corresponding HTML file and returns the data as an ASCII character stream to the client.
  4. After the HTML document is transferred, disconnect the connection.

HTTP/0.9 implementation features:

  • Only request line, no HTTP request header, request body.
  • Only data is returned, no headers.
  • Returns the contents of the file as an ASSII character stream.

HTTP / 1.0

Dial-up Internet access appeared in 1994, when Netscape launched its Netscape web browser. The growth of the World Wide Web led to the creation of HTTP1.0.

Support for multiple types of file downloads.

New features:

  • To support multiple types of files, request header and response header negotiation are used.
  • Introduces a status code to return server processing status.
  • Provides a cache mechanism (response header cache-control) to cache downloaded data to relieve server stress.
  • User-agent is added in the request header to facilitate the server to collect basic client information.
accept: text/html
accept-encoding: gzip, deflate, br
accept-charset: ISO-8859-1,utf-8
accept-language: zh-CN,zh

content-encoding: br
content-type: text/html; charset=UTF-8
Copy the code

HTTP / 1.1

  • Added persistent connections (enabled by default) that allow multiple HTTP requests to be sent over a TCP connection.
  • Pipelining technology (deprecated) to solve queue head congestion. That is, multiple HTTP requests are submitted to the server in batches, and the server returns response data in turn.
  • Added the host request header to support virtual hosts (a physical host is bound to multiple virtual hosts, sharing IP addresses, but each virtual host has its own domain name).
  • The Chunk transfer mechanism is introduced to solve the dynamic file transfer problem.
  • Client cookie mechanism and security mechanism are introduced.

30. HTTP/2

HTTP/2.0 protocol specification was officially released in 2015.

Major problems with HTTP/1.1

The main causes of low bandwidth usage are as follows:

  • TCP slow start (a policy to reduce network congestion).
  • If multiple TCP connections are enabled at the same time, each other competes for fixed bandwidth.
  • Queue head congestion problem.

HTTP / 2 new features

multiplexing

One domain value using a TCP long connection, resource parallel requests eliminate queue head blocking. HTTP/2 adds a layer of binary framing to implement multiplexing techniques.

Specific request and receiving process:

  1. The browser is ready to request data.
  2. The request data is processed by the binary framing layer and converted into frames with request IDS, which are sent to the server through the protocol stack.
  3. After receiving all the frames, the server merges the frames with the same ID into one complete request message. (The server can suspend previous requests and prioritize critical resource requests)
  4. The server processes the request and sends the processed response line, response header, and response body to the binary framing layer respectively.
  5. The binary framing layer converts the response data into frames with response ids that are sent to the browser through the protocol stack.
  6. The browser receives the response frame and submits the frame’s data to the request based on the ID number.

Other features

  • The request PRIORITY can be set, for example, the PRIORITY type in binary frames is used to specify or respecify the PRIORITY of a reference resource.
  • Server push.
  • Head compression.

31. HTTP/3

The main problem with HTTP/2

  • The TCP header is blocked. Procedure HTTP/2 only addresses queue header blocking at the application level. During TCP transmission, the loss of a single packet due to network faults or other reasons blocks all requests for the TCP connection.
  • TCP connection establishment delay. 1.5 RTT is required for TCP connection, and 1-2 RTT is required for HTTPS TLS connection. When the physical distance between the browser and server is long, the connection delay is significant.
  • TCP is rigid. TCP involves many intermediate devices, such as routers, firewalls, NAT, and switches. These intermediaries rely on rarely upgraded software that uses a lot of TCP features. The TCP protocol is implemented through the operating system kernel, and the operating system updates slowly.

HTTP / 3 QUIC agreement

QUIC, Quick UDP Internet Connection

  • Based on UDP protocol, it realizes the functions of flow control and transmission reliability similar to TCP.
  • TLS encryption is integrated to achieve a quick handshake (combining encryption and handshake) and reduce RTT.
  • It can realize multiple independent logical data flows on the same physical connection and solve the problem of TCP queue head blocking.

HTTP/3 Challenges:

  • No browser or server side currently provides full support for HTTP/3.
  • The kernel optimizes UDP far less than TCP.
  • Intermediate equipment rigid, high packet loss rate.

32. Same-origin Policy (Page Security)

Same-origin policy restriction

  1. DOM level: Restrict DOM manipulation.
  2. Data level: Access to Cookie, IndexDB, and LocalStorage data is restricted.
  3. Network layer: Restrict the sending of site data to different sites via XMLHttpRequest.

Security vs. Convenience

  1. Pages can reference third-party resources, but this exposes many security issues such as XSS, so CSP was introduced on top of this openness to limit its freedom.
  2. Cross-domain requests cannot be made directly using XMLHttpRequest or Fetch, so browsers introduce cross-domain resource sharing policies on top of this strict policy to make it safe for cross-domain operations.
  3. The DOM of two different sources cannot be manipulated with each other, so cross-document messaging is implemented in browsers to allow them to communicate more securely.

XSS cross-site scripting attack (Page security)

Malicious Scripts

  • Steal Cookie information, simulate user login on other computers, and operate user accounts.
  • Monitor user behavior, such as monitoring keyboard time to obtain sensitive information entered by users (bank card, account password, etc.).
  • Modify the DOM to forge a fake login window to trick users into entering user names and passwords.
  • Generate floating window ads on the page.

Malicious script injection mode

  1. Stored XSS attack. (For example, Ximalaya was exposed with storage XSS vulnerability in 2015)
    1. The hacker took advantage of the site vulnerability to submit a piece of malicious JS code to the site’s database.
    2. A user accesses a page that contains a malicious JS script.
    3. When users browse the page, malicious scripts upload user information and other data to malicious servers.
  2. Reflex XSS attack.
    1. Malicious scripts are part of the requests that users send to websites. Such as http://localhost:3000/? XSS =.
    2. The site returns malicious JS scripts to the user.
    3. When a malicious JS script is executed on a user page, hackers use the script to perform malicious operations.
  3. DOM based XSS attack. Hackers inject malicious scripts into users’ pages by means such as network hijacking (wifi router hijacking, or local malware hijacking) to modify HTML content in transit.

Block XSS attacks

  1. The server filters or transcodes the input script.
  2. CSP: Content Security Policy A Content Security Policy that prohibits the loading of resources in other domains, the submission of data to third-party domains, the execution of inline scripts and unauthorized scripts, and the reporting mechanism to detect XSS attacks.
  3. The server sets cookies to the HttpOnly flag, so that cookies can only be used in HTTP requests, preventing XSS from stealing cookies.
  4. Add captcha to prevent scripts from posing as users to commit dangerous operations and limit the length of untrusted input.

CSP enabling mode:

  1. HTTP headers. Example Content-security-policy: default-src HTTPS :; connect-src https:; . ;
  2. The label. The sample reference: www.ruanyifeng.com/blog/2016/0…

34. CSRF cross-site request forgery (page security)

Use user login status and server vulnerabilities to carry out attacks.

CSRF attack conditions

  1. The target site has a CSRF vulnerability.
  2. The user logs in to the target site and maintains the site login status in the browser.
  3. Users open third-party sites, such as hacker sites and forums.

CSRF attack mode

  1. Automatically initiates GET requests. (img SRC)
  2. Automatically initiates a POST request. (Form automatic submission)
  3. Entice users to click. (A tag at the bottom of the picture)

Prevent CSRF attacks

  1. The SameSite property of cookies prevents third-party sites from sending cookies when they send requests.
  2. The server verifies the site from which the request came through the Referer and Origin attributes of the HTTP header.
  3. CSRF Token. The server generates a Token and returns the Token to the page. The request from the page can carry the Token, but the request from the third-party site cannot obtain the CSRF.

35. Security Sandbox (System Security)

Modern browser architecture, on a multi-process basis, introduces a security sandbox that isolates the operating system from the renderer process. Renderers are most vulnerable because they perform DOM parsing, CSS parsing, network image decoding, and so on. The least protective unit of the security sandbox is the process.

The renderer process cannot interact directly with the operating system, so a series of functions such as persistent storage, network access and user interaction are implemented in the browser kernel to interact with the operating system, and then interact with the renderer process through IPC.

Handle to the window

An interface provided by an operating system that allows applications to draw on it.

To limit the renderer’s ability to monitor user input events, the renderer cannot manipulate window handles.

  • The renderer process renders the bitmap and sends it to the browser kernel, which copies the bitmap to the screen.
  • User input events are passed by the operating system to the browser kernel, which schedules events based on the mask state. (Focus on the address bar, input events are handled inside the browser kernel; The focus is on the page, and the browser kernel forwards the event to the renderer.)

Site isolation

Chrome executes related pages from the same site in the same rendering process.

Iframe rendering processes prevent malicious IFrames from invading processes using Spectre/Meltdown to attack the operating system.

36. HTTPS (Network Security)

At the protocol stack level, HTTPS inserts a security layer SSL/TLS between HTTP and TCP. All data passing through the security layer is encrypted or decrypted.

encryption

Symmetric encryption

Encryption and decryption use the same key.

Implementation process:

  1. The browser sends the encryption suite list (list of supported encryption methods) and client-randow to the server.
  2. The server returns the browser to the cryptosuite and service-randow selected from the list.
  3. Browser, server, use the same method to mix the two Randow to generate the master secret.
  4. You can then use the key and encryption suite to encrypt the data.

The problem: The encryption suite and the transmission of random numbers are in clear text, and hackers can get the generated key to forge or tamper with the data.

❓❓ question: what does the “same method” of step 3 refer to specifically?

Asymmetric encryption

There are two keys A\B, use A encryption, can only use B decryption.

Implementation process:

  1. The browser sends a list of encryption suites to the server.
  2. The server returns to the browser the encryption suite selected from the list and the public key used for browser encryption.
  3. After both parties confirm, the browser can send the public key encrypted data to the server.

Existing problems:

  1. Asymmetric encryption is inefficient.
  2. Data sent by the server to the browser cannot be secured.

Mixed encryption

Symmetric encryption is used in data transmission, and symmetric encryption keys are transmitted using asymmetric encryption.

Implementation process:

  1. The browser sends a list of cryptosuites, a list of asymmetric cryptosuites, and a client-randow to the server.
  2. The server returns the browser with the cryptosuite, asymmetric cryptosuite, service-randow, and public key selected from the list.
  3. The browser calculates the pre-master using the two Randow, encrypts the pre-master using the public key, and sends the encrypted data to the server.
  4. The server decrypts the pre-master data using the private key and returns a confirmation message.
  5. The browser and server use client-random, service-Random, and pre-master use the same set of methods to generate keys for symmetric encryption transmission.

❓❓ question: Can’t the hacker get the data in step 1 and step 2 and generate a pre-master?

The digital certificate

Hackers can use DNS to hijack normal IP addresses and replace them with hacker IP addresses to complete the subsequent mixed encryption process. Therefore, a digital certificate issued by an authority is required.

Functions of digital certificates:

  1. Prove the identity of the server to the browser.
  2. The digital certificate also contains the server public key.

The information contained in the digital certificate is as follows: public key, organization information, CA information, validity time, certificate number, and CA digital signature (The plaintext information submitted by geek time is calculated using the HASH function and the information summary is generated. The ciphertext encrypted by the CA private key is the digital signature issued by the CA to geek Time).

The operating system will have the certificate information (including the public key) of the trusted top-level CA. If the CA chain does not find the built-in top-level CA of the browser, the certificate will also be judged to be illegal.

series

One of the principles of the browser: look at it in general

Browser principle 3: Pages

Browser principle two: JS, V8