Browser security is mainly divided into three parts: page security, system security and network security. The following content (XSS,CSRF, browser security sandbox,HTTPS) is my recent learning and review of some content, only as a personal learning record, if there is any wrong please forgive and correct.
XSS stands for Cross Site Scripting. To differentiate it from "CSS", XSS is simply called XSS, which translates as "cross-site Scripting". XSS attack refers to a method by which hackers inject malicious scripts into HTML files or DOM to attack users when they browse pages.
Stored XSS attack
Reflex XSS attack
For example, I put a malicious script uninstall link behind, sent to the website, others click the link will trigger malicious script.
Web servers do not store malicious scripts for reflective XSS attacks, which is different from stored XSS attacks.
DOM based XSS attack
Hackers inject malicious scripts into users' pages through various means, such as network hijacking to modify the content of HTML pages during page transmission. There are many types of hijacking, including WiFi router hijacking, local malware hijacking. What they have in common is modifying the data of a Web page during the transfer of Web resources or during the user's use of the page.
- Don't trust any input from the user. Check, filter, and escape the user's input.
Second, the CSRF
Cross-site request forgery,CSRF attack means that the attacker uses the victim's Cookie to defraud the server of trust, and sends forged requests to the attacked server in the victim's name without the victim's knowledge, so as to perform operations under permission protection without authorization.
One question: because of the browser's same-origin policy, a request sent on another bogus URL can be returned to the user's session on the regular website. If it is a CSRF attack, then the hacker can't get the data of the victim's site. But the hacker would call the HTTP interface of victim B's site on his site A, which could be to transfer money, delete posts or set up Settings. In this process, you need to note that when hacker A's site calls the HTTP interface of victim B's site, by default, the browser will still send the victim's cookies and other information to victim B's site (note this is not hacker A's site). If there is a flaw in site B, then the hacker will attack successfully, such as transferring the victim's gold!
- The SameSite property of the Cookie. In the HTTP response header, when setting a cookie via the set-cookie field, you can add the SameSite option as follows:
set-cookie: 1P_JAR=2019-10-20-06; expires=Tue, 19-Nov-2019 06:36:21 GMT; path=/; domain=.google.com; SameSite=none Copy the code
The SameSite option usually has Strict, Lax, and None values. Strict is the strictest. If the value of SameSite is Strict, the browser completely disallows third-party cookies. In short, if you access InfoQ resources from a Geektime page, and some cookies of InfoQ are set to SameSite Strict, they will not be sent to InfoQ's server. These cookies are only carried when you request an InfoQ resource from an InfoQ site. Lax is a little looser. In the cross-site scenario, cookies are carried both by opening a link from a third-party site and by submitting a Get form from a third-party site. However, if you use the Post method in a third-party site, or if you load urls through tags like IMG or IFrame, these scenarios do not carry cookies. Using None sends Cookie data in any case.
- Verify the source site of the request
The server side validates the source site of the current request, and the Referer and Origin attributes in the HTTP request header
The Origin attribute only contains domain name information and does not contain a specific URL path, which is a major difference between Origin and Referer.
- CSRF Token
In the first step, when the browser makes a request to the server, the server generates a CSRF Token and stores it in the front end.
Step 2: If you want to initiate transfer requests or sensitive operations on the browser side, you need to bring the CSRF Token in the page, and then the server will verify whether the Token is legitimate. If a request is made from a third-party site, the value of the CSRF Token will not be available, so even if a request is made, the server will reject the request because the CSRF Token is incorrect.
Three, safety sandbox
When modern browser architectures are designed, they are divided into different processes to increase their stability. Although designed as a multi-process architecture, the way these modules communicate with each other is somewhat complicated, and you may have the following problems:
1 why is it necessary to request resources through the browser kernel and then forward the data to the renderer instead of directly requesting network resources from within the process?
2. Why is the renderer only responsible for generating the page image, and then notify the browser kernel module through IPC to generate the image, and then let the browser kernel take charge of displaying the image?
Doesn't this increase the complexity of the project?
First of all, the browser kernel is divided into two core modules: browser kernel and rendering kernel. The browser kernel includes network thread, GPU process and browser process. All system-specific operations or ID operations are performed in the browser kernel, such as cookie storage, cache storage, file download, file read, network request, etc. What does the rendering process do in the rendering kernel? (HTML, CSS parsing, JS, layout, drawing).
The smallest unit of security sandbox protection is the process, so the process protected by the security sandbox is the renderer process. When you download a malicious program on the Internet, it doesn't matter if you don't execute it, it only does damage if you execute it, and that step in the browser is done by the rendering process.
Site isolation is when Chrome puts related pages from the same site (including the same root domain and the same protocol address) into the same rendering process.
Chrome originally divided the rendering process by TAB page, meaning that the entire TAB page was divided into one render process. However, dividing the renderer process by TAB is problematic because a TAB may contain multiple Iframes from different sites, resulting in content from multiple sites running in the same renderer process through iframes.
For example, a website has three IFrames of ABC, among which a iframe has vulnerabilities or malicious programs, but at this time, they are in a rendering process, which may affect other websites. Because sandbox isolation in the current case is only for renderers, they are all under one renderer.
With site isolation, a malicious IFrame can be isolated within a malicious process so that it cannot continue to access the content of other iframe processes and therefore cannot attack other sites.
HTTP- SECURITY layer (SSL/TLS) -TCP-IP - data link layer. HTTP-TCP-IP - data link layer. The security layer does two things: it encrypts the HTTP request it makes and decrypts the HTTP content it receives.
First edition: Symmetric encryption
- First the browser sends a list of encryption suites and client-random random numbers to the server. The encryption suite here refers to the encryption methods, and the encryption suite list refers to the list of encryption methods supported by the browser.
- The server receives it, selects an encryption suite, and then also generates a random number service-random, which it returns to the browser.
- In this case, both the browser and server have a specified encryption suite and their respective random numbers. They use the same encryption method to generate client-random and service-Random keys, and then use the keys to encrypt and decrypt the sent contents.
Problem: The browser and server first exchange encryption suites and random numbers in clear text. If this process is intercepted, the hacker can also generate the corresponding key.
Second edition: Asymmetric encryption
- First, the browser sends the list of encryption suites to the server
- The server receives it and returns the encryption suite and a public key to the browser, which keeps its own private key.
- The browser then encrypts it with the public key, passes it to the browser, and the server decrypts it with the private key. Then the returned information server is encrypted with the private key and sent to the browser, which decrypts the information with the public key to complete the exchange of information. Because the private key only exists on the server side, there is no way for a hacker to decrypt the data even if he gets it.
Problem: 1 It is true that a hacker cannot decrypt the message encrypted with the public key on the browser side, but he can get the public key at the beginning of the plaintext exchange, and then he can decrypt the message encrypted with the private key on the server side. The efficiency of asymmetric encryption is too low, which will seriously affect the speed of data encryption and decryption, and thus affect the speed of users to open pages.
Third edition: Symmetric encryption and asymmetric encryption used together
- First, the browser generates a random client-random number and sends a list of symmetric encryption suites, a list of asymmetric encryption suites and a client-random number to the server.
- After receiving the preceding information, the server generates service-Random, selects a symmetric encryption suite and an asymmetric encryption suite (generating a public key and a private key), and sends the service-Random, public key, and symmetric encryption suite information to the browser.
- The browser saves the public key obtained, generates a random number pre-master, encrypts the pre-master with the public key, and sends the encrypted information to the server.
- The server decrypts the pre-master message with the private key. At this point, the server and browser both have three random client-random, service-Random, and pre-master numbers. The three random numbers and the specified symmetric encryption algorithm are used to generate symmetric keys, and then the symmetric keys are used for information encryption and decryption. (A hacker can intercept the pre-master, but he can't decrypt it because he doesn't have a private key.)
To put it simply, symmetric encryption is still used in the data transmission stage, but the symmetric encryption key is transmitted by asymmetric encryption. In symmetric encryption with asymmetric encryption, the key exchange is not exposed in plaintext.
Problem: The hacker used DNS hijacking to replace the GEEK Time IP address with the hacker's IP address, so I was actually visiting the hacker's server. The hacker could implement the public and private keys on his own server, and the browser had no idea that it was visiting the hacker's site. So the question is, how do I prove that I'm me?
Edition 4: Digital certificates
The website administrator applies to the CA and submits his/her public key and website information, including the website domain name and validity period. The CA makes a certificate based on the information, encrypts the certificate with his/her private key, and returns the encrypted result to the website administrator.
At this time when the browser to access the server, the server will send this certificate to return to the browser, the browser will verify the authenticity of the digital certificate, the certificate is legal, will go to verify the CA agency is legitimate, normally, operating system will be built-in top trusted CA certificate information (including public key), If the browser's built-in top-level CA is not found in the CA chain, the certificate will also be ruled illegal. Verification is normal, that proves that the site is real, not forged by others.
This main process differs little from the third version, with two major changes:
1 The server does not directly return the public key to the browser, but returns the digital certificate, which contains the public key. 2 A certificate verification operation is added on the browser and the subsequent process is continued only after the certificate is verified.
How does a browser validate a digital certificate
After the browser receives the digital certificate, it authenticates it. First, the browser reads the plaintext information in the certificate, uses the same Hash function as the CA signature to calculate and get the information summary A. Then, the corresponding CA's public key is used to decrypt the signature data, and the information abstract B is obtained. Compare information digest A with information digest B, and if they are consistent, you can confirm that the certificate is valid.
How does the browser verify the CA authority
There are many CAS around the world, and then the question arises, how do these cas prove that they are safe? If a malicious company also sets up a CA body and then certifies itself, then this is very dangerous, so we must also implement a mechanism for the CA to prove that it is safe and harmless.
This is where the digital certificate chain comes in.
To talk about digital certificate chains, we need to understand that there are two types of CA organizations, intermediate CAs (Intermediates CAs) and Root CAs (Root CAs). Usually, applicants apply for certificates from intermediate CAs, and the Root CAs serve to authenticate intermediate CAs. A root CA authenticates many intermediate cas, which in turn can authenticate other intermediate cas.
For example, you can open a website in Chrome and click on the little lock at the front of the address bar. You can see that the certificate for *.geekbang,org was issued by the middle CA GeoTrust RSA CA2018. DigiCert Global Root CA, DigiCert Global Root CA, DigiCert Global Root CA, DigiCert Global Root CA *. Geekbang,org-- GeoTrust RSA CA2018--DigiCert Global Root CA
Therefore, when the browser validates the certificate, it first validates the certificate of *.geekbang,org. If it is valid, it validates the certificate of the intermediate CA. If the intermediate CA is also valid, the browser continues to validate the root certificate of the intermediate CA.
How do you prove that the root certificate is valid?
The browser simply looks for the root certificate of the system. If the root certificate is in the operating system, the browser considers it valid. If the verified root certificate is not in the operating system, it is invalid.
The embedded root certificates in the operating system are not installed arbitrarily. The root CA certificates are authenticated by WebTrust international security audit.
So what is WebTrust authentication?
WebTrust (Network trust) authentication is the only international authentication standard in the electronic authentication service industry, mainly for the Internet service provider's system and business operation of business practices and information privacy, transaction integrity and security. WebTrust certification is a standard supported by major browsers, Microsoft and other large manufacturers, and is an international standard for regulating CA organization operation services. In the browser vendor root certificate implantation project, it is necessary to pass the WebTrust authentication to realize the seamless embedding of browser and digital certificate.
Currently, WebTrust certified root CA includes Comodo, GeoTrust, RapidSSL, Symantec, Thawte, Digicert, etc. That is, the root certificates of these root CA organizations are built into the large operating system, and browsers assume that the user's certificate is valid as long as they can trace the digital certificate chain back to these root certificates