A Brief introduction to Network Protocols: Web Security

This is the 11th day of my participation in Gwen Challenge

The difference between cross domain and cross site

When it comes to Web security, there is a concept of cross-site, which is different from cross-domain.

Cross domain

Cross-domain, which is the opposite of same-origin, requires that the protocol, port number, and domain name of two urls be the same.

cross-site

Cross-site refers to cross-site, which is opposite to same-site and has no requirements on protocol and port number. As long as the eTLD + 1 of two urls is consistent, they can be called same-site. So what is eTLD?

ETLD is effective top level domain. For example, the eTLD of http://juejin.cn is.cn, the eTLD of http://test.org is.org, http://chorer.github.io is github. IO. ETLD + 1 refers to a valid top-level domain + a secondary domain, such as juejin. Cn for http://juejin.cn or test.org for http://test.org.

PS: However, it should be noted that there are two types of same-site: one is the loose same-site defined above, that is, scheme-less same-site. The other is schemeful same-site. It requires that the protocols are consistent to be considered the same site.

XSS

XSS stands for Cross-Site Scripting, which is when a hacker injects malicious code into a page and executes it as soon as the page is opened. XSS attacks may lead to Cookie theft, personal information disclosure, and traffic hijacking for malicious redirect.

classification

XSS can be basically divided into two categories, one is reflective XSS (non-persistent XSS), the other is storage XSS (persistent XSS).

Reflective XSS

The hacker injects malicious code into the page by inducing users to click on urls with special parameters. For example, the normal URL to make a request to the server is http://test.com?name=jack. After the server receives the parameter jack, it directly returns a response Hello Jack. The HTML looks like this:

<div>Hello jack</div>
Copy the code

This is fine, but if the user clicks on the hacker’s URL http://test.com?name=, the server gets the name parameter, and returns the response without processing it, the HTML will look like this:

<div>
	<script>alert(1)</script>    
</div>
Copy the code

So when you parse the HTML, you actually execute the middle script. Alert (1) is just an example — any scripting operation, including stealing a user’s cookie via document.cookie, jumping through window.location, and so on, can pose a significant security risk.

Type stored XSS

Stored XSS is persistent and more risky because malicious code is stored in the database and can be affected no matter which user accesses the page. For example, a hacker may leave a message in the comments section of an article, write , and submit the form to the server. The server does nothing but store the message in the database. The next time any user accesses the article, the server retrives the message from the database and returns it to the browser, including , which pops up as soon as it is executed, for all users. Again, the popover here is just an example of any script operation that compromises the security of user information.

Defensive measures

1) HTML escape

< is used to define the start of a tag, and if we want the browser to actually display the < character itself, rather than parsing it as a tag, we must escape the character, writing character entities instead of characters.

Similarly, to be safe, we should not parse as a tag, but rather as a string, so consider HTML escaping on the server side:

&lt;script&gt;alert(1)&lt;/script&gt;
Copy the code

The result of this escape is eventually returned to the browser, and is displayed as a string on the page rather than as an executable script.

2) User input verification

Escaping refers to escaping special characters such as < and >, and if injected malicious scripts are wrapped in , then HTML escaping does prevent XSS attacks — but there are actually other ways to do script injection. For example, if a community site allows users to add their blog address to their profile and display it as my blog address , then malicious hackers can add javascript:alert(1); This is not HTML escaped, so the hacker’s blog address looks like this:

<a href="javascript:alert(1);">This is the address of the user's blog</a>
Copy the code

Whenever someone clicks on his blog address in his profile, a popup pops up.

Similarly, if the site also allows users to fill in a URL to set up an avatar image and display it as , the hacker could fill in XXX “onerror=”alert(1), again without HTML escaping, So the final img tag looks like this:

<img src="xxx" onerror="alert(1)">
Copy the code

SRC is clearly illegal, so an error event is raised and a popover occurs.

Therefore, HTML escape alone does not circumvent all XSS attacks, and we must also validate the data entered by the user.

3) the CSP

A CSP, or Content Security Policy, prohibits certain third-party scripts from running by providing a whitelist that tells browsers to load only code from a specific source.

CSP can be used in two ways:

The server responds with onecontent-security-policyHeader field that constrains the browser’s loading behavior:

Content-Security-Policy: script-src 'self'; style-src cdn.example.org third-party.org; child-src https:
Copy the code

HTML is used inmetaTag to constrain the browser’s loading behavior:

<meta http-equiv="Content-Security-Policy" content="script-src 'self'; style-src cdn.example.org third-party.org; child-src https:">
Copy the code

They have different forms, but they all do the same thing:

script-src: Sets which source scripts are allowed to load only, set toselfIndicates that only the scripts of this domain name can be loaded. Note that it disallows inline script event listening, as in the previous exampleonerror, an error message is displayed indicating a CSP violation
style-src: sets which source style files are allowed to load onlycdn.example.org 和 third-party.orgthe
child-src: set tohttpsMust be usedhttpsTo load theiframe

4) HttpOnly

As mentioned earlier, a hacker can inject a script to steal a user’s Cookie. This is essentially because the Cookie can be accessed through document. Cookie, so the server can use the set-cookie header field given to the client in response, Declare an HttpOnly to disallow cookies from scripts.

Set-Cookie: id=a3fWa; Expires=Wed, 21 Oct 2015 07:28:00 GMT; Secure; HttpOnly
Copy the code

CSRF

case

CSRF is cross-site request forgery. Hackers take advantage of the feature that the request will carry cookies to impersonate the identity of users to send requests to normal websites and perform some illegal operations. It works like this:

The user loginhttp://article.com, the server authenticates, and returns the Cookie to the browser for saving
Assuming the Cookie doesn’t expire, during which time the hacker induces the user to visit a malicious sitehttp://evil.comThe website has this code:

<img src="http://article/delete? id=1">
Copy the code

Then the malicious web site will attackhttp://article.comSend a request with Cookie, the server side verification is ok, it will delete the article with ID 1

There are several reasons why a hacker can launch a CSRF attack:

User: logs in to a normal website and does not log out (the Cookie is valid), and then visits a malicious website
Hacker: Knows the URL to execute the request and all the parameters
Server: Only Cookie is used for permission authentication, and no defense against CSRF is taken
imgCross-domain requests are supported. Hackers can also send an AJAX request directly, but due to the same origin policy and CORS restrictions,http://evil.comYou can’t go to different sourceshttp://article.comSend the request, so the hacker is using a natural cross-domainimgThe label

XSS + CSRF

The example above is using cookies instead of stealing cookies. In fact, a hacker can use XSS to get the user’s Cookie and then use CSRF to forge the request.

Defensive measures

To develop defense measures, we can start from the causes of CSRF:

CSRF mostly comes from third-party websites. If the server can know who made the request and limit it accordingly, it can avoid the attack to a certain extent. Related to this areSameSiteAttributes,OriginHeader field,RefererHeader fields
The key to the occurrence of CSRF is that third-party websites can also send requests with cookies, so the server does not know whether the request comes from a malicious website or a normal user. Then, we can let the normal user send a request with a token that the malicious website cannot obtain. The server can distinguish the normal request from the attack request by verifying whether the request carries the correct token, and also can defend against CSRF attack. And this is related to thatCSRF token.

Same-site restriction – SameSite

First-party Cookie: inhttp://bank.comUnder thehttp://bank.com/xxxInitiate the request, then the Cookie carried is the first-party Cookie (carried by the first party);

Third party Cookie: inhttp://evil.comUnder thehttp://bank.com/xxxInitiates a request with a third-party Cookie (carried by a third party).

If the site had declared the SameSite property in response to the set-cookie returned, the Cookie would have been a peer Cookie and would not have been a third-party Cookie — in other words, Declaring SameSite can avoid CSRF attacks by avoiding the need to carry cookies when making requests to http://bank.com/xxx under http://evil.com.

1) set-cookie: SameSite = Strict:

This is the strictest mode in that cookies will not be carried in any cross-site requests after the declaration, which completely disables third-party cookies and thus completely prevents CSRF attacks. But the disadvantage is that the user experience is poorer, such as the current web page have a jump to the target web site links, click to enter after often is already logged in, this is because the current page to the target site launched the Cookie, carries the target site in the request now if completely disable third-party cookies, login will not be able to maintain this state, You need to log in to the target website again.

2) set-cookie: SameSite = Lax:

The default value. This mode is relatively loose, and cookies are not carried in most cross-site request scenarios after the declaration, thus ensuring security. At the same time, GET requests that navigate to the target site can carry cookies, thus ensuring availability (such as staying logged in). The details are as follows:

Request type	The sample	normal	Lax
link	`<a href="..." ></a>`	Send a Cookie	Send a Cookie
preload	`<link rel="prerender" href="..." />`	Send a Cookie	Send a Cookie
GET the form	`<form method="GET" action="..." >`	Send a Cookie	Send a Cookie
POST form	`<form method="POST" action="..." >`	Send a Cookie	Don’t send
iframe	`<iframe src="..." ></iframe>`	Send a Cookie	Don’t send
AJAX	`$.get("..." )`	Send a Cookie	Don’t send
Image	`<img src="..." >`	Send a Cookie	Don’t send

The first three types of the table are all GET requests that navigate to the target site. These requests are cross-site but can carry cookies — especially the first one, which allows us to be logged in directly after reaching the target site via an external link.

3) set-cookie: SameSite = None; Secure

This mode enables the SameSite property to be turned off and the portability of third-party cookies in cross-site requests to be unlimited. But at the same time, you must declare Secure so that cookies can only be carried in HTTPS requests.

Why set up SameSite and then turn it off instead of not setting SameSite in the first place? Because Chrome sets SameSite = Lax by default, you must turn it off by explicitly setting SameSite = None.

PS: Chrome will completely disable third-party cookies in 2022

Homology detection —Origin 和 Referer

Generally, the request source can be known from the Origin or Referer header field of the request message. The difference is that the former only gives the server address, while the latter also gives the specific path:

Origin: https://developer.mozilla.org

Referer: https://developer.mozilla.org/en-US/docs/Web/JavaScript
Copy the code

So which one should I use? Origin won’t be carried in IE11 CORS requests, nor in 302 redirected requests, so it’s safer to use Referer — but even then be aware that when an HTTPS page goes to an HTTP page, for security reasons, No Referer will be carried.

CSRF token

One of the necessary conditions for a malicious site to carry out a CSRF attack is to know the request format and parameters. Therefore, if the request is required to carry a token known only to normal users as a parameter, the malicious site will not be able to construct a complete request, and thus cannot attack.

Pattern 1: Hide form field + session:

The server generates a random CSRF token, stores it in the server session, and sends the token to the user’s front-end page. There are two scenarios here

One is to inject the token into each form’s hidden input field:

<input type="hidden" name="csrf-token" value="CIwNZNlR4XbisJF39I8yWnWX9wX4WFoz">
Copy the code

Inject tokens into meta tags:

<meta name="csrf-token" content="CIwNZNlR4XbisJF39I8yWnWX9wX4WFoz">
Copy the code

If the front end wants to make a GET request, it gets it from JSmetaAs a parameter to the request URLhttp://test.com?csrftoken=CIwNZNlR4XbisJF39I8yWnWX9wX4WFoz; To make a POST request, simply submit the form, and the tokens previously injected into the form are automatically used as parameters in the request body
The server receives the token parameters of the GET or POST request and compares them with the tokens stored in the session. If they are the same, the request is considered to be from a legitimate user; otherwise, the request is considered to be from a malicious website (a malicious website cannot obtain the token and cannot construct a complete request).

Pattern 2: Hide form field + cookie:

The server generates a pair of associated CSRF tokens, one of which is delivered to the user’s front page through a hidden form field, and the other token is injected intoset-CookieIn the field
The front-end submits the form and initiates a POST request. The token in the hidden form field automatically becomes the parameter of the request body.set-CookieThe token is placed in the request header field Cookie
The server verifies the two tokens. If the verification succeeds, the request is from a legitimate user

PS: This mode does not require the server to maintain a large number of tokens through the session. Although a malicious website can still carry a Cookie (containing token) in its request, since it can’t get the hidden form field (containing token) returned to the user by the server, its request parameters are missing and cannot actually pass the server’s verification.

Verification code

Sensitive operations such as deleting data can be risky if they are performed without any validation. Therefore, captchas can be considered, but they should only be used in critical business nodes, and abuse will affect the user experience — from this point of view, captchas are more suitable as a defense against CSRF attacks.

ClickJacking

ClickJacking refers to hijacking a user’s click behavior to perform some action.

For example, if there is a malicious website http://evil.com and a normal website http://funnyvideo.com, the page of the malicious website has a transparent iframe that references the normal website. The hacker induces users to go to a malicious site and click on a web page. It looks like users are only clicking on a malicious site, but they are actually clicking on a normal site.

Defensive measures

X-Frame-OptionsImplementation:

The reason for click hijacking is that malicious websites can use iframes to reference healthy websites. If we try to prohibit the use of iframes to reference healthy websites, or limit the use of iframes to only certain trusted websites, we can avoid click hijacking. The response header field X-frame-options does this by setting the following values:

deny: Forbid any websites to passiframeReferencing a normal website
sameorigin: Allows only same-origin websites to pass throughiframeReferencing a normal website
allow-from: Allows only specific websites to passiframeReference normal websites, such asallow-from http://test.com, it meanshttp://test.comIs trusted and can refer to a normal website

JS:

For older browsers that do not support setting the X-frame-options header field, js can be used as a callback solution.

When A website passesiframeWhen quoting B’s website, B’s website can pass throughselfAccess to their ownwindowObject, throughtopObtain A websitewindowObject, so B websites can be usedtop == selfDetermine if you are approved by other sitesiframeReferences.
At the same time, it can also pass throughtop.location.hrefGet the URL that references your own website, so that you can filter websites by pattern matching and only allow trusted websites to reference you.

if(top ! = self){const style = document.createElement('style')
    style.innerHTML = 'html{display:none! important; } '
    document.head.appendChild(style)
    top.location = self.location
}
Copy the code

The code above is crude: if top does not equal self, it is referenced by another site, so it hides all of its content and lets third-party sites jump to it. Of course, you can modify the code to look like the x-frame-options parameters.

Man-in-the-middle attack

In a man-in-the-middle Attack, an attacker acts as a middleman, creating separate links with each end of a communication to hijack and tamper with the transmitted data. The entire conversation is controlled by a middleman, but both ends of the communication think they are talking to each other. The reason of man-in-the-middle attack is that the communication parties do not use digital signature, digital certificate and other means to verify the identity of the other party.

Taking the hybrid encryption process described above as an example, if a man-in-the-middle attack occurs, the process would be as follows:

The client sends a request to obtain the public key of the server. The middleman intercepts the request and forwards it to the server
The server receives the request and generates a pair of public and private keys. The private key is kept by the server and the public key is sent to an intermediary (the server thinks the intermediary is the client).
The middleman gets the public key (server). Generate its own public key (middle) and private key (middle) at the same time, and impersonate the server to send the public key (middle) to the client
The client receives the public key (middle), generates the session key (guest), encrypts the session key (guest) with the public key (middle), and sends the session key (guest) to an intermediary (the client thinks the intermediary is the server).
The middleman decrypts with the private key (middle) and obtains the session key (guest). At the same time, it generates its own session key (middle), encrypts it with the public key (server), and sends it to the server
The server receives it, decrypts it with the private key (server), gets the session key (middle), thinks it is the session key (client) sent by the client, encrypts data XXX with the session key, and sends it to the middleman
The middleman receives it, decrypts it with the session key (middle), gets the data XXX, tampers with it, and gets the data YYY. Then use the session key (guest) to encrypt the data YYY and send it to the client
The client receives the data YYY, encrypts the data ZZZ with the session key (guest), and sends it to the middleman
The middleman receives it, decrypts it with the session key (guest), and tampers with the data ZZZ to get the data WWW. The data WWW is then encrypted with the session key (middle) and sent to the server
The server receives it, decrypts it with the session key (middle), and gets the data WWW
…

As you can see, the middleman acts as a server in front of the client and a client in front of the server, hijacking and tampering data back and forth. The following diagram shows the process more clearly:

DNS contamination and hijacking

DNS hijacking

DNS hijacking refers to hijacking a DNS server to gain control over the resolution records of a domain name, modify the resolution results of the domain name, and return an incorrect IP address to the client. DNS hijacking tampers the data on the DNS server. As a result, users cannot access a website or access a fake cloned website, resulting in personal information leakage.

Example: Visit Google but open Baidu

Solution: Since the DNS server is faulty, you can manually replace the DNS server with a public DNS server

DNS pollution

DNS contamination is a DNS cache poisoning attack, which disguises itself as a DNS server and directs the domain names accessed by users to incorrect IP addresses.

Case: XX firewall prevents access to foreign websites

Solutions: VP* etc

HTTP hijacked

HTTP hijacking may be initiated by third-party carriers, lans, or free public Wi-Fi. The reason is that traffic must pass through carriers, Lans, and Wi-Fi, and HTTP itself is transmitted in plain text, which gives them opportunities to hijack and tamper with data.

Sometimes when we browse some websites, we often see a pop-up advertisement in the lower right corner. In fact, it is not necessarily the advertisement of the website itself, but often the operator after HTTP hijacking. The solution is simple: use encrypted HTTPS.

A Brief introduction to Network Protocols: Web Security

The difference between cross domain and cross site

Cross domain

cross-site

XSS

classification

Defensive measures

CSRF

case

XSS + CSRF

Defensive measures

ClickJacking

Defensive measures

Man-in-the-middle attack

DNS contamination and hijacking

DNS hijacking

DNS pollution

HTTP hijacked

Related Posts

Promise custom encapsulation

Sort out some common questions

Vuex Mutation is the first Mutation of Vuex