XSS attacks

XSSintroduce

XSS attack usually refers to the use of the vulnerabilities left in the development of the web page, through a clever way to inject malicious command code into the web page, the user load and execute the malicious web program made by the attacker. These malicious web programs are usually JavaScript, but can actually include Java, VBScript, ActiveX, Flash, or even plain HTML. After a successful attack, the attacker may gain various contents including but not limited to higher permissions (such as performing some operations), private web content, sessions and cookies. (Excerpted from Baidu Baike)

XSS attacks can be divided into two types, reflection type and storage type. As the name implies, one type is saved into the database, and the other type is triggered directly in the URL but not saved into the database. The following describes one by one.

reflective

Url parameters are injected directly

Scenario Description: Enter the attack script in the search box of the website

// Search box search
http://localhost/? keyWord=
Copy the code

Storage type

Stored in DB and then injected when read

Scenario Description: The attack script is submitted as message content to the background for saving. The backend returns the attack script to the front-end after refreshing the page

// The message board is inserted into the database<script>alert(222) </script>// Message contentCopy the code
  • HTML node content

The attack script is stored in the database through the interface. When the page is refreshed, the inserted script replaces the HTML node content of the page, and the script is executed immediately, triggering vulnerability attack

<! -- Content variable replaced by attack script -->
<div>
    {{content}}
</div>
<div>
    <script>
    </script>
</div>

Copy the code
  • HTML attributes

Attack events are triggered by modifying or adding HTML attributes

<! Raise an error event when the image address is abnormal.
<img src="#{image}"/>
<img src="1"onerror="alert(1)" />
Copy the code
  • JavaScript code

Print it in the script by taking a variable entered by the user or any other saved variable

<script>
    var data = params;
    console.log(data)
    // Enter the content
    params = "hello":alert(1)
</script>
Copy the code
  • The rich text

Rich text needs to be reserved HTML HTML has XSS attack risk

Processing method

The whole idea: escape

  • HTML – > HTML entities
  • js->JSON_encode

Escape timing:

  1. Escape when saving to the database
  2. Escape when returned to the front end

The escape method is more inclined to the first one: because the input is one-time performance consumption, and the output will be many times, resulting in a waste of performance, the sample code content of this article due to personal compilation, demonstration is convenient for the reason of the second way – escape on return, please be aware of the readers

HTML node content

Escape < > so that it cannot be saved as an HTML tag or returned to the front end

// Escape functions
var escapeHtml = function(str){
    str = str.replace(/</g.'< ');
    str = str.replace(/>/g.'> ')
    return str
}
Copy the code
/ / before escaping
<script>alert(111)</script>
/ / after the transfer&lt; script&gt; alert(111)&lt; /script&gt;// page DOM rendering
<script>alert(111)</script>// A string of characters
              ||
<span><script>alert(111)</script></span>
Copy the code

HTML attributes

Processing idea: escape “quotes, so that the tag attributes can not be closed, triggering events

// Escape functions
var escapeHtmlProperty = function(str){
    if(! str)return ' ';
    str = str.replace(/"/g.'&quto; ');// Replace the double quotes
    str = str.replace(/'/g.'& # 39; ');// Replace single quotes
    str = str.replace(/ /g.'& # 32; ');// Replace Spaces
    return str;
}
Copy the code
/ / before escaping
<img src="1"onerror="alert(1)" />
/ / escaped
<img src="1&quto; onerror=&quto;alert(1)" />
// page DOM rendering
<img src="1&quto; onerror=&quto;alert(1)" />
Copy the code

Does escaping the greater-than and less-than signs in AN HTML attribute have any other effect on an element? Does escaping single quotes, double quotes, and Spaces in the content of HTML nodes have any other impact on elements? The answer is no, and readers can check for themselves. The above two functions can be merged to handle HTML nodes and attributes as follows:

// Escape functions
var escapeHtml = function(str){
    str = str.replace(/&/g.'& ');// The ampersand also needs to be escaped, but must be placed first
    str = str.replace(/</g.'< ');
    str = str.replace(/>/g.'> ')
    str = str.replace(/"/g.'&quto; ');// Replace the double quotes
    str = str.replace(/'/g.'& # 39; ');// Replace single quotes
    // The space escape effect is not big, can be omitted
    // str = str.replace(/ /g,'  '); // Replace Spaces
    return str
}

Copy the code

JavaScript code

Escape double quotes or JSON_encode

// Escape functions
var escapeForJs =function(str){
	if(! str)return ' ';
	str = str.replace(/"/g.'\ \ "');// The escape method of double quotes in JAVASCRIPT is different from that in HTML
	return str;
}
Copy the code
Function: <script>var str = ! "" {keyWord}";
console.log(str)
</script>

//hello world"; alert(1);"
/ / before escaping
<script>Hello World Alert (1) has been raised</script>

/ / escaped
hello world"; alert(1);"
Copy the code

But is it safe? The answer is definitely not! Single quotes, script tags, double quotes, Spaces… Both can trigger attacks in a more secure way -JSON_encode escape

JavaScript code _JSON_encode escaped

JSON_encode escape – json.stringify demo code:

// Escape functions
JSONStringify (parameters)Copy the code
/ / before escaping
hello world"; alert(1);"
/ / escaped
hello world"; alert(1);"
Copy the code

The rich text

Processing idea: filter

  1. Blacklist filtering

Such as script tags, onerror tags… Filter it all out

  • Pro: Simple implementation – regular expression can be handled
  • Con: HTML tags are too bulky and can leave holes
  1. Whitelist filtering

Some labels and attributes can be reserved based on the whitelist. Only the labels and attributes in the whitelist can be reserved

  • Pro: More thorough, only the specified tags, attributes are allowed
  • Cons: Relatively cumbersome to implement, parsing HTML completely into data structures, filtering the data structures, and then assembling HTML

Blacklist filtering:

Example attack code:

<! -- Attack scripts that may be entered -->
<font color=\"red\ ">This is the red word</font><script>alert('Rich text')</script>
<a href=\"javascript:alert(1) \ "></a>
<img src=\"abc\" onerror=\"alert(1) \ ">. onfocus, .. onmounseover, .. onmenucontext, ...Copy the code
// Filter function
var xssFilter = function (html) {
	if(! html)return ' ';
	html = html.replace(/<\s*\/? script\s*>/g.' ');
	html = html.replace(/javascript:[^'"]*/g.' ');
	html = html.replace(/onerror\s*=\s*['"]? ["] * [' "^ ']? /g.' '); .return html
}
Copy the code

HTML with event triggers are likely to become a breach of attack, in the face of this situation, how to defend? The following describes whitelist filtering.

Whitelist filtering:

Treatment principle: organize rich text of all the tag attributes in filtering only allow these attributes through, other attributes are not allowed by processing method: the HTML parsing as tree structure, and a process similar to the browser parses HTML, to traverse the tree structure element, the filter allows within the scope of, not within the scope of the filter, is removed

Example attack code:

// Attack scripts that may be entered<font color=\"red\ ">This is the red word</font><script>alert('Rich text')</script>
<a href=\"javascript:alert(1) \ "></a>
<img src=\"abc\" onerror=\"alert(1) \ ">. onfocus, .. onmounseover, .. onmenucontext, ...Copy the code
// Filter code
var xssFilter = function (html) {
	if(! html)return ' ';
	/ / white list
	var whiteList = {
		'img': ['src'].'font': ['color'.'size'].'a': ['href']};var cheerio = require('cheerio');
	var $ = cheerio.load(html);
	$(The '*').each(function (index, elem) {
		if(! whiteList[elem.name]) { $(elem).remove();return;
		}
		for (var attr in elem.attribs) {
			if (whiteList[elem.name].indexOf(attr) === -1) {
				$(elem).attr(attr, null); }}})return $.html()
}
Copy the code

Cheerio introduction

Cheerio is specially customized for the server. It is a fast, flexible and implemented DOM operation scheme implemented with jQuery as the core

// Basic usage
const cheerio = require('cheerio');
const $ = cheerio.load('<h2 class="title">Hello world</h2>');

$('h2.title').text('Hello there! ');
$('h2').addClass('welcome');

$.html();
//=> 

Hello there!

Copy the code

Installation method, syntax and so on here but more introduction, need readers please jump to the official website to read the code mainly use it to convert HTML structure into directly used data structure, recycle to compare, remove. Are there any existing third-party frameworks that can be used directly? The answer is yes

js-XSS

XSS is a module that filters user input to protect against XSS attacks. Mainly used in the BBS, blogs, online stores, and so on some but allows the user to input the page layout, HTML format control related scenario, XSS module through the white list control allowed tags and related attributes, and also provide a series of interface so that users extension, is more flexible than other similar modules Chinese website address The sample code:

// Filter function
var xssFilter = function (html) {
	if(! html)return ' ';
	var xss = require('xss');
	var ret = xss(html);
	return ret;
}
Copy the code

It is so simple, of course, it still needs to be adjusted, here is no more introduction, the following is the introduction of XSS module features and basic use methods.

features

  • Whitelist controls the allowed HTML tags and the attributes of each tag
  • By customizing a handler function, you can process any tag and its attributes

Usage:

Used in Node.js

var xss = require("xss");
var html = xss('');
console.log(html);
Copy the code

Use it on the browser side

<script src="https://rawgit.com/leizongmin/js-xss/master/dist/xss.js"></script>
<script>
// Use the function name filterXSS, the same usage
var html = filterXSS(' + 'ipt>');
alert(html);
</script>
Copy the code

For other usage modes and usages, please refer to the official website

If need to the development of simple, rapid and safe use third-party libraries better, of course, but use process may have this or that problem, could not reach the requirements of the business and so on, compared to white list third-party libraries set themselves up to deal with, it would be easier to control, customized effect is obvious, the relative problems may be less, many things, Readers can choose by themselves according to the actual situation.

CSP

Content Security Policies (CSP) are an additional layer of security designed to detect and weaken certain types of attacks, including cross-site scripting (XSS) and data injection attacks. Whether it is data theft, website content contamination or the distribution of malicious software, these attacks are the main means.

  • Content Security Policy
  • Content security Policy
  • Used to specify what is executable

Usage:

Configuring a Content Security Policy involves adding the Content-Security-Policy HTTP header to a page and configuring the corresponding values to control which resources the user agent (browser, etc.) can obtain for that page.

Common use cases:

Example 1

A web site manager wants all content to come from the same source on the site (excluding subdomains)

Content-Security-Policy: default-src 'self'
Copy the code

Example 2

The administrator of an online mailbox wants to allow HTML to be included in emails, as well as images that can be loaded from anywhere, but not JavaScript or other potentially dangerous content.

Content-Security-Policy: default-src 'self' *.mailsite.com; img-src *
Copy the code

For more examples and usage, see MDN-CSP

conclusion

As a barrier to the system, security is of great importance. XSS attacks are an important part of security defense. This paper records the author in the process of development encountered problems and processing ideas. For the reference of readers with similar problems. Other security aspects of the article I will continue to update, welcome readers to put forward comments and suggestions.