TreeShking uses static analysis to find code that will not be used in the source code and remove it to reduce the volume of code needed to compile the package.

JS uses Webpack and Terser for TreeShking, and CSS uses PurgeCss.

PurgeCss analyzes the use of CSS selectors in HTML or other code to remove unused CSS.

Curious how PurgeCss finds unusable CSS? Today we’ll explore this with a simple version of PurgeCss.

Thought analysis

PurgeCss specifies which HTML CSS should be applied to. PurgeCss analyzes the CSS selectors in the HTML and removes unused CSS based on the analysis results:

const { PurgeCSS } = require('purgecss')
const purgeCSSResult = await new PurgeCSS().purge({
  content: ['**/*.html'].css: ['**/*.css']})Copy the code

What we have to do can be divided into two parts:

  • Extract possible CSS selectors in HTML, including ID, class, tag, and so on
  • Analyze the rule in CSS and delete the parts that are not used according to whether the selector is used by HTML

The part that extracts information from HTML is called an HTML extractor.

We can implement HTML extractor based on PosthTML, it can do HTML parse, analysis, conversion, etc., API and PostCSS are similar.

The CSS uses postCSS, and the AST can be used to analyze each rule.

Iterate over the CSS rule and determine whether the selector of each rule is extracted from THE HTML into the selector. If not, it means that the selector is not used, and then delete it.

If all of a rule’s selectors are deleted, delete the rule.

This is how purgecSS is implemented. So let’s write down the code.

Code implementation

Let’s write a PostCSS plugin to do this. The PostCSS plugin does CSS analysis and transformation based on the AST.

const purgePlugin = (options) = > {
  
    return {
        postcssPlugin: 'postcss-purge',
        Rule (rule) {}
    }
}

module.exports = purgePlugin;
Copy the code

The postCSS plugin takes the form of a function that takes the plugin’s configuration parameters and returns an object. Object that declares Rule, AtRule, Decl, etc.

The postCSS plugin is called Purge and can be called like this:

const postcss = require('postcss');
const purge = require('./src/index');
const fs = require('fs');
const path = require('path');
const css = fs.readFileSync('./example/index.css');

postcss([purge({
    html: path.resolve('./example/index.html'),
})]).process(css).then(result= > {
    console.log(result.css);
});
Copy the code

The path passed to the HTML is available in the plugin via option.html.

Let’s implement this plug-in.

As previously analyzed, the overall implementation process is divided into two steps:

  • Extract ID, class, tag from HTML through POSthTML
  • Walk through the AST of the CSS and delete the parts that are not used by the HTML

We encapsulate an htmlExtractor to do the extraction:

const purgePlugin = (options) = > {
    const extractInfo = {
        id: [].class: [].tag: []}; htmlExtractor(options && options.html, extractInfo);return {
        postcssPlugin: 'postcss-purge',
        Rule (rule) {}
    }
}

module.exports = purgePlugin;
Copy the code

The htmlExtractor reads the contents of the HTML file, parses the HTML, generates an AST, iterates the AST, and records the ID, class, and tag.

function htmlExtractor(html, extractInfo) {
    const content = fs.readFileSync(html, 'utf-8');

    const extractPlugin = options= > tree= > {      
        return tree.walk(node= > {
            extractInfo.tag.push(node.tag);
            if (node.attrs) {
              extractInfo.id.push(node.attrs.id)
              extractInfo.class.push(node.attrs.class)
            }
            return node
        });
    }

    posthtml([extractPlugin()]).process(content);

    // Filter out null values
    extractInfo.id = extractInfo.id.filter(Boolean);
    extractInfo.class = extractInfo.class.filter(Boolean);
    extractInfo.tag = extractInfo.tag.filter(Boolean);
}
Copy the code

Posthtml is a plugin similar to PostCSS, in which we walk through the AST and record some information.

Finally, filter out null values in ID, class, and tag to complete the extraction.

Before we rush to the next step, let’s test the current features.

We prepare an HTML like this:

<! DOCTYPEhtml>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="Width = device - width, initial - scale = 1.0">
    <title>Document</title>
</head>
<body>
    <div class="aaa"></div>

    <div id="ccc"></div>

    <span></span>
</body>
</html>
Copy the code

Information extracted under the test:

As you can see, the ID, class, and tag are all correctly extracted from the HTML.

Next, we move on to the next step: delete unused parts of the CSS AST.

We declare the listener of the Rule and get the AST of the Rule. The thing to analyze is the selector part, which needs to be split according to the “, “and then processed for each selector.

Rule (rule) {                        
     const newSelector = rule.selector.split(', ').map(item= > {
        // Convert each selector
    }).filter(Boolean).join(', ');

    if(newSelector === ' ') {
        rule.remove();
    } else{ rule.selector = newSelector; }}Copy the code

Selectors can be parsed, parsed, and converted using postCSs-selector -parser.

If all subsequent selectors are deleted, the rule style is useless, so delete the rule. Otherwise you might just delete some of the selectors and the style will still be used.

const newSelector = rule.selector.split(', ').map(item= > {
    const transformed = selectorParser(transformSelector).processSync(item);
    returntransformed ! == item ?' ' : item;
}).filter(Boolean).join(', ');

if(newSelector === ' ') {
    rule.remove();
} else {
    rule.selector = newSelector;
}
Copy the code

The next step is to analyze and transform the selector, which is the transformSelector function.

This part of the logic is to determine if each selector is in the selector extracted from the HTML, and if not, delete it.

const transformSelector = selectors= > {
    selectors.walk(selector= > {
        selector.nodes && selector.nodes.forEach(selectorNode= > {
            let shouldRemove = false;
            switch(selectorNode.type) {
                case 'tag':
                    if (extractInfo.tag.indexOf(selectorNode.value) == -1) {
                        shouldRemove = true;
                    }
                    break;
                case 'class':
                    if (extractInfo.class.indexOf(selectorNode.value) == -1) {
                        shouldRemove = true;
                    }
                    break;
                case 'id':
                    if (extractInfo.id.indexOf(selectorNode.value) == -1) {
                        shouldRemove = true;
                    }
                    break;
            }

            if(shouldRemove) { selectorNode.remove(); }}); }); };Copy the code

After extracting the selector information from the HTML and removing the useless rules from CSS based on the information extracted from the HTML, the plugin’s functionality is complete.

Let’s test the effect:

CSS:

.aaa, ee , ff{
    color: red;
    font-size: 12px;
}
.bbb {
    color: red;
    font-size: 12px;
}

#ccc {
    color: red;
    font-size: 12px;
}

#ddd {
    color: red;
    font-size: 12px;
}

p {
    color: red;
    font-size: 12px;
}
span {
    color: red;
    font-size: 12px;
}
Copy the code

html:

<! DOCTYPEhtml>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="Width = device - width, initial - scale = 1.0">
    <title>Document</title>
</head>
<body>
    <div class="aaa"></div>

    <div id="ccc"></div>

    <span></span>
</body>
</html>
Copy the code

The p, # DDD,. BBB selectors and styles, ee, ff selectors are all removed.

Let’s use this plugin:

const postcss = require('postcss');
const purge = require('./src/index');
const fs = require('fs');
const path = require('path');
const css = fs.readFileSync('./example/index.css');

postcss([purge({
    html: path.resolve('./example/index.html'),
})]).process(css).then(result= > {
    console.log(result.css);
});
Copy the code

After testing, the function is correct:

This is how PurgeCss works. We’ve done three Shaking for CSS!

The code is uploaded to github: github.com/QuarkGluonP…

Of course, we are only a simple version of the implementation, some places do not improve:

  • Only HTML extractors are implemented. PurgeCss also has JSX, PUG, TSX extractors (but the idea is the same)
  • Single file, not multiple files (just add another loop)
  • Handles only id, class, tag selectors, not property selectors (property selectors are a little more complicated to handle)

It’s not perfect, but the idea of PurgeCss is pretty clear, isn’t it

conclusion

JS TreeShking uses Webpack and Terser, while CSS TreeShking uses PurgeCss.

We implemented a simplified version of PurgeCss to clarify how it works:

The selector information in HTML is extracted through the HTML extractor, and then the AST of CSS is filtered. According to whether the selector of the Rule is used, the unused Rule is deleted to achieve the purpose of TreeShking.

In the process of implementing this tool, we learned how to write postCSS and PosthTML plugins, which are similar in form, except that one is for CSS analysis and conversion, and the other is for HTML.

Postcss can analyze and transform CSS, such as removing useless CSS is a good application. What other great postCSS scenarios have you seen? Let’s discuss them