Author: ChengCyber, The intelligent front end team

In the process of optimizing the compilation speed for business projects, the Smart Front-end infrastructure team studied the compilation process based on WebPack. In this article, the [email protected] version is expanded to explain the compilation principle of Webpack at the source level.

Understand Tapable

The source code of Webpack is very abstract, basically all operations are through Tapable dynamic registration callback form plug-in mechanism. So let’s look at what Tapable is.

In short, Tapable is EventEmitter that supports the following latitude, provides registered event callbacks, and gives event producers the flexibility to choose how to execute them.

  • Execution mode: synchronous, asynchronous serial, asynchronous parallel

  • Process control: basic (sequential execution), streaming (sequential execution but passing in the last result, similar to reduce), early termination (sequential execution, but allowing early termination), loop (loop execution until undefined)

Take a simple 🌰 :

class Person {
  constructor(name) {
    this.name = name;
    this.hooks = {
      eat: new SyncHook(),
    }
  }
  
  eat() {
    this.hooks.eat.call();
  }
}

const zhangsan = new Person('zhangsan');
Copy the code

Now we have a Person class that opens a hook for EAT. Next, we instantiate a Zhangsan as a Person instance. Then I built a FitnessPlugin designed to yell out “Be self-conscious” when Zhangsan is eating!!

// FitnessPlugin zhangsan.rest.eat. tap("FitnessPlugin", () => console.log(' Be self-active!! '));Copy the code

The only way to register an event callback is through the links.eat.tap method

zhangsan.eat(); // Zhang SAN has eaten! /// be conscious!! by FitnessPluginCopy the code

When eat is later called, the event callback that was just registered is executed, and the console output is self-conscious. With Tapable in mind, you now have the ability to write a FitnessPlugin for WebPack 😉

Compiler instantiation

Compiler is a Compiler instance generated by Webpack based on the combination of webpack.config.js and cli parameters passed in, source portal: github.com/webpack/web… .

The main functions of Compiler

  1. Select watchRun or Run based on whether to watch

  2. Execute compile to create a Compilation

  3. Use Compilation to perform the Compilation process

  4. Emit a product

The life cycle

As Compiler runs, WebPack defines many life cycles. Let’s start with an overview (excluding watch and error handling) :

Because so many life cycles are open, webPack’s plug-in mechanism is very powerful and can be flexibly involved in all aspects of compilation; It also makes it difficult to read the source code because it is all dynamically registered. Next, we look at several major life cycles and their roles to help understand Compiler.

compiler.hooks.compilation

The Compilation process is created, and it also has many life cycles through plug-in intervention to control the Compilation process. Such as:

  • The JavascriptModulePlugin registers the PARSER and Generator for JS files

  • The EntryPlugin sets the EntryDependency Factory to tell the compiler what to do with the Entry module

compiler.hooks.make

When the Compilation has been created (Compilation explained below), the Make life cycle is called, and the EntryPlugin built into WebPack registers the make processing logic to find entry modules. That’s the entry in webpack.config.

// webpack.config.js
module.exports = {
  entry: {
    main: './index.js',
  },
}
Copy the code

Working with an entry file named main./index.js eventually generates an EntryDependency instance, which is added to the compilation process via compilation.addentry.

Compilation

Compiler is a Compiler generated by configuration, and a Compilation instance represents a complete Compilation process. Including loading entry module, parsing dependencies, parsing AST, creating Chunk, generating products and a series of work.

Compilation main process

A Module corresponds to a source file, or a virtual Module generated during the parsing of the source file. Virtual module: a module that can be considered by the compiler as a module but does not correspond properly to the source files in the file system, and is used by the compiler to handle specific module types. For example the require. The context (‘/a ‘).

handleModuleCreation

The EntryPlugin adds entry to the compilation process. This function will eventually instantiate the Dependencies through NormalModuleFactory, continue to get them, and recurse.

buildModule

Parse the source using Parser. All nodes in the AST are iterated over and Tapable events are called. The plug-in interferes with the parse result by registering the corresponding syntax callbacks. It’s kind of like Tapable mimics the logic of the visitor and opens up the syntax processing logic to the outside.

Take a 🌰 : webpack has a built-in grammar the require. The context (webpack.js.org/guides/depe…

const allFiles = require.context('./a', false, /\.js$/);
Copy the code

To traverse the ast, until meet the require. The context, this call RequireContextDependencyParserPlugin registered callback functions, according to the parameters to create a RequireContextDependency, Call addDependency to the current NormalModule to add the dependency. It will eventually be replaced with:

const allFiles = __webpack_require__('./a sync \\\\.js$');
Copy the code

/ sync \\\\. Js $is the ContextModule name generated by the require.context parameter.

var map = { "./index.js": "./a/index.js" }; function webpackContext(req) { var id = webpackContextResolve(req); return webpack_require(id); } function webpackContextResolve(req) { if (! webpack_require.o(map, req)) { var e = new Error("Cannot find module '" + req + "'"); e.code = 'MODULE_NOT_FOUND'; throw e; } return map[req]; } webpackContext.keys = function webpackContextKeys() { return Object.keys(map); }; webpackContext.resolve = webpackContextResolve; module.exports = webpackContext; webpackContext.id = "./a sync \.js$";Copy the code

NormalModuleFactory Module factory 👷♂️

Being a competent modular factory consists mainly of the following activities

  • Create create module

  • Resolve Resolve module dependencies

  • Parse generates the AST from the module code

Create create module

Execute resolve and use the result to instantiate NormalModule, logging dependencies, files, source code, and so on. Configuration of module.rules is also handled (see RuleSetCompiler below).

Resolve Resolve module dependencies

Get normalResolver from getResolver (hence NormalModule 😁) and parse./index.js. Get a list of information, such as the absolute address of the resource file system, descriptionFilePath the package.json absolute address of the current item. Then execute RuleSet to get processing rules, find corresponding loaders, and generate corresponding Parser and generator.

Parse generates the AST from the module code

Use getParser to get the corresponding Parser according to the file type. The parser mapping here is registered through the plug-in (described below). For example, a JAVASCRIPT file corresponds to a JavascriptParser file. The Parse function is called during the build of NormalModule, the AST is generated by Acorn, and then the AST is processed internally (see Parser below).

NormalResolver

In fact, there is no NormalResolver in the code. All resolvers are created by ResolverFactory and distinguished by Type. NormalResolver === an instance of ResolverFactory whose type === ‘normal’.

Why?

Resolve in Webpack is all enhanced resolve with a different configuration. Parsing different target objects requires different strategies. For example, NormalResolver is responsible for module dependency resolution of the corresponding source code. Resolving a Loader requires a LoaderResolver to resolve code locations like TS-Loader. When creating different resolvers, you can use plug-ins to inject different resolverOptions to control resolvers.

For example 🌰 : I now need to resolve a NormalModule of esM type, which corresponds to a NormalResolver, ResolverFactory with type ‘normal’. At creation time, the parameter logic is injected through WebpackOptionsApply. Finally, the parameters required by enhanced- Resolve for ESM modules are obtained.

{
  conditionNames: ['import', 'module', 'webpack', 'development', 'browser'],
  aliasFields: ['browser'],
  mainFields: ['browser', 'module', 'main'],
  modules: ['node_modules'],
  mainFiles: ['index'],
  extensions: ['.ts', '.js', '.json'],
  exportsFields: ['exports'],
}
Copy the code

Json, and then alias the browser field with.ts,.js, and.json.

RuleSetCompiler

RuleSetCompiler combines user-set module.rules with default rules built into Webpack into a method that can be called directly in subsequent processing to determine which loaders are used for module processing and the parameters required for parse code.

Take an example 🌰 :

// webpack.config.js
module: {
 rules: [
  {
   test: /\.tsx?$/,
   loader: "ts-loader",
   options: {
    transpileOnly: true
   }
  }
 ]
},
Copy the code

After compile, it becomes:

{
  conditions: [
    {
      property: 'resource',
      matchWhenEmpty: false,
      fn: (v) => /\.tsx?$/.test(v),
    },
  ],
  effects: [
    {
      loader: 'ts-loader',
      options: {
        transpileOnly: true,
      },
      ident: 'ruleSet[1].rules[0]',
    },
  ],
}
Copy the code

Here’s how this result came about.

CompileRule process

First of all, BasicMatcherRulePlugin injection for dealing with the logic of the test, calls ruleSetCompiler.com pileCondition the regular transformed into (v) = > / \. TSX? $/. The test of matching function, The attribute for is the Resource attribute, namely conditions[0] in the above result. Next, UseEffectRulePlugin injects the logic to process the loader. The result is understandable, using ts-Loader and loaderOptions {transpileOnly: true}. If you are interested in the location of plug-in injection, look at the logical address of the injection.

How do I use this result?

CompiledRule contains two arrays: conditions and effects. When a Module resolve is resolved, all compiledrules will be run. Once conditions are met for a particular compiledRule, The effects are accumulated for subsequent processing (how to use the resulting effects is explained below).

CompiledRule.exec

🌰 : SRC /index.tsx file, processed, results in:

effects: [
  {
    type: 'type',
    value: 'javascript/auto',
  },
  {
    type: 'use',
    value: {
      loader: 'ts-loader',
      options: {
        transpileOnly: true,
      },
    },
  },
]
Copy the code

Type: use uses ts-loader as loaders for this Module. Type: Type sets the Module’s parameters. Here is type: ‘javascript/auto’ to instruct parser to parse.

Parser

Webpack has some default Parser

  • JavascriptParser

  • javascript/auto

  • javascript/esm

  • javascript/dynamic

  • JsonParser

  • json

  • WebAssemblyParser

  • webassembly/sync

  • webassembly/async

  • AssetParser

  • asset

  • asset/inline

  • asset/source

  • asset/resource

JavascriptParser is the main topic here

JavascriptParser

Inject PARSER and Generator into JS files using the JavascriptModulePlugin plug-in.

There are three types of JS:

  • javascript/auto

  • webpack@3 includes the default js types of previous versions, which can be CommonJS, AMD, ESM

  • javascript/esm

  • webpack@4 introduced js types for Treeshaking that can only handle ESM

  • More strictly, dynamic references must be default, not namespace

  • javascript/dynamic

  • Only CommonJS types can be handled

To put it bluntly, for Acorn, there were only two sourcetypes: Module and script

Javascript /dynamic is sourceType: script

Javascript/ESM is sourceType: Module

Javascript /auto is to use Module parse first and then script parse 😉 if this fails

AssetParser

There are four types of asset:

  • asset

  • Webpack automatically determines whether it is asset/inline (file size < 8KB) or asset/ Resource

  • asset/inline

  • Use the Data URI inline in the code

  • asset/resource

  • Output the corresponding resource file, in the form of links in the code reference

  • asset/source

  • Similar to asset/inline but inline is the file source, mostly for.txt types

See the Asset Modules section of the official documentation for more details.

Generator

Corresponding to the concept of Parser, it is natural to think that there is

  • JavascriptGenerator

  • JsonGenerator

  • WebAssemblyGenerator

JavascriptGenerator

Here’s an overview of the generate behavior for js files.

1. The references will be modified, for example 🌰:

require("./index"); Would be replaced by require(/*! ./index */"./index.ts");Copy the code

2. Replace require -> __webpack_require__

require("./index");

__webpack_require__("./index");
Copy the code

3. Information about the bound Runtime function

Since __webpack_require__ is used in 2, we need to inject this Runtime function to record that the Module needs __webpack__require__. Add the runtime code to renderManifest at the end of the renderManifest. Go here for the source code.

Tree Shaking

Tree shaking is the process of removing unused code from production. His name and concept comes from another bundle tool: rollup.

For 🌰 :

// index.js import { cube } from './math'; console.log(cube(2)); Math.js export function square(x) {return x * x; } export function cube(x) { return x * x * x; }Copy the code

From our reading of the code, it quickly becomes clear that the function math#square is not being used and can be removed from the production to reduce the size of the code. Now look at the product in the development environment (set Optimization: {usedExports: true}) :

/***/ "./math.js": /*! * * * * * * * * * * * * * * * * *! * \! *** ./math.js ***! \*****************/ /***/ ((__unused_webpack_module, __webpack_exports__, __webpack_require__) => { /* harmony export */ __webpack_require__.d(__webpack_exports__, { /* harmony export */ "cube": () => /* binding */ cube /* harmony export */ }); /* unused harmony export square */ function square(x) { return x * x; } function cube(x) { return x * x * x; } / / * * *})Copy the code

As you can see from the production, the function math# Square is still in place, not shaking it off by Tree. This is because removing Dead Code is not triggered in development mode, it has to work in production mode. But you can see that export is annotated with harmony, and square has a unused harmony export line.

First, you need to cooperate HarmonyExportDependencyParserPlugin multiple plug-ins, HarmonyImportDependencyParserPlugin, FlagDependencyUsagePlugin

  • Resolve to import {cube} from ‘. / math, will be added to the current module HarmonyImportSpecifierDependency, used in this dependency will record the export of variable names, In this case, the cube is recorded to be used.

  • In the optimize stage by FlagDependencyUsagePlugin will read the cited information from HarmonyImportSpecifierDependency, to add if used to all the export of information

  • At code generation time, get the unused exports containing square and add the unused Harmony export square comment

As for the development environment, how does the production environment end up being removed?

  • HarmonyExportDependencyPlugin see export grammar will be added to the current module HarmonyExportHeaderDependency

  • The dependency function is to remove export when generating code (same as the code generated by the development environment above)

  • Eventually, the code is handled by uglify’s tools without dead code, or math#square code, or terser in the current version.

conclusion

Congratulations, see here 👏. There are so many points that webPack can explain.