Author: Concave-man – wind demon small jiro

background

By the time Webpack was iterated to version 4.x, the source code was huge, highly abstracted from various development scenarios, and became more expensive to read. But to understand its inner workings, let’s try to start with a simple WebPack configuration and develop a low-profile Version of WebPack from a tool designer’s point of view.

Developer perspective

Suppose one day we received a request to develop a React single-page application. The page contains a line of text and a button that changes the text every time the button is clicked. So we create a new project and create a new JS file under [root]/ SRC. To simulate the Webpack process of tracking module dependencies, we created three React components and set up a simple dependency between them.

// index.js root component
import React from 'react'
import ReactDom from 'react-dom'
import App from './App'
ReactDom.render(<App />, document.querySelector('#container'))
Copy the code
// app.js page component
import React from 'react'
import Switch from './Switch.js'
export default class App extends React.Component {
  constructor(props) {
    super(props)
    this.state = {
      toggle: false
    }
  }
  handleToggle() {
    this.setState(prev= > ({
      toggle: !prev.toggle
    }))
  }
  render() {
    const { toggle } = this.state
    return (
      <div>
        <h1>Hello, { toggle ? 'NervJS' : 'O2 Team'}</h1>
        <Switch handleToggle={this.handleToggle.bind(this)} />
      </div>)}}Copy the code
// Switch. Js button component
import React from 'react'

export default function Switch({ handleToggle }) {
  return (
    <button onClick={handleToggle}>Toggle</button>)}Copy the code

Then we need a configuration file to let Webpack know how we expect it to work, so we create a new file in the root directory, webpack.config.js, and write some basic configuration to it. (If you are not familiar with the configuration content, you can learn the Chinese webpack documentation first)

// webpack.config.js
const resolve = dir= > require('path').join(__dirname, dir)

module.exports = {
  // Import file address
  entry: './src/index.js'.// Output file address
  output: {
		path: resolve('dist'),
    fileName: 'bundle.js'
  },
  // loader
  module: {
    rules: [{test: /\.(js|jsx)$/.// Compile a file that matches the include path
        include: [
          resolve('src')].use: 'babel-loader'}},plugins: [
    new HtmlWebpackPlugin()
  ]
}
Copy the code

The function of Module is to use the corresponding loader to compile the code when the test field and file name match successfully. Webpack itself only knows files of.js and.json, while through loader, We can process files in other formats such as CSS.

For React files, we need to convert the JSX syntax to pure JS syntax, the React. CreateElement method, so that the code can be recognized by the browser. We usually do this by using the babel-loader and configuring the react parsing rules.

After the above processing. The actual code the browser reads for the button component looks something like this.

. function Switch(_ref) {var handleToggle = _ref.handleToggle;
  return _nervjs["default"].createElement("button", {
    onClick: handleToggle
  }, "Toggle");
}
Copy the code

Plugins, on the other hand, are plugins that register handlers for compiled results with Webpack lifecycle hooks and do some work with compiled results before generating the final file. For example, in most scenarios we need to insert the generated JS file into the Html file. We need to use the htML-webpack-plugin, which we need to write in the configuration.

const HtmlWebpackPlugin = require('html-webpack-plugin');

const webpackConfig = {
  entry: 'index.js'.output: {
    path: path.resolve(__dirname, './dist'),
    filename: 'index_bundle.js'
  },
  // Pass an instance of the HtmlWebpackPlugin to the plugins array
  plugins: [new HtmlWebpackPlugin()]
};
Copy the code

In this way, the HTmL-webpack-plugin will be registered in the packaging completion phase and will get the entry JS file path to the final packaging completion. Generate a script tag of the form and insert it into the Html. This allows the browser to display the page content in HTML files.

Ok, here we go. For a developer, all the configuration items and project code files that need to be packaged are ready. The next step is to hand over the work to the packaging tool Webpack, which packages the code into what we and the browser want it to look like

Tool perspective

First, we need to understand the Webpack packaging process

As you can see from the Webpack workflow, you need to implement a Compiler class that collects all the configuration information passed in by the developer and directs the overall compilation process. Compiler can be understood as the boss of the company, who is in charge of the whole situation and has the global information (customer requirements). Once it knows all the information, it calls another class Compilation to create an instance and trusts it with all the information and all the work flow. The Compilation is basically the boss’s secretary, and you need to move all the departments around to start working as required. Loaders and plugins, on the other hand, are departments that work only in their specialties (JS, CSS, SCSS, JPG, PNG…). They deal with it when they show up

In order to implement both Webpack packaged functionality and only the core code. Let’s simplify the process a little bit

First we create a new Webpack function as an exposed method that takes two parameters, one of which is a configuration item object and the other is an error callback.

const Compiler = require('./compiler')

function webpack(config, callback) {
  // There should be parameter verification here
  const compiler = new Compiler(config)
  // Start compiling
  compiler.run()
}

module.exports = webpack
Copy the code

1. Configure the configuration information

We need to collect the user-passed information in the Compiler class constructor first

class Compiler {
  constructor(config, _callback) {
    const {
      entry,
      output,
      module,
      plugins
    } = config
    / / the entry
    this.entryPath = entry
    // Output file path
    this.distPath = output.path
    // Output the file name
    this.distName = output.fileName
    // Loader to use
    this.loaders = module.rules
    // The plugin to mount
    this.plugins = plugins
     / / root directory
    this.root = process.cwd()
     // Compile the utility class Compilation
    this.compilation = {}
    // The relative path of the entry file in module, which is also the id of the module
    this.entryId = getRootPath(this.root, entry, this.root)
  }
}
Copy the code

At the same time, we mount all plugins in the constructor to the hooks attribute of the instance. Webpack’s lifecycle management is based on a library called Tapable that makes it easy to create a publish-subscribe model hook and then mount functions to the instance (hook event callbacks support synchronous, asynchronous, and even chained callbacks). Trigger the corresponding event handler when appropriate. We declare some lifecycle hooks on hooks:

const { AsyncSeriesHook } = require('tapable') // Here we create some asynchronous hooks
constructor(config, _callback) {
  ...
  this.hooks = {
    // Lifecycle events
    beforeRun: new AsyncSeriesHook(['compiler']), Compiler means that we pass a Compiler parameter to the callback event
    afterRun: new AsyncSeriesHook(['compiler']),
    beforeCompile: new AsyncSeriesHook(['compiler']),
    afterCompile: new AsyncSeriesHook(['compiler']),
    emit: new AsyncSeriesHook(['compiler']),
    failed: new AsyncSeriesHook(['compiler']),}this.mountPlugin()
}
// Register all plugins
mountPlugin() {
  for(let i=0; i<this.plugins.length; i++) {const item = this.plugins[i]
    if ('apply' in item && typeof item.apply === 'function') {
      // Register publish-subscribe listening events for each lifecycle hook
      item.apply(this)}}}// Before running the logic of the run method
run() {
  // Publish a message during a particular lifecycle that triggers the corresponding subscription event
  this.hooks.beforeRun.callAsync(this) // This is passed as an argument, corresponding to the previous compiler. }Copy the code

Cold fact: Each Plugin Class must implement an Apply method that receives compiler instances and then mounts the actual hook function to one of the compiler.hook declaration cycles. If we declare a hook but do not mount any methods, an error will be reported when the call function fires. In fact, each Webpack lifecycle hook will mount at least one Webpack plugin in addition to the user-configured plugin, so there is no such problem. For more information about the use of tapable, go to tapable

2. Compile

Next we need to declare a Compilation class, which performs Compilation work. In the Compilation constructor, we first receive information from the boss Compiler and mount it in our properties.

class Compilation {
  constructor(props) {
    const {
      entry,
      root,
      loaders,
      hooks
    } = props
    this.entry = entry
    this.root = root
    this.loaders = loaders
    this.hooks = hooks
  }
  // Start compiling
  async make() {
    await this.moduleWalker(this.entry)
  }
  // DFS traversal function
  moduleWalker = async () => {}
}

Copy the code

Because we need to pack in the process of cited documents are compiled into the final code for the bag, so you need to declare a depth traversal functions moduleWalker (the name is the author take, take not webpack official), as the name implies, this method will start from the entrance to the file, so in order to compile files for the first step and the second step, And collect other referenced modules and recursively do the same.

The compilation step is divided into two steps

  1. The first step is to use all the ones that meet the criterialoaderCompile it and return the compiled source code
  2. The second step is equivalent toWebpackIts own compilation step, which aims to build dependent call relationships between individual modules. What we need to do is take all of thatrequireMethod is replaced byWebpackself-defined__webpack_require__Function. Because all compiled modules will beWebpackObjects stored in a closuremoduleMap, and the__webpack_require__The function is the only one that has accessmoduleMapMethods.

__webpack_require__ replaces the file address -> file content relationship between modules with the key -> object value (file content).

When the second Compilation is complete, references in the current module are collected and returned to Compilation so that the moduleWalker can compile these dependent modules recursively. Of course, there is a large probability of circular reference and repeated reference. We will generate a unique key value according to the path of the reference file, and skip the key value when it is repeated.

i. moduleWalkerTraversal functions

// Store the processed module code Map
moduleMap = {}

// Compile all referenced files according to dependencies
async moduleWalker(sourcePath) {
  if (sourcePath in this.moduleMap) return
  // To read a file, we need the full.js file path
  sourcePath = completeFilePath(sourcePath)
  const [ sourceCode, md5Hash ] = await this.loaderParse(sourcePath)
  const modulePath = getRootPath(this.root, sourcePath, this.root)
  // Get the compiled code of the module and the dependency array in the module
  const [ moduleCode, relyInModule ] = this.parse(sourceCode, path.dirname(modulePath))
  // Put the module code into the ModuleMap
  this.moduleMap[modulePath] = moduleCode
  this.assets[modulePath] = md5Hash
  // Then parse the dependencies in the module
  for(let i=0; i<relyInModule.length; i++) {await this.moduleWalker(relyInModule[i], path.dirname(relyInModule[i]))
  }
}
Copy the code

If we give the DFS path to log, we can see this process

Ii. First step compilationloaderParsefunction

async loaderParse(entryPath) {
  // Read the file contents in utf8 format
  let [ content, md5Hash ] = await readFileWithHash(entryPath)
  // Get the user-injected loader
  const { loaders } = this
  // Pass all loaders in sequence
  for(let i=0; i<loaders.length; i++) {const loader = loaders[i]
    const { test : reg, use } = loader
    if (entryPath.match(reg)) {
      // Determine whether the re or string requirements are met
      // If this rule requires multiple Loaders to be applied, proceed from the last loader
      if (Array.isArray(use)) {
        while(use.length) {
          const cur = use.pop()
          const loaderHandler = 
            typeof cur.loader === 'string' 
            // loader may also come from package packages such as babel-loader
              ? require(cur.loader)
              : (
                typeof cur.loader === 'function'
                ? cur.loader : _= > _
              )
          content = loaderHandler(content)
        }
      } else if (typeof use.loader === 'string') {
        const loaderHandler = require(use.loader)
        content = loaderHandler(content)
      } else if (typeof use.loader === 'function') {
        const loaderHandler = use.loader
        content = loaderHandler(content)
      }
    }
  }
  return [ content, md5Hash ]
}
Copy the code

However, there is a small catch. The babel-loader we normally use does not seem to be able to be used outside of Webpack scenarios

This package allows transpiling JavaScript files using Babel and webpack.

However, @babel/core has nothing to do with Webpack, so I have to take the trouble of writing a loader method to parse JS and ES6 syntax.

const babel = require('@babel/core')

module.exports = function BabelLoader (source) {
  const res = babel.transform(source, {
    sourceType: 'module' // Compile the ES6 import and export syntax
  })
  return res.code
}
Copy the code

Of course, compilation rules can be passed in as configuration items, but to simulate a real development scenario, we need to configure the babel.config.js file

module.exports = function (api) {
  api.cache(true)
  return {
    "presets": [['@babel/preset-env', {
        targets: {
          "ie": "8"}}],'@babel/preset-react'./ / compile JSX]."plugins": [["@babel/plugin-transform-template-literals", {
        "loose": true}]],"compact": true}}Copy the code

Thus, after obtaining the code handled by the Loader, any module can theoretically be used directly in a browser or unit test. But our code is a whole, and we need a way to organize our code references.

It also explains why we use the __webpack_require__ function. The code we get here is still a string, but for our convenience we use the eval function to parse the string into directly readable code. Of course, this is just the way to get fast, for JS such interpreted language, if a module to explain compilation, the speed will be very slow. In fact, a real production environment would wrap the module contents into an IIFE (immediate self-executing function expression)

To summarize, what we need to do in the second part of compiling the parse function is simply to replace the function name of the require method in all modules with __webpack_require__. We use the Babel family barrel for this step. Babel, as the top JS compiler in the industry, analyzes the code in two steps: lexical analysis and syntax analysis. In simple terms, a code snippet is analyzed word by word to generate a context based on the current word. Then judge the role of the next word in the context.

Note that in this step we can also “incidentally” collect the module’s dependency array and return it together (for DFS recursion)

const parser = require('@babel/parser')
const traverse = require('@babel/traverse').default
const types = require('@babel/types')
const generator = require('@babel/generator').default
...
// Parse the source code and replace the require method to build the ModuleMap
parse(source, dirpath) {
  const inst = this
  // Parse the code into an AST
  const ast = parser.parse(source)
  const relyInModule = [] // Get all the modules that the file depends on
  traverse(ast, {
    // Retrieves all lexical analysis nodes, executes when a function call expression is encountered, and overwrites the AST tree
    CallExpression(p) {
      // Some require are wrapped by _interopRequireDefault
      // You need to find the _interopRequireDefault node first
      if (p.node.callee && p.node.callee.name === '_interopRequireDefault') {
        const innerNode = p.node.arguments[0]
        if (innerNode.callee.name === 'require') {
          inst.convertNode(innerNode, dirpath, relyInModule)
        }
      } else if (p.node.callee.name === 'require') {
        inst.convertNode(p.node, dirpath, relyInModule)
      }
    }
  })
  // Reassemble the rewritten AST tree into a new piece of code and return it with the dependencies
  const moduleCode = generator(ast).code
  return [ moduleCode, relyInModule ]
}
/** * converts the name and arguments of a node to the new node we want */
convertNode = (node, dirpath, relyInModule) = > {
  node.callee.name = '__webpack_require__'
  // Argument string names, such as 'react', './ myname.js'
  let moduleName = node.arguments[0].value
  // Generate the dependency module relative to the project root directory path
  let moduleKey = completeFilePath(getRootPath(dirpath, moduleName, this.root))
  // Collect module array
  relyInModule.push(moduleKey)
  // Replace the __webpack_require__ parameter string, which is also the moduleKey of the corresponding module and needs to be consistent
  // Since every element in the AST tree is a Babel node, '@babel/types' is required to generate
  node.arguments = [ types.stringLiteral(moduleKey) ]
}
Copy the code

3. emitGenerating bundle files

At this point, the compilation mission is actually completed. The main function body is a closure. The closure caches installedModules that have been loaded and defines a __webpack_require__ function. The final return is the module corresponding to the function entry. The parameters of the function are the key-values of each module.

We will concatenate the ejS template here, traversing the previously collected moduleMap object and injecting it into the EJS template string.

Template code

// template.ejs
(function(modules) { // webpackBootstrap
  // The module cache
  var installedModules = {};
  // The require function
  function __webpack_require__(moduleId) {
      // Check if module is in cache
      if(installedModules[moduleId]) {
          return installedModules[moduleId].exports;
      }
      // Create a new module (and put it into the cache)
      var module = installedModules[moduleId] = {
          i: moduleId,
          l: false.exports: {}};// Execute the module function
      modules[moduleId].call(module.exports, module.module.exports, __webpack_require__);
      // Flag the module as loaded
      module.l = true;
      // Return the exports of the module
      return module.exports;
  }
  // Load entry module and return exports
  return __webpack_require__(__webpack_require__.s = "<%-entryId%>"); < % ({})for(let key in modules) {%>
     "<%-key%>":
         (function(module, exports, __webpack_require__) {
             eval(
                 `<%-modules[key]%>`); }}), < % % >});Copy the code

Generate bundle. Js

/** * emits the file to generate the final bundle.js */
emitFile() { // Launch the packaged output file
  // Compare the cache first to see if the file has changed
  const assets = this.compilation.assets
  const pastAssets = this.getStorageCache()
  if (loadsh.isEqual(assets, pastAssets)) {
    // If the hash value of the file does not change, there is no need to rewrite the file
    // Check whether each corresponding file exists
    // This step is omitted!
  } else {
    // The cache failed to hit
    // Get the output file path
    const outputFile = path.join(this.distPath, this.distName);
    // Get the output file template
    // const templateStr = this.generateSourceCode(path.join(__dirname, '.. ', "bundleTemplate.ejs"));
    const templateStr = fs.readFileSync(path.join(__dirname, '.. '."template.ejs"), 'utf-8');
    // Render output file template
    const code = ejs.render(templateStr, {entryId: this.entryId, modules: this.compilation.moduleMap});
    
    this.assets = {};
    this.assets[outputFile] = code;
    // Write the rendered code to the output file
    fs.writeFile(outputFile, this.assets[outputFile], function(e) {
      if (e) {
        console.log('[Error] ' + e)
      } else {
        console.log('[Success] compile successfully ')}});// Write the cache information to the cache file
    fs.writeFileSync(resolve(this.distPath, 'manifest.json'), JSON.stringify(assets, null.2))}}Copy the code

In this step, we compare the Md5Hash generated by the file content to the previous cache to speed up the packaging. If you are careful, you will notice that Webpack generates a cache file manifest.json every time it is packaged

{
  "main.js": "./js/main7b6b4.js"."main.css": "./css/maincc69a7ca7d74e1933b9d.css"."main.js.map": "./js/main7b6b4.js.map"."vendors~main.js": "./js/vendors~main3089a.js"."vendors~main.css": "./css/vendors~maincc69a7ca7d74e1933b9d.css"."vendors~main.js.map": "./js/vendors~main3089a.js.map"."js/28505f.js": "./js/28505f.js"."js/28505f.js.map": "./js/28505f.js.map"."js/34c834.js": "./js/34c834.js"."js/34c834.js.map": "./js/34c834.js.map"."js/4d218c.js": "./js/4d218c.js"."js/4d218c.js.map": "./js/4d218c.js.map"."index.html": "./index.html"."static/initGlobalSize.js": "./static/initGlobalSize.js"
}
Copy the code

This is also a common judgment used in file breakpoint continuations, which will not be expanded in detail here


inspection

Now that we’re pretty much done (mistake: if you don’t consider the mind-blowing debug process), let’s configure the packaging script in package.json

"scripts": {
  "build": "node build.js"
}
Copy the code

Run the yarn build

(@ο@) Wow ~ The exciting moment has arrived.

However…

Looking at the packaging out of this strange thing reported wrong, the in the mind or a little want to laugh. I checked and found that the backquote encountered the backquote in the comment and the concatenation string ended prematurely. Okay, so I added a few lines of code to the Babel traverse and deleted all the comments from the code. But then there are other problems.

Well, there may be some other steps in the actual React production packaging, but that’s not the topic of today’s discussion. At this point, the frame of ghosts in mind. I was reminded of JD’s own high-performance, highly compatible react framework NervJS, which follows closely on the React version. Perhaps NervJS’s approachable code would support this sorry packaging tool

So we configure alias in babel.config.js to replace the React dependency. (React project to NervJS is easy)

module.exports = function (api) {
  api.cache(true)
  return{..."plugins": [[..."module-resolver", {
          "root": ["."]."alias": {
            "react": "nervjs"."react-dom": "nervjs".// Not necessary unless you consume a module using `createClass`
            "create-react-class": "nerv-create-class"}}]],"compact": true}}Copy the code

Run the yarn build

(@ο@) Wow! The code is finally running successfully. There are many problems, but at least the WebPack is capable of supporting most JS frameworks with its simple design. Students who are interested can also try to write their own, or clone directly from here to read

Webpack is certainly an excellent slogen bundling tool (though slogen doesn’t have any slogen on its website). A great tool is one that retains its own identity while giving other developers the ability to build on it and expand beyond what was conceived. The ability to learn these tools in depth will greatly improve our knowledge of code engineering.


Welcome to the bump Lab blog: AOtu.io

Or pay attention to the bump Laboratory public account (AOTULabs), push the article from time to time: