preface

Nodejs portal series:

  • The NodeJS module mechanism is simple
  • NPM 7.0 source code analysis (2) NPM install execution logic
  • NPM 7.0 source code analysis (3) NPM config

The nodeJS module mechanism has been analyzed previously. If you are not familiar with the nodeJS module mechanism, you can go to the corresponding article. NPM is a package management tool for nodeJS. It is a package management tool for nodeJS. It is a package management tool for nodeJS, and it is a package management tool for nodeJS. We know that NPM is used to add, delete, change and check CommonJS modules in nodeJS stack. However, NPM itself is also a CommonJS module and can be added, deleted, changed and checked by NPM command. It also complies with the module specification, such as [email protected] (all the following analysis is based on version 7.0, Compared with the previous version, the overall execution logic of this version has been greatly reconstructed, and the code logic is clearer, easier to maintain and expand.) If we look at its package.json directly, we can find that in the bin field, NPM is an executable command, and its logical entry is based on the bin/npm-cli.js file. Without further ado, we trigger directly from this portal.

Core starting principle

To better analyze the entire NPM startup logic, go straight to vscode debug. In order not to do debugging under the global NPM package, because I may need to change the code in the NPM package to better debug, so I installed the NPM package locally, directly using the global installation of the NPM package is also ok. NPM I nopt –no-package-lock NPM I nopt –no-package-lock NPM I nopt –no-package-lock NPM I nopt –no-package-lock Create launch.json, hit a breakpoint, and F5 debugging starts.

You can also read this article while debugging, so that you can have a better sense of substitution.

Configurations ": [{"type": "node", "runtimeExecutable": "/usr/local/bin/node", "request": "launch", "name": "Launch Program", "skipFiles": [ "<node_internals>/**" ], "program": "index.js", "args": ["i", "nopt", "--no-package-lock"] } ] }Copy the code

NPM logical entry

Since the logical entry to the NPM command line is bin/npm-cli.js, we lock the file directly and find that the source code is very simple and internally references lib/cli.js.

#! /usr/bin/env node require('.. /lib/cli.js')(process)Copy the code

So let’s skip ahead and take a look at what cli.js does, starting with the core code logic:

checkForBrokenNode() ... checkForUnsupportedNode() ... const npm = require('.. /lib/npm.js') ... npm.load(async er => { ... const cmd = npm.argv.shift() const impl = npm.commands[cmd] if (impl) impl(npm.argv, errorHandler) else { npm.config.set('usage', false) npm.argv.unshift(cmd) npm.commands.help(npm.argv, errorHandler) } })Copy the code

To sum up, we have done the following three important things:

  1. Check the NodeJS version for compatibility tips
  2. Load the core module lib/npm.js to get the NPM instance
  3. Npm. load is performed, and after load is parsed in process.argv, the corresponding logic of CMD execution is obtained

The point is, since lib/npm.js was introduced to get the NPM instance, and most of the core processing logic is on the NPM instance, we must analyze the entire process of NPM instantiation.

NPM instantiation

As you can see from the source code, NPM instances are inherited from EventEmitter. Looking directly at its constructor, I’ve omitted some unimportant code,

const npm = module.exports = new class extends EventEmitter { constructor () { super() ... This.mand = null this.mands = proxyCmds(this.mands) this.version = require('.. /package.json').version this.config = new Config({ npmPath: Dirname (__dirname), types, defaults, shorthands,}) Config this[_title] = process.title this.updateNotification = null}Copy the code

It is not hard to see that there are two important things in it:

  1. throughproxyCmdsLoads and proxies all defined executable CMD
  2. Obtain the required configuration information config,View detailed NPMRC configurationThe configuration information is maintained in the Config instanceconfig.dataIn this Map

[‘default’,’builtin’,’global’,’user’,’project’,’env’,’cli’] [‘default’,’builtin’,’global’,’user’,’project’,’env’,’cli’]]

  • Default: contains all information

  • Contains all default command line options configurations

  • Builtin: Runtime configuration under built-in NPMRC

  • Global: runtime configuration under the global NPMRC

  • User: run time configuration under NPMRC configured by the user

  • Project: Runtime configuration under the current project NPMRC

  • Env: all fields starting with NPM environment variable npM_config_

  • Cli: All options of currently executing CLI commands (e.g. –save, etc.) are parsed by NOpt based on the types of all command-line options and shorthands (e.g. –save, -s, etc.). The result is an object with command line option as key, such as {save:true,’save-dev’:false}

Take a look at the implementation logic one by one, starting with proxyCmds:

const proxyCmds = (npm) => { const cmds = {} return new Proxy(cmds, { get: (prop, cmd) => { if (hasOwnProperty(cmds, cmd)) return cmds[cmd] const actual = deref(cmd) if (! actual) { cmds[cmd] = undefined return cmds[cmd] } if (cmds[actual]) { cmds[cmd] = cmds[actual] return cmds[cmd] } Return CMDS [CMD]} return CMDS [CMD]} return CMDS [CMD]} return CMDS [CMD]} }) } const makeCmd = cmd => { const impl = require(`./${cmd}.js`) const fn = (args, cb) => npm[_runCmd](cmd, impl, args, cb) Object.assign(fn, impl) return fn }Copy the code

ProxyCmds returns a Proxy instance,

  • All CMD defined in NPM (e.ginstall) are maintained in the CMDS object, and generate a Proxy instance with CMDS as the target
  • Do it for CMD namesderef.derefThe main effect is to first convert CMD (e.gnpm iCorresponding to theinstallConvert camelCase to kebab-case, find the final real name from aliases to real names (the real name for I is install), and reference the corresponding module file according to this real name (in this case lib/install.js)
  • inmakeCmdAccording to the real name of kebab-case form, the module file corresponding to the CMD name is introduced, and these introduced commands are utilized**npm[_runCmd]**Unified encapsulation of instance methods.
  • After creating the corresponding implementation of the CMD command, the CMDS assigns the corresponding implementation of the real name to the alias. This is why the NPM command can use many aliases. From the following figure, we can see that the alias of install also exists in CMDS.

Let’s look at how configuration information is obtained:

Take a look at the Config constructor. The types, shorthands, and defaults configurations are available in the source code.

constructor ({ types, shorthands, defaults, npmPath, ... {...}) this.data = new Map() let parent = null for (const where of wheres) { // Wheres = ['default','builtin','global','user','project','env','cli'] this.data.set(where, parent = new ConfigData(parent)) } } class ConfigData { constructor (parent) { this[_data] = Object.create(parent && parent.data) this[_source] = null this[_loadError] = null this[_raw] = null this[_valid] = true } get data () { return this[_data] }Copy the code

With the ConfigData constructor, we can see that in the order [‘default’,’builtin’,’global’,’user’,’project’,’env’,’cli’], The data of the ConfigData instance is the prototype of the data of the previous ConfigData instance, and the final fusion config. data is obtained. Therefore, when a field whose hasOwnProperty is fasle is modified, the corresponding field of the parent will be overwritten. So config.data.get(key, where = ‘cli’) can be used to obtain the configuration item corresponding to the key. If it cannot be found, the prototype will be searched step by step. These ConfigData assignments occur in npm.load along with npm.config.load. The image above shows what data looks like before config is loaded, with all configData.data empty.

At this point, NPM instantiation is complete, and the next step is to execute NPM. Load, whose core logic is in the [_load] method. Argv [0] (/usr/local/bin/node) is the executable for process.argv[0] (/usr/local/bin/node). This is the node command because NPM actually runs npm-cli.js through node. This is followed by config.load.

async [_load] () { const node = await which(process.argv[0]).catch(er => null) if (node && node.toUpperCase() ! == process.execPath.toUpperCase()) { log.verbose('node symlink', node) process.execPath = node this.config.execPath = node } await this.config.load() this.argv = this.config.parsedArgv.remain ... }Copy the code

In this case, all parsedargv. remain obtained by nopt parsing is assigned to npm.argv, where the REMAIN field contains the command line arguments left over from NOPT parsing. In this case, remain is:

The config. The load:

 async load () {
    if (this.loaded)
      throw new Error('attempting to load npm config multiple times')

    this.loadDefaults()
    await this.loadBuiltinConfig()
    this.loadCLI()
    this.loadEnv()
    await this.loadProjectConfig()
    await this.loadUserConfig()
    await this.loadGlobalConfig()
    
    ...
    this.validate()
   
    this[_loaded] = true

    this.globalPrefix = this.get('prefix')   
      ...
  }
Copy the code

Load everything in [‘default’,’builtin’,’global’,’user’,’project’,’env’,’cli’] and the completed config.data is shown below. Default is the default command line options configuration item in the source code

The callback is triggered when npm.load finishes, which is passed when npm.load is executed in cli.js.

npm.load(async er => {
    if (er)
      return errorHandler(er)
    if (npm.config.get('version', 'cli')) {
      console.log(npm.version)
      return errorHandler.exit(0)
    }

    if (npm.config.get('versions', 'cli')) {
      npm.argv = ['version']
      npm.config.set('usage', false, 'cli')
    }

    npm.updateNotification = await updateNotifier(npm)

    const cmd = npm.argv.shift()
    const impl = npm.commands[cmd]
    if (impl)
      impl(npm.argv, errorHandler)
    else {
      npm.config.set('usage', false)
      npm.argv.unshift(cmd)
      npm.commands.help(npm.argv, errorHandler)
    }
  })
Copy the code

In the callback, npm.argv.shift() gives us the name of the current NPM command line, ** I ** in this case, and the remaining npm.argv is [‘nopt’]. As mentioned above, nPM.com mands proxies all of the defined CMD execution logic. By executing impl(npm.argv, errorHandler), you get to the concrete CMD execution logic, in this case lib/install.js. As mentioned earlier in makeCmd, the real CMD entry is actually converged to the NPM [_runCmd](CMD, IMPl, args, cb) of the NPM instance. Take a look at the [_runCmd] method source code, the core logic is actually very simple:

 [_runCmd] (cmd, impl, args, cb) {
    ...
    
    if (this.config.get('usage')) {
      console.log(impl.usage)
      cb()
    } else {
      impl(args, er => {
        process.emit('timeEnd', `command:${cmd}`)
        cb(er)
      })
    }
  }
Copy the code

NPM install –usage: NPM install –usage: tue: NPM install –usage: tue: NPM install –usage: tue: NPM install –usage: tue: NPM install –usage: tue: NPM install –usage: tue: NPM install –usage: tue In particular, it can receive module destination address can have many forms, in addition to the module name, folder path, git address and tarball address, and its command line options are also relatively rich, can meet our different module installation and setting requirements

const usage = usageUtil( 'install', 'npm install (with no args, in package dir)' + '\nnpm install [<@scope>/]<pkg>' + '\nnpm install [<@scope>/]<pkg>@<tag>' + '\nnpm install [<@scope>/]<pkg>@<version>' + '\nnpm install [<@scope>/]<pkg>@<version range>' + '\nnpm install <alias>@npm:<name>' + '\nnpm install <folder>' + '\nnpm install <tarball file>' + '\nnpm install <tarball url>' + '\nnpm install <git:// url>'  + '\nnpm install <github username>/<github project>', '[--save-prod|--save-dev|--save-optional|--save-peer] [--save-exact] [--no-save]' )Copy the code

If it is not –usage, then the actual install execution logic is entered, which is the execution method exported by the lib/install.js module. By the way, as mentioned in proxyCmds above, when NPM is instantiated, the corresponding file module will be introduced according to the real command name, so all the export forms of CMD file module defined in NPM source are as follows:

This is also where version 7.0 has a larger refactoring than previous versions, and the code organization is easy to maintain and extend, making it easy to add new commands.

const cmd = (args, cb) => install(args).then(() => cb()).catch(cb)

Object.assign(cmd, { completion, usage })
Copy the code

conclusion

At this point, along with NPM instantiation, we have all the configuration item information needed for all commands to execute at run time, and it is now up to us to process the specific CMD execution logic based on this rich configuration item information. We will analyze the execution logic of each specific NPM command one by one. I believe that you can easily enter each NPM command to see what it is. If you want to discuss anything after reading it, you are welcome to leave comments and make progress together.