aboutyarn

Yarn and NPM are also JavaScript package management tools. We also found CNPM, PNPM and other package management tools. If only one of them is enough, why are there so many wheels?

Why is ityarn? What makes it different from other tools?

Tip: NPM refers to the NPM2 version

andnpmThe difference between

  • yarnDownloading and installing dependency packages takes a multi-threaded approach, whilenpmIs a single threaded way of execution, the speed gap opened
  • yarnDependency packages that have been downloaded from the local cache are read from the cache first. Only when the local cache does not exist, the remote request mode is adopted. In contrast,npmIs the full amount of requests, the speed of the gap again
  • yarnLaying all dependencies on the same level effectively reduces the number of duplicate downloads of the same dependencies, speeding up the download and reducing the number of downloadsnode_modulesThe volume of the; In contrast,npmIs strictly downloaded according to the dependency tree and placed in the corresponding location, resulting in the same package downloading multiple times,node_modulesBig volume problem

andcnpmThe difference between

  • cnpmDomestic images are faster (other tools can also change the source address)
  • cnpmGather all the downloaded packages of the project into their own cache folder and passSoft linksPut dependency packages in the corresponding project’snode_modulesIn the

andpnpmThe difference between

  • andyarnThere is also a directory for managing dependencies
  • pnpmRetain thenpm2Version of the original dependency tree structure, butnode_modulesAll dependency packages are saved by soft link

From doing a simpleyarnTo get to knowyarn

Step 1 – Download

The JavaScript package management tool uses package.json as the entry point for dependency description.

{
    "dependencies": {
        "lodash": "4.17.20"}}Copy the code

In the example of package.json above, we can directly identify package.json and download the corresponding package directly.

import fetch from 'node-fetch';
function fetchPackage(packageJson) {
  const entries = Object.entries(packageJson.dependencies);
  entries.forEach(async ([key, version]) => {
    const url = `https://registry.`yarn`pkg.com/${key}/-/${key}-${version}.tgz`.const response = await fetch(url);
    if(! response.ok) {throw new Error(`Couldn't fetch package "${reference}"`);
    }
    return await response.buffer();
  });
}
Copy the code

Now let’s look at another situation:

{
    "dependencies": {
        "lodash": "4.17.20"."customer-package": ".. /.. /customer-package"}}Copy the code

“customer-package”: “.. /.. “/customer-package” doesn’t work in our code anymore. So we need to make code changes:

import fetch from 'node-fetch';
import fs from 'fs-extra';
function fetchPackage(packageJson) {
  const entries = Object.entries(packageJson.dependencies);
  entries.forEach(async ([key, version]) => {
    // Copy files directly for file path resolution
    if ([` / `.`. / `.`.. / `].some(prefix= > version.startsWith(prefix))) {
      return await fs.readFile(version);
    }
    // The non-file path directly requests the remote address
    / /... old code
  });
}
Copy the code

Step 2 – Flexible matching rules

At present, our code can normally download fixed version of the dependency package, file path. However, for example, “react”: “^15.6.0” is not supported, and we know that this expression represents all package versions from version 15.6.0 to version 15.7.0. In theory we should install the latest version of the package in this range, so we add a new method:

import semver from 'semver';
async function getPinnedReference(name, version) {
  // First verify that the version number conforms to the specification
  if(semver.validRange(version) && ! semver.valid(version)) {// Obtain all version numbers of dependency packages
    const response = await fetch(`https://registry.`yarn`pkg.com/${name}`);
    const info = await response.json();
    const versions = Object.keys(info.versions);
    // Match the latest version number that complies with the specification
    const maxSatisfying = semver.maxSatisfying(versions, reference);
    if (maxSatisfying === null)
      throw new Error(
        `Couldn't find a version matching "${version}" for package "${name}"`
      );
    reference = maxSatisfying;
  }
  return { name, reference };
}
Copy the code
function fetchPackage(packageJson) {
  const entries = Object.entries(packageJson.dependencies);
  entries.forEach(async ([name, version]) => {
    // Copy files directly for file path resolution
    / /... old code
    let realVersion = version;
    // Get the latest version of the package if the version number starts with ~ and ^
    if (version.startsWith('~') || version.startsWith(A '^')) {
      const { reference } = getPinnedReference(name, version);
      realVersion = reference;
    }
    // The non-file path directly requests the remote address
    / /... old code
  });
}
Copy the code

So we can support the user to specify a package to install the latest package within a dependency scope.

Step 3 – Dependencies There are also dependencies

It’s not as simple as we think. Our dependencies have their own dependencies, so we need to recurse through each layer of dependencies to download all of them.

// Get dependencies for the dependency package
async function getPackageDependencies(packageJson) {
  const packageBuffer = await fetchPackage(packageJson);
  // Read the dependency package 'package.json'
  const packageJson = await readPackageJsonFromArchive(packageBuffer);
  const dependencies = packageJson.dependencies || {};
  return Object.keys(dependencies).map(name= > {
    return { name, version: dependencies[name] };
  });
}
Copy the code

Now we can get all the dependency packages in the dependency tree from the user project package.json.

Step 4 – Transfer files

It’s not enough to download the dependencies, we need to move all the files to the specified directory, which is known as node_modules.

async function linkPackages({ name, reference, dependencies }, cwd) {
  // Get the entire dependency tree
  const dependencyTree = await getPackageDependencyTree({
    name,
    reference,
    dependencies,
  });
  await Promise.all(
    dependencyTree.map(async dependency => {
      await linkPackages(dependency, `${cwd}/ `node_modules` /${dependency.name}`); })); }Copy the code

Step 5 – Optimization

We can download all the dependencies from the whole tree and put them in node_modules, but we find that there are many duplicate dependencies, so we can put the same dependencies in one place so that we don’t need to download them again.

function optimizePackageTree({ name, reference, dependencies = [] }) {
  dependencies = dependencies.map(dependency= > {
    return optimizePackageTree(dependency);
  });
  for (let hardDependency of dependencies) {
    for (let subDependency of hardDependency.dependencies)) {
      // Whether the child dependency has the same dependency as the parent dependency
      let availableDependency = dependencies.find(dependency= > {
        return dependency.name === subDependency.name;
      });
      if(! availableDependency) {// Insert a dependency into the parent dependency if the parent dependency does not exist
          dependencies.push(subDependency);
      }
      if (
        !availableDependency ||
        availableDependency.reference === subDependency.reference
      ) {
        // Remove the same dependencies from the child dependencies
        hardDependency.dependencies.splice(
          hardDependency.dependencies.findIndex(dependency= > {
            returndependency.name === subDependency.name; })); }}}return { name, reference, dependencies };
}
Copy the code

We have reduced the number of repeated dependency installations by flattening dependencies from one dependency to the next by a step-by-step recursion. At this point we have implemented simple YARN

Yarn Architecture

The most intuitive thing to look at is the codeyarnThe idea of object-oriented play incisively and vividly

  • Config:yarnRelated Configuration Examples
  • cliAll:yarnCommand set instance
  • registries:npmSource related information instances
    • It involves locking files, parsing dependency package entry file names, dependency package storage locations and file names, etc
  • lockfile:yarn.lockobject
  • Intergrity checker: Checks whether the dependency package has been downloaded correctly
  • package resolver: used for parsingpackage.jsonDependencies are referenced in different ways
    • Package Request: dependency package version request instance
    • Package Reference: package reference instance
  • Package fetcher: dependency package download instance
  • Package linker: manages dependency package files
  • Package Hoister: A flat instance of dependency packages

yarnThe working process

Flow profile

Here we use yarn add Lodash as an example to take a look at what Yarn does internally. Yarn installation consists of the following five steps:

  • checking: Check configuration items (.yarnrc, command line arguments,package.json), compatibility (CPU, NodeJS version, operating system, etc.)
  • ResolveStep: resolves the dependency package information and the specific version information of all the packages in the dependency tree
  • FetchStep: Download all the dependency packages. If the dependency packages already exist in the cache, skip the download, otherwise download the corresponding dependency packages to the cache folder. When this step is complete, all the dependency packages have been cached
  • LinkStep: Make a flat copy of cached dependency packages to the dependency directory of the project
  • BuildStep: For some binary packages, you need to compile and do so in this step

Process on

Let’s continue with yarn add lodash as an example

Initialize the

To find theyarnrcfile

// Obtain the configuration of the 'yarn' rc file
// process. CWD Project directory of the current command
Argv Specifies the 'yarn' command and parameters
const rc = getRcConfigForCwd(process.cwd(), process.argv.slice(2));
/** * generate all paths where the Rc file may exist *@param {*} Name rc Source name *@param {*} CWD Current project path */
function getRcPaths(name: string, cwd: string) :Array<string> {
/ /... other code
  if(! isWin) {// In a non-Windows environment, start the search from /etc/' yarn '/config
    pushConfigPath(etc, name, 'config');
// In a non-Windows environment, start the search from /etc/' yarn 'rc
    pushConfigPath(etc, `${name}rc`);
  }
// A user directory exists
  if (home) {
// 'yarn' Configures routes by default
    pushConfigPath(CONFIG_DIRECTORY);
// User directory /. Config /${name}/config
    pushConfigPath(home, '.config', name, 'config');
// User directory /. Config /${name}/config
    pushConfigPath(home, '.config', name);
 ${name}/config
    pushConfigPath(home, `.${name}`.'config');
 ${name}rc
    pushConfigPath(home, `.${name}rc`);
  }
${name} ${name} ${name} ${name
  // Tip: Rc files written by users have the highest priority
  while (true) {
// Insert - current project path /.${name}rc
    unshiftConfigPath(cwd, `.${name}rc`);
// Get the parent path of the current project
    const upperCwd = path.dirname(cwd);
    if (upperCwd === cwd) {
 // we've reached the root
      break;
    } else{ cwd = upperCwd; }}/ /... read rc code
}
Copy the code

Parse the instructions entered by the user

/** * -- index position */
const doubleDashIndex = process.argv.findIndex(element= > element === The '-');
/** * The first two parameters are node address and yarn file address */
const startArgs = process.argv.slice(0.2);
/** * 'yarn' subcommand & parameter * If it exists, take the part before -- * if it does not exist, take all */
const args = process.argv.slice(2, doubleDashIndex === -1 ? process.argv.length : doubleDashIndex);
/** * Transparent transmission of parameters of the 'yarn' subcommand */
const endArgs = doubleDashIndex === -1 ? [] : process.argv.slice(doubleDashIndex);
Copy the code

Example Initialize a shared instance

During initialization, the config configuration item and reporter log are initialized respectively.

  • Config will recursively query the parent step by step during initpackage.jsonIs it configured?workspacefield
    • Tip: If the current value isworkspaceThe project,yarn.lockBased onworkspacE For the root directoryyarn.lockShall prevail
this.workspaceRootFolder = await this.findWorkspaceRoot(this.cwd);
// 'yarn'. Lock directory. Priority is the same as the workspace directory
this.`lockfile`Folder = this.workspaceRootFolder || this.cwd;
/** * Find the workspace root directory */
async findWorkspaceRoot(initial: string): Promise<? string> {let previous = null;
    let current = path.normalize(initial);
    if (!await fs.exists(current)) {
	// No error is reported
      throw new MessageError(this.reporter.lang('folderMissing', current));
    }
    // Loop step by step to the parent directory to check whether 'package.json' \ 'yarn'
    // If workspace is configured at any level, return the path where the JSON is located
    do {
      / / remove ` package. Json ` \ ` yarn `. Json
      const manifest = await this.findManifest(current, true);
      // Remove the workspace configuration
      const ws = extractWorkspaces(manifest);
      if (ws && ws.packages) {
        const relativePath = path.relative(current, initial);
        if (relativePath === ' ' || micromatch([relativePath], ws.packages).length > 0) {
          return current;
        } else {
          return null;
        }
      }
      previous = current;
      current = path.dirname(current);
    } while(current ! == previous);return null;
}
Copy the code

Execute the add instruction

  • That’s from the previous stepyarn.lockRead the addressyarn.lockFile.
  • According to thepackage.jsonThe life cycle execution corresponds toscriptThe script
/** * execute in the lifecycle order configured by 'package.json' script */
export async function wrapLifecycle(config: Config, flags: Object, factory: () => Promise<void>) :Promise<void> {
  / / preinstall execution
  await config.executeLifecycleScript('preinstall');
  // Actually perform the installation
  await factory();
  / / install
  await config.executeLifecycleScript('install');
  / / execution postinstall
  await config.executeLifecycleScript('postinstall');
  if(! config.production) {// Non-production environment
    if(! config.disablePrepublish) {/ / prepublish execution
      await config.executeLifecycleScript('prepublish');
    }
    Prepare / / execution
    await config.executeLifecycleScript('prepare'); }}Copy the code

Obtaining project dependencies

  • First get the current directorypackage.json ηš„ dependencies,devDependencies,optionalDependenciesName + version number of all dependencies in
    • If the current isworkspaceItems are read from the project root directorypackage.json
      • Because the current isworkspaceItem, you also need to readworkspaceOf all the subprojects in the projectpackage.jsonCorrelation dependence of
// Get all dependencies in the current project directory
pushDeps('dependencies', projectManifestJson, {hint: null.optional: false}, true);
pushDeps('devDependencies', projectManifestJson, {hint: 'dev'.optional: false},!this.config.production);
pushDeps('optionalDependencies', projectManifestJson, {hint: 'optional'.optional: true}, true);
// The current is a workspace project
if (this.config.workspaceRootFolder) {
  // Collect 'package.json' for all subprojects in the Workspace
    const workspaces = await this.config.resolveWorkspaces(workspacesRoot, workspaceManifestJson);
    for (const workspaceName of Object.keys(workspaces)) {
	  // Subproject 'package.json'
          const workspaceManifest = workspaces[workspaceName].manifest;
 	  // Place the subproject in the root Project Dependencies dependency
          workspaceDependencies[workspaceName] = workspaceManifest.version;
	  // Collect subproject dependencies
          if (this.flags.includeWorkspaceDeps) {
            pushDeps('dependencies', workspaceManifest, {hint: null.optional: false}, true);
            pushDeps('devDependencies', workspaceManifest, {hint: 'dev'.optional: false},!this.config.production);
            pushDeps('optionalDependencies', workspaceManifest, {hint: 'optional'.optional: true}, true); }}}Copy the code

ResolveStep Obtains dependency packages

  1. To iterate over the first layer dependency, callpackage resolver ηš„ findMethod to get the version information of the dependency package and then call it recursivelyfindFor each dependencydependenceDepends on the version information in. Use one while parsing the packageSet(fetchingPatterns)To save parsed and parsedpackage.
  2. Parse each in detailpackage, first according to itsname ε’Œ range(version range) Determine whether the current dependency package is resolved (by determining whether it exists in the maintained aboveset, you can determine whether it has been parsed.
  3. For unparsed packages, first try fromlockfileTo obtain the exact version information iflockfilePackage information for exists in, and is marked as parsed after being obtained. iflockfileDoes not exist inpackage, a request is made to Registry for the highest known version that satisfies the rangepackageInformation will be obtained after the currentpackageMarked as parsed
  4. For parsed packages, they are placed on a delay queuedelayedResolveQueueDo not deal with first
  5. When dependent on all of the treepackageWhen you’re done recursively, iterate againdelayedResolveQueue, from the package information that has been parsed, find the most appropriate version information available

After that, we have determined the exact version of all packages in the dependency tree, along with details such as the package address.

  • Get the latest version number for all the dependencies of the first level project (callpackage resolver ηš„ initMethods)
/** * Find the dependency package version */
async find(initialReq: DependencyRequestPattern): Promise<void> {
    // Read from cache first
    const req = this.resolveToResolution(initialReq);
    if(! req) {return;
    }
    // The dependency package requests the instance
    const request = new PackageRequest(req, this);
    const fetchKey = `${req.registry}:${req.pattern}:The ${String(req.optional)}`;
    // Check whether the same dependency package has been requested
    const initialFetch = !this.fetchingPatterns.has(fetchKey);
    // Whether to update the 'yarn'. Lock flag
    let fresh = false;
    if (initialFetch) {
      // Add cache on first request
      this.fetchingPatterns.add(fetchKey);
      // Get the dependency name + version in 'lockfile'
      const `lockfile`Entry = this.`lockfile`.getLocked(req.pattern);
      if (`lockfile`Entry) {
        // The contents of 'lockfile' exist
        // Fetch the dependent version
        // eq: concat-stream@^1.5.0 => {name: 'concat-stream', range: '^1.5.0', hasVersion: true}
        const {range, hasVersion} = normalizePattern(req.pattern);
        if (this.is`lockfile`EntryOutdated(`lockfile`Entry.version, range, hasVersion)) {
          // The version of 'yarn'. Lock is behind
          this.reporter.warn(this.reporter.lang('incorrect`lockfile`Entry', req.pattern));
          // Delete the dependency version number that has been collected
          this.removePattern(req.pattern);
          // Delete package version information from 'yarn'. Lock (it is obsolete and invalid)
          this.`lockfile`.removePattern(req.pattern);
          fresh = true; }}else {
        fresh = true;
      }
      request.init();
    }
    await request.find({fresh, frozen: this.frozen});
}
Copy the code
  • Do a recursive dependency query for the requested dependency package
for (const depName in info.dependencies) {
      const depPattern = depName + '@' + info.dependencies[depName];
      deps.push(depPattern);
      promises.push(
        this.resolver.find(......),
      );
}
for (const depName in info.optionalDependencies) {
      const depPattern = depName + '@' + info.optionalDependencies[depName];
      deps.push(depPattern);
      promises.push(
        this.resolver.find(.......),
      );
}
if (remote.type === 'workspace' && !this.config.production) {
      // workspaces support dev dependencies
      for (const depName in info.devDependencies) {
            const depPattern = depName + '@' + info.devDependencies[depName];
            deps.push(depPattern);
            promises.push(
              this.resolver.find(.....),
            );
      }
}
Copy the code

FetchStep Downloads dependency packages

This is mainly about downloading dependencies that are not in the cache.

  1. Dependencies already in the cache do not need to be re-downloaded, so the first step is to filter out dependencies that already exist in the local cache. The filtration process is based oncacheFolder+slug+node_modules+pkg.nameTo generate apath, and determine whether thepathIf it exists, prove that there is a cache, do not re-download, filter it out.
  2. Maintain afetchThe task ofqueue, according to * *resolveStep** resolve the dependency download address to obtain the dependencies.
  3. When each package is downloaded, its corresponding cache directory is created in the cache directory, and the reference address of the package is resolved.
  4. becausereferenceFor example: NPM source, Github source, GitLab source, file address, etcyarnDepending on thereferenceAddress call corresponding tofetcherObtaining dependency packages
  5. To obtain thepackageDocument circulationfs.createWriteStreamWrite to the cache directory, the cache is.tgzCompress the file and then decompress it to the current directory
  6. After the download is decompressed, updatelockfilefile
/** * Splicing cache dependency path * Cache path + 'NPM' source - package name - version -integrity + 'node_modules' + package name */
const dest = config.generateModuleCachePath(ref);
export async function fetchOneRemote(remote: PackageRemote, name: string, version: string, dest: string, config: Config,) :Promise<FetchedMetadata> {
  if (remote.type === 'link') {
    const mockPkg: Manifest = {_uid: ' '.name: ' '.version: '0.0.0'};
    return Promise.resolve({resolved: null.hash: ' ', dest, package: mockPkg, cached: false});
  }
  const Fetcher = fetchers[remote.type];
  if(! Fetcher) {throw new MessageError(config.reporter.lang('unknownFetcherFor', remote.type));
  }
  const fetcher = new Fetcher(dest, remote, config);
  // Check whether the file exists based on the address passed in
  if (await config.isValidModuleDest(dest)) {
    return fetchCache(dest, fetcher, config, remote);
  }
  // Delete files in the corresponding path
  await fs.unlink(dest);
  try {
    return await fetcher.fetch({
      name,
      version,
    });
  } catch (err) {
    try {
      await fs.unlink(dest);
    } catch (err2) {
      // what do?
    }
    throwerr; }}Copy the code

LinkStep Moves a file

After the fetchStep, we have all the dependencies in the local cache. The next step is how to copy them to node_modules in our project.

  1. The package is parsed before being copiedpeerDependencesIf no match is foundpeerDependences,warningprompt
  2. Then we do it on the dependency treeflatProcess to generate the target directory to copy todest
  3. Target after flatteningdestTo sort (usinglocaleCompareLocal collation)
  4. According to flatTreedest(the address of the destination directory to copy to),src(The corresponding packagecacheDirectory address) tocopyThe task,package 从 srcCopy todest δΈ‹

yarnFor flattening is actually very simple and rough, according to the firstDo Unicode sorting of the package nameAnd then flatten the layer by layer according to the dependency tree

Q&A

1. How do I increase the number of concurrent network requests?

We can increase the number of concurrent network requests: –network-concurrency

2. What about the total timeout of network requests?

You can set the timeout duration of network requests: –network-timeout

3. Why did I change ityarn.lockIs the version number of a dependency package still not valid?

"@ Babel/code - the frame @ ^ 7.0.0 - beta. 35":
  version "55 7.0.0 - beta."
  resolved "Https://registry. ` yarn ` pkg.com/@babel/code-frame/-/code-frame-7.0.0-beta.55.tgz#71f530e7b010af5eb7a7df7752f78921dd57e9e e"
  integrity sha1-cfUw57AQr163p993UveJId1X6e4=
  dependencies:
    "@babel/highlight" "55 7.0.0 - beta."
Copy the code

We randomly intercepted a section of yarn.lock code. It is not enough to only modify the Version and resolved fields, because yarn will also compare the integrity generated based on the actual downloaded content with the integrity field of the yarn.lock file. If not, the download is the wrong dependency package.

4. When different versions of the same dependencies appear in a project dependency, how do I know which one I am actually using?

First we need to look at how dependencies are referenced. Pre-scene:

First, based on the current dependency and yarn installation features, the actual installation structure is as follows:

| - [email protected] | - [email protected] | - [email protected] | -- -- -- -- -- [email protected] | - [email protected] | - [email protected]Copy the code
  • Develop student direct code referencesDFor the actual[email protected]
  • BDependencies are not declared directly in the codeC“But it’s quoted directlyCRelated object methods (becauseBDirect referenceDAnd,DI’m sure to quoteC, soCThere must be). The actual reference is not[email protected]“But by quote[email protected].
    • becausewebpackQuerying dependencies is accessnode_modulesDependencies that conform to the rule, so it’s referenced directly[email protected]

We can use the YARN list to check whether there is a problem.

This article refer to

  • The yarn’s official website
  • I added some Chinese comments to fork yarn source code
  • Analyze the process of yarn installation dependency from the perspective of source code

❀️ Thanks for your support

  1. If you like it, don’t forget to share, like and collect it.