• Yarn: A new package Manager for JavaScript
  • By SEBASTIAN MCKENZIE, CHRISTOPH POJER and JAMES KYLE
  • The Nuggets of Gold translation Project
  • Translator: Da Zai
  • Proofreader: Square root of three

In the JavaScript community, engineers share thousands of pieces of code with each other, helping us save a lot of time writing basic components, libraries, or frameworks. Each code package may depend on other code, and dependencies between codes are maintained by the package manager. The most popular JavaScript package manager today is the NPM client, which offers up to 300,000 packages in the NPM repository. More than 5 million engineers use the NPM repository, with 5 billion software downloads per month.

At Facebook, we’ve been using the NPM client with success for years, but as our code repository and team size grew, we faced consistency, security, and performance challenges. After trying to address every aspect of the problem, we finally decided to work on a new set of client-side solutions to help us manage dependencies more reliably. We call this client tool Yarn — the faster, more reliable, and more secure alternative to the NPM client.

We are pleased to announce that we have partnered with Exponent, Google and Tilde to open source the Yarn project. Engineers still need to access the NPM repository when using Yarn, but Yarn can install software packages and manage dependencies more quickly, and maintain code consistency across machines or in a secure environment without a network. Yarn improves development efficiency and solves some of the problems associated with sharing code, allowing engineers to focus on building new products and features.

The evolution of JavaScript package management on Facebook

Before package management tools, JavaScript engineers typically relied on few projects and thus stored their dependencies directly in the project directory or uploaded to the CDN. Shortly after Node.js came along, NPM, the first mainstream JavaScript package management tool, was introduced and quickly became one of the most popular package management tools. Since then, new open source projects have sprung up and engineers are more willing to share code than ever before.

In Facebook, we have a lot of projects that rely on code in the NPM repository, like React. But as we grew internally, we faced the following challenges: code consistency issues when installing dependencies across platforms and across users, taking too long to install dependencies, and security issues when NPM clients automatically executed code for some dependency libraries. We have tried to find solutions to these problems, but in the process, we often create new ones.

Try modifying the NPM client

In the beginning, we followed best practices by only tracking changes to package.json files in the code repository and requiring engineers to manually run the NPM install command to install dependencies. This pattern works fine on a developer’s computer, but it is difficult in a continuous integration environment, which requires sandbox isolation and cannot be networked for security and reliability reasons, and therefore cannot install dependencies.

Next, we try to track file changes in the entire node_modules directory in the code repository. While this approach works, it complicates some simple operations. For example, updating Babel with a minor version number resulted in up to 800,000 lines of commit records, and invalid UTF-8 byte sequences, Windows newlines, and non-PNG compressed images due to lint rules, This can often result in engineers spending an entire day merging files in the node_modules directory. Our source-control team also pointed out that tracking the node_modules directory would introduce too much metadata. For example, the React Native package.json file currently lists only 68 dependencies, but after running NPM install, the node_modules directory contains 121,358 files.

Finally, in order to effectively organize Facebook’s growing number of engineers and manage the amount of code that needs to be installed, we tried to modify the NPM client. We decided to compress the entire node_modules directory and upload it to the internal CDN, where both our engineers and the continuous integration system could download and decompress the files to ensure code consistency. This allows us to remove tens of thousands of files from source control, but the downside is that engineers now need to be connected not only to pull code, but also to build it.

We also looked for optimizations for NPM’s Shrinkwrap feature, which is used to lock dependent versions. Shrinkwrap files aren’t generated by default, and if developers forget to generate them, they won’t be updated synchronically, so we wrote a tool to make sure the contents of Shrinkwrap files match those in the node_modules directory. These files are made up of a large number of JSON blocks and the key names are unordered, so each change typically causes the contents of the Shrinkwrap file to change significantly, making code review difficult. To mitigate this problem, we also need an extra script to sort all the items.

Finally, when upgrading a single dependency package through NPM, NPM is usually updated along with other unrelated dependencies based on semantic version number rules. This causes each update to have more changes than expected, and engineers feel that the process of submitting node_modules to the CDN is not as effective as expected.

Build a new client

Rather than continue to build the infrastructure around the NPM client, it is important to revisit these issues as a whole. Sebastian McKenzie of the London office asked, “What if we built a new client tool to replace the NPM client to solve our core problem?” The idea quickly caught on with us and the team was very excited about the idea.

During development, we talked to engineers in the industry and found that they were facing similar problems and had tried many similar solutions, usually only solving them one by one. It became clear that there was a need to bring together the problems that the entire JavaScript community was facing so that we could develop a mainstream solution. Thanks to the help of the engineers at Exponent, Google and Tilde, we have built the Yarn client and tested Yarn in every major JS framework and usage scenarios outside of Facebook. Today, we are proud to open source this tool to the community.

Introduction of Yarn

Yarn is a new package manager used to replace the existing NPM client or other package management tools compatible with NPM repositories. Yarn retains the features of existing workflows, which are faster, more secure, and more reliable.

The primary function of any package manager is to install packages, which are pieces of code for a particular function, usually from a global repository to the engineer’s local environment. Each package may or may not depend on other packages. The dependency tree of a typical project structure often contains dozens, hundreds, or even thousands of packages.

These dependencies are usually versioned and installed through semantic versioning (SEMver). The version number Semver defines reflects the type of change made in each new release, whether it’s an incompatible API change (MAJOR), a backward-compatible new feature (MINOR), or a backward-compatible bug fix (PATCH). However, semver developers who rely on packages cannot afford to make mistakes — if the dependencies are not locked, they may introduce damaging changes or create new bugs.

structure

In the Node ecosystem, dependencies are typically installed in the node_modules folder of your project. However, the structure of this file may differ from the actual dependency tree because duplicate dependencies can be merged together. The NPM client installs dependencies into the node_modules directory with uncertainty. This means that the structure of the node_modules directory can change when the dependencies are installed in different order. This discrepancy can lead to a “it works on my machine but not on others” situation, and often takes a lot of time to locate and resolve.

Yarn solves versioning issues and NPM uncertainties with lockFiles and a deterministic, reliable installation algorithm. Lockfile locks the installed package version to a specific version and ensures that the node_modules directory is installed the same on all machines. Lockfile also uses a concise, ordered key format that minimizes each file change and makes code review easier.

The installation process consists of the following three steps:

  1. Processing: YarnResolve dependencies by sending requests to the code repository and recursively looking for each dependency.
  2. Grab:Next,YarnThe global cache directory is searched to check whether the required software package has been downloaded. If no, Yarn captures the compressed package and stores it in the global cache directoryYarnOffline installation is supported. You do not need to download the same installation package multiple times. Dependencies can also be placed in the source control system in compressed form of a tarball to support a full offline installation.
  3. Generation:In the end,YarnCopy all files needed from the global cache to the localnode_modulesDirectory.

Clear segmentation of these steps, along with deterministic algorithm support, enables Yarn to support parallel operations to maximize resource utilization and speed up the installation process. On some Facebook projects, Yarn can even reduce the installation process by an order of magnitude, from minutes to seconds. Yarn also uses mutex to ensure that multiple CLI instances running at the same time do not conflict with each other.

Throughout the process, Yarn imposes strict restrictions on software package installation. You can control which lifecycle scripts work on which packages. The checksum of the package is also stored in the lockfile to ensure that the same package is available for each installation.

features

In addition to making the installation process faster and more reliable, Yarn adds additional features that further simplify the dependency management workflow.

  • At the same time compatiblenpm 与 bowerWorkflow, and support a mix of two software repositories
  • You can restrict protocols for installed modules and provide method output protocol information
  • Provides a stable set of public JS apis for logging the output of build tools
  • Readable, minimal, and beautiful CLI output

Yarn is used in the production environment

We have used Yarn in production on Facebook and it has worked very well. Yarn effectively manages package dependencies for many JavaScript projects. With each migration, the build can take place offline, thus speeding up the workflow. We conducted installation time tests based on React Native under different conditions to compare the performance of Yarn and NPM. For details, see here.

start

The easiest way to get started is:

npm install -g yarnpkg
yarn
Copy the code

The YARN CLI replaces the NPM CLI in the development workflow, either as a simple replacement or as a new, similar command:

  • NPM install and yarn

    No arguments are required. The yarn command reads the package.json file and then grabs the package from the NPM repository and places it in the node_modules directory. This is equivalent to running NPM install.

  • NPM install –save → YARN add

    We avoided the behavior of installing “invisible dependencies” in the NPM install command and isolated a new command. Running YARN add is equivalent to running NPM install –save.

In the future

There are already many members working together to build Yarn to solve our common problems, and we hope that Yarn will truly become a popular community project in the future. Yarn is now open source on GitHub, and we’re ready to roll it out to the Node community: use Yarn, share ideas, document, support each other, and help build a great community for long-term maintenance. We believe Yarn is off to a good start, and with your help, Yarn’s future will be even better.