preface

Vercel recently acquired Turborepo in order to speed up the build of next.js, and Turborepo writers have joined Vercel.

monorepo

What is monorepo?

In the development of general scenario, people want to be independent, between each project released its own development and won’t produce too much coupling, now a lot of projects is also with this in mind to go into a component, an independent warehouse management in a separate code, this is our common development pattern of single code warehouse. However, the above model can be inefficient and cumbersome in some scenarios. For example, if the code of a repository is referenced by many other related repositories, then as long as this repository is distributed, all the repositories that depend on this code will also be upgraded and distributed. If all the dependent code is put into a repository for unified maintenance, when one library changes, the other code can automatically upgrade the dependency, so it can simplify the development process and improve the development efficiency. This multi-package repository is called monorePO. In fact, Monorepo is very common in the front end. Babel, Reac, Vue and other open source projects all use this way to manage the code. Lerna, Babel’s official open source multi-package management tool, is also widely used.

Turborepo

Turborepo is a high performance build tool for JavaScript and Typescript Monorepo. Turborepo is not an intrusive tool that you can gradually introduce and use in your projects. It comes with enough packaging and a few simple configurations to achieve high performance builds. Turborepo, like Esbuild, is based on the go implementation and has some performance advantages at the language level.

advantage

  • Incremental build: Caches build content and skips what has already been computed, increasing build speed through incremental builds
  • Content hash: Determines whether a file needs to be built based on the hash calculated from the file content
  • Cloud cache: CI/CD’s cloud build cache can be shared with team members for faster builds
  • Parallel execution: Build with the maximum amount of parallelism without wasting idle CPU
  • Task Pipeline: Let Turborepo optimize build content and timing by defining relationships between tasks
  • Contracted configuration: Reduces the complexity of configuration by convention, requiring only a few lines of SIMPLE JSON

Begin to use

For a new project, run the following command to generate a new code repository

npx create-turbo@latest
Copy the code

For an existing Monorepo project, you can access Turborepo by following these steps

Install Turborepo

Add Turborepo to devDependecies, the outermost layer of the project

npm install turbo -D
or
yarn add turbo --dev
Copy the code

Increase the configuration

Add Turborepo configuration items to package.json

// package.json
{
     "turbo": {    
     }
}
Copy the code

All Turborepo configurations go into the Turbo configuration item

Create a task pipeline

In Package. json turbo, add commands that you want to “turbocharge” to the pipe. The pipe defines scripts dependencies in the NPM package and turns caching on for these commands. The dependencies and cache Settings for these commands are applied to the individual packages in Monorepo

{
    "turbo": {
        "pipeline": {
            "build": {
                "dependsOn": ["^build"],        
                "outputs": [".next/**"]            
            },
            "test": {
                "dependsOn": ["^build"],
                "outputs": []                            
            },
            "lint": {
                "outputs": []
            },
            "dev": {
                "cache": false            
            } 
        }    
    }
}
Copy the code

In the example above, the build and test tasks have dependencies and must wait for their dependencies to complete, so they are represented by ^. For script commands in package.json in each package, Turborepo defaults cache output to dist/** and build/** folders if no override items are configured. Outputs can be set to the output directory of the cache through the Outputs array, which in the example is stored in the.next/** folder. Turbo /turbo-

pipeline

As you can see from the Turbo configuration above, pipelines are a core concept that Turborepo uses to handle tasks and their dependencies.

In a traditional Monorepo repository, such as a lerna or YARN workspace for management, each NPM package script(such as build or test) is executed independently or in parallel. If a command has package dependencies, the CPU core may be idle during execution, resulting in wasted computing performance and time.

Turborepo provides a declarative way to specify relationships between tasks, which makes it easier to understand them, and allows Turborepo to optimize task execution and fully schedule CPU multi-core performance with explicit declarations.

This is a comparison of Turborepo’s execution pipeline with Lerna, showing that Turborepo can perform tasks efficiently while Lerna can only perform one task at a time.

Configuration pipeline

Each key name in a pipeline can be executed by running Turbo Run, and dependencies for the current pipeline can be executed using dependsOn.

The execution process shown in the preceding figure can be configured in the following format

{
    "turbo": {
        "pipeline": {
            "build": {
                "dependsOn": ["^build"],           
            },
            "test": {
                "dependsOn": ["build"],
                "outputs": []                            
            },
            "lint": {
                "outputs": []
            },
            "deploy": {
                "dependsOn": ["build", "test", "lint"]           
            } 
        }    
    }
}
Copy the code

The order of execution of each command can be seen in dependsOn configuration:

  • Since A and C depend on B, there is A dependsOn configuration for package construction. The build command of the dependency will be executed first, and the build command will be executed only after the dependency is executed. As can be seen from the waterfall flow above, B’s build is executed first, after which A and C’s build are executed in parallel
  • For test, you rely only on your build command and execute test as soon as your build command is complete
  • Lint has no dependencies and can be executed at any time
  • After completing build, test, and lint, run the deploy command

All pipelines can be executed with the following command:

npx turbo run build test lint deploy
Copy the code

Conventional rely on

If the execution of a task depends only on itself, place the dependent task in a dependsOn array

{
    "turbo": {
        "pipeline": {
            "deploy": {
                "dependsOn": ["build", "test", "lint"]           
            } 
        }    
    }
}
Copy the code

Topology dependent

You can use the ^ symbol to explicitly declare that the task is topologically dependent, and the dependent package can only execute its own task after completing the corresponding task

{
    "turbo": {
        "pipeline": {
            "build": {
                "dependsOn": ["^build"],           
            }
        }    
    }
}
Copy the code

Empty depends on

If a task has a dependsOn of undefined or [], the task can be executed at any time

{
    "turbo": {
        "pipeline": {
            "lint": {
                "outputs": []
            }, 
        }    
    }
}
Copy the code

Specific dependency

In some scenarios, a task may depend on a particular task of a package, in which case we need to manually specify the dependencies.

{
    "turbo": {
        "pipeline": {
            "build": {
                "dependsOn": ["^build"],           
            },
            "test": {
                "dependsOn": ["build"],
                "outputs": []                            
            },
            "lint": {
                "outputs": []
            },
            "deploy": {
                "dependsOn": ["build", "test", "lint"]           
            },
            "frontend#deploy": {
                "dependsOn": ["ui#test", "backend#deploy"]            
            }
        }    
    }
}
Copy the code

In the above example, a front-end deployment task is added. This deployment task relies on a UI component library and the corresponding back-end project, which will only be deployed if the UI component library passes a single test and the back-end project is successfully deployed. For specifying package dependencies, use the #< Task > syntax.

with Lerna

Lerna is a commonly used monorePO building tool, which can not only support the running of package tasks, but also carry out package dependency and version management well.

Compared to Lerna, Turborepo has a much better task scheduling mechanism, and Lerna runs tasks without caching, so Turborepo also has a huge advantage in caching.

For package publish and version update, Turborepo is not implemented yet, so Lerna and Turborepo can be used together at this stage, letting them do their job.

Install Turborepo and make the following changes to package.json:

{
  "scripts": {
- "dev": "lerna run dev --stream --parallel",
+ "dev": "turbo run dev --parallel --no-cache",
- "test": "lerna run test",
+ "test": "turbo run test",
- "build": "lerna run build",
+ "build": "turbo run build","prepublish": "lerna run prepublish", "publish-canary": "lerna version prerelease --preid canary --force-publish", "publish-stable": "lerna version --force-publish && release && node ./scripts/release-notes.js" }, "devDependencies": { "lerna": "^ 3.19.0",+ "turbo": "*"
  },
+"turbo": {
+ "pipeline": {
+ "build": {
+ "dependsOn": ["^build"],
+ "outputs": ["dist/**"]
+},
+ "test": {
+ "outputs": []
+},
+ "dev": {
+ "cache": false
+}
+}}}Copy the code

contrast

I’m working on a Monorepo project myself, but with fewer components, it’s hard to see how Turborepo can improve, so I used the Reakit library for comparison.

After installing the dependencies, make the following changes to package.json:

The packages in this project are built without dependencies, and like all build and Lint tasks, no dependencies are configured.

run build

The following uses the build command as an example.

To speed up lerna builds, turn on the — Parallel configuration item on lerna-build so that it can execute in parallel

When lerna-build is executed for the first time, the final time is shown in the following figure

Because Lerna has no cache, the results of subsequent runs are generally maintained at around 25 seconds

When yarn Turbo-build is executed for the first time, the console output is

It can be seen that the first execution takes about the same time as it takes lerna to start parallel execution. When it runs the second time, because the cache is in effect, the final output is shown below

The second execution took 0.5 seconds because of the cache

Changes in the file

To modify the code for one of the packages, the lerna-build time remains the same and is basically around 25 seconds.

When turbo-build is run, the final time output is shown below

You can see that the cache is still in effect for the unmodified package code, so the execution time is significantly reduced.

A complex scenario

We manually create a deployment scenario with the following commands and configurations

{
    scripts: {
          "turbo-deploy": "turbo run deploy",
          "lerna-deploy": "npm run lerna-lint && npm run lerna-build"  
    },
    turbo: {
         "deploy": {
            "dependsOn": ["lint", "build"]
          }   
    }
}
Copy the code

Lerna-deploy takes about 45 seconds to complete, whereas Turbo-deploy takes about 20 seconds and is supported by caching.

Remote cache

When multiple people start a project, team members can share the build cache, speeding up the project.

When a member of a branch to build cache files git repository is pushed to the remote, if another member on the same branch of development, then Turborepo to support what you have to select a member to build cache, and running the build tasks, from distal to the cache file to the local, to speed up the building

Run NPX Turbo Link, log in, and select the cache to use

Once the selection is complete, you can use the corresponding cache

In the author’s video demo, the project takes 27 seconds to complete all tasks without using caching

In the case of remote caching, the task can be built in about 3.5 seconds

So in multiplayer projects, remote Cache is a killer feature that can dramatically speed up build tasks

conclusion

As you can see from the examples above, the more tasks you perform and the more complex the dependencies, the more Turborepo can take advantage of the CPU’s multiple cores. And because of the existence of the cache, in some scenarios, such as front-end deployment, only the front-end code is modified, then the build cache of back-end code can be used directly, which can greatly reduce the build time and improve the build efficiency.

Therefore, using Turboreop instead of Lerna to build complex Monorepo projects can greatly reduce the construction time, improve the development experience and bring considerable benefits.