Author’s brief introduction

Any source

Classical Internet practitioner

Joined Fluent English at the end of 2014, and is now mainly responsible for Platform Team. I worked in the Plant in Hangzhou before I came to work here.

Within the

let

big

The outline

1. Package Management

2. Code Management Multi Languages

3.bazel build //:go

4.Demo

1. Package Management

Vendor 

Go has introduced Vendor from version 1.5. By default, 1.7 already loads code from Vendor. If the lib you are using provides inconsistent Go versions, it is a good choice to place it under Vendor. This is Rob Pike’s open source project Upspin. Upspin is a file sharing project. Use Tree to look at its Vendor directory:

 

Since Upspin uses FUSE, it stands to reason that fuse is all included, along with Protobuf, Crypto… , which is the most common way that people use vendor.

But if you think about it, the code under Vendor is already open source, so why commit it to your own library? How do I keep the lib version under Vendor up to date? How to solve the dependency problem?

Back in February, Russ Cox had something to say about how we’ve been using Vendor for years, and Golang really needs a dependency and version management tool, so he posted an article about three things

1. Cargo Rust did a good job, so we did DEP

2. Eight years of go install and go get fortunately everyone, go Module to understand

3. Go Module kills most vendor directories

Vgo

And then you have Vgo

 

I wrote 7 blog posts explaining why I do VGO and why I use Go Module. But when a technology or versioning requires so many blogs to spell it out, I don’t think users will be able to figure it out in such a short time.

Going back to earlier, Package Managment has three issues that need to be addressed:

1. The Packges Versioned. I don’t want V1, V2, V3 in Packges,

2. Expect the result to be the same every time you build the code

  3. Work outside of $GOPATH

 

Russ put forward an issue to illustrate this problem. At present, Golang has about 40,000 stars on Github, and this issue has received 141 likes, 4 of which have decreased by 1. Of course, this does not prove anything.

When looking at something, it’s usually better to understand it with a question than to accept it passively.

Golang has been used at Google for so many years. How did Google build the Go code?

With this question, we move on to the next topic

2. Code Management

   

Code management is quite difficult. In addition to writing code, I just manage the code. How to organize it and whether it is appropriate to put it here?

Fluency has been growing rapidly and has encountered many technical problems. How do you support your business when it’s growing fast? For example, codebase has C++, some machine learning libraries are written in Python, and codebase is not just Golang code. How do we build it?

Before I came here, THERE are about 213 services running online. However, this is an inflated figure, which includes some services of K8S, such as Kube-Proxy. So there are about 160+ or 180+ individual projects online, and it’s normal to have multiple projects in one development.

 

In addition, we actually have some very complex systems in the process of evolution and gradual change. For example, our Adpative learning involves algorithms, data and back-end. With so many types of work together, there will be problems in cooperation. Big data is more and more need through data feedback to tell us what to do next, the feedback loop is very important, and in a timely manner, data interaction between different data should do, how to interaction between different languages, heterogeneous project how to compile, you can’t say I give you a makefile to build, The Makefile might be long and smelly, and even impossible to maintain.

2.1 Code management challenges

What are the challenges of managing code when it’s growing fast? For example, if we’re internally using protobuf on a large scale to define data structures, what do I tell you when I discard a field? The hope is that all repO CI builds that rely on this proto fail.

 

Here’s an example:

 

We define the Type of these activities as an enum. Because of historical reasons, we write different enUms in different systems. Different tags represent different meanings in different systems. The problem in later use is that different systems get the same tag but the meanings are different. For example, 0 represents PRESENTATION_TYPE in some systems, but UNKNOWN in other systems.

2.2 How to reuse the code and solution

For example, if you write a lib and makefile, how do you let people know when you finally run locally? For example, there is a fingerprint function, we have some content of the question, we need to generate ID based on the content, fingerprint function has different implementation, JAVA implementation, C++ implementation, how can we ensure that after the fingerprint function is modified, Other languages are also changed together, how you might forget about putting them in different repo’s.

2.3 How to Share Knowledge

We want to do it once for the system solution. Bazel, which is an open source version of Google Blaze and is currently in beta, basically solves this problem after hiring two PHDS from Google in 2017. Golang rules_go is still alpha and not yet in beta, but it works fine now.

Take a look at Google’s Blaze usage data

 

Google has all of its code in a repo with 45K commits/day and 800K builds per day, and of course they have a dedicated team working on the Blaze.

See how Bazel compares to makefiles (see levelDB BUILD snippet).

 

The Bazel BUILD is on the right and the Makefile is on the left, so you can clearly see which is clear and which is not

 

This is our very simple protobuf for a repo, with ABCD four proTos that reference each other, and this is the shell script for the Protoc that Bazel generated

 

Verbose will print the protoc command and generate a service through go_proto_library. It’s about 24 lines. The first one is a declaration. Referencing our own Common/Bazel, look at the previous image to get a sense of how if we write in a Makefile, this corresponds to only the bottom two parts of the code, which cannot be maintained when the code is large.

2.4 How do I use Bazel

Create a WORKSPACE file in project root

All build rules are based on the absolute path of the WORKSPACE. You don’t need to know what the upper layer is. If you know the root directory, you will know what all paths look like.

Let’s look at an example of the WORKSPACE file

 

Divided into three parts, the first part has a name, the name is written in reverse. Next up is Bazel, which is made up of a lot of rules. Here we refer to Rules Go, specify a version, and then introduce Gazelle, Go’s tool for automatically generating BUILD files. When you initialize dependencies, put them all in and initialize them. The language script is similar to Python. They have a name called Skylark. We require that all dependencies be explicitly declared in WORKSPACE. The advantage is that we know what code is used in WORKSPACE, and declare even indirect dependencies. We do not recommend using a master branch because the master has a cache. If the master has a local cache, it will not pull a new master. We recommend you write commit.

Write a BUILD in each directory

Create a BUILD file under each directory. BUILD explicitly specifies how the code should be built. This has two advantages. Another option is to put different packages in the same directory through the BUILD file.

This is a BUILD example

 

This declaration package visibility, if it is private, will not be referenced and only the current directory is available. Then we introduce bazel_Essentials, which is open source. This is used to specify the prefix when building. Go_binary produces an executable file, and go_library releases the go source file as a library.

Here is a more complex example

   

Here’s an explicit list of what is test, and you can put all the test files in there, but I don’t recommend it, we’d really like to say that if your REPO isn’t big enough, test files can be counted. A timeout is used to set the maximum execution time of the test. If the test cannot finish, an error will be reported.

Let’s explain what these previous rules mean.

 

External is preceded by @ and protoc is target name.

A few brief words about build scripts

1. build //…

First ‘bazel build //… ‘represents all targets under build WORKSPACE and will sweep all directories

2. build //:demo

The target of a separate build is directly // :demo, where :demo is target name

3. run //:demo

Execution: demo

4. test //:demo_ test

Run demo_test test

Here are a few frequently used Bazel commands

 

Experimental_platforms should be open while compiling Linux AMD64 on another platform (such as macOS). Should this feature be new or in beta?

Bbazel Query displays your current dependencies. You can use bazel Query to list all the targets under a package.

The last one is that if you want to ignore a target or directory, just add a minus sign after it.

 

There is also a case where some packages have already been downloaded and you want to run a separate target, as shown in the first one, the parameters are passed after the -(dash), which is also common practice.

You can run individual targets (# 2) and check all dependencies (# 3). The advantage of this is that bazel explicitly tells you how your code is built without having to rely on language or other tools, and bazel Query can also be used to look it up and draw a graph.

3.bazel build //:go

Bazel build//:Go Earlier we said, “We should look at it with the question, how does Google build Go?” In fact, blazel is used.

The first step before using bazel is to delete vendor: CD < your_go_project > && rm-rf vendor,

(But doing so is a bit radical and unstable. As a developer we know that things must be steady, you should think twice about rm-RF.)

Also, rules_go, which is currently under development, supports these

 

Does not support the

 

3.1 bazel build / / : gazelle

This is a tool for automatically generating builds. With Gazelle, community Package can generate all BUILD files from Gazelle. So how do you get your own projects to support Bazel Builds? In three steps:

 

Start by creating a WORKSPACE file in project root that declares which rules and dependent packages to have. The second step is to run bazel run //:gazelle to generate a BUILD file, and the third step is to update dependencies that have not yet been created.

DEMO

Originally, it was to be a live demo, but due to the network and other factors, I recorded a video, we build a big BeeGo, we clone under Github first. There are no files now, so we need a WORKSPACE file to open and edit directly. We can copy some rules from Github of rules_go, which clearly explains which rules should be in place and what version should be. If you have a new WORKSPACE, you can see what the name is displayed. We need to put Gazelle in there to automatically generate all of our BUILD files and then initialize them. You want all the rules in one place. The nice thing about this is that you don’t have to burn all the code, you just update the current file.

Let’s go back to those three questions

1. [X] Packages Versioned

Bazel solves this problem by specifying a commit for the package

2. [X] Verifiable and verified build

This is also OK, there is any change or upgrade, even if the dependency, every time the result is the same.

3. [X] Work outside $GOPATH

While it is customary to put code in $GOPATH, build without setting or relying on any GOPATH.

3.2 Minimal Version Selection

We only need to consider two versions. One is whether our code needs to be upgraded to the latest one, so we directly use master/HEAD. The other is that I know which commit I can work before and after, and I just specify commit. Because although the larger version of sematic may have break change, not many people actually follow it.

Remote Caching

Bazel downloads the entire Lib of Go. Remote Caching can avoid unnecessary downloads. All build files will do Caching for you. Move Bazel’s current WORKSPACE directory up, do some special work, and do some common clustering.

Remote Execution

Local builds run inside the Sandbox, with multiple threads running. If you want a faster build, you can use remote Execution.

Bazel can also build others. You can build Docker, which we’ve always wanted to try, and you can build Android, IOS, and the Web. When all is said and done with Bazel, cross-language cross-platform builds are not an issue, we just need to know what Bazel builds are.

3.3 Speak fluently about the problems encountered with Bazel

First, you need to pull the latest code from master/head

Google is in the same REPo and does not have this problem. To solve this problem, Doctor wrote a rule using Skylark.

 

This code is open source and needs to be used in conjunction with the environment variable BAZEL_RUNID, which updates the cache if the value changes to “BAZEL_RUNID”.

 

Normally BAZEL_RUNID is set to a random value, or the current time, so that each time YOU run CI, the latest code is pulled.

 

Second, Bazel is slow to get started

We found Bazel’s idea to be a good one, but we solved it with Codelab if it was a first use or a transition from another tool.

 

Third, third party dependence

I recently encountered a problem

 

Qiniu_x fails to build by adding go_repository.

 

Error: ‘com_qiniupkg_x’ is needed, actually the code is under @com github qiniu_x//, why there is package?

 

It turned out that another package name was used in qiniu_x config.v7. There is no problem with go Build, because the import path is available, but unfortunately this code happens to be in the same REPO. The solution is to reuse the repo, which is actually Github’s qiniu_x, and copy the code just like it. Now you have a problem with a third party and you have to do that.