This article is based on the practice case of the privatization module processing by the Youpai Cloud team. It introduces how to use the privatization module and the details behind the Go Get tool, including how to get the correct source code of Go from privatized GitLab and authentication issues. This article is based on the Open Talk live broadcast shared by Liu Yunpeng, senior development engineer of Youpai Cloud. Please click “Read the original” at the end of the dropdown to replay the video.

About Open Talk: The comprehensive technology salon initiated by Youpaiyun, following the original intention of Youpaiyun to “make entrepreneurship easier”, provides technology developers with multi-dimensional knowledge sharing including technology, operation and maintenance, products and entrepreneurship in the form of all-dry goods, helps enterprise members improve their professional skills and promotes the better and faster development of the enterprise.

Research and development background

Go began introducing Module features in version 1.11; Version 1.13 introduces Module checksum checks to strengthen Module security; Now version 1.16 uses the Module mode by default. The Go team recently announced on their blog that GoPaht support will be removed in 1.17. If you don’t already use the Go Module, try GomModule.

The main difference between GoModule and GoPath is the use of private modules. Publicly socialized modules are used the same way, they are all directly available through Go Get. For privatized Module GoPaht can directly throw the Module code in GoPaht directory, but the Go Module can not, it has its own code management mode, let’s briefly introduce it below.

How does Go get modules

To get a module, you usually use the Go Get tool to get the module. Currently, there are two ways to get the module:

The first is to pull the code from the code hosting platform through the traditional VCS, which is mainly based on Git, but also supports SVN, HG, and other platforms.

The second is the GoProxy protocol, which has been supported since version 1.12. Go fetching the code archive files on the GoProxy server.

As of 1.13, Go also uses a checksum check — GO SUM. All modules are checked for their checksums after downloading. It will compare the hash value of the downloaded module with the hash value in Google’s online database to prevent the module from being tampered with. Only after the module has been verified can it be installed and used normally.

How the VCS gets the module

Go supports a number of versioning tools. The first thing you need to decide is which version control tool to use to get the module. Judgment methods are roughly divided into three categories, independent of the other two static matching methods and one dynamic matching method.

Static matching mode

Prefix Matching: Cod-hosting platforms such as GitHub, Bitbuket of Google, Apache, OpenStack, etc., are built into the Go Get toolchain to determine the prefix of a module and use the corresponding version management tool if the prefix matches. In the example on the left, the github.com/eamaple/pkg module matches the prefix, matches GitHub, and knows that GitHub is using the Git tool.

Regular matching: Regular matching is done by adding a suffix to the module. The suffix can be one of the five versioning tools (Git, SVN, HG, BZR, Fossil). The matching of suffixes is achieved through regular expressions. Both of the examples in the figure above are suffixed with.git, and regular expression matching results in subgroups inside, i.e. VCS subgroups are matched to modules that are managed using git.

Dynamic matching mode

When neither the prefix nor the regular expression match, dynamic determination is used. Go Get sends an HTTP request with the URL for the module with the protocol header and parameters (GO-GET =1). Go GET expects the server to return the corresponding information of the module to help go get further operations. Go sends an HTTPS request by default, and if the server wants to use the HTTP protocol, it can do so on the environment variable GOINSECURE. When the GOINSECURE reaches 1, Go uses the HTTP protocol.

The expected return body of Go Get is an HTML document, where what’s meaningful to Go is a meta tag with the name=”go-import” attribute. The meta tag tells Go how to retrieve the module via the content attribute.

The content consists of three parts: the first part is root-path, which refers to the name of the module; The second part of the VCS represents the management tools you need to use, such as Git and SVN. ; The third part of the REPO-URL refers to the module code stored under the warehouse, the warehouse needs to be in the form of the protocol plus the warehouse address.

Using curl to simulate a Go Get request, the golang.org/x/net server returns an HTML document that contains the meta tags in red. The first part of the content is the Go module name golang.org/x/net; The second part is Git, which means you need to use Git to get the source code. The third part is the module hosted address, which means it is hosted on the module package address googlesource.com/net. Note that the meta tags can only be placed inside the head. Go get parsing starts from the beginning and stops parsing when the end tag of the head or the start tag of the body is encountered.

The application of Git in Go Get

Git supports HTTP protocol and SSH protocol. Go calls Git by default only using HTTP protocol, and the interaction process of Git will be disabled during the call. For example, Git uses HTTP protocol to clone a private repository, which requires the input of user name and password. However, if you cannot enter the user name and password interactively when calling Git, the acquisition module will fail. The interaction is controlled by the environment variable GIT_TERMINAL_PROMPT, and if you manually force the variable to 1, you can enable the interaction to enter the user name and password manually.

So how do you pass the username and password to Git without knowing it? In fact, in Git, if you are using the HTTP protocol, you can pass the username and password through the netrc file. This file is located in the HOME directory. There are two file formats:

  • The first: by the server name and username password to define the server’s username and password;
  • Second: do not specify a server, specify the same user name and password for all servers.

As shown in the figure above, gitlab.com is configured with a user name of root and password of admin. When you clone GitLab’s private repository through Git, you can pass the user name root and admin to Git, so that Git can get the user name and password without knowing it, so that you won’t have to enter the password again. The default user name and password are set for all servers. The user name is set to guest, and the password is 123456, which means that all servers except gitlab.com need authentication. Both guest and 123456 are passed to the required program as user names and passwords.

Go also supports the SSH protocol when calling Git, but it is not used by default. The SSH protocol can only be used if the specification is displayed at dynamic fetch time. If using static matching (prefix matching or regular matching), can match the used module information, can only use HTTPS protocol.

In the above module is example.com/pkg, warehouse address is gitlab.example.com/example/pkg. The content of the meta tag contains the complete module information. The first part is the module name, which is the same as the previous module name definition. This is followed by Git, which means using Git to get the code, and the last section is the repository address where the SSH protocol is specified, along with the Git user name and server SSH port number.

Git SSH authentication is based on a key pair, which can be generated using the SSH tool suite ssh-keygen if you don’t have a key pair. The commonly used parameter -t, which specifies the type of the key, is listed in the figure above. Among them, the RSA key is probably the most commonly used, and I prefer to use ED25519, which has an obvious advantage that the key length is very short, the public key and the private key are only 32 bytes, the security can also be comparable to the RSA key of about 3000 bits, which can ensure security, the key length is short, So ED25519 is often used as the key.

When the secret key pair is generated, the file code of the key team, including the private key and the public key, is generated in the.ssh folder under the HOME. The file at the end of. Pub “j is a public key file that needs to be configured on a code-hosting platform such as GitLab or GitHub. On the right is a screenshot of GitLab. The key used in the image is in the ED25519 format. As you can see, the length is really short.

GoProxy fetching module

Go supports access to GO modules through the GOProxy protocol. The module is based on the HTTP protocol, using only the HTTP GET request, and using the standard HTTP status code to call. When using the public GOProxy protocol, the GOProxy proxy server has no user name and password by default. In fact, if you want to build a private one, you can support HTTP base authorization by configuring the user name and password through the.netrc file as before. GoProxy also has two features:

  • First: GoProxy can get the module faster than using VCS to clone directly, for reasons that will be explained later.
  • Second: can solve the module can not access the problem, such as the Golang domain name can not access the problem, through the third party set up the proxy server can access to download these modules.

GOPROXY use

The configuration of GOProxy is controlled by the GOProxy environment variable, which configures the proxy server URL. Proxy server URLs can be configured multiple times, separated by a comma and a comma, the differences between which are illustrated in the examples below.

The fixed strings off and direct can be used instead of URLs. Off disallows downloading of modules from any source. Setting GOPROXY to off disallows downloading of modules, only using local modules, not from Gitlab, GitHub, or anywhere else. Direct stands for pull directly from the VCS and is usually used as an alternative.

Two examples are shown in the figure:

  • The first is the syntax for the Linux environment variable, which is set through export. I configured proxy.golang.org, the official Google GoProxy server, with the comma specifying the alternative, direct. When the GOProxy server returns the 403 and 410 status codes, the module cannot be found. When specifying alternatives separated by commas, Go Get will attempt to use alternatives only if the server returns a 403 or 410 status code, in this case by downloading the code from the version management platform.
  • The second one uses a different syntax configuration. The Go env-w syntax comes with Go and is supported in Go1.13. It can be used across platforms. With this syntax, there is no difference between operating systems. On Windows, Linux, Max, you can configure GO-related environment variables in this way. In the example, the address of the popular domestic proxy is set: goproxy.cn. The pipe symbol is used here to specify the alternative. The pipe symbol means that no matter what error the proxy server returns, even if it is not an HTTP error, such as a GOProxy server failing to return 500 error, or a network error. Will try to use alternatives to download modules.

GOPROXY implementation

The implementation of GoProxy is simple, with only five interfaces officially defined.

The meaning of the three variables in the URL is as follows:

  • Base is the URL address of GOProxy server;
  • Module means the name of the module that needs to be retrieved;
  • Version is the version of the module.

Case encoding problems

HTTP URL definitions are case insensitive, and on some systems confusion can occur when module or version appears with uppercase letters. To avoid this problem, you need to encode the uppercase letters to exclamation mark plus lowercase letters.

  • The first interface is to get a list of all versions;
  • The second interface is to get the information of the specified version;
  • The third interface is to get the specified module, the specified version of the mod file;
  • The fourth interface is to get the latest version of the module. This is an optional interface. If you do not provide or implement this interface, GoProxy will still work.
  • The last interface is to download the ZIP file of the specified version of the module.

Proxy.golang.org is the address of the proxy server, golang.org/x/text is the name of the module to get, @v is a fixed string, and list is the list interface to call. You can see that this interface returns all the versions of the Text package, and after Go gets all the versions in the figure, it can infer the latest version of the module from the version semantics.

As you can see in the figure above, the INFO interface and the LATEST interface return the same thing. Version: The Version number of the fixed Version string. Time is an optional string in the format of FC3339, which represents the submission Time of the Version.

Finally, the MOD and ZIP interfaces. The MOD interface returns the specified version of the MOD file. In the example above, we get the latest version of the MOD file. The text package only relies on the tools module. ZIP file interface is to get the specified version of the module ZIP file, when it packages all the original files of the version into ZIP file, Go Get finally through the interface to download this version of the module.

As mentioned above, it is faster to get the source code through GoProxy than through VCS. To download the source code through ZIP will only download all the files of the current version without including the historical version information. If you clone the repository through VCS such as Git, you will get all the historical version information. As a result, files obtained through the GOProxy ZIP interface are smaller and faster to download. It is important to note that GOProxy defines the size of the module ZIP file and the total uncompressed limit for all its files is 500 MiB. Go.mod files and LICENSE files are limited to 16 MiB.

The module verification

Go1.13 adds SUM validation to the Go module. By default, all Go modules are downloaded to verify that their hash is the same as recorded online (default: sum.golang.org).

The validation process can be controlled by the environment variables GONOSUMDB and GOSUMDB: let’s first look at the configuration of GOSUMDB, which specifies the online database address to be used. Since sum.golang.org is not accessible in China by default, the configuration in the figure above uses the domestic image built by Google. It can also be configured as OFF, which means to disable validation, that is, the downloaded module does not perform validation of the hash value, and completely discard this process. I do not recommend this in practice. You can use GONOSUMDB’s environment variable to configure modules that do not need to be validated. For example, private modules will definitely not pass validation. Gonosumde works by prefix matching. If gitlab.com is configured in the figure, then all packages starting with gitlab.com will not undergo GO checksum checks.

Here’s a rundown of common variables:

  • GonoProxy runs based on prefix matching. GitLab.com is specified in the figure above, that is, all the codes on GitLab.com are not retrieved from the GoProxy server, but are pulled directly from the original code server through traditional VCS method.
  • GonosumDB, which allows modules with prefix matches to skip security checks;
  • GOPIvate is a set of the two environment variables, and configuring GOPIvate is configuring the two environment variables together.
  • GovCS, which was added only in GO1.16, is primarily used to specify which VCs are used by which modules.

The business practice of cloud again

Use of private packages

Here’s how to use private modules. In general, GitLab is a privatized service that is widely used in companies. GitLab itself supports HTTP requests in response to GO GET. When a packet is retrieved via Go Get, the client sends an HTTP request to the GitLab server, which receives the request and returns a response containing Meta tags. This tag tells the client that the module is using Git to get the source code over the HTTP protocol. GitLab uses the HTTPS protocol by default. After the client receives the response result from GitLab server, it can properly use Git to pull the source code of the module. After a module has been downloaded, there will also be a checksum check process. You can add gitlab.com to the GOPVRivate variable to tell GO that all related modules are private and skip the checksum check.

The situation is a little different in the internal practice of Youpao Cloud, where all HTTP services used need to be re-verified by Google. All requests to the internal GitLab server are pre-checked to see if there is a Googley authorized HEAD, and if there is none are intercepted and a 403 error is returned. This will result in all simple HTTP requests not reaching the GitLab server and being intercepted directly. HTTP requests sent by Go will also be blocked, causing Go to fail to retrieve module information correctly. At this time, although the original code on the Clone server under the SSH protocol can be directly communicated, the request failed due to the lack of such information in GO GET. So the gray line in the figure below will not actually make the request.

So what’s the solution? This is done by using an additional HTTP service to handle Go Get’s HTTP requests. The additional HTTP service does not have a validation process. After the request passes, it will go GET to retrieve the Meta information required. The use of SSH protocol must be specified in the Meta, because the GitLab HTTP service has secondary authentication and requests without authentication cannot pass, so the SSH protocol must be used only. Permission authentication can be accomplished by SSH key pairs, which are authorized without awareness. The Go Get boot HTTP service does not manage authorization related issues; all authorization handling is left to GitLab. As a private module, if there is no corresponding responder, the authorization is handled by GitLab.

Go get a guide

How does it work with additional services to guide Go Get? This requires a change in the naming of the module package, which requires a change in the rules based on GitLab naming.

gitlab.com/lyp256/pkg

Domain name repository

A complete module is composed of several parts, the first is the domain name gitlab.com, lyp256 is the owner, and PKG is the module’s project name. What is important for a single GitLab platform is that the last two paragraphs, specifying the module owner and the project name, must be fixed and can be ignored.

Based on this rule, I implemented a simple little service to handle Go Get HTTP requests. The code is as follows:

Gitlab CI practice

GitLab CI will open an empty container. The example in the figure uses a mirror image of Golang Alpine. There is nothing else in this mirror image except Golang. We need to install the dependencies and inject the SSH authentication content. Script is defined as follows:

Step 1: Using mikdir-p, create a directory under the cache. This directory is the cache on our CI machine. This directory is a space on the physical disk that can hold data, which can be used to cache Go Mods to reduce module downloads.

Step 2: Install the base environment, tool packages, etc. The example in the figure installs Git and g++, which is the dependency required for Go compilation, and OpenSSH, which is the SSH toolchain required by Git.

Step 3: Process the SSH secret key. There are two steps, trusting the GitLab server secret key and importing the authenticated private key. The private key is imported through the environment variable $DEPLOY_SSH_KEY and you only need to save the contents of the environment variable into the corresponding secret key file. The Gitlab server secret key is obtained and saved to the known_hosts file using ssh-keyscan. The GitLab SI configuration places the private key that allows access to the Git project in the environment variable $DEPLOY_SSH_KEY, places the private key in the corresponding SSH private key file and grants the correct permissions.

Finally, you also need to configure the GOPrivate variable to define all GO.holdcloud.com related modules as PRIVATE modules without using proxies and verifiers and checks.

So far, we have basically completed all the preparation work. The GO TEST following is the normal CI test logic, which can be written according to the actual situation.

conclusion

  • Go will remove support for GoPath in 1.17. It is recommended to migrate to GomModule as soon as possible.
  • Go checksum checks can sense code changes to improve security and availability. It is recommended not to close.
  • It is recommended to retain vendor to prevent the deletion of dependent modules.

Recommended reading

Practical note: the mental process of configuring a monitoring service for NSQ

Say goodbye to DNS hijack, read DOH