origin

Nebula Graph’s earliest automated testing was done using Jenkins on Azure, in conjunction with GitHub’s Webhook, when a user submitted a Pull Request, Jenkins Go can automatically run the corresponding UT test by adding a ready-to-testing label and commenting on it. The results are as follows:

Because it was a rented Azure cloud host, with Nebula’s compilation requirements, the task triggering was concentrated during the day. Since last year, the team has been looking for an alternative solution to bring the Azure test machine offline and provide a multi-environment test solution.

The existing products mainly include:

  1. TravisCI
  2. CircleCI
  3. Azure Pipeline
  4. Jenkins on K8S (self-built)

Although the above products are somewhat limited to open source projects, they are generally relatively friendly.

In view of the previous experience of using GitLab CI, I realized that deep integration with GitHub is of course the first choice. “Deep” means sharing GitHub’s entire open source ecosystem and perfect API calls (more on that later). In 2019, With GitHub Action 2.0 out of the way, Nebula Graph gamely stepped into the hole.

Here’s a quick overview of some of the benefits we’ve seen with GitHub Actions:

  1. Free of charge. Open source projects can use all the functions of Action for free, with relatively high machine configurations.
  2. Open source ecology is good. In the whole CI process, you can directly use all the open source actions on GitHub, even if there is no Action that meets the needs, it is not very troublesome to write it by yourself, and docker customization is supported, you can use Bash to complete an exclusive Action.
  3. Support for multiple systems. Windows, macOS and Linux can all be used with one click, making it easy to use across platforms.
  4. Interact with GitHub’s API. throughGITHUB_TOKENDirect accessGitHub API V3To upload a file, run the curl command to check the PR status.
  5. Since the escrow. Just provide workflow’s description file and place it in.github/workflows/Directory, each commit automatically triggers the execution of a new Action Run.
  6. The Workflow description file is changed to YAML format. The current description is much more concise and readable than the Workflow file in Action 1.0.

But before we get to the Nebula Graph’s requirements for practice, the first goal is clearly to automate testing.

As a database product, testing cannot be overemphasized. Nebula’s testing is divided into unit testing and integration testing. GitHub Action is really aimed at unit testing and then preparing for integration testing, such as Docker image builds and installer packaging. In addition to addressing the release requirements of PM sister, the entire CI/CD process was built for the first version.

PR test

Nebula Graph is an open source project hosted on GitHub. One of the first testing questions to be addressed is how to quickly validate changes when contributors submit PR requests. There are mainly the following aspects.

  1. Does not conform to the code specification;
  2. Can compile on different systems;
  3. There is no single test failure;
  4. Has code coverage decreased, etc.

Only if all of the above requirements are met and at least two Reviewer’s have agreed will the change proceed to the main branch.

Requirement 1 can be easily implemented using open source tools such as CPplint or Clang-format. If the requirement is not verified, the following steps are automatically skipped.

For Requirement 2, we want to be able to run Compilation validation of Nebula source on several currently supported systems simultaneously. So like before directly build on the physical machine is no longer desirable, after all, the price of a physical machine has been high, let alone one is not enough. In order to ensure the consistency of the compilation environment and reduce the performance loss of the machine as much as possible, docker container construction was finally adopted. With the help of Action’s matrix running strategy and support for Docker, the whole process can be smoothly carried out.

Maintenance of a Docker image of the Nebula compilation environment in the Vesoft – Inc/Nebula – Dev-Docker project. When the compiler or ThirdParty relies on upgrade changes, Automatically triggers the Docker Hub Build task (see figure below). When the new Pull Request is submitted, the Action is triggered to start pulling the latest build environment image and executing the build.

A complete description of workflow for PR is in the pull_request.yaml file. At the same time, considering that not every PR submitted by everyone needs to run CI test immediately, and the self-built machine resources are limited, the following restrictions are made on the trigger of CI:

  1. Only PR that passes Lint verification will send subsequent jobs to self-built Runners. Lint tasks are lightweight and can be executed using GitHub Action hosted machines without consuming offline resources.
  2. Only if you addready-for-testingThe PR of the label triggers the execution of the action, and the label adds permission control. Further optimizes situations where runners are triggered randomly. The restrictions on labels are as follows:
jobs:
  lint:
    name: cpplint
    if: contains(join(toJson(github.event.pull_request.labels.*.name)), 'ready-for-testing')
Copy the code

The effect of PR is as follows:

Code Coverage is described in the blog post: Code Change Test Coverage Practices for Nebula Graph.

The Nightly build

With Nebula Graph’s integration testing framework, you want to be able to run through all the test cases every night against the code in Codebase. There are also new features that you want to package quickly and deliver to the user experience. This requires CI system to be able to give the docker image and RPM/DEB installation package of the day code every day.

GitHub Action triggers events of the schedule type as well as the pull_REQUEST type. Schedule-type events, like crontab, allow the user to specify the trigger time for any repeated tasks, such as the following at 2am each day:

on:
  schedule:
    - cron: '0 18 * * *'
Copy the code

Since GitHub uses UTC time, 2am on the 8th east corresponds to 18:00 UTC on the day before.

docker

The docker image built daily needs to be pushed to the Docker Hub, tagged with the nightly nightly, integrated to the K8S cluster for the test, and set the pull policy of the image to Always. Then the daily trigger enables rolling upgrade to the latest test of the day. Since the issues of the day are resolved on the same day, there is no additional date tag for the nightly image. The corresponding action section looks like this:

      - name: Build image
        env:
          IMAGE_NAME: The ${{ secrets.DOCKER_USERNAME }}/nebula-${{ matrix.service }}:nightly
        run: |
          docker build -t ${IMAGE_NAME} -f docker/Dockerfile.${{ matrix.service }} .
          docker push ${IMAGE_NAME}
        shell: bash
Copy the code

package

GitHub Actions provide artifacts that allow the user to persist data from a Workflow run for up to 90 days. This is more than enough for the storage of the NIGHTLY edition installation package. You can easily upload files from a specific directory to artifacts using the official actions/upload-artifact@v1 actions. The final nightly version of the Nebula installation package is shown below.

See package.yaml for the complete workflow file above

Branch issued

In order to better maintain each release and bugfix, Nebula Graph is branching out. Before each release, code freeze is performed and a new release branch is created. On the release branch, only bugfix is accepted without feature development. Bugfix is still committed on the development branch, and finally cherrypick goes to the Release branch.

In each release, in addition to source, we hope to append the installation package to Assets for users to download directly. If you manually upload it every time, it can be error-prone and time-consuming. This is more suitable for Action to automate this work, and the packaging and uploading are done via GitHub internal network, which is faster.

After compiling the installation package, use the curl command to directly invoke the GitHub API to upload the script to assets, as shown below:

curl --silent \
     --request POST \
     --url "$upload_url? name=$filename" \
     --header "authorization: Bearer $github_token" \
     --header "content-type: $content_type" \
     --data-binary @"$filepath"
Copy the code

At the same time, for the sake of security, it is hoped that the checksum value of the installation package can be calculated and uploaded to Assets each time the installation package is released, so that users can conduct integrity verification after downloading. The specific steps are as follows:

jobs:
  package:
    name: package and upload release assets
    runs-on: ubuntu-latest
    strategy:
      matrix:
        os:
          - ubuntu1604
          - ubuntu1804
          - centos6
          - centos7
    container:
      image: vesoft/nebula-dev:${{ matrix.os }}
    steps:
      - uses: actions/checkout@v1
      - name: package
        run: ./package/package.sh
      - name: vars
        id: vars
        env:
          CPACK_OUTPUT_DIR: build/cpack_output
          SHA_EXT: sha256sum.txt
        run: |
          tag=$(echo ${{ github.ref }} | rev | cut -d/ -f1 | rev)
          cd $CPACK_OUTPUT_DIR
          filename=$(find . -type f \( -iname \*.deb -o -iname \*.rpm \) -exec basename {} \;)
          sha256sum $filename > $filename.$SHA_EXT
          echo "::set-output name=tag::$tag"
          echo "::set-output name=filepath::$CPACK_OUTPUT_DIR/$filename"
          echo "::set-output name=shafilepath::$CPACK_OUTPUT_DIR/$filename.$SHA_EXT"
        shell: bash
      - name: upload release asset
        run: |
          ./ci/scripts/upload-github-release-asset.sh github_token=${{ secrets.GITHUB_TOKEN }} repo=${{ github.repository }} tag=${{ steps.vars.outputs.tag }} filepath=${{ steps.vars.outputs.filepath }}
          ./ci/scripts/upload-github-release-asset.sh github_token=${{ secrets.GITHUB_TOKEN }} repo=${{ github.repository }} tag=${{ steps.vars.outputs.tag }} filepath=${{ steps.vars.outputs.shafilepath }}
Copy the code

The complete workflow file above is in release.yaml.

The command

GitHub Actions provide Workflow with commands that you can easily call from your shell to fine-tune the execution of each step. Common commands are as follows:

set-output

Sometimes results need to be passed between job steps, In this case, you can run the echo “::set-output name=output_name::output_value” command to set the output_value to the output_name variable.

In the next step, you can refer to the above output as ${{steps.stepid.elsion.output_name}}.

The job that uploads an asset in the previous section uses this method to pass file names. A step can set up multiple outputs by executing the above command multiple times.

set-env

As with set-output, environment variables can be set for subsequent steps. Syntax: echo “::set-env name={name}::{value}”

add-path

Add a PATH to the PATH variable for subsequent steps. Syntax: echo “::add-path::{path}”.

Self-Hosted Runner

In addition to the official GitHub hosted runner, Action can also use its own machine as a runner to run Action jobs. Once you have the Action Runner installed on your machine, follow the tutorial, register it with your project, and make it available by configuring RUNS-on: Self-Hosted in your Workflow file.

Self-hosted machines can be labeled differently so that tasks can be distributed to specific machines using different labels. For example, if offline machines are installed with different operating systems, jobs can be run on specific machines based on the runs-on label. Self-hosted is also a specific label.

security

GitHub officially doesn’t recommend the use of self-hosted Runner for open source projects, because anyone can make runner’s machine run dangerous code to attack its environment by submitting PR.

But with the Nebula Graph compilation requiring a lot of storage space, and GitHub providing only a 2-core environment for compilation, we chose to build Runner instead. In consideration of security, security hardening is carried out in the following aspects:

Vm Deployment

All runners registered to GitHub Action are deployed on virtual machines, which isolates the first layer from the host machine and makes it more convenient to allocate resources to each VIRTUAL machine. A highly configured host machine can allocate multiple virtual machines for runners to run all received tasks in parallel.

If the VM is faulty, you can easily restore the environment.

Network isolation

Isolate all runner VMS from the office network so that they cannot directly access internal resources of the company. Even if someone submits malicious code through PR, it prevents them from accessing the company’s internal network, creating further attacks.

The Action choice

If you are using an individual developer’s work, it is best to review the specific implementation code to avoid the occurrence of privacy key leakage and other things that are exposed online.

For example, GitHub maintains a list of actions: github.com/actions.

The private key check

${{secrets.my_tokens}} ${{secrets.my_tokens}} ${{secrets.my_tokens}} To prevent users from stealing the key by printing it out privately through PR.

Environment setup and cleanup

For self-built Runners, it is convenient to share files between different jobs. However, do not forget to clean up the intermediate files generated after the entire action is executed, otherwise these files may affect the execution of subsequent tasks and occupy disk space continuously.

      - name: Cleanup
        if: always()
        run: rm -rf build
Copy the code

Setting the run condition of step to always() ensures that the step is executed every time the task is done, even if something goes wrong.

Docker-based Matrix parallel construction

Because Nebula Graph requires compilation and validation on different systems, the container approach is used for the build. GitHub Actions can support Docker-based tasks natively because of the isolation of environments at build time.

Actions support the way matrix policies run tasks, similar to TravisCI’s Build Matrix. By configuring a combination of different systems and compilers, it is easy to set up both GCC and CLang on each system to compile the Nebula source simultaneously, as shown below:

jobs:
  build:
    name: build
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        os:
          - centos6
          - centos7
          - ubuntu1604
          - ubuntu1804
        compiler:
          - GCC - 9.2 -
          - clang-9
        exclude:
          - os: centos7
            compiler: clang-9
Copy the code

The strategy above generates eight parallel tasks (4 OS x 2 Compiler), each of which is a combination of (OS, Compiler). This matrix-like representation can greatly reduce the definition of task combinations at different latitudes.

If you want to exclude a combination from the matrix, you simply configure the value of the combination under the exclude option. If you want to access the matrix value in the task, you can also get it in the same way that ${{matrix.os}} gets the context variable value. These methods make it easy to customize your own tasks.

Runtime container

We can specify a container environment at run time for each task so that all steps under that task are executed within the container’s internal environment. It’s much cleaner than applying the Docker command at every step.

    container:
      image: vesoft/nebula-dev:${{ matrix.os }}
      env:
        CCACHE_DIR: /tmp/ccache/${{ matrix.os ${{}} - matrix.compiler }}
Copy the code

The container configuration, like in the docker compose configured in the service, you can specify the image/env/ports/volumes/options and so on parameters. In self-hosted Runner, you can easily mount a directory on the host to a container for file sharing.

It is based on the container feature above the Action that is convenient to do subsequent compilation in docker cache acceleration.

Compile the acceleration

The source code for Nebula Graph is written in C++, which takes a long time to build and would waste a lot of computing resources if Nebula Graph had to be completely restarted each time. Because each runner runs variable tasks, it takes an accurate check of each source file and the corresponding compilation process to determine whether the source file has actually been modified. Currently, the latest version of CCache is used for caching tasks.

Although the GitHub Action itself provides cache functionality, because the Nebula Graph’s current unit test case was statically linked and compiled to a larger size than its available quota, a local cache strategy was used.

ccache

Ccache is a compiler caching tool that speeds up the compilation process and supports compilers such as GCC/CLang. Nebula Graph uses the C++ 14 standard. Earlier versions of ccache have compatibility issues and are installed manually in all vesoft/ Nebula -dev images.

Nebula Graph automatically identifies whether CCache is installed in the Configuration of Cmake and decides whether to enable it. Therefore, you only need to configure ccache in the container environment. For example, in ccache. Conf, you can set the maximum cache capacity to 1 GB.

max_size = 1.0G
Copy the code

The ccache. Conf configuration file is best placed in a cache directory so that ccache can read it easily.

tmpfs

TMPFS is a temporary file system located in the memory or swap partition. It effectively alleviates disk I/O latency. Since the self-hosted host memory is sufficient, change the directory mount type of ccache to TMPFS to reduce ccache read/write time. The mount type of TMPFS used in Docker can be referred to the corresponding documentation. The parameters are as follows:

    env:
      CCACHE_DIR: /tmp/ccache/${{ matrix.os ${{}} - matrix.compiler }}
    options: --mount type=tmpfs,destination=/tmp/ccache,tmpfs-size=1073741824 -v /tmp/ccache/${{ matrix.os ${{}} - matrix.compiler }}:/tmp/ccache/${{ matrix.os ${{}} - matrix.compiler }} 
Copy the code

Place all cache files generated by ccache in a directory of type TMPFS mounted.

Parallel compilation

Make itself supports parallel compilation of multiple source files. Configure -j $(nproc) at compile time to start the same number of tasks as the core. In the Action Steps, configure the following:

      - name: Make
        run: cmake --build build/ -j $(nproc)
Copy the code

pit

With all the positives, are there any downsides? The main experience is as follows:

  1. Only later versions of the system are supported. Many actions are based on newer Nodejs versions and cannot be easily used in older Docker containers such as CentOS 6. Otherwise, the Nodejs library file cannot be found, and the action cannot be started. Because Nebula Graph wanted to support CentOS 6, tasks under that system had to be handled differently.

  2. Local authentication cannot be done easily. Although the community has an open source project, ACT, there are many limitations to its use, and sometimes you have to submit validation repeatedly in your own repository to make sure your action changes are correct.

  3. There is still a lack of good guidelines, and when there are a lot of custom tasks, there is always a feeling of writing programs in a YAML configuration. At present, there are three main methods:

    1. Split configuration files by task.
    2. Custom action, via GitHub SDK to achieve the desired functionality.
    3. Write a large shell script to complete the task and invoke it in the task.

There is no consensus in the community as to whether to use as many combinations of small tasks as possible or to use large tasks. However, the combination of small tasks can easily locate the failure location of the task and determine the execution time of each step.

  1. Some of the history of the Action cannot be cleaned up. If you change the name of workflows, the old Check Runs record will remain on the Action page, affecting the user experience.

  2. There is currently a lack of functionality like GitLab CI to manually trigger job/ Task execution. No manual intervention in the middle of operation.

  3. Action development is also in constant iteration, sometimes need to maintain a new update, such as: checkout@v2

In general, GitHub Action is a relatively good CI/CD system. After all, there is a lot of experience that can be learned from GitLab CI/Travis CI and others.

subsequent

Custom Action

Docker released its first Action some time ago to simplify docker-related tasks for users. In the future, we will also be customizing actions for all Nebula’s REPOs to address Nebula’s complex CI/CD needs. A generic repO would create a separate repO and publish it to the Action market, such as appending assets to release. A dedicated repo can be placed in the.github/ Actions directory.

This simplifies YAML configuration in workflows by simply using a custom action. Flexibility and extensibility are better.

IM integration with Dingpin/Slack etc

GitHub’s SDK can be used to develop complex action apps, and combined with the customization of Bots such as Dingding/Slack, many interesting automated small apps can be implemented. For example, when a PR was approved by more than 2 reviewer’s approve s and all check runs passed, a message could be sent to the spike group and @some people to merge the PR. It saves the trouble of checking every PR status in the PR list every time.

Of course, there’s a lot of fun to be had around GitHub with bots.

One More Thing…

Nebula Graph 1.0 GA is ready for release. Welcome to the party.

Any errors or omissions in this article are welcome to GitHub: github.com/vesoft-inc/… Submit an issue to us in the Issue section or submit a suggestion under the suggestion feedback section of the official forum: discuss.nebula-graph.com.cn/ 👏; NebulaGraphbot to join the NebulaGraph Networking group, contact NebulaGraphbot, the NebulaGraph’s official assistant account

Hi, I’m Yee, and I’m a Nebula Graph developer with a passion for database query engines. I hope you can help with the Nebula Graph experience