Container-orchestrated Dev/Ops processes

Author: Huang Kai

This article is original, please indicate the author and source

With the popularity of DevOps and SRE concepts, more and more developers and Operaters are abandoning the traditional development and deployment process in favor of a wireless cycle as shown below:

As I understand it, DevOps consists of three chunks: Agile, Continuous Integration and delivery (CI/CD), and Automated Operations (ITSM).

How do we implement DepOps or SRE in the age of containerization? Here’s a look at the Container-orchestrated DevOps process of the HJenglish.com product line team.

Agile development

The road to simplicity, all the lessons of blood taught us, do not complicate simple things. In other words, don’t do simple things in complicated ways. My definition of agile is “fast” and “micro”. Fast refers to fast iteration, fast development, fast online, fast performance. Micro refers to micro service, micro image. Around these two points, there are a few things we need to do during development:

Apply microservitization

This is a relatively large concept and will not be discussed here. If you are interested, you can refer to my other articles. But only when the application is small can it be fast.

Slim down the Docker image

FROM java:8-jre-alpine                         #add timezone and default it to ShanghaiRUN apk --update add --no-cache tzdataENV TZ=Asia/Shanghai  RUN
                        mkdir -p /app/logCOPY  ./target/xxx.jar  /app/xxx.jar  EXPOSE 9999VOLUME ["/app/log"]WORKDIR
                        /app/  ENTRYPOINT ["java","-Xms2048m",
                        "-Xmx2048m", "-Xss512k", "-jar",
                        "xxx.jar"]CMD []
                            Copy the code

The average image generated using the above Dockerfile is just over 80 MB and starts in almost 5 seconds. Using Alpine mirrors reduces volume, but lacks utility commands such as curl, which can be installed as needed. Another problem is the timezone problem: because the timezone in the Docker image is UTC time, which is inconsistent with the east 8 zone of the host machine, the timezone tool must be installed and TZ must be set to keep the time in the container consistent with the host machine. It is very necessary to write to the database and log output.

Include all environment configurations in the image

Back in the virtual machine era, we’ve been using virtual machine images that contain dependencies to speed up deployment, so why stop there? We can go one step further and include the service itself in the image, which Docker already does in a lighter way.

Here we also introduce the concept of making images that run on all Docker installed servers, regardless of the host’s operating system and environment. To borrow a phrase from Java: Build once, run on multiple platforms. Therefore, we will also put all the configuration files of the environment into the image with different file names, and select the environment configuration file used when Docker is started by parameter.

It is worth noting that this is easy to implement if you are developing an application based on the Spring framework. But if you’re developing in other languages, there’s a certain amount of development.

When all the development work is done, the recommended directory structure looks like this:

├ ─ ─ the SRC │ ├ ─ ─ the main │ │ ├ ─ ─ Java │ │ ├ ─ ─ resources │ │ │ ├ ─ ─ application. The yaml │ │ │ ├ ─ ─ application - dev. Yaml │ │ │ ├ ─ ─ Application - qa. Yaml │ │ │ ├ ─ ─ application - yz. Yaml │ │ │ ├ ─ ─ application - prod. Yaml │ │ │ ├ ─ ─ logback. XML │ ├ ─ ─ the test ├ ─ ─ Scripts │ ├─ Dockerfile │ ├─ Initdb.sql │ ├─ pop.xmlCopy the code

Continuous integration and delivery

Automated continuous integration and delivery plays an important role in the overall DevOps flow as a bridge between development and operations. If this link is not done well, it cannot support the rapid iteration and efficient operation and maintenance of a large number of microservices. In this process, we need to use tools flexibly and minimize people’s participation. Of course, we still need to focus on “fast” and “micro”.

How do you reduce human involvement in continuous integration and delivery? We most hope that the development process is: to the computer to say we want the function, the computer according to the routine, automatic coding, automatic release to the test environment, automatic running of the test script, automatic online. Of course, the current era of automatic coding still requires the invention of the cat.

But with enough confidence in testing, you can easily submit your code on a hot afternoon, go to the break room for a cup of coffee, and come back to see your code being used in production. In the container age, we can realize this dream very quickly, and the detailed steps are as follows:

Gitfolw and Anti – Gitflown

The first step of continuous integration is Code Commit. VCS also evolved from CVS and SVN to Git today, and Gitflow has to be mentioned. When you talk about Gitflow, you’ll hear a lot about its advantages: multiple teams, parallel development with developers in multiple countries, and reduced chances of code conflicts or dirty code going live. Its general process is as follows:

Gitflow shows us an elegant solution for complex teams dealing with broken versions of code, requiring feature, Develop, Release, Hotfix, and Master branches to handle parallel development at different times. But is this really appropriate for a local collaborative team of no more than 20 people? Our development team is less than 6 people, and each person is responsible for more than 3 micro-services. It is almost impossible to arrange more than two students to develop the same project in parallel.

In the early days of adhering to the rules and using the standard Gitflow process, developers immediately found that they needed to merge code back and forth across at least three branches without any code conflicts (because one person was developing), which reduced development efficiency. This made me realize that the Gitflow model might not be suitable for the world of small team microservices, and an anti-Gitflow model came to mind. I decided to slim down Gitflow and simplify it.

We simplified the 5 branches into 3 branches, where the Master branch only maintains the latest online version, the Dev branch is the main branch of development, and all mirrors are generated from this branch’s code source. The development process then becomes:

The developer checks out the new feature branch from the Dev branch and develops on the feature branch
After the development is completed, merge the Dev branch into a mirror based on the Dev branch code, and deploy the QA environment for testing
If there are bugs in the test, fix them in the new branch loop step 2
Test merge back to Master branch

In this way, there is only one merge from the Feature to the Dev branch, which greatly improves the development efficiency.

Using Jenkins Pipeline

Jenkins, as an old CI/CD tool, can help us automate code compilation, upload static code analysis, create images, deploy test environment, smoke test, deploy online and other steps. This is especially true with the introduction of Pipeline concepts in Jenkins2.0. It allows us to go completely unattended throughout the integration and release process, starting with Step 3.

To do a good job, we need to do a good job. First we need to install the plug-in on Jenkins:

Pipeline Plugin (if installed using Jenkins2.0 default)
Git
Sonar Scaner
Docker Pipeline Plugin
Marathon

If you first contact Jenkins Pipeline, can find help from https://github.com/jenkinsci/pipeline-plugin/blob/master/TUTORIAL.md.

Now, let’s start writing the Groove code. Pipeline based on container orchestration is divided into the following steps:

1. Check out the code

This step uses Git plug-ins to check out the developed code.

stage('Check out')
  gitUrl = "[email protected]:xxx.git"
  git branch: "dev", changelog: false, credentialsId: "deploy-key", url: gitUrlCopy the code

Maven builds Java code

Since we are using the Spring Boot framework, the result should be an executable JAR package.

stage('Build')
  sh "${mvnHome}/bin/mvn -U clean install"Copy the code

Static code analysis

With the Sonar Scaner plug-in, you can tell Sonar to perform a static scan of the code base.

Stage ('SonarQube analysis') // requires SonarQube Scanner 2.8+ def scannerHome = tool 'sonarqube.scanner-2.8 '; withSonarQubeEnv('SonarQube-Prod') { sh "${scannerHome}/bin/sonar-scanner -e -Dsonar.links.scm=${gitUrl} -Dsonar.sources=. -Dsonar.test.exclusions=file:**/src/test/java/** -Dsonar.exclusions=file:**/src/test/java/** -Dsonar.language=java -Dsonar.projectVersion=1.${BUILD_NUMBER} -Dsonar.projectKey=lms-barrages -Dsonar.projectDescription=0000000-00000 -Dsonar.java.source=8 -Dsonar.projectName=xxx" }Copy the code

4. Make the Docker image

This step will call the Docker Pipeline plug-in through the pre-written Dockerfile, the JAR package and configuration file, three-party dependency package into the Docker image, and uploaded to the private Docker image warehouse.

stage('Build image') docker.withRegistry('https://dockerhub.xxx.com', 'docker.build('dockerhub.xxx.com/xxxx').push('test') //test is the tag name}Copy the code

5. Deploy the test environment

Deploy the generated image in the test environment by notifying the Marathon cluster with the Marathon plug-in through a pre-written deployment file.

stage('Deploy on Test')
    sh "mkdir -pv deploy"
    dir("./deploy") {
        git branch: 'dev', changelog: false, credentialsId: 'deploy-key', url: '[email protected]:lms/xxx-deploy.git'
        //Get the right marathon url
        marathon_url="http://marathon-qa"
        marathon docker: imageName, dockerForcePull: true, forceUpdate: true, url: marathon_url, filename: "qa-deploy.json"
    }
Copy the code

6. Automate testing

Run automated test scripts written in advance to verify that the program is running properly.

Git branch: 'dev', Changelog: false, credentialsId: 'deploy-key', url: '[email protected]:lms/xxx-test.git' parallel(autoTests: Docker run -it --rm -v $PWD:/code nosetests nosetests -s -v -c conf\run\api_test.cfg --attr safeControl=1" },manualTests:{ sleep 30000 })Copy the code

7. Manual testing

If you are not comfortable with automated testing, you can choose to end the Pipeline and perform manual testing at this time. To illustrate the process, we chose to skip the manual testing here.

8. Deploy the production environment

When all tests pass, the Pipeline automatically releases to the production environment.

stage('Deploy on Prod')
    input "Do tests OK?"
    dir("./deploy") {
        //Get the right marathon url
        marathon_url="http://marathon-prod"
        marathon docker: imageName, dockerForcePull: true, forceUpdate: true, url: marathon_url, filename: "prod-deploy.json"
    }Copy the code

Finally, let’s look at the whole Pipeline process:

Container choreography configuration is documented

When I introduced Agile development, I talked about deploying to different environments based on the configuration parameters of different environments. How do you tell the deployer what configuration file to start the service with, and how much CPU, memory, and instance each environment uses?

Let’s take a look at container-orchestrated configuration files. Since we used Mesos+Marathon’s container choreography, the burden of deployment was changed from writing a deployment script to writing a Marathon configuration, which reads as follows:

{" id ":"/appName ", "cpus" : 2, the "mem" : 2048.0, "instances" : 2, "args" : [" -- spring. Profiles. The active = qa "], "labels" : { "HAPROXY_GROUP": "external", "HAPROXY_0_VHOST": "xxx.hujiang.com" }, "container": { "type": "DOCKER", "docker": { "image": "imageName", "network": "USER", "forcePullImage": true, "portMappings": [ { "containerPort": 12345, "hostPort": 0, "protocol": "tcp", "servicePort": 12345 } ] }, "volumes": [ { "containerPath": "/app/log", "hostPath": "/home/logs/appName", "mode": "RW" } ] }, "ipAddress": { "networkName": "calico-net" }, "healthChecks": [ { "gracePeriodSeconds": 300, "ignoreHttp1xx": true, "intervalSeconds": 20, "maxConsecutiveFailures": 3, "path": "/health_check", "portIndex": 0, "protocol": "HTTP", "timeoutSeconds": 20 } ], "uris": [ "file:///etc/docker.tar.gz" ] }Copy the code

We save this configuration in separate JSON files, with a set of configuration files for each environment. For example, marathon-qa. json, marathon-prod. json. When the Pipeline is deployed, the Jenkins Marathon plug-in can be used to call the deployment configuration according to the selected environment, so as to achieve the purpose of automatic deployment.

Separation and management of automated processes and deployment rollout

Development deployment is so simple and fast, is it convenient for everyone to use it? The answer is no, not because it’s technically difficult, but because of access. In an ideal world, this process would make it possible to submit your code and, within a cup of coffee, see that your code has been used by tens of millions of people.

However, the risk is too high. Not everyone can be as buggy as Rambo, and most cases need to be constrained by specifications and processes. Just as automated testing cannot replace manual black box testing, deployment testing cannot go directly into production, and a manual validation and deployment process is still required once the test passes.

Therefore, we need to separate the automation process from the final deployment work into two jobs, and give the latter separate authority, and let the authorized person do the final deployment work. This person can be the Team leader, development manager, or operations partner, depending on the organizational structure of the company.

So what exactly is the Job deployed for? In the era of container choreography, in line with the idea of mirroring both build-ups, deployment jobs do not start with code compilation, but instead deploy a fully tested and approved version of the image into a production environment via the Marathon Plugin. Here is an example of Deploy_only:

Node ('docker-qa'){if (ReleaseVersion ==""){echo "return} stage "Prepare image" def moduleName = "${ApplicationModule}".toLowerCase() def resDockerImage = imageName + ":latest" def desDockerImage = imageName + ":${ReleaseVersion}" if (GenDockerVersion =="true"){ sh "docker pull ${resDockerImage}" sh "docker tag ${resDockerImage}  ${desDockerImage}" sh "docker push ${desDockerImage}" sh "docker rmi -f ${resDockerImage} ${desDockerImage}" } stage "Deploy on Mesos" git branch: 'dev', changelog: false, credentialsId: 'deploy-key', url: '[email protected]:lms/xxx-test.git' //Get the right marathon url echo "DeployDC: " + DeployDC marathon_url = "" if (DeployDC=="AA") { if (DeployEnv == "prod"){ input "Are you sure to deploy to production?" marathon_url = "${marathon_AA_prod}" }else if (DeployEnv == "yz") { marathon_url = "${marathon_AA_yz}" } }else if ("${DeployDC}"=="BB"){ if ("${DeployEnv}" == "prod"){ input "Are you sure to deploy to production?" marathon_url = "${marathon_BB_prod}" }else if ("${DeployEnv}" == "yz") { marathon_url = "${marathon_BB_yz}" } } marathon  docker: imageName, dockerForcePull: true, forceUpdate: true, url: marathon_url, filename: "${DeployEnv}-deploy.json" }Copy the code

Why not put this file under scripts along with the application project? Because when deployment and application are separated, maintenance can be done by two groups of people, taking into account the organizational structure of the company.

Automated operation and maintenance

Container monitoring

Containers can be monitored in either of the following ways: Install other services on the physical server to monitor all containers on the physical server. Monitor container state through Mesos or Kubernates’ built-in apis. The monitoring software or Agent needs to be installed on the physical machine.

Our team currently uses the combined suite of cAdvisor+influxDB+Grafana to monitor containers.

You first need to install cAdvisor on all agents in the Mesos cluster. It is responsible for sending all the running container data on the host to the sequential database in the form of data points. Here are some data points monitored by the cAdvisor:

These data points are collated by Grafana and displayed on the interface so that we can understand the performance of the specific container. Here is a screenshot of Grafana:

In addition to monitoring the container itself, monitoring the host is also essential. Because there are many monitoring points, here is not an example.

Automatic telescopic

Restful interfaces are used to notify the AutoScaler program of application services that need to be monitored.
The AutoScaler program starts reading Metrics for the application deployed on each Agent, including CPU and memory usage.
The Marathon API is called when an application is found to be too busy, mostly in the form of too much CPU or memory
Marathon received the message and immediately notified the Mesos cluster to release the new application, thereby relieving the current busy situation.

conclusion

DevOps and SRE are not aspirational concepts; they need to be implemented in different environments. Our entire DevOps process is based on container choreography to simplify the process and automate CI/CD and operations. There will be a lot of unexpected places and may not be suitable for complex scenarios. Secondly, the examples in this article also have the corresponding privacy treatment, and may not be directly used. I hope you can refine your own DevOps process based on our successes and problems in practice.

reference

Microservice Design

True Architecture

https://datasift.github.io/gitflow/IntroducingGitFlow.html

http://mesos.apache.org/

https://github.com/jenkinsci/pipeline-plugin/blob/master/TUTORIAL.md

https://github.com/google/cadvisor

https://grafana.com/

End