In the last two weeks, we have optimized our continuous deployment program and achieved remarkable results. Please record and share with us

background

In that year, the company grew rapidly and launched new projects frequently. Every time a project was launched, it was necessary to apply for a new batch of machines, initialize and deploy the dependent service environment

In that year, the project was in full swing. The flow of project A increased rapidly, and the machine was immediately expanded to A. The new functions of project B were added to B

Working overtime day and night, I was on the run

That year, I learned that Docker could save me, so I decided to fight for glory (hairline).

In order to quickly land and minimize the impact of the introduction of Docker on the whole CICD process, Docker was added into our online process with minimal changes. Please refer to the following figure for process changes

In that year, container arrangement was in chaos, K8S was not yet popular, and due to the limited time and energy and technical strength, the production environment did not dare to hastily launch the arrangement. Docker was simply used on the previous host, mainly to solve the problems of environment deployment, expansion and reduction. Docker did solve these two problems after it was launched. There are additional perks such as ensuring consistency in the development line environment

But the use of the Docker not’s interest, and packaged synchronization code way into a mirror, update the container also brought the growth of the online time, at the same time because of the difference of various environment configuration file failed to fully accomplish a packaging environment more sharing, this paper mainly introduces how we optimize the these two problems

Python multithreading

After analyzing the deployment logs, it is found that it takes a long time to download images and restart containers during the whole deployment process

The whole deployer is developed by Python, and the core idea is to use the Paramiko module to remotely execute SSH commands. Before the introduction of Docker, the release is rsyslog synchronization code, single thread rolling restart service, and the whole deployer logic has not changed much after the launch of Docker. Instead of synchronizing the code restart service to download the image restart container, the code looks like this:

import os
import paramiko

# paramiko.util.log_to_file("/tmp/paramiko.log")
filepath = os.path.split(os.path.realpath(__file__))[0]


class Conn:
    def __init__(self, ip, port=22, username='ops') : self.ip = ip self.port = int(port) self.username = username self.pkey = paramiko.RSAKey.from_private_key_file( filepath +'/ssh_private.key'
        )

    def cmd(self, cmd):
        ssh = paramiko.SSHClient()

        try:
            ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
            ssh.connect(self.ip, self.port, self.username, pkey=self.pkey, timeout=5)
        except Exception as err:
            data = {"state": 0."message": str(err)}
        else:
            try:
                stdin, stdout, stderr = ssh.exec_command(cmd, timeout=180)
                _err_list = stderr.readlines()

                if len(_err_list) > 0:
                    data = {"state": 0."message": _err_list}
                else:
                    data = {"state": 1, "message": stdout.readlines()}
            except Exception as err:
                data = {"state": 0."message": '%s: %s' % (self.ip, str(err))}
        finally:
            ssh.close()

        return data


if __name__ == '__main__':
    # Demo code is much simplified, the overall logic remains the same

    hostlist = ['10.82.9.47'.'10.82.9.48']
    image_url = 'ops-coffee:latest'

    for i in hostlist:
        print(Conn(i).cmd('docker pull %s' % image_url))
        Update the container after the image has been downloaded
Copy the code

All are single thread operation, but think efficiency will not be very high, why not multi-thread? Mainly considering the availability of services, the completion of a server update to update a server until all server update is complete, single thread rolling updates ensure that services are available to the greatest extent, if all servers are updated at the same time, the service cannot provide service to restart process, system will have the risk of downtime, and projects are small scale at the time, Ignoring this increase in time, as projects grew larger and larger, this optimization had to be rethought

The introduction of multithreading is imperative, so how to use multithreading? In view of the overall availability of the service, the two operations of downloading the image and restarting the container are split. Downloading the image does not affect the normal provision of the service, but can use multi-threading completely, so the time of downloading the image will be greatly shortened. The optimized code is as follows:

import threading
Import the Conn class from the previous example

class DownloadThread(threading.Thread):

    def __init__(self, host, image_url):
        threading.Thread.__init__(self)
        self.host = host
        self.image_url = image_url

    def run(self):
        Conn(self.host).cmd('docker login -u ops -p coffee hub.ops-coffee.cn')
        r2 = Conn(self.host).cmd('docker pull %s' % self.image_url)
        if r2.get('state'):
            self.alive_host = self.host
            print('---->%s image download completed ' % self.host)
        else:
            self.alive_host = None
            print('---->%s image download failed, details: %s' % (self.host, r2.get('message')))

    def get_result(self):
        return self.alive_host


if __name__ == '__main__':
    # Demo code is much simplified, the overall logic remains the same

    hostlist = ['10.82.9.47'.'10.82.9.48']
    image_url = 'ops-coffee:latest'
    
    threads = []
    for host in hostlist:
        t = DownloadThread(host, image_url)
        threads.append(t)

    for t in threads:
        t.start()

    for t in threads:
        t.join()

    alive_host = []
    for t in threads:
        alive_host.append(t.get_result())
    ## Multithreaded download mirror finished

    print('----> This project has a total of hosts % D, % D hosts download image successfully ' % (len(hostlist), len(alive_host)))
Copy the code

Restarting the container can not be as simple as restarting multiple threads at the same time. As mentioned above, restarting the container at the same time will have the risk of service downtime. All online servers are redundant. You can’t restart them at the same time. Can you restart them in batches? After analyzing the traffic situation, we came up with an algorithm, if the project host is less than 8, then the single thread rolling restart, it doesn’t take too long, if the project host is more than 8, then the number of project hosts /8 rounded up, as the number of threads multithreaded restart, In this way, about 80% of the hosts in the project can be guaranteed to provide services externally all the time, reducing the risk of service unavailability. The optimized code is as follows:

import threading
from math import ceil
Import the Conn class from the previous example

class DeployThread(threading.Thread):
    def __init__(self, thread_max_num, host, project_name, environment_name, image_url):
        threading.Thread.__init__(self)
        self.thread_max_num = thread_max_num
        self.host = host
        self.project_name = project_name
        self.environment_name = environment_name
        self.image_url = image_url

    def run(self):
        self.smile_host = []
        with self.thread_max_num:
            Conn(self.host).cmd('docker stop %s && docker rm %s' % (self.project_name, self.project_name))

            r5 = Conn(self.host).cmd(
                'docker run -d --env ENVT=%s --env PROJ=%s --restart=always --name=%s -p 80:80 %s' % (
                    self.environment_name, self.project_name, self.project_name, self.image_url)
            )
            
            if r5.get('state'):
                self.smile_host.append(self.host)
                print('---->%s Mirror update completed ' % (self.host))
            else:
                print('---->%s server failed to execute docker run,details:%s' % (self.host, r5.get('message')))
                
            # check mirror restart status and restart failure require rollback code omitted

    def get_result(self):
        return self.smile_host


if __name__ == '__main__':
    # Demo code is much simplified, the overall logic remains the same

    alive_host = ['10.82.9.47'.'10.82.9.48']
    image_url = 'ops-coffee:latest'
    
    project_name = 'coffee'
    environment_name = 'prod'
    
    # alive_host / 8 is rounded up as the maximum number of threads
    thread_max_num = threading.Semaphore(ceil(len(alive_host) / 8))

    threads = []
    for host in alive_host:
        t = DeployThread(thread_max_num, host, project_name, environment_name, image_url)
        threads.append(t)

    for t in threads:
        t.start()

    for t in threads:
        t.join()

    smile_host = []
    for t in threads:
        smile_host.append(t.get_result())

    print('---->% D hosts updated successfully ' % (len(smile_host)))
Copy the code

After the above optimization, we found that it took about 10 minutes for a project with 28 hosts to go online before optimization, but only about 2 minutes after optimization, and the efficiency was improved by 80%

Configuration file processing in multiple environments

We adopted project code packed into a mirror image of the management plan, develop, test, staging, production environment configuration file is different, so even if is the same project of different environment will walk alone again deployment release process packaging image, the different environment configuration of the package to a different image, the operation is too cumbersome and unnecessary, It also significantly increased our live time

Each container can define the configuration to be mounted, which is automatically mounted when the container is started, to solve the problem that the image can be used in different environments. How to deal with the problem that the image is not used? Configuration center is essential, the previous article “small and medium-sized team landing configuration center details” has a detailed introduction to our configuration center program

The overall idea of dealing with different configurations is that two environment variables ENVT and PROJ are passed in when Docker is started. These two environment variables are used to define the environment of which project this container belongs to. After getting these two environment variables, the Startup script of Docker uses ConfD service to automatically go to the configuration center to obtain the corresponding configuration. Then update to the corresponding local location, so there is no need to package the configuration file into the image

Take a purely static project that requires only nginx services

Dockerfile is as follows:

FROM nginx:base

COPY conf/run.sh     /run.sh
COPY webapp /home/project/webapp

CMD ["/run.sh"]
Copy the code

The run.sh script is as follows:

#! /bin/bash
/etc/init.d/nginx start && \
sed -i "s|/project/env/|/${PROJ}/${ENVT}/|g" /etc/confd/conf.d/conf.toml && \
sed -i "s|/project/env/|/${PROJ}/${ENVT}/|g"The/etc/confd/templates/conf. TMPL && \ confd - watch - backend etcd - node = http://192.168.107.101:2379 - node = http://192.168.107.102:2379 | | \exit 1
Copy the code

Docker startup command:

'docker run -d --env ENVT=%s --env PROJ=%s --restart=always --name=%s -p 80:80 %s' % (
    self.environment_name, self.project_name, self.project_name, self.image_url)
Copy the code

The image packaging can be shared by multiple environments once, and there is no need to go through the process of compiling and packaging again when it goes online. You only need to update the image and restart the container, which significantly improves the efficiency

Write in the last

  1. A container without choreography has no soul, and continuing to advance the use of choreography tools will be a major focus in 2019
  2. In fact, after Docker was reformed and stabilized, we deployed a set of K8S cluster in the internal development and test environment, which has been stable for more than a year now
  3. The online project uses a cloudy environment. Some online projects have already used container arrangement based on K8S. Of course, some of them are pure Docker environment I introduced above

If you find this article helpful to you, please share it with more people. If you’re not enjoying your reading, read the following:

  • Details of landing configuration center for small and medium-sized teams
  • Varian: Elegant release deployer