Two years ago, I wrote an article called “The Devops Tools We’ve Developed” that introduced some of our own Devops tool systems. Two years later, I want to review the progress of these tools and what’s new

CMDB

CMDB configuration management database, as the basis of the operational system to build, almost all the other operational tools system will depend on him to provide the basis of the data, so it’s really important to ensure stability, the stability is not only refers to the stability of the system running state, and the stability of data structure, function, its data structure once change upstream system may have to follow changes, Therefore, when planning the iteration update of CMDB, compatibility should be considered first. Existing functions should not be modified too much, and only new functions should be added

Based on the above considerations, CMDB in overall function in the last two years without too big change, the function is only the host monitoring to integrate, can directly search in CMDB corresponding host view monitoring, without having to go to retrieve the monitoring system, at the same time have a promotion in the ease of use, for example, to remove a directory tree, enrich the display information, optimize the retrieval functions

nova

Nova system is mainly used for production of continuous deployment, before also have introduced in the article, as a result of our environment is more complex, the public cloud, private cloud, cloud hosting, physical machine, Kubernetes, etc are used, and the distribution in different parts of the room, for the convenience of maintenance, we specialized in the production environment to pull away the nova system deployment capabilities

Continuous integration was originally implemented by Varian systems, but now we have completely abandoned Varian and replaced it with a more powerful custom task engine, Probius, which is described below

Now the rollback function is also integrated into Nova. The rollback is realized based on Docker images. If you select the environment to be rolled back, you will pull the corresponding project and environment to Dockerhub and pull the corresponding image list

Nova has also added a batch task execution function, which is based on Ansible. For details, see the previous article: Django+Ansible Construction task center, with the help of batch tasks, you can easily check multi-node exceptions, and non-deployment tasks

kerrigan

As an o&M friendly configuration management tool, Kerrigan plays a crucial role in the entire project operation. Currently, Kerrigan has managed hundreds of different types of configuration files and ensured the normal execution of tens of thousands of configuration modifications. In the past two years, it has also been optimized and upgraded, mainly in two aspects

Confd This parameter specifies the watch mode. The configuration modification takes effect immediately. This is mainly for the management of nginx configuration files. In the previous mode, after configuration modification, the confD service configuration can take effect only after redeploying the project or restarting the ConfD service configuration, which is a tedious process. Therefore, we changed confD to Watch mode, and the configuration will be updated immediately once it is modified. One risk with this pattern is what happens if the configuration changes wrong? Configuration correction is divided into two kinds of circumstances, grammar and rules errors, first confd before updating the configuration file will check the configuration file for grammatical errors, if there is no update, because this avoids the grammar errors caused by the update fails, the second answer to the question of the rules themselves write wrong the only carefully checked before update, Kerrigan: Configuration center management UI implementation ideas and technical details, including configuration comparison, fast rollback, etc. Kerrigan: Configuration center management UI implementation ideas and technical details, including configuration comparison, fast rollback, etc

A complete API is provided. Kerrigan not only manages service configuration like nginx, but also manages files like Dockerfile. Dockerfile is needed in continuous integration, so kerrigan provides APIS to support obtaining configuration files through HTTP. In order to use the configuration file in Probius system, there is a special Python command to obtain the configuration file in Kerrigan. It only needs one command to obtain the configuration file

overmind

At the beginning, overMind was just a SQL audit platform, but now it has become a one-stop DB management system. From work order to DB information addition, password management, authority management, DB query and audit, DB execution audit and a series of database-related operations can be completed by using overMind system

proxy

The ease of use of the proxy system has been greatly improved. First, select a protocol when creating an instance. If you want an instance to support both HTTP and HTTPS, you need to create two instances in the previous version. You can create an instance that supports both HTTP and HTTPS. You can also configure whether the two protocols can be accessed at the same time or whether HTTP forces you to jump to HTTPS

Another change is that you can edit whether logging is enabled or not, and if enabled, you can also listen for logs on the page in real time. This is great for some troubleshooting scenarios

For a detailed introduction to proxy, you can see this article: Proxy: a simple, compact and powerful proxy system

wiki

Wiki overall function without too big change, only strengthen the function of search, the homepage is still a directory tree, inside pages only content, focused, no fancy features, but with the increasing number of directory, if behind refactoring will consider increasing the space, the concept of the team and team, make a distinguish between plate and plate or very be necessary

Except for the Devops tools we developed and introduced in this article, Varian has been completely abandoned, other systems are in normal use and iterative update, and their vitality is still strong. In addition to these systems, some new tool systems have been developed in the past two years

alodi

Alodi system is mainly used to quickly generate a temporary environment, and realize the temporary environment throughout the life cycle of a one-stop management, mainly used in the same confidential project more version at the same time develop test or not through the use of conventional test environment test through alodi can quickly create a project running environment, by generating random temporary domain access

Alodi is based on Kubernetes, and the deployment process log, container terminal log, and container terminal view can be implemented in Alodi without jumping to other systems. It also allows the binding of custom domain names for special test environments, and the use of a button to destroy all created resources

More on Alodi can be found in this article: Alodi: Creating environments has never been easier

webssh

Through WebSSH to achieve fortress machine function, through WebSSH to connect to the remote host, you can record session information, video operation, follow-up audit, but also can view the operation process of other users in real time, extract operation commands, for easy use, but also through the directory tree added grouping function

probius

Probius, a custom task engine, is the most important system implemented in the past two years. It has powerful and flexible task scheduling ability. At first, it was intended to replace Varian, but now it has not only replaced varian perfectly, but also greatly improved its usability and functionality. Dozens or hundreds of continuous integration tasks are performed daily through Probius, which also integrated Kubernetes and Prometheus into the Probius system

The three core concepts of Probius are command, template and task. A command is the smallest granularity in the system, which can be a specific Linux command or a script that can be executed. A template is a combination of commands, and a task contains templates and parameters. Based on this idea, Probius can realize any function, whether it is daily inspection or release online, can be easily handled

Probius does all the deployment of the development test environment, and the development test environment relies on the underlying resource Kubernetes, so for ease of use, Probius also integrates Kubernetes and Prometheus, Kubernetes and Prometheus are integrated as plugins, so they don’t have much impact on Probius

sadmin

Sadmin is a djangos basic public library. This public library integrates many basic functions, such as background configuration of site titles, titles, and themes, dynamic configuration menus, automatic audit logging, and multiple recognition methods, as well as encapsulation of the most commonly used CRUD

Almost all of these systems have been reconfigured using the Sadmin public library to ensure consistency and easy maintenance. Sadmin: Creating a private Django public library for code reuse

The last

The articles of Devops tools we developed by ourselves describe our plan for the next year, connecting those relatively scattered systems in series. Now the future has passed, and we have realized the series of multiple systems with the help of Probius, realizing a higher degree of automation. So what is the development of the next stage? I really need to stop and think again

Besides above these enterprise related tools system, I also help to develop some business related systems, such as demand management system is introduced in this paper, two days ago about above these tools system, I have written many articles to introduce, interested can go to operations of coffee or a blog for the public, with the same if you have any idea can communicate together