With the rise of cloud technology and container technology, the era of human operations is over
In 2018, in order to solve the pain points in daily operation and maintenance and promote the operation and maintenance work more efficiently, we developed and improved several tool systems by ourselves. These systems without exception helped us save time and improve efficiency. This article will share and introduce these tool systems
System is introduced
CMDB
CMDB configuration management database, mainly used to record our management maintenance of hardware and software information, include physical servers, switches and virtual project, service, environment and so on all need to manage maintenance information, popular understanding is before we may have an excel spreadsheet to record all the items we maintenance, server resources used in the project, Server configuration and other information can be input into the CMDB system for unified maintenance and management
The CMDB system is the cornerstone of many other systems. Apis should be provided to all third-party systems that use basic information to query or modify data, such as providing the corresponding server information of the project to the continuous deployment tool to push code to the project server. Therefore, the accuracy of the DATA of the CMDB system is very important. Maintaining basic information in only one place at the same time makes the entire o&M system more controllable, more efficient, and less prone to errors
Our CMDB system has been online for a long time. Previously, it was only used to replace the maintenance information of Excel tables. This year, we added an API for it to provide the third-party systems to obtain basic data
varian
Varian is a modular continuous integration tool developed by us. It is mainly responsible for the process of project from source code to final deployable program. Now that most projects have been deployed by Docker, Varian will be responsible for the process from source code to final packaged project image and upload it to the image warehouse. We will compile, merge, compress, and so on. In this article, we will introduce varian’s working process in detail: Discovering Varian: An elegant Distribution and deployment program
Varian’s core logic is to divide every small step in continuous integration into independent classes or methods, and finally assemble different classes or methods according to different project types, so that projects of different types and technology stacks can share the same set of continuous integration programs, reduce code redundancy and improve availability
nova
Nova will continue to deploy, and cooperate with Varian in the whole online process. Nova is mainly responsible for pushing the final deployable program or Docker image to each online node for update. Because the online environment is complicated, including cloud host, Docker container, private cloud and public cloud K8S, it is compatible with Nova layer
Nova only accepts three parameters: project name, deployment environment, and deployment version number. Based on the project name and deployment environment, nova calls the API provided by CMDB to determine which nodes to push the project to, and pulls the code repository code or image repository image based on the version number
Operations such as capacity expansion, rollback, and restart can be done automatically through the Nova system. This article covers more details of continuous deployment: Continuous deployment optimization practices for Docker environments
kerrigan
In addition to code changes, configuration files and database changes are usually involved in the whole process of release and launch. In order to solve the problem of automatic update of configuration files, we developed kerrigan system. This article introduces the implementation details of configuration center: Detailed explanation of landing configuration center for small and medium-sized teams
The bottom layer of Kerrigan is based on etCD + CONFD, which mainly implements the functions of web modification and automatic update taking effect on the server. Kerrigan can also manage different types of configuration in multiple environments, especially file configuration (different from the configuration center based on KV, which is more friendly to operation and maintenance). For example, you can manage nginx and Tomcat configurations, record the modification history of configuration files, quickly roll back configurations, compare configuration files, save only modification and release later, and other functions
Because we have a large number of projects, there are a lot of rules in the Nginx of each project. Based on Docker, each rewrite update needs to be repackaged and released, which is cumbersome. After using Kerrigan, this problem is effectively solved
overmind
Overmind, a database operation and maintenance system, can solve the last database change in the process of publishing and launching, and also integrates some other practical functions, such as SQL audit, SQL query, automatic data derivatives work order system, password table, etc
The first version of OverMind mainly integrates inception for SQL auditing and execution to help us automate the handling of online database changes, as described in this article: Small and medium-sized teams quickly build an automated SQL auditing system
Internal drive development testing after finish the first version to use, collect feedback, on the basis of the first version adds display functions such as SQL query, the Explain execution plan, follow-up found that dbas often received the impassability environment between derivative according to the demand, and developed a single function to realize automatic data migration, this article has introduced the migration: Automation of data migration for operation and maintenance efficiency
Abandoned Excel maintenance password back, to develop the function of code table, as shown in this article: the Django development password management table example attached source 】 【
Overmind is slowly improving, and more functions will be added based on demand and practicality to improve efficiency
proxy
Proxy is a proxy system, similar to Ali Cloud SLB, Kubernetes ingress, mainly used for development and test environment
We maintain many projects, each project has multiple sets of different environments, each environment has a different domain name, corresponding to different back-end services, in order to simulate the real request SLB proxy environment and centralized management of these project entry, the previous practice is to point all domain names to a nginx server. Nginx server through the domain name based vhost proxy to back-end services, each add or modify by manually changing the Nginx configuration file to complete, now developed proxy system, can be quickly and easily completed through the page
wiki
The wiki system was launched 18 years ago. At that time, the goal of standardization, documentation, automation and intelligent operation and maintenance was proposed. Documentation is a very important part of the whole operation and maintenance process, and its benefits are needless to say, and continuous promotion of document output is also a very important part for us
Of course, in addition to the above systems also developed some small tools to standardize management, improve efficiency, here is not introduced. In addition, we also use a large number of open source software systems, such as Jenkins, ELK suite, Kubernetes, etc
2019 plan
We know that the conversation is from r&d to online an idea of the whole process automation, is not a tool or a collection of some tools, I have been wondering how to implement enterprise, more than 18 years based on the current environment we developed a variety of tools to help us to efficient work, but these tools system relatively scattered, Unable to form a system process, 19 will practice some ways and methods to series these tool systems, to achieve a higher degree of automation, but also continue to promote Kubernetes to a wider range of landing, for the real realization of Devops thought, from development to online automation to lay a foundation
If you think the article is good, please click on the lower right corner [good]. If you’re not enjoying your reading, read the following:
- DevOps operation automation tool system platform
- Small and medium-sized teams based on Docker’s Devops practices
- Encryption scheme for sensitive information in code