Theme introduction

The state of DevOps

2. Traditional enterprise DevOps cases

3. The path to the DevOps tool chain



Above are the tools you need to master to do continuous delivery. In the past, many technologies were used in this research and development. Now we can see that if we want to do the deployment of micro-services, more and more skills need to be mastered in operation and maintenance. What’s interesting about this is the game between R&D and operation. If you can’t keep up with the operation and maintenance ability, the r&d will tease you and think you don’t support them well enough, so they will push back their Own DevOps tool chain. Therefore, there will be a game between R&D and operation and maintenance, which is quite interesting.

Enterprise status quo



From a global perspective, the figure above shows Puppet Lab data released in 2017. As you can see, the largest percentage of DevOps engineers is in North America, because silicon Valley has a large number of DevOps engineers, but Asia only accounts for 10% of the world’s DevOps engineers. This year, I believe the number will be much higher.

In terms of salary, Stack Overflow did a survey of 60,000 engineers from around the world. The number one surprise was DevOps specialists, followed by machine learning specialists. If you’re DevOps, you might want to ask your boss if it’s time for a raise.



To evaluate an enterprise’s IT effectiveness, the frequency of deployment is calculated as multiple deployments per day for high efficiency and once a month for low efficiency. The time from submission to live is important, so you can collect your own data internally. If it’s less than an hour, it’s way ahead of time. And then the failure recovery rate, if less than an hour, also indicates a highly efficient business.



As you can see from the figure above, the yellow line is the frequency of high performance team deployment, which increased from 2016 to 2017, while the black line is the frequency of low performance team deployment. But inefficient teams go from 16 years to 17 years, and the recovery time of their deployment systems is getting longer and longer, because now the business is getting more and more stressed, and more and more things are coming online at once, and more and more things are being delivered, and there are bound to be problems. This data is publicly available and can be found on the Internet.

The traditional enterprise DevOps case



Traditional industries, like ING, are very ahead of their time. They started DevOps in 13, so DevOps is not landing, it’s a long journey. Their r&d team of 600 people is very difficult to manage. In 13 years, they did it by hand. Now they have fully automated assembly line delivery. The concept of Industrial Manufacturing 4.0 in Germany has a big impact on me because the way of delivery has changed. In the assembly line of Industry 4.0, people don’t talk to each other. How can they collaborate? Completely dependent on the system, each person has a big screen on his workbench, the screen will show all the delivery information in front of him, what happened in front of him, what needs, where to send the next link… All software engineering is the same as real industrial engineering.



The picture above shows the tools ING currently uses, and you can see that there are so many tools for orchestration and checking that it’s hard for a team to manage. There are also different deployment tools to support on the delivery side, so what’s the problem with that? Development team wants to use the tools, operations teams don’t support, the development he said I may not need operations support, because a Jenkins machine running test, the operational side account over a week, but development can wait so long time, so would consider himself a set of development, may set up half a day. What does that lead to?



Each team maintains its own pipeline of delivery and its own operations, so operations are very passive. One R&D project is deployed in this way, and another is deployed in that way. Such operations want to push the delivery specifications from back to front, it is impossible to push.



ING’s solution is not to change the existing development process, or let you use it, but let you use the operation team tools, set up the account, permissions, etc., the same as before. You can see the power and value of operations at this point, not only operating your application, but also managing the previous tools, which is a success story of operations from back to front.

In fact, developers don’t care about these tools, they are under a lot of pressure every day, if there is a problem, they have to restart every day. So it’s a very valuable thing for operations to do. Management tools are not enough right now, there is a community within ING where they invite teams to do some practices, for example, of the same type, and they give you a list of how to do it, and then other teams can reuse the process. It’s a quick way to do things with less business change, no matter what language you’re using. They will implement this plan, and all services will be managed uniformly.



The whole process is relatively straightforward, because these open source tools support LDAP authentication. They have high security requirements for keys, and all keys are encrypted and stored through Vault. Looking back at the process from a corporate perspective, every business team sees the same process, code scans, putting the results of those scans into binary, and then doing the rest. The banking industry has higher requirements for this, and they also have certain demands. At the end of the day, ING DevOps has been on the ground for about 4 years, Posting 12,000 times a month, but has reduced the number of online publishing incidents by 50%.



Hygieia, a reporting tool from us digital bank CapitalOne, is also excellent. Its pain point is that it has changed from traditional development. It took 5 years to develop the project and achieved 4 releases per day. They do a lot of test development, moving from a manual testing perspective to writing cases, writing automated scripts. So they’re slowly transitioning from manual deployment to automated deployment, as well as some services in the public cloud. They have a lot of quality levels, like using Sonar for static analysis, which is also an open source tool, like Tencent using Sonar to scan everyone’s submitted code.



They developed reports based on continuous delivery tools that provide visual pages of the software delivery lifecycle, such as frequency, code, and data metrics from build to launch, all displayed on this large screen. It’s open source, you can install it and test it, and it’s also JAVA. It can collect the data of various tools, and make some queries and reports based on these data. CapitalOne did all of this in five years, so it’s rare for a bank to be this community-centric and to be able to drive this kind of indigenous r&d capability.



A relatively large bank in China started to manage binary packages using FTP, which had many pain points. The r&d part was in different locations and it was difficult for each center to coordinate. For example, if your package is large, it will basically die when it reaches 50Gb/s, so you need to maintain it frequently. First, evaluate, manage with a professional tool, and distribute images in real time. With Artifactory, continuous delivery can be achieved, depending on the environment, each environment has certain warehouse permissions. There are a lot of open source software vulnerabilities on the Internet, and companies need to provide a unified and regulated repository, so not all packages can be downloaded directly.

The path to the Landing of the DevOps tool chain

So, how do you take the first step in DevOps and use some tools to do things inside your company? I’m going to share some of my toolchain landing experiences.



The code and requirement need to be correlated, after which we can see the average time of requirement from code submission, requirement creation, test and launch, as well as the information provided to the operation and maintenance department about the package. Before we got the package, we didn’t know anything, and we didn’t know what problems and requirements the package solved. Jira information can be gathered using the JIRa-Jenkins plug-in if the code is submitted with the requirement ID. Artifactory will have a record above, is can point to a content above, the same data, with other management tools can also do.



The diagram above is a pain point for many large companies. There are various private servers in the company, which need independent maintenance, configuration permissions, high availability, disaster recovery backup, configuration management. At the same time, your deployment scripts need to connect with various private servers, which makes it difficult to standardize your CICD pipeline.



The first step should be to manage all of this in one place at build time. What good is that? Instead of specifying a version to send packets, the machine can now automatically find the latest version that is suitable for sending packets this time. This is also a concept of industrial production lines, jingdong, Amazon logistics are robots in the delivery of bags, now jingdong unmanned cars have been sent to the university.



The above process is a model of single-team real-time delivery, which implements automated delivery processes. I suggest that you draw a diagram like this in your own company to see what your company is like. After the above development builds the code, do the code merge, merge into the trunk to make a build, and keep this information in mind. Such as the ID of the requirement, the code address of the build, whether the single test was passed, and coverage will also be focused on this.

Upload your package to the binary repository. If it passes the unit test, the test will be notified and the test environment will be verified. If it passes the test, the metadata will be recorded to the package. You can test some compatibility interfaces manually or automatically. The test interface of many tools can analyze its results and write them to your warehouse. This eliminates the need for communication between development, test, and operations. We then release to a new test environment, a simulated production environment to test, and the information for these tests, which have manual test coverage of 70%, is also written on this package. When o&M finally gets the package, it will get an evaluation from the test team on whether it can go live or not, and then o&M will use automated scripts to put the package into the production environment.

This is a process delivered by a single team, which is usually scheduled with tools. It contains many automated steps, but also supports manual steps, which are required in many environments, such as approval and audit tests.



From the perspective of binary package, when Jenkins is built, it should uniformly download trusted third-party packages, and should not pull dependent packages from multiple folders/private servers, so as to ensure the same dependencies when everyone submits. Avoid the same version, some use 1.0 and some use 2.0. The data of the test process are written on the deliverables, and the automatic release of the test environment will be carried out according to the data obtained. After passing the test, it will be upgraded to the production environment warehouse, and the release of the production environment can be automatically triggered to reduce manual deployment.

So inside the company, our operation and maintenance team is defining this quality level. When the assembly line is laid down, the R&D team may not be willing to cooperate, so the initial standard cannot be set too high. Maybe the coverage rate is only 20% at the beginning. It doesn’t matter, let’s start from 20%. It took Yahoo a year to go from zero unit testing to 90%.

From the perspective of binary delivery, all test results will be recorded in the delivery process, which is the information sharing in the assembly line, breaking through the barrier of r&d test information communication, and breaking through the department wall. This thing is difficult, but we can do information sharing from the information system. The previous is to write a lot of metadata, but how to consume, let the machine automatically pick packages to do environment deployment? Write deployment scripts based on quality levels to help the machine filter and let the machine know what environment the package should be, rather than checking with the test team.



Application alone is not enough, but also the configuration of the environment, application related configuration files how to achieve different configurations of different environments? Application configurations for different test environments should be stored in different code branches, and these configuration files should be pulled down from different branches during application deployment. In this case, unified management is achieved through the configuration center. And database version of the script, the script is also exist, the database of every change should become a change in the version, operations to get this package, the same from one branch to the corresponding change list, ensure that every time a release to get the full amount, contains all the change history, but also can automate specific release.



Now the container is popular, but there are many vulnerabilities in the container, people will care about the container mirroring vulnerability scan, this is also a bank case, will do the package of third-party vulnerability scan. If there are no vulnerabilities, they will be pushed to the production environment. Similarly, they have more security levels in the production environment, so they need to do vulnerability scanning of third-party software to ensure the security of the online environment.



With unified management of configuration, operations, and databases, consider moving forward and providing your company with basic CICD tools. DevOps are usually done team by team, not mandatory. After landing, some teams do not need to research and develop any transformation, they can see the scan results, you will see the effect. You can send this report to the leaders of the business teams, and they will know the maturity of each team and will ask the r&d team to improve the quality of the code. As a result, operations teams become more and more responsible because they run the assembly line of the company, while R&D becomes more and more relaxed and all they have to care about is the business.



Take it one step further, and once you’ve managed all the tools and your requirements are clear, you can develop a DevOps platform of your own. This is suitable for r & D capabilities of enterprises, of course, you can also use a third party to do, you can achieve the whole company’s time display, you can adjust the interface to do this thing. Then your r&d efficiency is not hit a bottleneck, with this tool, can be encapsulated. This involves the cost. We suggest that you first open the tool chain and make sure that your research and development is stable before doing this design. It is not to transform a large number of business delivery processes on the platform at the beginning, but to have a transition process.

With this platform, you can do a lot of things you want to do. The underlying open source tools give you the interface to return data, the average test pass rate of your code, a portrait of your team’s ability to deliver, what the team’s ability to deliver looks like, and how to improve when problems are identified. You can see in this, you can define some curves, the curves are the industry average, and you can see if your business is on average.



DBAplus community
DBAplus community