Introduction: With the continuous improvement and development of cloud native concept and cloud native technology, more and more industries begin to practice cloud native technology, which has different degrees of impact on technical practitioners of different positions. From business logic to technology selection, the entire technology stack has been turned upside down for IT executives as well as front-line developers and operations staff. In order to better meet the arrival of the cloud native era, it is necessary for everyone to have a deep understanding of the impact of cloud native implementation practice on different positions.

CXO and IT director

Many enterprises have comprehensive and strict requirements for technical LEADERS such as TECHNICAL CXOS (including CTO, CIO, CISO, CDO, etc., all referred to as CXO in this paper) and technical directors. Technical leaders should not only take into account all aspects of technical management, but also take maintaining the business of the company as their core responsibilities. Therefore, CXO and IT technical directors should have broad technical vision, excellent technical judgment and even high-level architecture design ability, as well as good product awareness to cope with the changing internal and external environment.

The external environment

As THE CXO and IT/R&D director of the enterprise, these senior roles must first realize that cloud native is the inevitable trend of cloud computing development, cloud native has reshaped the basic technology platform of enterprise digital transformation, and cloud native architecture is the basic technology architecture for building modern enterprise applications. Cloud native architecture is important for Internet applications, enterprise transaction applications, big data applications, and artificial intelligence-type workloads.

Secondly, for the issues of special concern to technology managers, such as open source and localization, CXO and IT managers need to see that most of the cloud native technologies and standards come from the projects of major open source foundations, and these technologies and standards constitute the open source technology system. Cloud native services launched by major cloud providers are also compatible with the corresponding technologies and standards. Open source Cloud native technologies and products meet the demand of enterprise customers for “no Vendor lock-in”. When changing Cloud Service Provider (CSP) or Independent Software Vendor (ISV), The enterprise does not have to worry about the technology cannot switch or the migration cost is too high. Localization is increasingly becoming the rigid needs of the country and enterprises. Enterprises need to select native cloud products that meet the localization standards, including the autonomy and controllability of native cloud products, contributed source code (usually reflected in o&M, API, component extension, etc.), and localized server support. At the same time, organizations such as The China Academy of Information and Communication Technology and the China Institute of Electronic Technology Standardization also provide relevant assessments for enterprises to help them select commercial products that meet local standards.

The internal environment

Within the enterprise, CXO and IT leaders must leverage cloud native technologies to drive technology upgrades and realize technical and business value based on the actual situation of the enterprise.

First of all, at the strategic and organizational level, ACNA (Alibaba Cloud Native Archite-Cting) architecture design method is used to evaluate and formulate the enterprise’s Cloud Native strategy and implementation path, and make it a part of the overall strategy of the enterprise, to help and accelerate the digital transformation of the enterprise. In addition, the cloud native strategy is not only a comprehensive upgrade of technology, but also an upgrade of enterprise IT organizational structure and organizational culture. Now more and more enterprises have realized this, alibaba, for example, it is not only as early as 10 years ago launched the cloud native related technology and product research and development, established in 2020 during the cloud congress about “alibaba cloud native technical committee”, is to let the outside world see group push alibaba and ant group comprehensive cloud original biochemical determination.

Secondly, as cloud native technology is a comprehensive reconstruction of enterprise application development mode, CXO and IT director need to think about how to rewrite application by using container, microservice, Serverless, Service Mesh and other technologies, and reshape enterprise R&D, operation and maintenance process by DevOps. We use GitOps, IaC and declarative architecture to redefine the pipeline and operation and maintenance mode of the enterprise, upgrade the original monitoring system with observability and service-level Agreement (SLA), and ensure enterprise security with cloud native identiity-centered security system.

The purpose of any technology upgrade is to bring real value to the enterprise, so CXO and IT leaders need to pay attention to the following points when using cloud native technology upgrade.

  1. Operating cost and Return On Investment.
  2. Backend as a Service (BaaS) is widely used and provides direct cost savings.
  3. Improvements in efficiency brought by new technologies, tools and processes based on cloud native.
  4. Stability, indirect cost optimization from SLA increase (risk reduction, user experience improvement, etc.).

Architect/consultant/System planner

For the technical backbone of the enterprise, such as architects/consultants/system planners, cloud native technology and architecture have a profound impact on architecture evolution and risk control, technology selection, construction of modern applications, IT service process remodeling, new tool application, security planning and other work.

1. Architecture evolution and risk control

The fundamental evolution of cloud native architecture is to change the infrastructure environment of software running — cloud platform, so that the upper software architecture changes from “steady state” to “break the original steady state and build a new steady state”. This requires architect/consultant/system planning personnel carefully assessing enterprise organization ability, development and operations staff skill levels, development cycle and cost budget, legacy system integration, business demands, etc., and use the ACNA architectural approach to risk control, in order to ensure the cloud native architecture smooth implementation and continue to play in the enterprise value.

2. Technical selection

Technology selection involves two aspects: on the one hand, which field of cloud native technology and architecture to choose, and on the other hand, how to make a choice among multiple similar technologies or products in the same field. For the former, it is suggested that enterprises gradually select technology fields that match enterprise demands and capabilities according to the evaluation dimension of cloud native architecture maturity model and architecture iteration cycle (for example, some enterprises choose “container + micro service + Internet middleware” to build enterprise middle platform). For the latter, it is suggested that enterprises choose products and services with commercialization support (at least successful commercial implementation cases in the same field) on the basis of open source and open source, such as micro-services, containers and other services provided by cloud platforms.

3. Build modern applications

By building modern applications based on cloud native, enterprises can achieve business agility to cope with the rapidly changing market challenges and give applications the ability of dynamic scaling and strong resilience. By rewriting and reconstructing the enterprise core software, the enterprise core architect applies the cloud native technology and architecture iteration process to the next generation of the development of these core software, so that the new applications have the characteristics of modern applications. Since the enterprise cloud protogenism will lead to a complete upgrade of the application architecture, it is recommended that the system be rewritten rather than reengineered to minimize the repayment of historical technical debt, reduce the legacy of the system, and accelerate the modernization process of new applications.

4. IT service process remodeling

After an enterprise upgrades the cloud native technology, the entire IT service process needs to be upgraded, including event management, problem management, change management, release management, and configuration management. These processes are well defined. As cloud native technologies define new tools, methods, and standards, the upgrade process becomes more automated and streamlined. In event management in the process, for example, the use of observability tools greatly reduce the burden of the monitoring, because the cloud based on Kubernetes event management is better at covering from the virtual host, container, PaaS services, integrated services all events to the application level of centralized collection, storage, analysis and alarm, correlation analysis, and visualization display process, Thus improve the efficiency of service desk and subsequent event processing.

5. New tool application

There are a number of new tools associated with the cloud native technology architecture, which can greatly improve the efficiency of cloud delivery, clustering, and cloud operation and maintenance. If enterprises lack these tools, they will face problems of insufficient automation, IT information fragmentation, and high operation and maintenance risks. Therefore, architects/consultants/system planners need to provide support for enterprise CI/CD (Continuous integration/continuous delivery) process, microservice implementation, cloud provisioning and integration of PaaS/SaaS services, enterprise CMDB (Configuration Management Data Base, Configuration management database) integration, enterprise monitoring integration, account/permission/authentication integration and other scenarios to select and even develop appropriate tools to improve enterprise O&M automation level and reduce O&M risks.

6. Safety planning

In the context of digital transformation, although the value of digital assets is constantly explored, the risks are also increasing. Cloud native advocates DevSecOps, zero-trust model and a large number of cloud security services, which carry out fine-grained upgrades to security policies such as permission control, service-level dynamic isolation and request-level access control, thus realizing end-to-end security control from code development to application operation and maintenance. This process requires an enterprise to upgrade its security planning to synchronize planning from cloud infrastructure to application security.

The developer

Cloud native technologies and architectures have a significant impact on technology developers (designers, developers, testers, etc.) in the following six areas.

1. The technology stack

Developers across the technology stack, from the front end to the back end, will benefit from adopting cloud native technologies: The development environment is gradually moving from a native IDE to a Cloud IDE, with pre-integration of Cloud services into the IDE (for example, using Cloud Toolk it for application deployment in the IDE), making the entire code writing and debugging more efficient; The Backend for Frontend layer uses Serverless architecture and a large number of PaaS cloud services to simplify the technology stack, freeing developers from back-end o&M. Back-end developers need to focus on technologies that will be used a lot, such as containers, microservices, Serverless, Service Mesh, PaaS cloud services, etc.

2. Distributed design pattern

The cloud native technology architecture incorporates a number of existing distributed design patterns and integrates them into open source products and cloud services, greatly reducing the workload of architects and developers. For example, microservices and ServiceMesh can be preconfigured with grayscale, fuses, silos, limiting, downgrading, observability, and service gateways. While such as Event-driven Architecture (EDA), read/write separation, Serverless, CQRS (Command Query Responsibility Segregation, The separation of command and query responsibilities) mode and BASE (Basically Available, Softstate, Eventual Consistent) mode need to be introduced from the application architecture level and cannot be transparent to applications.

3. Business development

The more cloud native technologies and cloud services are adopted, the less effort developers have to spend on non-functional feature development and the more time and energy they have to focus on the functional design of the business itself. With applications developed based on Service Mesh and Serverless, developers don’t even have to worry about server operation and maintenance, don’t have to constantly upgrade dependent software, don’t have to deal with the complexity of gray scale heat rise and automatic rollback, and don’t have to use online flow pressure to reduce the workload of integration testing and smoke testing.

4. Test method

The traditional method of test case design based on prediction is too inefficient. The solution is to use active fault injection and chaos engineering for fatigue test to simulate the possible faults in the real world. The online traffic recording and playback test method can quickly form test cases and improve the effectiveness of regression. More importantly, these testing methods are performed directly in the production system, without prior testing in the test environment. Internet companies such as NetFlix, Amazon and Alibaba are using these testing methods extensively to reduce the risk of failure in large-scale distributed environments.

5. Software development and operation and maintenance process

DevOps and DevSecOps not only require safe and continuous release, but also require the enterprise to redefine and standardize the DEVELOPMENT process and tools to achieve integration of development and operations. Creating positions that focus on improving engineering stability, efficiency and quality redefines the organization, processes and culture of R&D and operations.

6. Study scenarios

Cloud platforms are the infrastructure of the digital society and an important part of the new infrastructure. Many of the most advanced and latest IT technologies and ideas will be reflected in the cloud platform. The open source projects behind these new technologies, as well as the conferences, gatherings, discussion boards, and technical blogs surrounding open source projects, are the perfect places for technical staff to learn and improve their skills. In addition, cloud computing related technical media often provide a large number of new technologies and new solutions in the cloud native field, and developers can broaden their horizons and improve their technical capabilities through learning. (These tech media often offer online documents, live broadcasts, video recordings, technical articles, blogs, and more.)

Operations staff

Operation and maintenance personnel, including SRE (Site Reliability Engineer), as the guarantee of the successful operation of software, will also be deeply affected by cloud native technology and architecture, especially in the technology stack, operation and maintenance tools, monitoring and error handling, SLA management, AIOps and other aspects. The details are as follows.

1. The technology stack

On the one hand, the change of technology stack of operation and maintenance personnel is caused passively by the construction of cloud native technology stack of operation and maintenance software. On the other hand, IT is based on proactive use of cloud native technologies and tools to build new integration, monitoring, automation, self-healing, performance management, high availability management, security management, SLA management, IT asset management, event management, configuration management, change management, release management, patch management and other work and processes. The typical application scenario here is to automate resource creation, delivery, and instance migration using the Kubernetes Operator.

2. O&m tools

The cloud native architecture emphasizes the highly automated operation and maintenance process through IaC and declarative operation and maintenance. Even in a complex distributed system with hundreds or thousands of machines, deployment, upgrade, rollback, configuration change, expansion/reduction, and other operations can be handled automatically. As a core concept of IaC, GitOps not only contains the description of the target state of the system, but also runs through the whole change process, which not only conforms to the transparency principle of DevOps, but also has the advantages of declarative operation and maintenance.

3. Monitoring and error handling

From user feedback and discovery of system index anomalies to the adoption of a variety of operation and maintenance methods to confirm, analyze and solve problems and faults, is an important scope of daily error handling. Observability emphasizes that a single execution of a business can obtain log, measure, and trace information from multiple distributed services, containers, virtual hosts, networks, and BaaS services, improving monitoring capabilities and error handling efficiency. Cloud native technology does not require operations personnel to collect and associate this information from multiple distributed nodes. Instead, Prometheus and Grafana help perform correlation analysis, alarms, and visualization of multi-dimensional information.

4. The SLA management

With the metrics information, we can combine the dependencies obtained in the invocation relationship to perform SLA management on business services and PaaS components, and in turn, on global services and IT assets. Without infrastructure and capabilities such as Service Mesh and observability, traditional monitoring systems can only extract these metrics from logs in different formats. If the software does not print the measurement information, the monitoring system cannot obtain it. At the same time, SLA management cannot perform upstream and downstream association analysis due to the lack of full-link dependency. As a result, the system cannot immediately perceive whether a Service or component achieves its Service Level Objective (SLO). These problems are well solved in the cloud native system, which can help o&M personnel improve the SLA management level of the system.

​5. AIOps

AIOps refers to the use of machine learning and artificial intelligence technology in operation and maintenance to proactively analyze and prevent faults and speed up troubleshooting. When observable operations are implemented across a large number of business services and technology components, the system generates large amounts of logging, measurement and tracking data that can be analyzed using real-time machine learning and artificial intelligence techniques. It can assist in anomaly detection before and after change, correlation analysis of multiple events and “false positive” elimination, root cause analysis, automatic anomaly node removal and emergency recovery.

Software delivery Engineer/System integration Engineer

As important players in the software delivery chain, software delivery engineers and system integration engineers will also change the way they work due to the application of software related to cloud native technology.

1. Standardized delivery

One of the biggest challenges in the delivery process is that different customers have different IaaS environments, including different server or virtual host technologies, network environments, storage products, operating systems, and basic software libraries. Different IaaS environments not only result in different versions of the deliverable software, but also change at different delivery stages, further increasing the complexity of delivery management. Containers and immutable shielding IaaS infrastructure can not only the different components, and the running environment changes in the container, can form in different image of different configuration version, instead of in situ modification upgrades way (this way will be lost version of the configuration information, or make a different version of the configuration become unmanageable), In this way, the software delivery process can be standardized and the infection of upper-layer application configuration changes caused by frequent changes in the IaaS layer can be isolated to improve software delivery efficiency.

2. Automated delivery

Another difficulty with software integration and delivery is the need to provide manuals for software configuration, installation, or deployment that the relevant people need to learn, and then adapt to the differences between standard and different environments. In this process, the installation script is just a sidekick, as it does not require knowledge from the manual. Cloud native OAM (Operation Administration and Maintenance) uses YAML files to describe the operating environment, composition, and Operation and Maintenance characteristics of software at the metadata level from the perspective of applications. It also describes the end state of the software deployment and the configuration changes that can be adapted. Scripts can read and understand YAML files. At the same time, we can see that the deployment of the same software in a typical scenario can be standardized, open source and shared (for example, the deployment process of Redis on Ali Cloud ECS). This not only automates the delivery of commonly used software, but also improves delivery by sharing experience with typical environments.

3. Cloud delivery and aggregation

Cloud computing provides a new place for software to run and a new form of delivery. Cloud computing is also a POC (Proof of Concept) for software delivery. The Integration of software and Cloud has become a new model of software Integration, forming a new CSI (Cloud System Integration). The system integrates on a small scale with software deployed in the public cloud, and then uses cloud native delivery tools to copy the environment in the public cloud to the private cloud in a one-click manner. This reduces integration and delivery costs while simplifying integration complexity.

4. Continuous delivery

Continuous delivery of software is a necessary part of the DevOps process. With small and frequent deliveries, DevOps enables the software delivery process to become more automated, versioned, and upgraded and rolled back repeatedly and automatically. Continuous delivery ensures that the software always has the latest and available version, that is, once the code or configuration changes, a new version can be generated immediately and the availability of the new version can be verified, thus improving the efficiency of software delivery.

5. Extensive tool chain and knowledge system

Cloud native technology system is open source, with widely used open source component products and open knowledge system. With these products and knowledge, software integration engineers and software delivery engineers can quickly learn the latest cloud native technologies, acquire the most appropriate cloud native toolchain, and quickly validate in their own environment. In addition, enterprises can also reduce training costs in the process of software delivery by acquiring basic technical knowledge of the products they use through Internet channels.

From database administrator to database architect

Data Base Administrator (DBA) plays an important role in traditional commercial database and open source database product system. They are the key to ensure the stability of the whole software system. The development of cloud native technology and products has also profoundly affected database administrators. Their way of working is undergoing a huge change. Their focus is gradually shifting from bottom system construction to business system architecture design, from basic stability to business structure optimization, and from how to make good use of database software to how to make good use of cloud native product system. At the same time, enterprises’ requirements for operation and maintenance objects, operation and maintenance platforms and technical capabilities have also changed greatly.

1. O&m objects

With the continuous evolution of cloud native architecture, Database as a Service (DaaS), which was once out of reach, has become a reality. The cloud database provides out-of-the-box PaaS services and provides rich cloud native database products, such as computing and storage resource pools, using the cloud native resource pooling technology. This changes the operation and maintenance object of database administrator from host, network, database to database service. The database administrator no longer needs to pay attention to the delivery of resources from the Internet Data Center (IDC) to the host. These basic services are done by the cloud platform. The cloud platform will give full play to the scale benefit of supply chain and virtualization technology, and provide high-quality services far lower than the cost of self-built IDC. In the era of cloud computing, with the IaaS service capability of cloud computing, database administrators are relieved of the burden of basic resource operation and maintenance in their daily work, so that they can pay more attention to the support capability of database services for business and shift the focus of operation and maintenance objects to database services.

2. Operation and maintenance platform

In the era of commercial database, the basic ability of database administrators is to make good use of a single database product, build a basic operation and maintenance platform, and achieve data security, high availability of services, backup and recovery, performance monitoring, problem diagnosis and other basic functions. Even in the era of open source databases, database administrators in most companies focus on the above mentioned aspects, or customize them from scratch or based on open source operation and maintenance components, which costs a lot of human and material resources and is difficult to achieve sustainable operation and maintenance capabilities. Once the core operation and maintenance personnel are lost, the enterprise is likely to appear platform unsustainable situation. In cloud native architecture, a database of PaaS platform provides rich operational support capability, so the construction operations platform, database administrators no longer need to start from scratch from operations of fundamental component oriented to service oriented database operations, is based on the cloud platform provides rich OpenAPI realize the customized development of the business support ability, How to provide stable database service support for business is the primary goal of operation and maintenance platform. At the same time, with the gradual improvement of the basic capabilities of cloud platform, the new technology makes use of the advantages of OpenAPI system to continuously improve the capabilities of database service-oriented operation and maintenance platform. Therefore, we need to realize that only by changing the goal of operation and maintenance platform construction can we give full play to the advantages of cloud native architecture platform.

3. Technical ability

The technical and architectural advantages of rich cloud services in the cloud native era have freed traditional database administrators from fundamental problems. Enterprises need architects who are capable of designing business data architecture based on cloud services rather than traditional operation and maintenance database administrators. Therefore, database administrators need to make the transition as quickly as possible. In the cloud native architecture, many of the problems that would have required a database administrator to solve in the past have been solved. A typical example is the data security problem, data security has always been the top priority of database administrators, they put huge energy into disk disaster recovery, machine room disaster recovery, data backup and other data security work. Availability Zones (Azs) and distributed storage architecture in the cloud native era have natural advantages in data security. Another example is capacity planning, database capacity planning is always a difficult point to grasp. During periods of change in the business model, such as in a rush scenario, it is easy to run out of system capacity. With the help of resource pooling technology, the cloud native system takes advantage of the elastic capability of the cloud native storage computing separation architecture, which can significantly shorten the expansion period from the original “days” to “seconds”. The shared storage technology can pull up read nodes in seconds to expand the read capacity of the system. It is believed that in the near future, with the breakthrough of CPU pooling, memory pooling and multi-point writable technology, database capacity elasticity will be more powerful.

In addition, SQL optimization has always been an important part of a database administrator’s daily routine. Instructing business developers to write SQL that conforms to database features has always occupied a large proportion of the work of database administrators. In the era of cloud native, cloud native automatic optimization system based on machine learning and expertise to realize database from perception, self-healing, self-optimizing, since the operations and the security of cloud services, can help the database administrator to reduce the complexity of the database management, eliminate the manual operation of service failure, thus effectively guarantee the stability of the database service, security and efficiency.

In the era of cloud native, cloud services largely liberated the database administrator, has also asked the database administrator as soon as possible to complete the transformation of personal ability, accelerating from the database administrator to database architect, thus more deeply involved in the business system architecture design, to help developers with good characteristics of the cloud database.

conclusion

Cloud native technology affects the daily work of relevant technical roles from many aspects such as business process, technology selection and technology stack, and the impact of cloud native technology is much more than the above. In cloud native environment has become an inevitable trend in the future, technology practitioners will follow with different jobs in the cloud native stresses focus on business and evolve, and learn to accept and cloud native concept and technology, and through the cloud native technologies and products better to release the value of cloud computing, to better support the development of related businesses.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.