Zhang Jianfeng, president of Ali Cloud Intelligence, shared at the 2021 Cloud Conference

In 2021, cloud native will see a number of important advances. What are the trends to watch for in 2022? Li Guoqiang (Zhan Yan), senior technical expert of Aliyun, appeared on InfoQ video to give the latest interpretation of cloud native trend. The following is edited based on the content of the live broadcast. The full content can be viewed here for replay. The following content is reprinted from InfoQ and supplemented with relevant references for readers to understand and learn more comprehensively.

What are the most impressive things that happened in the cloud native space in 2021?

Opinion: Cloud native has been talked about more and more in 2020, but I think 2021 is the year that it really takes off and becomes widely adopted by all cloud vendors and users. There are many impressive things that happened in 2021. I would like to share two with you.

The first is that distributed clouds will have a big explosion in 2021. Both the use of users and the technical support of cloud manufacturers have shown a very hot trend. Why is that? I think this is very closely related to the development of your business. The rise of live broadcasting, 5G, IoT and other fields makes businesses have higher demand for cloud forms. People hope that cloud can be closer to the data generation point, so there are more and more forms of edge cloud, local cloud and hybrid cloud. At present, there is a very important trend in cloud computing, which is to present the model of one cloud and multiple forms, so that users can use the power of cloud computing in every place. But it also presents a significant challenge to cloud infrastructure. Users used to use only one cloud, and the complexity of management was acceptable, but with multiple cloud forms, the challenge became greater.

Cloud native technology naturally solves the problem of unified interface management when the cloud becomes multi-morphed, including the complexity challenge posed by hybrid cloud. So cloud vendors are investing a lot in this area. Amazon’s EKS Anywhere launched in September, and Alibaba Cloud launched ACK Anywhere in September. Both are essentially more complete solutions that allow users to use multiple clouds in a single cloud mode. Business scenarios drive the widespread adoption of technology.

Second, in 2021, the cloud native landing of the head Internet companies has reached a milestone key node. The representative event is that the major Internet companies have basically completed the cloud biogenesis, and all their businesses are 100% on the cloud. Nowadays, the availability and maturity of cloud native core technologies such as containers, microservices and service grids can support the volume of the head Internet. Each industry cloud native progress is not the same, the head of the Internet companies run in the front, basically do a comprehensive cloud biogenesis. In the coming years, other industries will gradually follow in the footsteps of the Internet and go full cloud biogenesis.

Some say that cloud native and cloud computing in general is a standards war. What do you think of this sentence? Do you think the war is over?

Viewpoint: This is an interesting topic. In fact, open source in the cloud native space is hot right now. There are a lot of open source projects in CNCF, there are more than 1000 projects in CNCF. What does the standard of so many open source projects have to do with cloud computing companies? In my opinion, I would not define it as a standards battle, because the role of standards evolution is critical for cloud vendors and users, and only standardization can truly improve scale and efficiency. In the future, whether cloud vendors or other enterprises, the direction of large-scale and high efficiency must be standardization.

There are several key standards in the cloud native space. The earliest was container, which solved the problem of standardization of application packaging and application distribution. Before that, the level of standardization in virtual machines and other ways was not enough, and Docker put an end to that problem. With the continuous evolution and promotion of Docker, new problems appeared in application scheduling and resource scheduling. At that time, Docker Swarm, Mesos and Kubernetes competed with each other. Finally, Kubernetes won and brought a new de facto standard in resource scheduling. Today Kubernetes has become a de facto standard. Before the Application layer, there was also a hundred schools of contention. Every enterprise is making its own cloud native applications. Now there are more and more Open source voices and standards, such as Open Application Model, and everyone is trying to define the standards of the Application layer.

The release of KubeVela 1.1 marks a new milestone in the delivery of applications in mixed environments

In 2000 or earlier, standardization was played out by organizations that set up standards committees. Today’s standard customization process is more open source projects first, when open source has become a fact of the standard, we Follow. There will also be some relevant departments to participate in the formulation and promotion of standards. For me, standardization is more about collaboration between enterprise and ecology, promoting the scale and universality of the whole cloud computing and cloud native technology system.

What trends in cloud native will continue into 2022?

Opinion: The cloud native space is really rich, and there are a lot of things that will carry over into next year, including the distributed cloud mentioned earlier. I’m still a big fan of edge computing in the distributed cloud. Why? Because the scene is getting richer and richer. For example, we now see less text content, audio and video more and more, so the video processing business is developing very fast, the demand for edge computing will be more and more intense.

As the extension of cloud computing, edge computing will be applied to more fields, and it will also bring a lot of challenges to infrastructure. For example, some edge networks may be weak networks and computing resources are not rich. How can infrastructure play a role in this situation? How to solve o&M problems in cloud-side collaboration. Under edge architecture, container can play a great role in network opening and elastic load. In this respect, cloud manufacturers have also invested a lot. Many open source projects, such as OpenYurt and KubeEdge, enter CNCF. In 2021, edge technology will have a strong explosion from the business side and the open source side. I expect that in 2022, both the open source community and cloud vendors will have great changes and progress.

OpenYurt in depth | Enable cloud native management for edge devices ****

Are there any edge computing application cases to share?

Opinion: Cases are numerous. In Internet business, familiar things like CDN and audio and video processing are typical edge scenarios. For example, many parks or factories have video capture, and then companies do in-depth analysis. Whether safety violations can be detected in factories, parks and residential buildings, etc., needs to be analyzed after video collection to finally find problems. The previous process was to collect videos and then upload them to the central cloud or local server for processing, but this can no longer meet the needs of enterprises. Now, companies want to be able to process the video data locally, rather than upload it to a central cloud, to meet network latency and reduce network transmission costs. In this scenario, edge container can manage computing power for different scenarios, including algorithm sinking.

For example, in the power industry, there are substations scattered throughout the country as infrastructure. How to manage the computing power of these infrastructure and rapidly deploy the business to some marginal nodes is a marginal field. In the past, substation infrastructure upgrading may require people to personally go to that place, which is very inefficient. After cloud protogenics, infrastructure management and its application management and algorithms can be solved by cloud protogenics, and the efficiency will be greatly improved.

Deeply convinced intelligent edge computing platform and OpenYurt landing scheme exploration and practice

As more services are connected, K8s configurations become more complex. It was liberating productivity, but now it seems to be shackled. What do you think of this phenomenon? Are there any new challenges for container applications this year?

Viewpoint: This is interesting. K8s is to solve the container scheduling and resource scheduling problem, and this problem is very complex for enterprises, but K8s tries to use cloud native way to redefine the application scheduling and resource scheduling, and is better than the original scheme. Now, K8s has more and more diversified business types, from stateless at the beginning to stateful later, and now more complex computing engines like AI are also placed on K8s. This is a process of mutual promotion, and more and more types of loads are placed on K8s, and the whole K8s system is indeed becoming more and more complex. But it can manage more and more. The complexity of containers is bound to increase if future users use them entirely.

But for enterprises and cloud vendors, the challenge is to reduce the complexity of the container as it can do more, otherwise the threshold for the container is very high. Now, vendors are considering more efforts from the perspective of intelligent operation and maintenance. For example, in cluster management, how to use intelligent operation and maintenance to find some conditions in the current operation, and can provide solutions. There are also smart application portraits and resource portraits to improve resource utilization. Intelligent operation and maintenance is also a hot direction.

Also, changes in the entire technology stack can lead to changes in the entire enterprise organization, often disruptive changes.

The ecology around containers is arguably the most important IT revolution of the last decade, and IT is bound to lead to a number of changes, including organizational changes within the enterprise. We will see not only changes in the entire o&M management system, but also the emergence of new organizational forms within the enterprise, such as the SRE team proposed by Google, which is responsible for usability. Many companies that are heavily cloud native, including Alibaba, have dedicated SRE teams that are responsible for building the overall usability capabilities. Secondly, the enterprise will also have some horizontal departments of platform, which are based on the cloud native system to support the upper business. In the past, some enterprises may have been siloed business units, i.e. a business unit with a support team under it, and the inclusion of K8s will allow the emergence of more platform-based horizontal departments within the enterprise. This is also a solution to complexity, since not every vertical line of business has the resources and expertise to solve this problem. When the enterprise is large enough, it must consider forming horizontal departments in the SRE layer and platform construction layer to separate functions.

What do you expect to be the focus of container technology development in 2022? What are the prospects for its future applications?

Opinion: Another hot word this year: green and low-carbon. In addition, the whole Internet this year is a bit like entering the winter period, many Internet companies have proposed to reduce costs and increase efficiency. Cost reduction has become a very important KPI for ctos of many enterprises and an inevitable trend of technological development.

In terms of cost reduction, enterprises can do a lot of things. At the lower level, many cloud manufacturers and Internet companies are developing their own chips. The investment in this area is very large, but the integration of software and hardware will indeed bring cost reduction and efficiency increase. Containerized operating systems (OS), which is about six or seven years old, is an important optimization approach at the infrastructure level.

Elasticity is widely used by many enterprises to reduce costs and increase efficiency, especially after the combination with cloud vendors, flexibility is more widely used. At present, many Internet companies are trying to move away from online hybrid technology, essentially improving machine utilization. In the past, each vendor built their own room or bought servers on the cloud, and the utilization rate was usually less than 10%. This is not a high utilization rate, and many enterprises try to push this level higher, but pushing this level inevitably brings many technical challenges, such as whether multiple loads will interact with each other when running at high utilization. Several major manufacturers are trying to export off-line hybrid (multi-load hybrid deployment) technology in the form of open source or commercial products, and I believe off-line hybrid technology will see further commercialization in the next year.

There is a concept called FinOps, which stands for cost-oriented application and management. In this regard, what we are doing is cost visualization, such as knowing the cost of cloud vendors or local machine rooms for several departments within the enterprise, and the cost of different businesses.

On the open source side, there is a project called Kubecost that provides this capability from an open source perspective. Cloud vendors provide “cost centers” in container services that help users link cloud billing to clusters, clearly see the cost per department, per business, and even make recommendations accordingly. This is suitable for mixed deployment scenarios. Cost management will also see some development next year.

Another interesting thing is that foreign cloud manufacturers put forward a concept called Carbon Bills, which turn the cost consumption of enterprises into the form of Carbon Bills, which is also an interesting direction.

Cost reduction is the eternal pursuit of all enterprises. However, when enterprises are developing at a high speed, this pursuit is not so strong and they will put business first. The need to reduce costs becomes more apparent as the business plateaus or encounters difficulties.

The application scenarios of Serverless are quite diverse. Will this affect the universality and reusability of this technology? Why is that?

Opinion: Serverless is also a topic that everyone talks about more recently. First of all, I would like to talk with you about what is Serverless, because everyone’s understanding of Serverless is different. Some people will simply understand Serverless as functional computing. Indeed, amazon first launched AWS Lambda as a functional computing product and defined it as Serverless. But in fact, today’s Serverless scope is indeed more and more extensive, Serverless is essentially a design concept, has been more than a functional calculation category.

Now, there are Serverless products oriented to functional calculation on the market, as well as Serverless products oriented to application, such as App Runner launched by foreign cloud manufacturers, domestic Serverless application engine, etc. This allows users to use the Serverless architecture without any modifications to more traditional applications and without concern for the underlying IaaS infrastructure. In addition, there are K8S-oriented Serverless products, which users can use through the K8s interface, and container-oriented Serverless, which is used to deliver container instances. The essence of these products is to allow users to use cloud resources in one interface, regardless of the underlying infrastructure. Serverless diversification gives users more choices.

Another big trend is that more and more cloud products are becoming serverless. As you can see from Amazon’s Re :Invent, a number of cloud products are also Serverless themselves, such as the Serverless version of Kafka. This means that users don’t need to pay attention to the size of the cloud product when they actually use it.

Breaking the Serverless landing boundary, Ali Cloud SAE released 5 new features

The Serverless of cloud products also brings diversity. In my opinion, this diversity is brought by the continuous enrichment of Serverless concept in user interface and product form, and also promotes the standardization process of the industry. When users use a variety of Serverless products, You can also use products from various cloud vendors in standard form. Functional computing, for example, will be triggered by more and more standard HTTP schemas, with observable performance in conjunction with open source technologies such as Prometheus and OpenTelemetry, leading to more and more standardization of Serverless products.

The diversification of user needs is coupled with the standardization of Serverless products, and this is a necessary process so that more and more users can use it.

We said in 2019 that the future of Serverless has arrived. In your opinion, has this “future” really arrived?

Opinion: Gartner has published a technology maturity curve, in which a new technology goes through periods of growth, expansion, disillusionment, and finally plateau. In my opinion, Serverless technology has now passed the phase of disillusionment and is beginning to plateau. I think Serverless was at its peak a couple of years ago, when people were very enthusiastic about Serverless, it was in its boom. The diversity of scenes mentioned above is also related to this, as more and more scenes are mentioned during the expansion period, and more and more scenes will be landed after the real disillusionment period.

I’ll give you an example and you’ll see if Serverless has really become more of an implementation.

The first is ali himself in 2021 double 11, a large number of front-end applications are actually implemented with Serverless framework. This is a typical Serverless scenario and is easier to land. The front-end business based on node. js framework is now developed, deployed and used with Serverless framework across many internal business departments of Ali, which also supports the massive applications of Singles’ Day. Serverless delivers very high development efficiency and extreme resiliency.

In addition, there are many users of the Serverless architecture for audio and video processing. A music service manufacturer this year in Ali public cloud with function computing audio and video processing, including audio transcoding, automatic recognition, etc. Vendors chose Serverless because of its flexibility. For example, after just getting the copyright of a batch of songs, manufacturers need to quickly convert the code quality of all songs, which is explosive elastic demand and also a parallel batch task, and Serverless can handle it well. There is also a micro-service architecture for enterprises like video APP, so the investment in infrastructure operation and maintenance will be relatively high. Some enterprises will choose application-oriented Serverless products, such as The Serverless application engine of Ali Cloud, which will deploy microservices on the platform without any infrastructure management. This is now the true value of Serverless.

NetEase cloud music audio and video algorithm Serverless to explore the road

Does cloud native have special requirements for programming languages?

Opinion: I don’t know if you know, but what is the most popular backend development language in the country? Or Java. But today, multilingualism is inevitable. Many companies are using Go as their primary development language, and PHP is widely used. Each language has different characteristics, and many enterprises will choose a suitable language based on business needs. At this point, there may be multiple languages, business departments think it’s better to use Go, front-end people want PHP or Node.js, and multiple languages are becoming more common within the enterprise.

Developers can use whatever language they want, but operations people face big challenges, such as how to unify service governance in a multilingual environment. Currently, technologies such as Service Mesh have been introduced in the cloud native space for multi-language Service governance. In terms of the ecosystem, the most mature backend language is Java, it’s easy to recruit Java talent, and Go has a very good growth trend. In the future, companies will be more and more tolerant of multilingualism.

Java was basically the dominant language in Alibaba, but now there are multiple languages in Alibaba. Ali has acquired a lot of enterprises, such as Ele. me, Flying Pig, Autonavi, etc., but it is impossible to make all the acquired companies change their programming language, it is very difficult. As a result of corporate acquisitions, Arene’s programming languages have diversified. If you’re big enough, you’re multilingual. If you’re a startup or not big enough, language unity can be a convenience.

Netizens asked that cloud native is very popular, maybe it is not advanced without cloud native, such as Mesh. What do you think of the netizen’s question?

Viewpoint: The fire in the cloud native space is market and business driven. The richness of technological development also brings with it the difficulty of selection, the risk of choosing the wrong route, which is very real. Today, cloud native technology is very hot, just mentioned CNCF has thousands of projects, which users should use? This is really something that every business thinks about. In my opinion, to choose suitable for you, but the premise is the corresponding technical scene support. As for whether to choose Mesh, it ultimately depends on the business demands of the enterprise.

For example, a more robust team that is stable to land and supports a single language is a good choice for mature SpringCloud and Dubbo. But if the team is choosing architecture for multilingualism or for the future, some organizations will choose Mesh. Mesh has evolved over the years and the open source community has matured. Istio has become a de facto standard, and many companies are using these technologies to produce Mesh. They don’t need to worry about whether Mesh is a bubble.

Service Grid ASM Year-end Summary: How do End Users use the Service Grid?

The service grid aims to be a cloud-native network infrastructure. Where do you think that goal is going? What are the next r&d and application priorities?

Opinion: My answer to the question is about my view that service grid technology is maturing and Envoy and Istio are becoming more and more popular. CNCF’s previous research service grid utilization rate is 27%, which is good. Head Internet companies are already using Istio, or doing their own research on the community. Ant announced the whole core business Mesh in a few years ago, and Alibaba Group faced with multi-language governance issues in the Mesh. If requirements matching also has technical reserves, you can try using Mesh.

However, now that the technology is in production, the community Tech edition still needs to face some challenges, such as the gradual transition to inventory systems. The Istio system is closely related to the K8s ecosystem. However, many enterprise VMS may not be fully transformed into containers, and some VMS exist. How can the service network support VMS? Some companies may have a mix of microservice frameworks and some may already use SpringCloud. Can Mesh be integrated with SpringCloud? Community programmes in this area have not been particularly comprehensive. Observability is one of the most important aspects of Service Mesh production. Fully self-built businesses face such technical challenges.

To truly build Mesh systems by themselves, enterprises need to have technical reserves and relevant talents. In addition, enterprises can also leverage the power of cloud vendors. At present, several cloud vendors have cloud products related to Mesh. For example, Ali Cloud provides services such as Service Mesh hosting. An enterprise can judge the capabilities of its technical team based on its own business dimension before deciding whether to build it entirely or leverage the capabilities of a cloud vendor.

Some netizens asked, what are the observability components of Alibaba?

Viewpoint: Observability is also a very important area of cloud native and a necessary partner for enterprise production.

There are two big trends in observability. The first big trend is full stack observability. A common challenge in services is that a user reports a problem, and the enterprise can quickly identify the problem in an observable way across the entire link. Nowadays, as architectures become more and more complex, enterprises may start from the user side, for example, from the front end to the application layer, and then to the infrastructure layer, etc. What is needed is the diagnostic capability of the whole link. So an important trend in observability is to open up the whole link.

Another very important trend is the opening of the index system. In the field of observability, there are Metric, tracing and Loggin data. These three data were developed separately in the past, but today’s users have great demands to integrate the three data for unified monitoring. For example, when a problem occurs, the developer might want to look at the log of the transaction for which the Metric anomaly was detected. After seeing the log, the developer might want to look at the entire link for that transaction. In the observability scenario, the demand for unified monitoring data is more and more intense. What ali is doing in this area is offering a full range of hosting services, such as hosting products from Prometheus and Grafana, around the two points mentioned above.

Annual Inventory | Review of Alibaba Cloud Observables in 2021

Some in the community have also asked how companies can explore cloud native if development testers are not skilled in container technology.

Viewpoint: I think the whole process of going cloud should be done according to the situation and mode of the enterprise. There is a common saying in the industry that the upper cloud will be divided into several phases. First of all, the simplest way is Rehosting, that is, moving the original offline computer room to the cloud. It used to be a virtual machine offline, but it is also a virtual machine on the cloud, which often brings financial changes to the enterprise. It used to have assets, but now it has become a service on the cloud. This is a translation for the enterprise, the overall value is slightly lower, but the cost is also the lowest, basically there is no need to transform the business, operation and maintenance mode does not need to change. All businesses can do it.

The second step is Replatform. This is related to some of the concepts native to the cloud, such as turning old virtual machines into containerized modes. A typical feature of Replatform is that enterprises do not need to modify applications, but only change the system operation and maintenance mode. Most of the time the cost and cost of application transformation is relatively high. Containerization generally does not require a transformation of enterprise applications. Another is to consider from their own construction of open source tools into the use of cloud manufacturers’ products, such as the original self-built MySQL, into cloud manufacturers’ RDS, etc.. Enterprises can really see the cost-saving and cost-enhancing results of cloud native.

From a team-building perspective, you still need someone who knows K8s. K8s learning materials is very much, InfoQ, CNCF official website, open source community official website, as well as Ali Cloud have a large number of information for users to use.

The last stage, which many enterprises are doing, is Refactor, that is, refactoring. The whole application architecture of an enterprise often changes, including Serverless, microservitization and so on. This phase involves app transformation, but it’s really where the app side takes advantage of the cloud. Enterprises can combine their own characteristics and choose the gradual cloud biogenesis.

In addition, enterprises also look at their own business type. Now there’s this idea of two-state IT, which is steady state and sensitive state. Steady-state is a business that does not change much inside the enterprise. For this kind of business, we suggest that we only need to build the Replatform, because its iteration speed is not that fast and business changes are not great, but it needs to enhance its stability and flexibility through containerization and other modes. However, sensitive business still has rapid iteration, so it may be suggested to do Refactor, such as microservitization, which can improve the efficiency of the whole research and development.

Enterprises should consider their own cloud biogenesis methods based on their business types and technology reserves.

As cloud native systems get bigger and developers have more to learn, do you have any learning tips for everyone?

Viewpoint: There is a lot to learn, and the iteration is very fast. I suggest you learn from a different Angle. Self-study is certainly not a problem, there are all kinds of materials on the Internet. But there is one thing that you must do from a business-driven perspective when doing cloud native capabilities.

Cloud native, the golden age of developers

I see that some enterprises do cloud native for the sake of technology, which may not have good results in the end. Most of the time, they should first consider what to do from the perspective of business value, and then choose the corresponding technology. On the one hand, if the enterprise is business driven, it will have enough resources to invest. On the other hand, enterprises will have enough practice in technology selection and implementation.

In terms of the field, my advice to everyone is to lay a solid foundation first, and then perfect some of the necessary production skills. Container technology is the cornerstone of everything, and then after that comes some of the more critical technologies like observability, CICD, microservices, and so on that really need to be deployed in the enterprise.