Article | Jin Min (Technical Expert of Ant Group)\ Qiu Jian (Red Hat)

Proofreading | Feng Yong (Senior technical expert of Ant Group)

Read this article in 6 minutes with 3311 words

From the original Kubernetes technology came out, the industry again and again questioned whether it can withstand the test of production level, to today, many large enterprises have used Kubernetes technology “cloud” before large-scale infrastructure, in the enterprise to create dozens or even hundreds of clusters.

Kubernetes’ native management capabilities remain at the single-cluster level. Each cluster can operate stably and autonomously, but it lacks the overall management capability across multiple clusters. Infrastructure builders need to coordinate disparate management components to form a unified management platform.

Through it, operation and maintenance managers can learn the change of water level of multiple clusters, the oscillation of node health status and other information; The service application manager can decide how to deploy and distribute application services in each cluster. Application o&M personnel can obtain service status and issue relocation policies.

Innovations related to multiple clusters are emerging. ClusterAPI and Submariner, for example, are successful projects dealing with specific multi-cluster management issues.

This article is about technical explorations that attempt to solve all of the problems faced by enterprises with multi-cluster management.

Over the past five years, Ant Group, a Chinese technology company, has learned valuable lessons in the thinking, use and implementation of multi-cluster management.

Ant Group manages dozens of Kubernetes clusters around the world, each with an average of thousands of nodes (servers). They organize applications and their required components (including middleware, databases, load balancers, and so on) into elastic logical units of logical data centers (LDCS) at the architectural level and plan their deployment on physical infrastructure. This architectural design helps them achieve two key goals for infrastructure operations: high availability and transactional.

  • First, the availability of service applications deployed on an LDC can be guaranteed in the LDC to which they belong.
  • Second, application components deployed in an LDC can be validated and rolled back in the event of a failure.

Feng Yong, senior technical expert of Ant Group’s PaaS team, said:

“Ant Group has an infrastructure of dozens of Kubernetes clusters, hundreds of thousands of nodes and thousands of critical applications. In such cloud-native infrastructure, tens of thousands of pods are created and deleted every day. Building a highly available, scalable, and secure platform to manage these clusters and applications was a challenge.”

PART. 1 begins with KubeFed

In the Kubernetes project ecosystem, multi-cluster functions are handled primarily by the eponymous SIG-Multicluster team. The team developed a cluster federation technology called KubeFed in 2017.

Federation was initially considered a built-in feature of Kubernetes, but soon ran into implementation and user fragmentation issues. Federation V1 could distribute services to multiple Kubernetes clusters, but could not handle other types of objects, nor could it really “manage” the cluster in any way. Some users with fairly specialized needs — notably several academic LABS — still use it, but the project has been archived by Kubernetes and never became a core feature.

Federation V1 was then quickly replaced by a refactored design called KubeFed V2, which was used by operators around the world. It allows a single Kubernetes cluster to deploy multiple objects to multiple other Kubernetes clusters. KubeFed V2 also allows the “control plane” master cluster to manage other clusters, including their extensive resources and policies. This is ant group multi-cluster management platform of the first generation of solutions.

One of the primary tasks for Ant Group to use multi-cluster federation is resource elasticity, not only at the node level but also at the site level. Increase efficiency and extend system capabilities by adding nodes and entire clusters as needed. For example, the annual resource flexibility, November 11 is China’s annual Singles Day, Ant Group usually needs to quickly deploy a large amount of additional capacity to support peak online shopping workload. However, as they discovered, KubeFed is slow to add new clusters and inefficient at managing large clusters.

In the KubeFed V2 cluster, a centralized Kubernetes cluster acts as a single “control plane” for all the other clusters. Ant Group found that the resource utilization of the central cluster was very high when managing the managed cluster and the applications in the managed cluster.

In tests that managed only 3% of the ant group’s application workload, they found that the central cluster of medium-sized cloud instances was saturated and had poor response times. As a result, they never run their full workload on KubeFed.

The second limitation has to do with Kubernetes’ extended functionality, called custom resource definition or CRD. “Power users” like Ant group tend to develop a lot of custom resources to expand management capabilities. To distribute CRDS across multiple clusters, KubeFed requires that a “federated CRD” be created for each CRD. This not only doubles the number of objects in the cluster, but also creates serious problems in maintaining consistency between CRD versions and API versions across the cluster, and can cause applications to fail to upgrade because they are incompatible with different DRD or API versions.

This proliferation of CRDS also leads to serious troubleshooting problems, as well as bad habits such as ill-defined usage of CRDS and random field changes that make the robustness of the KubeFed control surface even worse. Where a local cluster has a custom resource, there is also a graphical aggregation view on the federated control plane that represents that local cluster resource. But if there is a problem with the local cluster, it is difficult to know from the federated control plane what the problem is. Operation logs and resource events on the local cluster are also not visible at the federated level.

PART. 2 Switching to Open Cluster Management

The Open Cluster Management Project (OCM) was originally developed by IBM and opened source by Red Hat last year. OCM has improved its approach to multi-cluster management based on the experience of Ant Group and other partners. It devolves administrative overhead from the central cluster to agents on each managed cluster, allowing it to be distributed autonomously and maintain stability across the entire infrastructure. This allows OCM to theoretically manage at least an order of magnitude more clusters than KubeFed. So far, users have tested managing up to 1000 clusters simultaneously.

OCM has also been able to take advantage of developments in Kubernetes itself to improve its capabilities. For example, those capability extensions encapsulated in CRD could distribute Kubernetes objects between clusters using OCM’s WorkAPI (a subproject being proposed to Sig-Multicluster). WorkAPI embeds a subset of the local Kubernetes resources as a definition of the object to be deployed and leaves it up to the agent to deploy. This model is more flexible and minimizes the deployment requirements for any central control plane. WorkAPI can define multiple versions of a resource together, supporting upgrade paths for applications. At the same time, WorkAPI takes into account the state maintenance problem when the network connection between the central cluster and the managed cluster fails, and can guarantee the final consistency of the resource state in the case of reconnection.

Most importantly, OCM enables more automation in clustered deployments. In KubeFed, cluster management is a “two-way handshake” process based on “zero trust” between the central and managed clusters, with many manual steps to ensure security. The new platform simplifies this process. For example, because it runs on a “PULL” basis, there is no need for multi-stage manual certificate registration or any cleartext KubeConfig certificate circulation to allow the managed cluster to obtain administrative commands for the central cluster.

Although the registration process focuses on “trust” in both directions, adding a new cluster to the OCM requires very little action; Workers can simply deploy the “Klusterlet” agent on the target Kubernetes cluster for automatic management. This not only makes it easier for administrators, but also means ant Group’s deployment of more new clusters to prepare for Nov 11 is faster.

What’s next for Kubernetes multi-cluster?

In just four years, the Kubernetes community’s multi-cluster Management capabilities have grown rapidly from Federation V1 to KubeFed V2 to Open Cluster Management.

Through the technical capabilities of the talented engineers working on the internal interest group Sig-Multicluster and external projects (OCM, Submariner, etc.), the scale of management and management capabilities supported by multiple clusters are much higher than before.

Will there be a new platform for further development of multi-cluster capabilities, or is OCM the ultimate way to do it?

Feng Yong thinks so:

“Looking to the future, with the joint efforts of red Hat, Ant Group, Ali Cloud and other participants, the Open Cluster Management project will become the standard and backplane for building multi-cluster solutions based on Kubernetes.”

In any case, one thing is clear:

You can now run the entire planet on Kubernetes

To learn more about cloud native topics, join the CloudNative Computing Foundation and the CloudNative Community at KubeCon+CloudNativeCon North America, 2021 — October 11-15, 2021.

🔗 “link to the original” : containerjournal.com/features/th…

Recommended Reading of the Week

  • Scaling the peak – Ant Group large-scale Sigma cluster ApiServer optimization practice

  • SOFAJRaft’s practice in simultaneous travel

  • MOSN sub-project Layotto: Open a new chapter of service grid + application runtime

  • Intelligent ant monitoring