Vernacular edge computing solution SuperEdge

I. Definition of SuperEdge

To quote the definition on the SuperEdge open source website:

SuperEdge is an open source container management system for edge computing to manage compute resources and container applications in multiple edge regions. These resources and applications, in the current approach, are managed as one single Kubernetes cluster.

The translation means:

SuperEdge is an open source edge container solution that manages cross-geographic resource and container applications in a single Kubernetes cluster.

Let’s briefly explain this sentence with key words:

Open source

Although this solution is an open source edge container solution by Tencent cloud container team, it is a completely third-party neutral open source project. On the official announcement day, Intel, VMware, Meituan, Cambrian, Capital Online, Hueyang and Tencent will jointly announce and open source the project. Tencent is not the only one to make the final decision. Students and enterprises with ideas in Edge are also welcome to participate in the project to jointly build Edge and jointly promote the implementation and development of Edge Computing in the actual scene.

Edge container solution

It is necessary to arrange and manage the container. However, the most popular solution for container scheduling is Kubernetes. Why not use Kubernetes for container scheduling and management? We’ll come back to that.

Single Kubernetes cluster

The SuperEdge team has not cut a single line of code from Kubernetes, which means that it is completely native to Kubernetes and has all the features of the corresponding version of Kubernetes.

Manage resource and container applications across geographies in a single Kubernetes cluster

Key words fall in single and across regions. Why manage cross-geographic resource and container applications in a single Kubernetes cluster? The scene, the scene, the scene.

In a simple example, there is a supermarket chain with 10 outlets, each of which has a same-day special advertising promotion program. The network is completely disconnected and independent between each business outlet, as well as between the business outlet and the central control surface, forming regional isolation. Each network advertising program is exactly the same, push the content is also exactly the same, the goal is to be able to in a single Kubernetes cluster, management of 10 business outlets, as easy as managing a network.

2. Possible problems

Looking at the features on the SuperEdge website, I don’t quite understand them either. What is Network Tunneling? Why Built-in Edge Orchestration Capability? I also doubt what the introduced feature will solve.

Take a supermarket chain, 10 outlets, and there is no network between centers and outlets, but to deploy and maintain the same application to design the solution, and look at the problems that need to be solved, maybe we can better grasp the characteristics and design intention of SuperEdge.

Let’s take a look at this example. In a real world scenario, the problem we are going to face is:

1.10 business outlets with the same procedure

While there are only 10 outlets in this example, the actual number could be hundreds or thousands. We also have to consider the scalability of the future, hundreds of thousands of nodes to let the same program all synchronized to do a thing, naturally not small difficulty. If managing hundreds of outlets were as easy as managing one, that would be a lot easier.

2. There is no network between centers and outlets, and between outlets

In reality, there is such a scenario, after all, the dedicated line is time-consuming and costly. The real scene is more complex, each dot may be a small room, may also be a small box, no public IP. Some nodes have hundreds of boxes, only one box can access the external network, some even all the boxes can not access the external network, the deployment time is quite difficult.

3. Weak network and geographical distance

The scene at the edge is not the same as the center. Many boxes at the edge, such as cameras, may access the public network or connect to WiFi in all areas. It is a normal phenomenon that the network is disconnected from time to time, as short as a few minutes, or a few hours, or as long as several days. What’s more, power off and restart… In this scenario, the challenge is to ensure that marginal businesses can provide services normally and improve the availability of services as much as possible.

4. How to know the health of edge nodes?

If the edge node is not healthy, the service of the abnormal node should be scheduled to the available node to ensure the health of the service as much as possible. How to know the health of the node when the network between the center and the edge is impassable and the network of each node is impassable?

5. Limited resources

Embedded edge nodes often have limited resources, and 1G1C is very common. How to ensure that edge services run when edge resources are limited?

6. Resources Department

Central cloud Kubernetes not only want to manage the central cloud application, but also want to manage the edge of the application, how to do?

Not to expand to say, there are many details to consider, can not be met in the actual scene to solve, we must be in the design of the scheme, as far as possible to solve the problems to face. This allows us to devote our limited time to the actual business, rather than wrestling with the underlying architecture.

3. Think about your own solutions

If we encounter the above problem, how to solve it?

1.10 business outlets with the same procedure

What’s a better solution to having a program run around in the same environment? Container, a good mirror test, as long as the underlying model system consistent, run up the problem.

2. What do you use for container choreography?

Kubernetes, of course, but the question is, can Kubernetes from the open source community be used directly? The answer is: no! Why is that? Because the network model of Kubernetes official website clearly states:

Kubernetes imposes the following fundamental requirements on any networking implementation (barring any intentional network segmentation policies):
- pods on a node can communicate with all pods on all nodes without NAT
- agents on a node (e.g. system daemons, kubelet) can communicate with all pods on that node

In other words, Kubernetes requires nodes to communicate with the master node network where Kube-Apiserver is located. In other words, components of nodes can access components of master nodes, and components of master nodes can also access components of nodes. Our second problem is that the network between the center and the network is not good, direct deployment of native Kubernetes is obviously not feasible, so how do we solve the management of edge services?

3. Weak network

In the case of weak network, even if we have established a connection between the center and the edge, the center and the edge will also be disconnected from time to time. In the case of disconnected network, how to ensure the normal service of the edge container? There is also the issue of restarting the edge node so that the edge container can be properly serviced.

4. What about the deployment of the edge nodes when the region is far apart?

We have to consider deployment issues when we are far away from each other. We can’t go to the users every time we deploy a network. What goes wrong, but also the user to open a back door, remote connection to the user’s node to solve.

Take the real scene to think down, facing a lot of problems, I hope these problems can cause everyone to think deeply, there is no unified standard for edge solutions, can only be smooth to solve customer problems, is the perfect solution.

Let’s take a look at SuperEdge’s solution.

4. SuperEdge features

Before introducing the features, let’s first post the architecture diagram of SuperEdge. As we look at the architecture diagram, we can explore the implementation of SuperEdge:

1.Network tunneling

After reading the introduction of this function and the architecture diagram of SuperEdge, in fact, Network Tunneling is the tunnelin the diagram. Deploy a tunnel-edge on the edge node and a tunnel-cloud in the center. In this way, even if the edge node does not have public network IP, it can also establish the connection between the edge end and the cloud, so that the center and edge node can communicate with each other. In essence, it is the tunnel tunnel technology.

2.Edge Autonomy

This is mainly for weak networks and restarts. Even with Network tunneling, the instability of Network between edge nodes and cloud cannot be changed. Frequent disconnection still exists. The function of edge autonomy meets two edge scenarios: first, the network is cut off between the center and the edge, and the service of the edge node is not affected; The second is the restart of the edge node. After the restart, the service on the edge node can still be restored.

How does this function work?

Take a look at the introduction to Lite-Apiserver. This component caches all the management data of the request center of the edge node, works with the drop data, maintains the edge service, and can provide the service normally even after the restart. However, there is a problem. The application of edge autonomy also generates data. What to do with the data generated? Will it generate dirty data and affect edge services? That’s a question for you to think about.

3.Distributed node health monitoring

But I am more confused, why the central and edge nodes cut off the network, the edge node’s service is not expelled? According to Kubernetes’ logic, services on the edge node should be ejected to other Ready nodes. Why is there no ejection? Does SuperEdge turn off the ejection of edge nodes? I have confirmed that SuperEdge has not turned off eviction, and the open source documents on the official website emphasize that POD eviction will only occur when edge node anomalies are confirmed.

So how does SuperEdge identify edge node exceptions? I found the answer later in Edge-Health’s component introduction.

Edge-Health runs on each edge node and detects the health of edge nodes in a region. The principle is roughly like this: within a certain area, edge nodes can visit each other, and then edge-health running on each edge node periodically visits each other to confirm the health of each other, including itself. The health of each edge node is calculated based on the results of our probe. If XX% of nodes consider this node abnormal (XX% is configured by the HealthCheckScoreline parameter, by default, 100%), this result is fed back to the central edge-health admission component.

The edge-health admission is deployed at the center to decide whether to eject the edge service to other edge nodes in combination with the status of the edge nodes and the edge-health voting results. By using edge-health, HealthCheckScoreline is set up so that the service will not be expelled as long as the edge node is not actually down. One improves the availability of edge services, and the other extends the application of Kubernetes expulsion in the edge.

However, I still have a question. If one of the edge nodes does go down and the service is expelled to another edge node, but the health of the later edge node is restored, what will happen to the service on the edge node?

4.Built-in edge orchestration capability

This function, in my opinion, is the most practical. This is the ServerGroup capability provided by the combination of the Application-Grid Controller and Application-Grid Wrapper components of the architecture diagram.

What does ServiceGroup do? Take the example of the supermarket chain with 10 outlets, which provides roughly two functions:

Multiple outlets can simultaneously deploy a special advertising promotion solution;

What’s new about this? Note that simultaneous means that you have a Deployment that has been committed to the central cluster once and can be deployed to 10 outlets simultaneously, each of which deploys a set of such Deployment solutions. There you go. You can manage a sales promotion program for 10 outlets as easily as you can manage a sales promotion program for one outlet. And the newly added network will automatically deploy special advertising solutions, regardless of the version, naming issues.

Services deployed by multiple sites differ, known as grayscale capabilities

This is an extension of ServiceGroup. The same set of services may have different fields in different locales or have different mirror versions. TemplatePool of DeploymentGrid can be used to define templates for different locales or sites. Using Kubernestes Patch function, the Patch can produce different running versions based on the basic version. This allows easy management using DeploymentGrid’s TemplatePool to take a grayscale of a region before a service goes live, or to define a set of services that run at different sites.

The same solution does not cross nodes access;

What do you mean? Even if each network is interconnected and each network has deployed special advertising promotion program, the program of A network will not visit the special advertising promotion service of B network, but only visit the service of its own network, so as to realize the traffic closed loop.

In my opinion, there are two advantages to this:

The network access flow closed loop is realized, the service of the network can only visit the network;
It promotes the autonomy at the edge. If a node cannot connect to the center, it can achieve autonomy by using its own services and will not visit other nodes.

This function is really useful in edge scenarios, where two components, Application-Grid Controller and Application-Grid Wwrapper, can be independently deployed in Kubernetes cluster to solve the management problem of multiple nodes and multiple sets of solutions. You can also use the ServerGroup documentation to try it out for yourself.

Today said here, the basic SuperEdge basic function analysis of the almost, there is a chance to in-depth analysis of the SuperEdge internal principle!

5. Collaborate and open source

The Edge computing core component of TKE Edge Edge Container Management Service has been open source to the SuperEdge project. Welcome to build the Edge computing and participate in the construction of the SuperEdge open source project, so that the Edge capabilities you developed can benefit more people. The following is the SuperEdge open source project’s WeChat group, where the environment is discussed.

SuperEdge version:

SuperEdge – V0.4.0
SuperEdge – V0.3.0
SuperEdge – V0.2.0
Dorcas has teamed up with open source SuperEdge

TKE EDGE

【TKE Edge Container Series 】 Install Edge K8S Clusters and Native K8S Clusters with EdgeADM
Learn about SuperEdge from 0 to N, these dry goods are a must see! [A collection of 18 pieces of dry goods]
【TKE Edge Container Series 】 Break down the Intranet barrier and add hundreds of edge nodes from the cloud at a time
SuperEdge Edge Tunneling Features: Operate and maintain edge nodes from cloud SSH
【TKE Edge Container Series 】 Learn about the SuperEdge Edge Container architecture and principles
TKE Edge Container Series Read about the SuperEdge distributed health check edge
【TKE Edge Construct Series 】 Read SuperEdge topology algorithm
【TKE Edge Container Series 】 Read SuperEdge cloud edge tunnel

Related information of landing case:

Edge Containerization Practice of Tencent Wemake Industrial Internet Platform: Creating a More Efficient Industrial Internet
After the explosion. With the edge container, you can achieve a week’s work of seven or eight team members in seconds
Industrial Internet Platform Construction Based on Edge Container Technology
Deploy Edgex Foundry using TKE Edge

[Tencent cloud native] cloud said new, cloud research new technology, cloud tour new live, cloud appreciation information, scan the code to pay attention to the public number of the same name, timely access to more dry goods!!

Vernacular edge computing solution SuperEdge

I. Definition of SuperEdge

2. Possible problems

3. Think about your own solutions

4. SuperEdge features

5. Collaborate and open source

Related Posts

Data Lake Accelerator GooseFS, accelerates the performance of data analysis on the lake

[Play WordPress] Serverless builds WordPress = 2 minutes

Technology Enabled Education: Serverless and Audio & Video Practices for 51Talk Online Education