Abstract: After the restart of Dubbo maintenance, Alibaba technology in the open source dynamic, at the China Open Source annual meeting, Alibaba formally open source its own research container technology Pouch.
At the China Open Source Annual Conference, Alibaba officially opened the Pouch container technology based on the Apache 2.0 protocol. Pouch is a lightweight container technology with features such as high efficiency, high portability, and low resource occupancy. It mainly helps Alibaba to deliver internal business faster and improve the utilization rate of physical resources in super-scale data centers. Since open source, Pouch has become an inclusive technology that is available to everyone on GitHub.


https://github.com/alibaba/pouch









Pouch’s opening is a sign of Ali’s interest in container technology. Up to now, container technology has become a consensus in most enterprises around the world. How to do a good job of container technology selection, how to make the container technology controllable, I believe that every enterprise must consider the problem. Pouch undoubtedly adds a new dimension to the container ecosystem, gaining ground for Chinese technology in the open source container ecosystem dominated by global giants.



Pouch Technology Status



Because of the open Pouch, it is believed that many experts in the industry will be interested in Ali’s current container technology. Is Ali a warrior or a rising star? The future can be seen from the past, especially in the field of technology. The precipitation and accumulation of technology can roughly tell a company’s technical strength.


Evolution of Pouch


Looking back at the history of the Pouch, it can be found that the Pouch originated in 2011. At that time, technologies such as Namespace and Cgroup on the Linux kernel were beginning to mature, and tools such as LXC were also born shortly after that. As a technology company, Alibaba developed container technology T4 based on LXC and provided services to the group in the form of products at that time. This move is regarded as ali’s first exploration of container technology, and also accumulated the initial experience for Ali’s container technology. As time goes by, Docker was born two years later, and its image technology has largely solved the problem of “software packaging” that has plagued the industry for many years. As mirror technology became popular, There was no reason for Alibaba not to incorporate a technology that brought tremendous value to the industry. Therefore, in 2015, T4, on the basis of its own container technology, gradually absorbed the Docker image technology in the community and slowly evolved into Pouch.


Container technology with mirror innovation, like a hurricane, everywhere it goes, it is praised at home and abroad, alibaba is no exception. Since the end of 2015, alibaba Group has been quietly changing in terms of infrastructure. There are many reasons, among which the simplest one is easy to understand. Alibaba’s massive Internet company must be supported by a huge data center. The explosive growth of business will inevitably lead to the increase in infrastructure demand, which will also lead to a significant increase in infrastructure cost. The lightweight and low resource consumption of containers, coupled with the rapid distribution of images, quickly made Alibaba decide to invest more in container technology to help upgrade data centers.


Ali container scale


After more than two years of investment, Pouch has played an extremely important role in the group’s underlying technology. Pouch did this in a “super-project” behind the 168.2 billion yuan of transactions on November 11, 2017:


  • 100% online business streamlining
  • The size of the container reaches millions
Back to Alibaba Group, Pouch’s daily services have covered most business divisions, including e-commerce, advertising, search, etc.; The technology stack covers e-commerce applications, databases, big data, streaming computing, etc. Covering programming languages: Java, C++, NodeJS, etc.


Pouch Technology Advantages



Such a wide range of applications for Alibaba’s container technology is a blessing for the industry because Alibaba has demonstrated that the technology has been proven in a mass production environment. However, because Pouch originated from Ali, not the community, there are differences between the two systems in terms of container effects, technical implementation, and so on. In other words, Pouch has a number of unique technical advantages.


Isolation is strong


Isolation is an unavoidable technical problem in the process of cloud transformation. Strong isolation means that the technology has the preliminary conditions for commercial use; Conversely, it is almost impossible to spread out on the business line. Even technology companies like Alibaba are not immune to security problems when they start practicing container technology. As we all know, most container solutions in the industry are based on cgroups and namespaces provided by the Linux kernel to achieve isolation. However, such lightweight solutions have disadvantages:


  • Between containers, between containers and hosts, sharing the same kernel;
  • Kernel implementation of isolated resources, insufficient dimensions.


Faced with such a kernel situation, Alibaba has taken three measures to solve the container security problem:


  • The user mode enhances the isolation dimension of the container, such as network bandwidth and disk usage.
  • Submit patch to kernel, fix the resource visibility problem of container, cgroup bug;
  • Implement hypervisor-based containers that achieve container isolation by creating a new kernel.


Container safety research will continue for a long time in the industry. However, in the open source Pouch, Ali will continue to integrate LXCFS and other features to share with the community on the basis of the original security. At the same time, Alibaba is also planning to open source the “Ali kernel”, alibaba over the years to enhance the Linux kernel back to the industry.


P2P Image Distribution


With the explosive growth of Ali’s business and the rapid popularization of container technology after 2015, the distribution of Ali container images has also become an urgent problem to be solved. Although container mirroring has helped enterprises make many optimizations over traditional methods in areas such as application file reuse, distribution efficiencies are still maddening in the tens of thousands of cluster sizes. Take a simple example: if there are 10000 physical nodes in the data center, and each node simultaneously initiates image download to the mirror warehouse, the network pressure and CPU pressure of the machine where the mirror warehouse is located can be imagined.


Based on the above scenario, Alibaba image distribution tool “Dragonfly” emerged. Dragonfly is a general file distribution system based on intelligent P2P technology. This solution solves problems such as time-consuming distribution, low success rate, and bandwidth waste in large-scale file distribution scenarios. Services such as release deployment, data preheating, and large-scale container image distribution are greatly improved. Currently, both Dragonfly and Pouch are open source, and the project address is:
https://github.com/alibaba/dragonfly




The architecture of Pouch and Dragonfly is shown below:





Rich container technology


There are a variety of business scenarios within Alibaba Group, and almost every scenario has its own requirements for Pouch. There is incredible resistance to containerization in the business with an external “single-container, single-process” approach. Inside Alibaba, the basic technology plays a huge supporting role and needs to better support the operation of the business every moment. When the business is running, it’s almost impossible for technology to get the business to change and adapt itself. Therefore, a container technology that is not invasive to application development, application operation and maintenance can be rapidly rolled out on a large scale. Otherwise, in the process of containerization, on the one hand, there is no support from the business side; on the other hand, a lot of manpower needs to be invested to help the business side to achieve business operation and maintenance in a non-standardized way.


Ali is well aware of this. Internal Pouch technology can be said to be non-invasive to the business, which is why it is 100% containable within the group. Such container technology is called “rich container” by countless Ali people.


The implementation of “rich container” technology is mainly to create a container on the Linux kernel that is completely consistent with the virtual machine experience. As a result, Pouch is more powerful than a normal container, with a full init process inside, and whatever services a business application might need. Of course, this explains why Pouch can be non-invasive to applications. In the implementation of the technology, Pouch needs to define the execution entry of the container as Systemd. In the kernel state, Pouch introduces the latest kernel patch, cGroup Namespace, to meet the isolation of Systemd in rich container mode. From the perspective of enterprise operation and maintenance process, rich container also has obvious advantages. It can do some things before the Entrypoint of the application starts, such as unify to do some security-related things, operation and maintenance related agent pull up. The rich container can transparently handle all of the things that need to be done in a unified manner that would be intrusive to the user’s application if placed in the user’s startup script or image.


Kernel compatibility


The blowout development of container technology makes many enterprises in the forefront of technology enjoy technology dividend. However, the “long tail” also guarantees long cycles of technological evolution. Pouch’s development also encountered similar problems during the scaling process.


If the scale reaches a certain amount, “Moore’s Law” determines that there will be legacy resources in the data center. How to utilize and deal with these physical resources is a big problem. The same is true within Alibaba Group, whether it is different machines or Linux kernels ranging from 2.6.32 to 3.10+, there is still heterogeneity. Pouch must support all kernel versions if all applications are to run in Pouch, and existing container technologies support Linux kernels greater than 3.10. Fortunately, for older kernels like 2.6.32, namespace support is missing only the User namespace; Other namespaces and common cgroup subsystems exist. But auxiliary files such as /proc/self_/ns to record namespaces did not exist, and system calls such as setns needed to be supported in older kernels. Ali’s technical strategy is to implement container support in older kernels by bypassing certain system calls in some other way.


Of course, from another perspective, rich container technology also, to a large extent, ADAPTS other operation and maintenance systems, monitoring systems, and user habits on the older kernel, ensuring Pouch’s high availability in kernel compatibility.


Therefore, overall, based on the technical advantages of Pouch, IT is not difficult to find suitable application scenarios for Pouch: rapid containerization of traditional IT architectures, large-scale enterprise business deployment, and financial scenarios requiring high security isolation and stability.


Pouch architecture



Due to its differentiated technological advantages, Pouch has been well validated in large-scale application scenarios of Alibaba. However, it has to be said that there are still some differences between the internal Pouch and the current open source version.


Despite its obvious advantages, internal Pouch would be nearly impossible to open source directly. After years of development, internal Pouch not only serves business, but also exists coupling with Ali’s internal infrastructure and business scenarios. Coupling content, for the industry is not strong generality, but also involves some other issues. Therefore, Pouch open source, the first priority in the Pouch open source process is to decoupled internal dependencies and open source the core parts that are also of great value to the community. At the same time, Ali hopes to build Pouch’s open source community by standing with the community at the very beginning of open source. Subsequently, Pouch was gradually replaced by an open source version of the internal Pouch of Alibaba Group, ultimately achieving the goal of internal and external Pouch consistency. Of course, the decoupling of the internal Pouch is as important as the plug-in evolution in this process. In the Pouch’s open source plans, however, an important point will be at the end of March, when a 1.0 version of the Pouch will be released.


From the first moment of planned open source, Pouch’s architectural diagram in the ecology was designed as follows:







Pouch’s ecological architecture can be viewed in two ways: first, how to interconnect with the container choreography system; Second, how to enforce the container runtime.


Support for the container choreography system is an important part of the Pouch open source initiative. Therefore, Pouch was designed to be native to support choreography systems such as Kubernetes. To achieve this, Pouch is the first in the industry to support Container 1.0.0. Although containerd is still available in version 0.2.3, its security features and other features are not yet available. Pouch is the first iteration of the container solution. At present, Docker is still a popular container engine solution in Kubernetes system. At the runtime level, Kubernetes strategic plan is to use Cri-Containerd to reduce the coupling degree between itself and commercial products, and to take the road of compatible community solution. Cri-containerd and ContainerD Community Edition. In addition, it should be mentioned that internal Pouch is an important part of Alibaba’s scheduling system Sigma and supports the realization of the “mixed part” project. Pouch’s open source route also aims to support “mixing parts.” Sigma’s scheduling and co-location capabilities are expected to serve the industry in the future.


Ecologically, Pouch is based on openness; Pouch stands for “richness” and “security” in terms of the container runtime. RunC’s support comes naturally. RunV’s support, however, is different from that of ecology. Although Docker supports runV by default, docker API is not compatible with “container” and “virtual machine”, so Docker is not a unified management portal. As far as we know, there are still many virtual machines in existing enterprises. Therefore, in the container era, how to manage containers and virtual machines simultaneously through a unified o&M portal is bound to be one of the most concerned solutions for enterprises in the transition period of “virtual machines to containers”. Pouch’s open source form nicely overrides this scenario. Runlxc is alibaba’s own LXC container runtime, and Pouch’s support for runlXC also means that RunlxC will be open source in the near future, covering scenarios where a large number of lower-version Linux kernels are available within the enterprise.


Pouch docking ecological architecture is as follows, and Pouch internal architecture can be found in the figure below:







Similar to the traditional container engine solution, Pouch also represents a C/S software architecture. At the CLI level, Pouch CLI and Docker CLI can be supported simultaneously. Connect to the container Runtime, Pouch internally invokes Containerd through the Container Client using the gRPC. Pouch Daemon interior takes a componentized design concept, The corresponding System Manager, Container Manager, Image Manager, Network Manager, and Volume Manager are removed to provide a unified object management scheme.


Write in the last



Now that Pouch is open source, it means that the container technology accumulated by Alibaba will go beyond Alibaba and be available to the industry. However, Pouch’s technical advantages determine that it will provide a differentiated container solution for users to choose from. As enterprises go Cloud and embrace Cloud Native, Pouch aims to be a powerful piece of software to help them achieve the most stable support for their digital transformation.
Pouch is currently open source on GitHub and is open to all forms of open source participation. GitHub address is:
https://github.com/alibaba/pouch





The authors introduce


Sun Hongliang, a technical expert at Alibaba who graduated from Zhejiang University, is currently responsible for the open source development of the Pouch project at Alibaba. He has been engaged in the field of cloud computing for several years. He is one of the first engineers in China to study and practice container technology and plays an extremely important role in preaching container technology in China. Author of “Docker Source Code Analysis”, personally advocates open source spirit, and is a global Maintainer of Docker Swarm projects.