Large-scale container technology application is never an independent project, but a collection of virtualization technology, container scheduling, task scheduling, operating system, container warehouse, cross-node network, distributed storage, dynamic scaling, load balancing, log monitoring, fault recovery and other systemic problems of complex organism. With the birth of Docker and the promotion of Internet companies like Google, a number of excellent open source projects have emerged in this field, which simplify the cost of using container technology while often leaving developers and enterprise users who have not been around containers for a long time at a loss.

Classifying knowledge is an effective way to sort out scattered information. For the container technology ecosystem, there are many areas involved. Some projects span multiple segments, while others are customized for specific scenarios, making it difficult to define their functional types accurately. However, if you just consider related products and tools in the general domain, you can roughly divide them into 14 main categories.

Here are some examples of typical open source projects in these categories, as well as some of the 100 peripheral products that are not open source but commonly used, based on the container ecosystem.

  1. Container Engine A container engine is a core part of the container cluster ecosystem. It is a tool or service that interacts directly with the Namespace and CGroup functions of the kernel and provides an API for external integration. Docker is undoubtedly one of the most successful and widely used container engines to date. In fact, since version 1.12, Docker’s containerization function has been realized by an independent project RunC, but Docker is still an open source product to provide users with a complete containerization solution. In addition, there are many container engine projects in the community, such as:

  2. Docker:https://www.docker.com

  3. Rkt:https://coreos.com/rkt

  4. Systemd-nspawn:https://www.freedesktop.org/wiki/Software/systemd

  5. The Hyper: https://hyper.sh

  6. Garden:https://github.com/cloudfoundry/garden

  7. LXC:https://linuxcontainers.org

  8. Photon:https://github.com/vmware/photon

  9. Vagga:https://github.com/tailhook/vagga

  10. gVisor:https://github.com/google/gvisor

  11. Pouch:https://github.com/alibaba/pouch

These projects are just the tip of the iceberg of many container engines that support different platforms and have different characteristics. For example, the Google-led LMCTfy (http://lmctfy.io/) project was also an excellent container engine, but it has not been maintained since 2015. GVisor, which Google recently opened source, is a newcomer in the field. It’s also worth noting that Hyper uses virtual machine isolation to isolate its environment. It’s not a container-based isolation solution, but it fits in well with container clustering technologies such as Docker or Kubernetes to replace its isolation.

  1. Monitoring and data collection Monitoring of container performance and state differs from that of virtual machines due to the special kernel-based isolation of containers. Traditional virtual machine monitoring tools, such as Nagios and Zabbix, do not have native support for container monitoring that is easy to use. Newer open source projects have a friendlier experience with this scenario, such as:

  2. cAdvisor:https://github.com/google/cadvisor

  3. Sysdig:http://sysdig.org

  4. Prometheus: https://prometheus.io

  5. TICK-Stack:https://influxdata.com

  6. Docker-Alertd:https://github.com/deltaskelta/docker-alertd

  7. Grafana:https://grafana.com

The tick-stack refers to four open source tools, Including Telegraf, InfluxDB, Chronograf, and Kapacitor, by Influxdata, but since 1.0, these tools have been available in enterprise versions on top of open source. The latter offers enterprise-class features such as high availability and cloud storage.

  1. Container management and interface tool visualization is an important part of user friendliness. Shipyard and Decking are very popular visualization tools of Docker in the early stage, and Docker also buys Kitematic as the official container management UI. But with the clustering of container applications, the early UI tools are no longer popular, and new management UIs are emerging that are tailored to specific cluster platforms. For example, Kubernetes has officially launched Dashboard for visual management of clusters, and Red Hat’s Kubernetes for cluster management. Here are some of the open source container managed UI projects:

  2. Kitematic:https://kitematic.com

  3. DockerUI:https://github.com/crosbymichael/dockerui

  4. Panamax: http://panamax.io

  5. Rapid Dashboard:https://github.com/ozlerhakan/rapid

  6. Cockpit:http://cockpit-project.org

  7. Portainer: https://www.portainer.io

  8. Shipyard:http://shipyard-project.com

  9. Seagull:https://github.com/tobegit3hub/seagull

  10. Dockeron:https://github.com/dockeron/dockeron

  11. DockStation: https://dockstation.io

  12. The implementation of infrastructure integration container clusters is based on the hardware infrastructure, and there are ancillary tools that can simplify this process. These projects are often related to specific underlying platforms, such as:

  13. Nova-docker:https://github.com/stackforge/nova-docker

  14. Magnum:https://github.com/openstack/magnum

  15. Machine:https://docs.docker.com/machine

  16. Boot2Docker:https://github.com/boot2docker/boot2docker

  17. Clocker:https://github.com/brooklyncentral/clocker

  18. MaestroNG:https://github.com/signalfuse/maestro-ng

Nova-docker and Magnum are both projects that integrate container clusters in OpenStack. However, OpenStack is currently trying to unify the differences between The IaaS and CaaS layers by allowing Kubernetes to create virtual machines directly. One of them, Nova-Docker, has been scrapped. Machine is an infrastructure management tool launched by Docker Company. Boot2Docker used to be the official solution for using Docker on Windows and Mac, but it is no longer recommended to use it after Docker 1.12 version has been released for a variety of operating systems.

  1. Orchestration and scheduling are fundamental to container clustering, so choosing a orchestration and scheduling tool is really choosing a solution for container clustering. Here are some open source container scheduling tools:

  2. SwarmKit:https://github.com/docker/swarmkit

  3. Kubernetes: http://kubernetes.io

  4. Marathon:https://github.com/mesosphere/marathon

  5. The Rancher: http://www.rancher.io

  6. Nomad:https://github.com/hashicorp/nomad

  7. OpenShift:https://www.openshift.com

  8. Crane:https://github.com/michaelsauter/crane

  9. Nebula:https://github.com/nebula-orchestrator

  10. GearD: http://openshift.github.io/geard

Among them, OpenShift mainly refers to its release after 3.0, which is red Hat’s continuous integration and delivery container cluster solution based on Kubernetes secondary development, with open source and commercial versions.

  1. Container image warehouse Image warehouse is a necessary part of the software release process based on containers. Docker has opened source the minimum implementation of its image warehouse, but for enterprise-level applications, it lacks the necessary functions such as high availability, permission control and management interface. Docker Hub and many container cloud platforms in China provide enterprise-level warehouse services of public cloud. There are also some open source or free implementations of container warehouses in the community, such as:

  2. Repository:https://github.com/docker/distribution

  3. Nexus:http://www.sonatype.org/nexus

  4. Habor: http://vmware.github.io/harbor

  5. Portus:https://github.com/SUSE/Portus

  6. Docker Registry UI:https://github.com/atcol/docker-registry-ui

  7. Dragonfly:https://github.com/alibaba/Dragonfly

Nexus is a general package repository solution that supports distribution and management of many mainstream packaging formats including Maven, NPM, PIP, RPM, etc. It began to support as a Docker image repository in version 3.0 and later. Habor launched by VMWare is a relatively common enterprise open source Docker warehouse solution. Portus and Docker Registry UI are interface management tools based on the official Repository image Repository. Dragonfly is a P2P image distribution tool that does not directly provide image storage, but is also a repository assistance tool.

  1. Service discovery and Container Domain Name Service (DNS) service discovery and container domain name Service (DNS) are actually the components of the microservice architecture and container cluster scheduling tools. They are very common in container cluster and are an important part of the ecosystem. Here are some of the tools that have been mentioned in practical engineering:

  2. Etcd:https://github.com/coreos/etcd

  3. Consul: http://www.consul.io

  4. ZooKeeper:https://zookeeper.apache.org

  5. Eureka:https://github.com/Netflix/eureka

  6. Traefik: https://traefik.io

  7. Muguet:https://github.com/mattallty/muguet

  8. Registrator:https://github.com/gliderlabs/registrator

  9. SkyDNS:https://github.com/skynetservices/skydns

  10. Collecting container logs Is the same as monitoring container clusters. The methods of collecting service running logs in containers are different from those in VMS. Currently, log collection tools that Docker can directly support through plug-ins include Rsyslog, Splunk and Fluentd. Although FileBeat is not included, it is also favored by many users due to its small and convenient deployment mechanism. Some of the logging collectors used for virtual machines, such as LogStash or Flume, can also use services in containers, but they are no longer preferred.

  11. Splunk:https://www.splunk.com

  12. Fluentd:https://www.fluentd.org

  13. ElasticStack: https://www.elastic.co

  14. Flume:https://flume.apache.org

  15. Rsyslog:https://www.rsyslog.com/

ElasticStack is the name given to the Beats, Logstash, ElasticSearch and Kibana open source projects. ElasticStack is a very popular set of tools for aggregating, processing, storing and displaying logs. ElasticSearch and Kibana can also work with Fluentd to form an end-to-end log processing solution. It is also worth noting that Splunk is not open source or free, but it is widely used in enterprise log processing solutions.

  1. Container-related system distributions Some Linux distributions are optimized for container running, both Atomic and ClearLinux systems fall into this category. Other Linux distributions, such as CoreOS, have been designed with container mechanisms fully integrated into the architecture of the system. Some systems even use Docker as a core service to manage other user processes, such as the operating systems used by RancherOS and Hyper container engines. There are many similar projects, all of which are handy infrastructure for setting up container clusters, such as:

  2. The Container Linux:http://coreos.com

  3. Project Atomic: http://www.projectatomic.io

  4. RancherOS:http://rancher.com/rancher-os

  5. ClearLinux:https://clearlinux.org

  6. Photon OS: https://vmware.github.io/photon

  7. CargoOS: https://cargos.io

  8. SmartOS:https://www.joyent.com/smartos

  9. Container Platform Container platform is the product of large-scale container application. It is usually combined with continuous integration and continuous delivery tools to connect the upper application services and the underlying infrastructure, and help users quickly realize the end-to-end delivery process from code submission to product launch. Here are some of the relevant open source projects:

  10. Deis:https://deis.com

  11. Flynn: http://flynn.io

  12. Dokku:https://github.com/progrium/dokku

  13. Fabric8: http://fabric8.io

  14. Kel:http://www.kelproject.com

  15. Nanobox: https://nanobox.io

  16. Tsuru: https://tsuru.io

In addition to these open source container platform service implementations, there are many online pay-as-you-go container as a service platforms on the Internet that are part of the overall container cluster ecosystem.

  1. Container network Container technology introduces network-level complexity while addressing environmental isolation and quota issues. Due to the use of Network Namespace, each container can obtain an independent IP address, which is not a big problem for a single host, but for container cluster, IP address allocation and interconnection becomes a new problem. Therefore, when designing container clusters, you usually need to consider the connection mode of the network. Common open source solutions are:

  2. Libnetwork:https://github.com/docker/libnetwork

  3. Flannel:https://github.com/coreos/flannel

  4. Calico:http://www.projectcalico.org

  5. Weave:https://github.com/zettio/weave

  6. Romana: http://romana.io

  7. Canal:https://github.com/projectcalico/canal

  8. Open vSwitch:http://openvswitch.org

  9. Pipework:https://github.com/jpetazzo/pipework

Most of these Network schemes use Overlay Network mode of seven-layer Network, which encapsulates extra packet headers for routing addressing on Network packets for communication between containers. This mode reduces Network communication efficiency, and the specific impact is related to the size of extra data encapsulated. Calico implements data routing and access control among containers by modifying IPtables and routing table rules on each host node, which belongs to layer 3 network. This scheme has obvious efficiency advantages when the node scale is not too large (up to several hundred nodes), and it is a recommended container network tool. In addition to these common solutions, some enterprises may combine layer 2 network solutions such as MacVLAN to realize container interconnection to achieve better network performance.

  1. Container security The root cause of the container security problem is that the container and the host share the kernel, so the attack area is particularly large. Also, unlike virtual machines, if the application in the container causes the Linux kernel to crash, the entire host system will crash. In addition, the security of images is also a part of container security. How to ensure that the images downloaded by users are trusted and not tampered with, and how to ensure that the images do not accidentally contain old software with a lot of bugs are also issues to consider. At present, these security topics attract more attention in some enterprise applications. Here are some related open source tools and projects:

  2. Notary:https://github.com/docker/notary

  3. Clair:https://github.com/coreos/clair

  4. AppArmor:http://wiki.apparmor.net/index.php/Main_Page

  5. SELinux:https://selinuxproject.org

  6. Twistlock:https://www.twistlock.com

  7. OpenSCAP:https://github.com/OpenSCAP/container-compliance

  8. Container data persistence A container is an immutable infrastructure, and its data should be stored to external media by Volume. In essence, container persistent storage is to solve the problem of how to easily mount external storage to the container for use. Docker provides storage plugins after version 1.9, which also facilitates many storage schemes. Here are a few examples:

  9. Flocker :https://github.com/clusterhq/flocker

  10. Convoy:https://github.com/rancher/convoy

  11. REX-Ray:https://github.com/codedellemc/rexray

  12. Netshare:https://github.com/ContainX/docker-volume-netshare

  13. OpenStorage:https://github.com/libopenstorage/openstorage

Ceph is a universal network storage tool, which provides block storage and object storage capabilities, and has good support for application data persistence in containerized scenarios.

  1. The image of container can be regarded as a new type of application packaging, so containers are often combined with software development and continuous integration, continuous delivery processes to provide consistent deployment capabilities in different environments. Here are some tools or platforms that leverage containers to improve software development and delivery:

  2. Drone. IO: https://drone.io

  3. Shippable:http://shippable.com

  4. Cyclone:https://github.com/caicloud/cyclone

  5. Screwdriver: http://screwdriver.cd

  6. WatchTower:https://github.com/v2tec/watchtower

  7. Wercker:http://wercker.com

  8. Totem: http://totem.github.io

This is an excerpt from a new book, Container as a Service: Building Enterprise-class Container Clusters from Scratch, which has just come out. It is the most complete book to date on container clustering technology and its surrounding ecosystem. The author is a front-line technical expert from Alibaba and a front-line technical consultant who has been applying containers in projects since lXC-Docker (the earliest Version of Docker 0.x). He has witnessed the whole process of container technology from rise to maturity. In the technology selection and practice time after time, I have personally experienced all kinds of related open source projects, and made speeches and shared in technical conferences for many times (such as CNut Global Container Technology Conference in 2015, CSDN Architecture Technology Practice Summit in 2016, CNut Global Operation and Maintenance Technology Conference in 2017, etc.). The book is divided into eight chapters, which not only includes the top four mainstream open source container cluster solutions, but also introduces the technical selection and principle details of many container-related projects. The specific contents are listed as follows:

Chapter 1 overview of container cluster

Chapter 2 SwarmKit Cluster Solution

Chapter 3 Kubernetes Cluster Solution

Chapter 4 Mesos Cluster Solutions

Chapter 5 Rancher Cluster Solutions

Chapter 6 Network and storage for container clusters

Chapter 7 Infrastructure for container services

Chapter 8 New wind direction of container technology

Learning a technology is quick, but learning to understand a technology takes time to accumulate, to know and know why it is a hard and happy process. Hopefully Container as a Service: Building Enterprise Container Clusters from Scratch will lay a solid foundation for your container learning journey.

[Fan welfare]

In the enterprise container application scenario, you have any experience or confusion, please let me know the comments section interaction. Lin Fan, the author of this book, will interact with you in the interactive area, and the two who reply with the most likes will get a collector’s edition of “Container as A Service: Building Enterprise Container Cluster from Zero” signed by the author as a gift, and immediately follow the cloud efficiency wechat public account (Ali_yunxiao) to participate in the interaction.