The ZStack logically abstracts a storage system into primary storage and backup storage. A primary storage is a storage pool for VM disks. A backup store is a store where the user stores image templates, backup disks, and snapshots. The primary storage system and backup storage system can be physically separated, or the same storage system can play two roles simultaneously. Storage vendors can easily add their products to ZStack by implementing storage plug-ins.

An overview of the

Storage systems in the cloud can be divided into two categories based on their logical capabilities. One works as a storage pool, storing disks for VMS and being accessed by running VMS. This type of storage can be file system-based, with disks stored as files; Or based on block storage, disks become block devices. In the ZStack glossary, this type of storage is called primary storage and can either be network shared storage such as NFS, ISCSI:

Or local storage, such as a physical host’s hard disk:

Another type of storage system serves as a repository, storing image templates of the operating system, backup disks, and snapshots. This type of storage can be file system-based, with entities stored as files; Or object store based, where entities are stored as objects. In the ZStack glossary, this type of storage is called backup storage and is not directly accessible to the VM, only network shared storage:

Both types of storage are logical concepts; in fact, they can be separate storage systems using different protocols. For example, ISCSI primary storage and NFS backup storage. Or the same storage system plays two roles simultaneously. Ceph, for example, has a block storage part for primary storage, and an object storage part for backup storage. Storage vendors can easily add primary and backup storage to their storage systems in ZStack by implementing storage plug-ins.

Internal implementation

Primary and backup storage do not work separately; They really need to cooperate in order to perform storage-related activities. The most important activity is to create a new virtual machine. When a VM is first created on a primary storage, its image template will be downloaded from the backup storage to the image cache of the primary storage. Since most hypervisors use a technique called chain cloning, once the image template is downloaded, it works as a base disk for all virtual machines that use the same image template and have the root disk in the same primary storage.

In addition to downloading images, the primary storage also uploads entities, such as disks, snapshots, and backup storage; These upload activities are backup related; For example, when a user backs up a data disk, a copy of the data disk is uploaded to the backup storage as a mirror template that can later be downloaded to the primary storage to create a new data disk.

In the source code, primary storage and backup storage are implemented in separate plug-ins. In terms of complexity, backup storage is more straightforward because it only deals with itself. The main activities of backup storage are downloading, uploading, and deleting. A backup store needs to define protocols that specify how the master store downloads and upload entities, but it does not need to know the details of the master store, because it is the responsibility of the master store to use these protocols to perform these activities. In addition, backup storage must implement protocols that allow the mirror service to register and delete image templates. Similar to all other resources, backup storage has an abstract base class called BackupStorageBase, which implements most of the common service logic. Storage vendors only need to implement operations directly related to their background storage systems, usually by invoking the SDK or Agent.

Primary storage is more complex. The root of the complexity comes from the fact that its business logic depends not only on backup storage but also on hypervisor details. For a master store, first, you must understand the protocol for backup storage in order to download and upload entities; For example, an NFS master store must know information about Sftp backup storage, Amazon S3 backup storage, and Swift backup storage if it plans to support all of these. On the other hand, the protocol used for the same backup storage varies with hypervisors. For example, the NFS primary storage can call the KVM Agent to use s3Tool to download an image template from amazon S3 backup storage. However, because VMWare has a closed ecosystem, the only way to do the same for NFS master storage is through VMWare’s SDK. Based on these facts, the complexity of the master store is M*N, where M is the type of backup storage and N is the type of hypervisors it supports.

As described in ZStack – General Plug-in System,ZStack is a plug-in system in which each feature is made into a small plug-in; A main store needs to define two interfaces to break through this complexity. The first is a hypervisor backend that handles hypervisor-only activities; NFS, for example, primary storage has a defined interface: NfsPrimaryStorageBackend, support for each of the hypervisor, there will be a concrete class, similar NfsPrimaryStorageKVMBackend for KVM. Second, called PrimaryToBackupStorageMediator, it is a hypervisor to a backup storage backend, used for processing involves the hypervisor and backup storage backend at the same time; Nfs, for example, primary storage has a NfsPrimaryToSftpBackupKVMBackup implementation, for KVM support Sftp backup storage.

This sounds pretty bad, because a main store has to implement so many things; In fact, however, a primary store may not need to support all backup stores for all hypervisors; For example, it makes no sense to support Sftp backup storage for VMWare, because there is no way the VMWare SDK will allow SCP transfer of a file to its repository (even if it could be made possible by bypassing the SDK, we don’t see it as a reliable way). Moreover, there are not many popular protocols for network shared storage, and most usage scenarios can be handled once we have Nfs and Iscsi master storage in place.

Note: In the current version of ZStack (0.6), only Nfs primary storage and Sftp backup storage are implemented.

conclusion

In this article, we demonstrate the storage model of ZStack. By logically dividing storage into primary and backup storage, ZStack provides great flexibility, allowing storage vendors to selectively plug in their storage systems with various intents. And with increasingly common storage protocols such as NFS, ISCSI, S3, Swift added as default plug-ins, users will not need to worry about whether they can find the right combination for their existing storage systems.