In the network model of ZStack, OSI Layer 4 to 7 network services are implemented as small plug-ins from different service delivery modules. The default module is called virtual routing. Customized Linux VMS are used as virtual devices to provide network services for each L3 network, including DHCP, DNS, NAT, EIP, and port forwarding. The advantages of using virtual machines as virtual routers include: no single point of failure and no special requirements for physical devices. Therefore, users can implement various network services on commercial devices without buying expensive hardware.

An overview of the

As mentioned in “ZStack– Network Model 1: L2 and L3 Networks,” ZStack designs network services as small plug-ins that vendors can selectively implement with their hardware or software by creating network service delivery modules. By default, ZStack comes with a Virtual route that is responsible for implementing all network services using a single application Virtual Router VM.

Note: ZStack actually has another provider module called the security group provider module that provides distributed firewall functionality. We call virtual routing the default provisioning module because it provides the most common network service a cloud needs.

There are several ways to implement network services in IaaS software

  • One way is to use central, powerful network nodes, usually physical servers. By aggregating traffic from different tenants, network nodes are responsible for traffic isolation and provide network services using techniques similar to Linux network namespaces.
  • Another approach is to use dedicated network hardware, such as programmable physical switches, physical firewalls, and physical load balancers, which require users to purchase specific hardware.
  • The final approach is to use network Functional Virtualization (NFV) techniques, such as ZStack’s virtual routing virtual machines, to virtualize network services on commercial x86 servers.

Each method has advantages and disadvantages; We chose NFV as our solution for the following reasons:

  1. Minimal infrastructure requirements: The solution should require little or no physical infrastructure from the user; That is, users should not change existing infrastructure or buy special infrastructure to fit the network model of IaaS software. We don’t want to force users to buy specific hardware or require them to put special functional servers in front of a group of hosts.
  2. No single points of failure: The solution should be a distributed approach with no single points of failure. A network node crash should only affect the tenant that owns it, not any other tenant.
  3. Stateless: Network nodes should be stateless so that IaaS software can easily destroy and recreate them if unexpected errors occur.
  4. Good for high availability (HA) : The solution should be easy to implement so that tenants can request to deploy surplus network nodes.
  5. Independent of hypervisors: Solutions should not rely on hypervisors and should work well with mainstream hypervisors, including KVM, XEN, VMWare, and Hyper-V.
  6. Good performance: The solution should provide reasonable network performance for most usage scenarios.

NFV solutions based on virtual routing satisfy all of these considerations. We chose it as the default implementation, but it also opens up the possibility for developers to adopt other solutions.

Virtual routing

Appliance VMs are special virtual machines that run a custom Linux operating system and special agents to help manage the cloud. Virtual routing Virtual machine is the first implementation of the concept of application virtual machine. The idea, in a nutshell, is to create a virtual routing vm that provides all network services for an L3 network when the user vm is first created, as long as the network services provided by virtual routing are enabled on that L3 network. Each VIRTUAL routing VM contains a Python Agent, which receives commands from the ZStack management node over HTTP and provides network services including DHCP, DNS, NAT, EIP, and port forwarding for user VMS on the same L3 network.

The figure above shows the network topology with all network services enabled on the customer’S L3 network. There are usually three L3 networks for a virtual route:

  1. An L3 management network refers to the HTTP communication between the ZStack management node and the Python Agent in the virtual routing VM. It is a required network, and each virtual route has a network.
  2. A public L3 network is an optional network that can connect to the Internet and provides default routes within virtual routing VMS. If omitted, the L3 management network is used as both a management network and a public network.

Public networks do not need to allow public access: Public networks that connect user VMS to the outside world (other networks in the data center or the Internet) do not need to allow public access. For example, when bridging customer L3 network (192.168.1.0/24) that is isolated by VLAN and other networks in the data center (10.x.x.x/x), network 10.0.0.0/24 can be a public network, even though it is not accessible by the Internet.

  1. A customer L3 network is the network that user VMS connect to. Traffic related to network services flows between user VMS and virtual routing VMS.

The number of L3 networks is variable for different network service combinations. For example, if DHCP and DNS are enabled, the network topology becomes:

Because there are no NAT-related services (such as SNAT, EIP), a user’s virtual machine does not need a separate, isolated customer L3 network, but can connect directly to the public network.

Note: Of course, you can create a segregated customer L3 network with only DHCP and DNS services, where the virtual machines can get IP but cannot access the external network because the SNAT service is missing.

If we omit the L3 public network in the figure above, the network topology will become:

Users can configure the L3 management network, L3 public network, CPU speed, and memory size of a virtual route VM using a virtual route calculation specification. When creating a virtual routing virtual machine, ZStack will try to find an appropriate virtual routing calculation specification. A system tag guestL3Network::{l3NetworkUuid} can be used to specify a virtual routing calculation specification for an L3 customer network. If no specified specification is found, a default specification will be used.

Note: For System tags, see The Tag System.

In this version of ZStack (0.6), an L3 customer network can contain only one virtual routing virtual machine, and for a multi-layer network environment, different virtual routing virtual machines will serve different layers:

The ZStack management node will send commands to the Python Agent inside a virtual routing virtual machine that will stop when the user virtual machine starts. To achieve network services through DNsmasq and iptables. A short section of an Iptables rule looks like this:

Note: In future versions of ZStack, network services: load balancing, VPN, GRE tunneling, will also be implemented by virtual routing virtual machines. In addition, virtual routing VMS will be the core implementation element of virtual private cloud VPCS.

Let’s review some of the considerations we mentioned earlier and see how a virtual routing virtual machine can meet them.

  1. Minimal infrastructure requirements: Virtual routing virtual machines with no special need for physical equipment in the data center. They are just virtual machines that resemble user virtual machines and can be created on physical machines. With them, administrators do not have to plan for complex hardware interconnections.
  2. No single point of failure: There is a virtual routing VM for each L3 network. If it crashes for some reason, only the user VMS on the corresponding L3 network will be affected, without any impact on other L3 networks. In most usage scenarios, an L3 network belongs to only one tenant, that is, only one tenant will suffer the failure of a virtual routing vm. This is especially useful when L3 networks are compromised by malicious tools. For example, DDOS. An attacker cannot destroy an entire network in the cloud by attacking a single tenant.
  3. Stateless: Virtual routing Virtual machines are stateless, and all configurations from the ZStack management node can be rebuilt at any time. Users have a variety of options for reconstructing virtual routing configurations in virtual machines. For example, close them, start them, delete them, rebuild them, or call the Reconnection VirtualRouter API.
  4. Easy high availability (HA) : Two virtual routing VMS can be deployed and work in active/standby mode using the virtual route redundancy protocol to implement HA. Once the primary virtual route fails, the backup automatically takes over, making the downtime of the network negligible.

Note: This feature of redundant virtual routing VMS is not supported in the current version (0.6). 5. Hypervisor independent: Virtual routing VMS do not depend on a Hypervisor. ZStack has a script for building virtual routing vm templates for mainstream hypervisors. 6. Reasonable performance: Because Linux is used, virtual routing VMS can achieve reasonable performance provided by Liunx. Users can configure a virtual routing computing specification to provide sufficient computing power for virtual routing VMS with more cpus and more memory to cope with heavy network traffic. The main performance concern lies in routing traffic between virtual machines and user VMS ‘public network cards in virtual machines

Routing VMS provide NAT-related services, including SNAT, EIP, and port forwarding. In most scenarios, since a public IP address typically has tens of MEgabytes of bandwidth, a virtual routing virtual machine is sufficient to perform well. However, significant network performance degradation due to virtualization is inevitable when the traffic of virtual machines through virtual routing requires a very high bandwidth. However, there are two techniques that can mitigate this problem:

  • LXC/Docker: Because ZStack supports a variety of hypervisors, once LXC or Docker is supported, as a lightweight virtualization technology, virtual routing virtual machines running as containers can achieve near-native performance.
  • Sr-iov: Virtual routing VMS can be assigned physical nics using SR-IOV to achieve native network performance.

Note: LXC/Docker and SR-IOV are not supported in the current version (0.6).

In addition, users can control the allocation of physical hosts to virtual routing VMS using system labels and virtual routing calculation specifications. Furthermore, users can even assign a physical server to a virtual routing virtual machine; With the help of LXC/Docker or SR-IOV, virtual routing virtual machines approach the native network performance that a Linux server can provide. In any case, software solutions have inherent performance flaws; Users can choose a hybrid solution for high performance of the network; For example, use virtual routing virtual machines only for DHCP and DNS, leaving performance-sensitive services to service providers that use physical devices. # summary

In this article, we demonstrated ZStack’s default network service provider: the Virtual Route Provider. Explains how it works and elaborates on how it satisfies our concerns about web services. With virtual routing virtual machines, ZStack strikes an ideal balance between flexibility and performance. We believe that 90% of users can easily and explicitly build their web services on commercial hardware.