This article mainly analyzes the unified monitoring function of multi-cloud VIRTUAL machines through three parts: general introduction, core monitoring Agent introduction, summary and demonstration.

1. Unified monitoring of multi-cloud VMS

First of all, as a multi-cloud management platform, unified monitoring is a very necessary and important function.

1) architecture

Uni-cloud has always had the function of unified monitoring, mainly collecting monitoring information of each infrastructure and making statistics.

2) Past implementation

Previously, the implementation of monitoring query, monitoring alarm, monitoring operation and maintenance is to call the cloud API of each cloud vendor to obtain their monitoring data, which is relatively simple and natural.

But the disadvantages are also obvious:

First, the monitoring data is not unified. Although the monitoring data of Ali Cloud and Tencent cloud are generally the same, there will be differences.

Second, there is a limit to how many times a cloud API can be called. For example, the API of Ali Cloud will limit the number of calls per day. If the number of calls exceeds the limit value, data cannot be obtained. Therefore, the interval of calls can only be adjusted higher, which may cause some data loss.

Third, the latency of monitoring data is relatively high. On the one hand, due to the limitation of the number of times, its interval has to be adjusted larger. We have to keep pulling it down, which is equivalent to second-hand data with high delay, which has an obstruction effect on alarms, for example, because alarms need immediate data.

3) Present implementation

The current implementation is to install monitoring Agent for the multi-cloud VIRTUAL machine. The monitoring Agent will collect monitoring data on the virtual machine and then actively push the data to the database of Unicloud. In this way, the three shortcomings mentioned above can be overcome.

Because each VM has a unified Agent, the data obtained is also unified.

Since there is no cloud vendor API, there is no limit to the number of calls.

Agent actively pushes the data, and we can also control the delay.

The current implementation is to use the monitoring Agent to perform unified monitoring of multi-cloud VMS. The following focuses on the monitoring Agent.

2. Monitor Agent

Monitoring Agent is a daemon running on a VM. It collects monitoring data and sends the data back to the InfluxDB.

How do I install the Agent on a VM? How do you get it to collect data? How do you get the data back?

Here’s a three-point overview of the implementation.

1) Install the Agent on the VM

We used Ansible to install the Agent on VMS.

Using Ansible to install agents on VMS in VPCS in the cloud, you need to solve two key problems:

First, how to ensure that the Ansible component in the OnceCloud can connect to VMS in the VPC on the cloud? Because the network may be disconnected, the VPC internal network is isolated, so the connection may not be directly possible.

If you want to install Ansible, you need to log in.

(1) How do I log in

To log in to a vm, assume that the one Cloud can connect to the VM, for example, through the NAT gateway or the VM is bound to an EIP address.

The login problem is divided into two types. The first type is that the machine itself is created through the Cloud One platform. In this case, the login problem has been solved. Cloudroot users can log in to a CLOUdroot vm using a public key and the private key is stored in the local database.

However, many cases are different from the above. For example, the account of Ali Cloud has just been managed, and the data of these virtual machines are pulled from Ali Cloud. In this case, the requirements for direct login cannot be met.

(2) Users can assist in configuring secret free login

In this case, users are required to help Union.net configure the encrypted login. Users can help configure the encrypted login in either of the following ways:

First, the user temporarily informs the virtual machine of the user name and password, so that Unicloud can temporarily log in to the virtual machine.

The one Cloud uses Ansible to create a Cloudroot user on the target VM and configure public key login, which can meet the requirements for direct login.

Second, the script is directly shown to users. Users only need to copy the script to their own VIRTUAL machines to run it, which can also achieve the above effects. Cloud Unione cloud can also log in to virtual machines without secret.

After the login problem is resolved, the Agent can be installed.

Of course, the problem is solved on the premise of our previous hypothesis, that is, the connection problem has been solved and the network is connected.

If the network fails, we can use SSH agent, specifically Local Port Forwarding.

(3) Introduction to Local Port Forwarding

Assuming that network A and network B are two separate networks, what if you want the VMA to have access to the Web service listening on port 80 on the VMB?

VMB has no public IP address, but only one internal IP address. We need to set up two proxies.

ProxyA can be created on network A and proxyB on network B. ProxyB must have A public network. To enable network A to access proxyB, you can use SSH Local Port Forwarding to create an SSH proxy. This command sets up a Local Port Forwarding and executes it in the background.

There’s actually two parts, the part before the second colon, which is locally proxyA’s IP plus port.

The colon is followed by the IP address and port that is actually accessed on the remote end.

Cloudroot is the login user and IP of proxyB.

Executing this command requires proxyA to be able to log into proxyB as cloudroot user, which can be done using the login method described above.

The VMA can access web services on the VMB by accessing 10.127.30.251:12345.

Just to give you a quick overview of the process,

After executing this command on proxyA, an SSH tunnel is first established between proxyA and proxyB, and a port Forwarding is created on proxyA, which listens for 10.127.30.251: Once a request is sent to 12345, it will be forwarded to proxyB through SSH tunnel, and proxyB will forward the request to 172.31.25.194:80.

The VMA can access web services on the VMB by accessing 10.127.30.251:12345.

(4) The one Cloud connects to internal VMS in the VPC

The following scenario is introduced. First of all, There is an Ansible component in The One Cloud and a target VM that needs to install Agent in the Ali Cloud.

Make proxy service in One Cloud, and then find a VIRTUAL machine inside the VPC of Aliyun, called proxyVM. It should meet two conditions. The first condition is that it can be accessed by One Cloud, and the second condition is that it can access the target VM, so that it can perform Local Port Forwarding.

By running the preceding command on the proxy, the Ansible component can access the target VM using the proxy, and then connect to the target VM for installation. After the connection and login problems are resolved, the Ansible component can install the monitoring Agent on the VM.

2) How does the monitoring Agent collect data

The first version of monitoring Agent is actually Telegraf, customized configuration files to meet the needs of data collection.

Telegraf is an open source project whose purpose is to collect, process, aggregate, and write metrics agents. Telegraf can be configured flexibly, such as what data to collect, where to send data, add data, etc.

For example, the first page is [global_tags]. You can add a tag, for example, vm_name= “correhost” to indicate the VM name. You can also learn about the region, project, and platform by adding a tag.

Check by label, check a platform or project can be done, the other two configuration is more important, respectively is INPUTS and OUTPUTS.

INPUTS indicate what kind of data is required. For example, INPUTS 1: CPU indicates CPU related data.

OUTPUTS indicates where data is to be sent, and here indicates where data is to be sent to influxDB.

In summary, we use Telegraf and customize the configuration file so that we can meet the requirements as monitoring Agent

3) How does the monitoring Agent send back data?

The essence of this problem is how to enable the Agent service on a VM to access the InfluxDB in the cloud One. This can also be divided into two cases. The first case is relatively simple and can be connected directly; otherwise, SSH Agent is still used, specifically Remote Port forwarding.

(1) SSH Remote Port Forwarding

In proxyA, VMS on network B need to access services on network A.

SSH – NfR172.31.25.194:12345:10.127.40.251:30086 [email protected]

This requires proxyA to be able to log into proxyB as cloudroot user, which can be done using the login method described earlier.

After this command is executed, an SSH tunnel is established between proxyA and proxyB.

Port Forwarding will now be built on proxyB, an SSH tunnel will be established between proxyA and proxyB, and a port Forwarding will be created on proxyB, which will listen on 172.31.25.194:12345, When a request is received, it is forwarded to proxyA over an SSH tunnel, which forwards the request to 10.172.40.251:80

In this way, the VMB on network B can access the DB as long as it accesses 172.31.25.194:12345.

(2) Data transmission through proxy the whole roadmap is similar to the previous one, SSH tunnel still needs to be established between the two, virtual machine can directly access proxyVM to send back data.

3. Summary and presentation

1) summarize

The overall solution is to create an SSH tunnel between the proxyVM and proxy in the VPC, access the VM, install the Agent, and then transfer data to the database through the SSH tunnel.

Agent performs three things in total. The first is how to install Agent, the second is how to collect data, and the third is how to send back data.

2) Process demonstration

(1) Create a VPC on the cloud

(2) Create an IP subnet

(3) Create a VM with EIP as proxyVM

(4) The VMS created on the one cloud platform are encrypted login free

(5) Transform into proxy

(6) Create common VMS in the VPC

(7) Monitoring

The monitoring interface is divided into basic monitoring and Agent monitoring. Basic monitoring refers to the monitoring data obtained by calling the cloud API.

Agent monitoring is the monitoring data returned by monitoring agents running on VMS. The installation of Agent monitoring must be manually triggered.

(8) Agent monitoring requires users to click to install Agent

(9) Data display after successful installation is shown in the figure below.

Making: github.com/yunionio/cl…

Official website: www.yunion.cn/