Taos data | Shawwave

With the popularity of containerization, more and more projects adopt containerization scheme to design architecture and implement deployment.

As the core of the timing data engine, it is generally recommended to adopt FQDN(Fully Qualified Domain Name, Fully Qualified Domain Name) to communicate between nodes during deployment. In container architecture design, many customers adopt IP address as the communication mode, and because the IP and container name (Hostname) of these containers will change with the life cycle of the container, which makes it difficult for TDengine to use Hostname as FQDN addressing for communication.

In this paper, we will try to give a best practice suggestion to achieve the following two requirements:

  1. Tdengine cluster nodes are addressed by IP addresses
  2. The application end does not need to configure the IP/FQDN address of the cluster node, but only uses the LoadBalancer domain name as the first step to realize cluster addressing

Assume that the TDEngine cluster consists of two nodes with IP values of 172.16.31.1 and 172.16.31.2.

Before going further, the reader is expected to have an understanding of the overall architecture of Tdengine by referring to the Tdengine technical documentation.

Let’s highlight some of the concepts in TDengine:

  1. Physical Node (PNode): A PNode is an independent running computer with its own computing, storage and networking capabilities. It can be a physical machine, virtual machine or Docker container with OS installed. The physical node is identified by its configured FQDN.
  2. Data Node (DNode): A DNode is a running instance of TDEngine server-side execution code TAOSD on a physical node. A working system must have at least one data node. A dnode contains zero to more than one logical virtual node (VNODE) and zero or at most one logical management node (mnode). The unique identification of a dNode in the system is determined by the instance’s End Point (EP). EP is the combination of the FQDN of the physical node where the dnode is located and the network Port number configured by the system.
  3. Management Node (MNode): A virtual logical unit that is responsible for monitoring and maintaining the health status of all data nodes, as well as load balancing among nodes. The management node is also responsible for the storage and management of metadata, including users, databases, tables, static labels, and so on.

The basic process of client initialization is as follows: the application accesses TDEngine through TAOSC native interface, finds the cluster through FIRSTEP, and obtains the meta-data of the cluster, that is, the list of all nodes of the cluster (FQDN or IP list). Once the client driver has the list, it can use the list to communicate with the nodes in the cluster. The default mode of communication is: UDP protocol for data below 15K, TCP protocol for data above 15K.

With the above process understood, we can take advantage of the relevant features of loadBalancer to help us implement containerized deployment to hostname. Of course, this scenario can also be used for non-containerized deployment, but you want to deploy TDEngine using IP addresses.

In this scenario, firstEP points to the domain name of LoadBalancer and the corresponding port (default is 6030). Let’s assume that the domain name of LoadBalancer is lb.taosdata.com. Tdengine cluster nodes are addressed using IP address as FQDN: 172.16.31.1/172.16.31.2.

The application connects to the client driver through first step: lb.taosdata.com:6030. After receiving the request, LoadBalancer forwards the request to the preset TDengine node — 172.16.31.1:6030 or 172.16.31.2:6030, according to the preset load balancing strategy. The node receiving the message returns the original route of the META-DATA message to the requestor (if the current node is not the MNode master node, it will trigger a redirection, and the subsequent process is similar). Finally, the application/client application driver refreshes to obtain the list of META-DATA.

After that, when the application needs to establish communication with any node in the cluster, it will initiate the connection directly through the acquired IP list and realize communication without passing the LoadBalancer.

Last but not least, if the above steps do not communicate successfully, it is likely that your LB does not support UDP, or that the UDP packet loss rate is too high. The solution is simple: turn on the rpcForceTCP switch on all nodes (including all clients and servers) to switch all initiated traffic to TCP traffic instead of using UDP traffic, and ensure that the traffic through LoadBalancer is implemented through TCP.


Recommended activities: