Author | Chen Yu

Recently, a serious flooding incident broke out in one of TDengine’s community groups. Several group friends chat endlessly, can be said to forget eating and sleeping. So what are they talking about at 4:00 in the morning?

This topic is how to perfect TDengine cluster construction under Docker environment. “What? Besides your own officials, how can there be users working overtime to discuss how to improve the Docker environment cluster construction, this is too fake.”

Ok, let’s admit it: a user named Oliver was having trouble connecting to a Docker TDengine cluster on the client side. Then, triggered a group of two enthusiastic leaders of the discussion, until the final solution.

This is how it happened:

The user’s database cluster is installed on this Linux server (IP :10.0.31.2), and the network where the container IP resides is 172.19.0.0/16, a virtual network created by Docker on the host computer. The hostname and node IP addresses of the three containers are taosnode1 (172.19.0.41), Taosnode2 (172.19.0.42), and taosnode3 (172.19.0.43).

The configuration of each node is as follows:

taosnode1: firstEp=taosnode1:6030,secondEp=taosnode2:6030,fqdn=taosnode1; Port mapping: 16030-16042:6030-6042 (TCP/UDP) TAOSNode2: firstEp= TaosNode1:6030,secondEp= TaosNode2:6030, FQDN = TaosNode2; Port mapping: 26030-26042:6030-6042 (TCP/UDP) TAOSNode3: firstEp= TaosNode1:6030,secondEp= TaosNode2:6030, FQDN = TaosNode3; Port mapping: 36030-36042:6030-6042 (TCP/UDP)Copy the code

After a bit of fiddling with the official documentation, Oliver finally put together the cluster. After adding nodes, he nervously typed “Show dNodes” and was relieved to see three READY nodes.

There is no problem with the server, it is time for the client. He opened his own IP 10.0.31.5 (the same network segment as the cluster host) Windows host, quickly installed a TDengine client on it, add hosts information, do a good route, 2.8MB, fool installation, easy and convenient, connect to the cluster in one go. “Show dnodes” comes into view again with three READY — and feels good again.

Oliver is pleased, however, he soon discovers that things may not be as simple as they seem.

Due to business needs, he also needs to complete the client (10.0.2.61) to connect the server cluster across network segments (cluster based on the Docker environment with IP :10.0.31.2). Ping the host, Telnet through the cluster mapping port, using TAOS to connect to the cluster, the same operation is as smooth as before. He presses “Show dNodes” again — to his surprise, “DB error: Unable to establish Connection” appears, which all TDengine users hate. He then posed his own question to the group.

It was at this time that the two warm-hearted students mentioned above appeared. One is an external Contributor at TDengine — Freemine. The other is Pigwing, who is eager to help with road problems.

Since the cluster itself does not have any usage issues, the only difference is that the way clients connect to the server becomes cross-network segment. So, the initial thought was — since you can’t use the host port, try connecting directly to the IP in the Docker environment. Unfortunately, the idea of connecting to internal IP in Docker environment across network segments did not materialize.

It is then assumed that TDengine relies on an EndPoint (EP) to identify data nodes, and EP=FQDN+ port. However, the client connection has been successful, but the data operation is not possible. In the case of the FQDN is correct, we guess that there is a problem with the port in the cluster, so we did not get the topology information of the cluster. Then, from the initial understanding of the environment to the troubleshooting step by step, the three persevering engineers discussed in the group from April 22 to April 25, and at the latest at 4am, more than one person was online.

Finally, after a lot of trial and error, at 1am on April 24, Freemine came up with a final solution that worked.

Done, tested, all is well!

So, how does Freemine’s cluster building approach differ from the original cluster building?

Although the process is tortuous, but finally we carefully compare the two schemes will find that they are only different in port configuration. Freemine’s solution is to change the serverPort value differently for every single server. The serverport of taosnode1 is 6030. – Port 6030 is mapped to the host. The serverport of taosnode2 is 7030. – Port 7030 of the mapped host. The serverport of taosnode3 is 8030. – Port 8030 of the mapped host.

While the original serverport of each node of Oliver was the default 6030 without modification, which was 16030,26030,36030 when mapped to the host. This configuration does not cause problems when clients connect to cluster hosts on the same network segment, but when they connect across network segments.

What seems like a small change makes a big difference? According to?

In fact, when the client and the server belong to the same network segment, the client can directly access the Docker after adding routes. In this way, the IP address can be correctly resolved as needed. For example, taosnode1 (172.19.0.41), taosNode2 (172.19.0.42), and taosNode3 (172.19.0.43). TDengine can differentiate between nodes with different IP addresses, even if port 6030 is the same.

However, things change when you cross network segments. For clients and servers of different network segments, the client needs to connect to the server through real routes, but the Docker internal network we set is not registered in the real routes, so the client naturally cannot access the Docker internal network. Therefore, when TAOSC needs to get information about the different nodes provided by the cluster, the FQDN can no longer resolve the IP address correctly. In this case, it is necessary to distinguish different nodes through ports.

This is why port 6030 can no longer be used on nodes in a Docker environment.

Therefore, when you use a consistent port mapping inside and outside the Docker host, and the serverPort parameter of each node is set differently, the cluster can distinguish different nodes by different ports. In this way, the client can get the topology information to operate the cluster smoothly.

This is the final answer to the whole “case”.

To sum up, for users, the water for building TDengine cluster in Docker environment is still quite deep. Due to the complexity of the environment, we do not recommend clustering in this way. Therefore, you should be careful about using TDengine in Docker environments.

Finally, we would like to say that as an open source product, the active and professional community is the most important place for us to pay attention to. Although there is no document about TDengine cluster construction under Docker environment on the official website at present. But the active thinking of these community users has largely filled the gap.

Thank you to Oliver, Freemine and Pigwing. We really hope that we can continue to see you in the forefront of big data technology in the Internet of Things, and we also hope that more friends can participate in it.

Wx added T (TDEngine) as a friend, you can interact with friends who are interested in open source in the group

Click “Here” to see Oliver’s notes on TDengine cluster building in Docker environment.