SF Technology’s big data cluster needs to collect massive monitoring data every day to ensure the stable operation of the cluster. Although OpenSDB +HBase was used as the storage scheme of full monitoring data of big data monitoring platform, there are many pain points, and the storage scheme of full monitoring data must be reformed. Through the investigation of IOTDB, Druid, ClickHouse, Tdengine and other timing data storage schemes, we finally chose Tdengine. After adopting TDEngine, the big data monitoring platform has a great improvement in stability, write performance, query performance and other aspects, and the storage cost is reduced to 1/10 of the original scheme.

Scenes and pain points

SF Technology is committed to the construction of intelligent brain, intelligent logistics services, continue to deep research in big data and products, artificial intelligence and application, integrated logistics solutions and other fields, in China’s logistics technology industry in the leading position. In order to ensure the smooth operation of various big data services, we built a big data monitoring platform around OpenFalcon. Since OpenFalcon itself uses RRDTool as the data storage, it is not suitable for the storage of full-scale monitoring data, so we adopt OpenSDB +HBase as the storage scheme of full-scale monitoring data of big data monitoring platform.

Currently, the entire platform is averaging billions of writes per day. With the increasing amount of data accessed by the big data monitoring platform, we have many pain points to solve, including too much dependence, high cost of use and unsatisfactory performance.

  • High dependence and poor stability: As the underlying big data monitoring platform, it relies on big data components such as Kafka, Spark and HBase for data storage. Excessively long data processing link will reduce the reliability of the platform. At the same time, because the platform relies on big data component, and the monitoring of big data component depends on the monitoring platform, when the big data component is unavailable, the problem cannot be located through the monitoring platform in time.
  • High use cost: due to the large amount of monitoring data written and the need to save the full amount of monitoring data for more than half a year, to trace problems. Therefore, according to capacity planning, we adopted 4-node OpenSDB + 21-node HBase as the full-volume monitoring data storage cluster. After compression, about 1.5T (3 copies) of storage space is still needed every day, and the overall cost is high.
  • Performance fails to meet the requirements: OpenSDB, as a full-volume monitoring data storage scheme, basically meets the requirements in terms of writing performance, but fails to meet the requirements in terms of daily large-span and high-frequency queries. On the one hand, OpensDB query returns results slowly, requiring more than ten seconds in the case of a relatively large time span; OpenSDB, on the other hand, supports low QPS, and with more and more users, OpenSDB is prone to crash, making the entire service unusable.

Technology selection

In order to solve the above problems, it is necessary to upgrade the full monitoring data storage scheme. In the aspect of database selection, we have done pre-research and analysis on the following databases:

  • IOTDB: a newly hatched top-level Apache project, contributed by Tsinghua University, with good stand-alone performance. However, we found that the cluster mode was not supported during our investigation, and the stand-alone mode could not meet the requirements in disaster recovery and expansion.
  • Druid: Powerful, extensible, distributed system, self-healing, self-balancing, easy to operate, but relying on ZooKeeper and Hadoop for deep storage and overall high complexity.
  • ClickHouse: The best performance, but the cost of operation and maintenance is too high, the scaling is extremely complex, and the resources are used.
  • TDEngine: Performance, cost, operation and maintenance difficulty are all satisfied, support horizontal expansion, and high availability.

Through comprehensive comparison, we preliminarily choose TDEngine as the monitoring data storage scheme. TDEngine supports a variety of data import modes, including JDBC and HTTP modes, which are easy to use. Due to the high performance requirements for monitoring data writing, we finally adopted the GO Connector. The following operations are required during the access process:

  • Data cleaning, eliminating the data with wrong format;
  • Data formatting, converting data into entity objects;
  • SQL statement splicing, to determine the data, to determine the written SQL statement;
  • Batch write data. In order to improve writing efficiency, write data according to batch after SQL splicing of single data.

Data modeling

Before accessing the data, TDengine needs to design the schema according to the characteristics of the data in order to achieve the best performance. The data characteristics of the big data monitoring platform are as follows:

  • Fixed data format, with time stamp;
  • The content of uploaded data is unpredictable, as new nodes or services will upload new label content, which leads to the fact that the data model cannot be created uniformly in the early stage and needs to be created in real time according to the data.
  • The data label column is not many, but the label content changes more; The data value column is relatively fixed, including time stamp, monitoring value and sampling frequency;
  • The data volume of a single piece of data is small, about 100 bytes;
  • A large amount of data every day, more than 5 billion;
  • Keep for more than 6 months.
  • According to the above characteristics, we build the following data model.

According to the data model suggested by Tdengine, each type of data collection point needs to have a super table, such as disk utilization, which can be abstracted into a super table if every disk on the host can be collected. Combined with our data characteristics and usage scenarios, the data model is created as follows:

  • With the index as a super table, it is convenient for the same type of data to be analyzed and calculated.
  • The monitoring data itself includes label information, which is directly taken as the label column of the super table, and the same label values form a subtable.

The library structure is as follows:

The super table structure is as follows:

The implementation of landing

The big data monitoring platform is the base for the stable operation of the upper big data platform, which needs to ensure the high availability of the whole system. With the increase of business volume and the continuous growth of monitoring data, it is necessary to ensure that the storage system can be easily extended horizontally. Based on the above two points, the overall architecture of TDengine landing is as follows:

In order to ensure the high availability and scalability of the whole system, we use NGINX cluster to carry out load balancing in the front end to ensure the high availability. Separate the client layer, convenient to expand and shrink capacity according to the flow demand.

The implementation difficulties are as follows.

  • Data writing: Because the upload interface of monitoring indicators is open, only the format will be verified, and the written data indicators are not sure, so the super table and child table cannot be created in advance. This checks to see if a new super table needs to be created for each piece of data. If you need to access TDEngine for every judgment, the write speed will drop dramatically and you will not be able to meet the requirements. In order to solve this problem, the local cache is established, so that only one TDEngine query is needed, and the subsequent written data of relevant indicators can be directly written in batch, which greatly improves the writing speed. In addition, the speed of batch table creation before is very slow. In order to ensure the write speed, it is necessary to insert the table and insert the data in batches, and the data information of the child table needs to be cached. The later version optimizes the function of creating the child table, greatly improves the speed, and simplifies the data insertion process.
  • Query problems: 1. Query bugs. The monitoring platform data is mainly presented through Grafana, but during the use process, we found that the official plug-in does not support parameter setting, so we modified it according to our own needs and provided it to the community. In addition, a serious query bug is triggered in the process of use: when setting more Kanban, refresh the page will cause the server to crash. After investigation, it was found that it was caused by the fact that a dashboard in Grafana would simultaneously issue multiple query requests when refreshing and process concurrent queries, which was later officially fixed. 2. Search for a single point of question. TDEngine native HTTP queries are done by directly querying a specific server. This is risky in a production environment. First of all, all queries are concentrated on one server, which can easily lead to single machine overload. In addition, high availability of the query service cannot be guaranteed. Based on the above two points, we use the Nginx cluster as the reverse proxy at the front end of the Tdengine cluster to distribute the query requests evenly among all nodes, which can theoretically expand the query performance infinitely.
  • Capacity planning: Data type and data scale have a great impact on the performance of TDEngine. It is better for each scene to carry out capacity planning according to its own characteristics, and the influencing factors include the number of tables, data length, number of copies, table activity and so on. Adjust configuration parameters based on these factors to ensure optimal performance, such as blocks, caches, ratioOfQueryCores, etc. According to the communication with Taosi engineers, the capacity planning calculation model of TDengine is determined. The difficulty of TDEngine capacity planning lies in the memory planning. In general, the three-node 256G memory cluster can support the number of sub-tables around 2000W at most. If it continues to increase, the write speed will decrease, and a part of the memory space needs to be reserved as the query cache, generally reserved about 10G. If the number of child tables exceeds 2000W, you can choose to extend the new node to share the pressure.

Modification effect

After the transformation, the TDengine cluster can easily handle the writing of full monitoring data, and now runs stably. The composition of the rear frame is as follows:

  • Stability: After the transformation, the big data monitoring platform gets rid of the dependence on big data components and effectively shortens the data processing link. Since its launch, it has been running stably, and we will continue to observe it in the future.
  • Write performance: The write performance of TDengine has a great relationship with the written data. After the relevant parameters are adjusted according to the capacity planning, the cluster write speed can reach 90W /s at the highest in an ideal situation. In the normal case (there is a new table, mixed insert), the write rate is 20W /s.
  • Query performance: in terms of query performance, in the case of using the predictive function, the query P99 is within 0.7 seconds, which has been able to meet most of our daily query needs; In the case of large-span (6 months) non-predicted query, the first query takes about 10 seconds, and the subsequent similar query takes a significant decrease (2-3s). The main reason is that TDEngine will cache the recent query results, and similar query will read the existing cached data first, and then aggregate the new data.
  • Cost: the number of physical servers is reduced from 21 to 3, and the daily storage space required is 93G (2 copies), which is only about 1/10 of OpenSDB +HBase in the same copy. Compared with the universal big data platform, it has a great advantage in cost reduction.


At present, from the perspective of big data monitoring, TDengine has great advantages in terms of cost, performance and convenience of use, especially in terms of cost. We would like to express our gratitude to the engineers of Taos for their professional and timely help during the pre-research and project implementation. We hope that TDengine can continuously improve performance and stability and develop new features. We will also conduct secondary development according to our own needs and contribute code to the community. Wish TDengine better and better. For TDEngine, we also have some feature points that we would like to improve:

  • Friendlier table name support
  • Support for other big data platforms and joint query requirements;
  • Support richer SQL statements;
  • Gray level smooth upgrade;
  • Sub-table automatic cleaning function;
  • Cluster abnormal shutdown recovery speed.

In the future, we will also try to apply TDengine in more scenarios of SF Technology, including:

  • Internet of things platform, as the underlying Internet of things data storage engine to build SF Technology big data Internet of things platform;
  • Hive on Tdengine, through Hive on Tdengine to achieve joint query with other existing data sources, so that it can be used smoothly with the existing system, reduce the access threshold.