Performance comparison report of the TDengine and the InfluxDB

preface

Performance is a point of great concern to users when selecting and using temporal databases.

In order to accurately reflect the performance data of TDengine, we have planned a series of test reports called “Performance Comparison between TDengine and InfluxDB”.

“A word is not data”, today we will first share two database write performance comparison.

In order to be more convincing, the test is based on the scenarios and data sets previously used by InfluxDB in the performance comparison with Graphite. (https://www.influxdata.com/blog/influxdb-outperforms-graphite-in-time-series-data-metrics-benchmark/)

After many preparations and repeated tests, we come to the following conclusions: 1. Under the optimal conditions released by InfluxDB, the write speed of TDengine is twice that of it. 2. When the number of devices is magnified to 1000, the write speed of TDengine is 5.2 times that of the InfluxDB.

In addition to providing test results, we have a small goal — that by following the steps and configurations in this article, any developer or architect reading this article will be able to reproduce the same process and results. In our opinion, only the test report obtained through this method is the most valuable test report.

The body of the

InfluxDB is an open source temporal database written in the Go language. At its core is a custom built storage engine that is optimized for time series data and is currently the most popular time series database, ranking no. 1 in DB-Engines’ list of time series databases.

TDengine is an Iot big data platform that integrates message queues, databases, streaming computing, and more. The product does not rely on any open source or third-party software, with fully independent intellectual property rights, with high performance, high reliability, scalable, zero management, easy to learn and other technical characteristics. Compared with InfluxDB, TDengine is a dark horse in the current temporal database field.

Next, we officially enter the test.

I. Basic information is as follows:

The data set used for this test is the data set used to model the DevOps monitoring metrics case. In this scenario, a set of servers is required to periodically report system and application metrics by sampling 100 values every 10 seconds across nine subsystems (CPU, memory, disk, disk I/O, kernel, network, Redis, PostgreSQL, and Nginx) on a server. In order to better complete the comparison of key indicators, InfluxDB chose a period of 24 hours and 100 sets of equipment in the comparison with Graphite. Therefore, this relatively modest deployment has been reused for the current TDengine/InfluxDB comparison test.

Important parameters are shown in the following figure, which can be seen in the links above:

2. Environmental preparation

For the sake of replay, all of our tests were performed on two Azure virtual machines running Ubuntu 20.10 with the following configuration:

Standard E16as_v4 ©AMD EPYC 7452 (32-core Processor 2345.608 MHz, 16 vcpus, 128GB RAM, 5000 IOPS SSD 1024GB) used for the database server. Standard F8s_v2 instance type ©Intel(R) Xeon(R) Platinum 8272CL (2.60GHz, 8vCPU, 16 GB RAM) for database clients.

It is worth noting that while the server CPU above shows 32 cores, the cloud service is split between only 16 processors.

Iii. Specific test methods and steps:

We can reproduce the test results as follows:

1. Overall planning:

Server servers must be installed for the Influxdb and TDengine servers. The client machine requires the TDengine client (also version 2.0.18) and the GO language environment to be installed, as well as the performance test scripts downloaded from Github and run.

2. Installation preparations:

1) How to install TDengine (including client) :

A. Download the TDengine installation package

B. Procedure for installing TDengine

2) Installation mode for the Influxdb:

Influxdb Download and installation steps

3) GO1. Installation mode:

Wget HTTP: / / https://studygolang.com/dl/golang/go1.16.linux-amd64.tar.gz

Tar -c /usr/local -xzf go1.16.linux-amd64.tar.gz

Example Add the environment variable /etc/profile export PATH=$PATH:/usr/local/go/bin

source /etc/profile

Ensure that the database connection between the two servers is normal after the TDengine, InfluxDB, and Go language environments are deployed. (Test the write and query functions of the database during database construction and deletion. Delete the database immediately after database construction and rectify any problems immediately.

In addition, the following points should be noted during testing:

The default value for InfluxDB is fsync without delay. You need to modify the parameters walLevel=2 and fsync=0 of the TDengine to achieve the same configuration environment. All subsequent tests were performed under this setting.

2) The CLIENT of the TDengine should set maxSQLLength to 1048576.

3. Remove code from Github:

su - root
mkdir /comparisons
cd /comparisons
git clone https://github.com/taosdata/timeseriesdatabase-comparisons

4. Preparation for compilation:

1) CD/comparisons/timeseriesdatabase - comparisons, Then delete the go inside. The mod and go. Sum file 2) go mod init github.com/taosdata/timeseriesdatabase-comparisons 3) perform the following command to install depend on the package:  go get github.com/golang/protobuf/proto go get github.com/google/flatbuffers/go go get github.com/pelletier/go-toml go get github.com/pkg/profile go get github.com/valyala/fasthttp

When you finally see the new go.sum and go.mod files, you can continue.

5. Compile phase:

The mkdir/comparisons/timeseriesdatabase - comparisons/build/tsdbcompare/bin we need three written procedures, These are Bulk_data_gen, Bulk_load_influx, and Bulk_load_TDEngine. Download code, respectively into the corresponding directory compiling etc. The following command: CD/comparisons/timeseriesdatabase - comparisons/CMD/bulk_data_gen; go build ; cp bulk_data_gen .. /.. /build/tsdbcompare/bin cd .. /bulk_load_influx; go build ; cp bulk_load_influx .. /.. /build/tsdbcompare/bin cd .. /bulk_load_tdengine; go build ; cp bulk_load_tdengine .. /.. /build/tsdbcompare/bin

(Note: Remember to install the TDengine client before compiling Bulk_load_TDEngine)

6. Modify the script:

Modify/comparisons/timeseriesdatabase – comparisons/build/tsdbcompare/write_to_server. Sh, put the add = ‘TDVS’, modify the database server hostname for your choose.

Then run the following four commands to replace the original directory with the directory of the database file (usually TDengine is /var/lib/taos, and the value of Influxdb is /var/lib/influxdb) :

rm -rf /mnt/lib/taos/* -> rm -rf /var/lib/taos/ rm -rf /mnt/lib/influxdb/* ->rm -rf /var/lib/influxdb/* TDDISK=`ssh root@$add "du -sh /mnt/lib/taos/vnode | cut -d ' ' -f 1 " `-> TDDISK=`ssh root@$add "du -sh /var/lib/taos/vnode | cut -d  ' ' -f 1 " ` IFDISK=`ssh root@$add "du -sh /mnt/lib/influxdb/data | cut -d ' ' -f 1" `-> IFDISK=`ssh root@$add "du -sh /var/lib/influxdb/data | cut -d ' ' -f 1" `

Log off: “curl “http://$add:8086/query? Q =drop%20database%20benchmark_db” -x POST

7. Run the script to reproduce the test results:

cd /comparisons/timeseriesdatabase-comparisons/build/tsdbcompare/

./loop_scale_to_server.sh

Note: This script encapsulates the data generation and writing process, so interested readers can read it themselves. If the write fails due to interference factors, you can manually pass in parameters and execute again to get the test results. For example, write_to_server.sh -b 5000 -w 100 -g 0 -s 100. Specific parameters are meaning by “/ comparison/timeseriesdatabase – comparisons/build/tsdbcompare/write_to_server. Sh – h know)

Iv. Actual measurement data

After some testing, we created a table like this. Through it, we can clearly see that whether it is single-threaded or multi-threaded, whether it is small batch or large batch, TDengine has maintained a steady speed advantage of about 2 times.

In the case of 5000batch and 16wokers (the test items in the report are compared between the InfluxDB and Graphite), the time for the InfluxDB is 35.04 seconds, while the time for the TDengine is only 17.92 seconds.

Moreover, only 100 devices and 900 monitoring points have been tested by InfluxDB. However, from our point of view, the number of devices and monitoring points in practical application scenarios must be far more than this number. Therefore, we adjusted the script parameters from 100 devices to 200,400,600,800,1000. By increasing the same proportion of the data volume on both sides, we got the result of writing comparison under the condition of more access devices.

(The data table is attached to the text. As a result, after doubling the number of devices, TDengine maintained a steady lead and continued to expand its advantage.

conclusion

The current test results have strongly explained the two conclusions in the preface: 1. Under the optimal conditions released by InfluxDB, the write speed of TDengine is twice that of TDengine. 2. When the number of devices is magnified to 1000, the write speed of TDengine is 5.2 times that of the InfluxDB.

Since 5.2 times is exactly the highest performance gap between the two sides in this test, we decided to use the test condition (5000batch size, 16workers) to make two line charts with the number of equipment as the horizontal axis, because it would be very representative. (Figure 1 represents the number of seconds it takes each party to write the same amount of data, and Figure 2 represents the number of lines per second written by each party.)

These two figures fully illustrate one point: the more devices, the more data, the more advantages of TDengine, as the saying goes, “The more, the better.”

Considering that the interface types compared in this performance test are not consistent, TDengine uses CGO interface and the InfluxDB is REST, the performance will fluctuate a little and will never fundamentally change the results. In addition, we will release the tests of other interfaces and scenarios in the future.

If you are interested in more details, you can use the above test code to do your own reproduction, and your valuable suggestions are most welcome.

Finally, attach the full record of test data:

Performance comparison report of the TDengine and the InfluxDB

preface

The body of the

conclusion

Related Posts

Breakthrough the Storage Bottleneck and Break the “Last Mile” of High Performance Computing

PostgreSQL object management

Mongo from entry to master | online technical training