background

Our project involves the Internet related business, since the beginning of young ignorance, sensor data using MySQL for storage, after two years of accumulated data, at present a few core form table data is over million, although by index optimization, SQL optimization and a separation of read and write, barely meet the basic query, to be able to give the data in the second level; But the amount of data continues to increase, when facing the user’s multidimensional statistical needs, in the implementation, efficiency is always so unsatisfactory.

Heaving a sigh, can we only go to separate database and table, or transfer historical data, distinguish between cold and hot temperature data these two irreversible paths?

By chance, I learned about time series databases such as InfluxDB and TDengine. The differences between them are not compared here, but I just experience TDengine briefly.

Before using a new technology, it’s important to know why we need it and what pain points it can solve. I strongly recommend visiting the official website for the white paper and these four articles:

  • The white paper

www.taosdata.com/downloads/T…

  • Massive time-series big data of Internet of Things
  1. www.taosdata.com/blog/2019/0…
  2. www.taosdata.com/blog/2019/0…
  3. www.taosdata.com/blog/2019/0…
  4. www.taosdata.com/blog/2019/0…

TDengine is designed for big data in temporal space such as Internet of Things and Internet of vehicles, and its core function is temporal database. However, in order to reduce the complexity of r&d, operation and maintenance of big data platform and further reduce computing resources, TDengine also provides message queue, message subscription, cache, streaming computing and other functions required by big data processing.

This is not a tutorial, but a record of the experience of this excellent Chinese open source timing database.

System environment

  • The server

In fact, there is no server, there is no cloud host, it is a virtual machine installed locally.

Memory: 4G Processor: 2 x 2 Hard disk: 100 GBCopy the code
[root@hadoop1 local]# cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)

[root@hadoop1 local]# uname -aLinux hadoop1 3.10.0-1127. El7 x86_64#1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

[root@hadoop1 local]# taos -VVersion: 2.1.2.0Copy the code
  • The development environment

A company ThinkPad T580 with only 8GB of memory. What do you think you can do with that?

Device Name HEARTSUIT Processor Intel(R) Core(TM) I5-8250 U CPU @ 1.60GHz 1.80 GHz With RAM 8.00 GB (7.83 GB available) System type 64-bit OPERATING system Windows 10 Home Chinese Version 20H2 Installation Date Astrologer 2021/ astrologer 7/ astrologer 9 OS Internal Version 19042.1110 Experience the Windows Feature Experience Pack 120.2212.3530.0Copy the code

download

www.taosdata.com/cn/getting-…

Once the download is complete and uploaded to the server, I like to put these packages in the /usr/local directory.

The installation

# installation
[root@hadoop1 local]# RPM - the ivh TDengine - server - 2.1.2.0 - Linux - x64. RPMIn preparation...# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # [100%]Upgrading/installing... 1: tdengine 2.1.2.0-3# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # [100%]
Start to install TDengine...
Created symlink from /etc/systemd/system/multi-user.target.wants/taosd.service to /etc/systemd/system/taosd.service.

System hostname is: hadoop1

Enter FQDN:port (like h1.taosdata.com:6030) of an existing TDengine cluster node to join OR leave it blank to build one:

Enter your email address for priority support or enter empty to skip: 

To configure TDengine : edit /etc/taos/taos.cfg
To start TDengine     : sudo systemctl start taosd
To access TDengine    : taos -h hadoop1 to login into TDengine server

TDengine is installed successfully!

# check status
[root@hadoop1 local]# systemctl status taosdLow taosd. Service - TDengine server service the Loaded: the Loaded (/ etc/systemd/system/taosd. Service; enabled; vendor preset: disabled) Active: inactive (dead)Copy the code

run

# start
[root@hadoop1 local]# systemctl start taosd

# Check status again
[root@hadoop1 local]# systemctl status taosdLow taosd. Service - TDengine server service the Loaded: the Loaded (/ etc/systemd/system/taosd. Service; enabled; Vendor PRESET: Disabled) Active: Active (running) since 2 2021-06-15 12:59:24 CST; 1s ago Process: 5354 ExecStartPre=/usr/local/taos/bin/startPre.sh (code=exited, status=0/SUCCESS) Main PID: 5360 (taosd) CGroup: / / system. Slice taosd. Service └ ─ 5360 / usr/bin/taosd June 15 12:59:24 hadoop1 systemd [1] : Starting TDengine server service... 6月 15 12:59:24 Hadoop1 systemD [1]: Started TDengine Server service. 6月 15 12:59:24 Hadoop1 TDengine:[5360]: Starting TDengine service... Hadoop1 TDengine:[5360]: Started TDengine service successfully.Copy the code

Note: TDengine uses FQDN to access, so to configure hostname, command: vi /etc/hosts

  192.168.169.129 hadoop1
Copy the code

An introduction to experience

TDengine for Linux comes with a client after installation. Enter taos on the local command line to enter the interactive interface.

[root@hadoop1 local]# taos Welcome to the TDengine shell from Linux, Client Version:2.1.2.0 Copyright (C) 2020 by TAOS Data, Inc. All rights reserved. taos> show databases; name | created_time | ntables | vgroups | replica | quorum | days | keep0,keep1,keep(D) | cache(MB) | blocks | minrows |  maxrows | wallevel | fsync | comp | cachelast | precision | update | status | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =log                            | 2021-06-15 13:09:38.773 |           4 |           1 |       1 |      1 |     10 | 30,30,30                 |           1 |           3 |         100 |        4096 |        1 |        3000 |    2 |         0 | us        |      0 | ready      |
Query OK, 1 row(s) in set(0.001817s) taOS > create database db; Query OK, 0 of 0 row(s)inDatabase (0.002028s) taOS > use db; Database changed. taos> create table t (ts timestamp, speed int); Query OK, 0 of 0 row(s)inDatabase (0.180100s) taOS > show tables; table_name | created_time | columns | stable_name | uid | tid | vgId | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = t | 16:28:09. 2021-07-20 184 | 2 | | 37436171923321672 | | 133 | 1 Query OK, 1 row (s)in set(0.004538 s) taos > go t; Field | Type | Length | Note | ================================================================================= ts | TIMESTAMP | 8 | | speed | INT | 4 | | Query OK, 2 row(s)in set(0.000165s) taOS > insert into t values ('2019-07-15 00:00:00', 10);
Query OK, 1 of 1 row(s) inDatabase (0.002749s) taOS > insert into t values (now, 100); Query OK, 1 of 1 row(s)inDatabase (0.000982s) taOS > select count(*) from t; count(*) | ======================== 2 | Query OK, 1 row(s)in set(0.000129s) taOS > select * from t; Ts | speed | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 2019-07-15 00:00:00. 000 | | 2021-07-20 16:28:21 10. 813 | 100 | Query OK, 2 row(s)in set(0.002184s) taos> select * from db. Ts | speed | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 2019-07-15 00:00:00. 000 | | 2021-07-20 16:28:21 10. 813 | 100 | Query OK, 2 row(s)in set (0.003076s)

taos> select * from t limit1; Ts | speed | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 2019-07-15 00:00:00. 000 | | 10 Query OK, 1 row (s)in set(0.001250s) taos> select * from t order by ts desc; Ts | speed | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 2021-07-20 16:28:21. 813 | 100 | 2019-07-15 00:00:00. | | 10 000 Query OK, 2 row(s)in set (0.002486s)

taos> drop table t;
Query OK, 0 of 0 row(s) inDatabase (0.115054s) taOS > show tables; Query OK, 0 row(s)in set(0.002069s) taOS > drop database db; Query OK, 0 of 0 row(s)inThe database (0.010900 s) taos >exit
[root@hadoop1 local]# 
Copy the code

Note:

  1. If the client of another host is used for accessTDengine, the host name needs to be specified:taos -h ip/hostname
  2. We noticed the passagetaosWhen connecting, do not specify username and password, this is too. In fact,TDengineIf no authentication information is specified, the default authentication information is root and taosData

Refer to the official documentation for the common command line arguments:

-c, --config-dir: specifies the configuration file directory, default is /etc/taos -h, --host: specifies the FQDN of the service, default is local service -s, --commands: Run the TDengine command -u without entering the terminal. --user: indicates the user name for connecting to the TDengine server. By default, the user name is root -p. , --help: Prints all command line argumentsCopy the code

After this experience, TDengine at first glance looks like MySQL, the relational database we used before. Wait, this is just an SQL-like operation of TDengine. Let’s look at some of the differences from traditional SQL. At the same time, experience TDengine write and query efficiency.

Advanced experience

TDengine comes with a program called Taosdemo, which will automatically create a super table under the database test. The super table has 10,000 tables named “T0” to “T9999”, and each table has 10,000 records. Each record has five fields (TS, COL0, COL1, COL2, COL3), timestamp from “2017-07-14 10:40:00 000” to “2017-07-14 10:40:09 999”, each watchband has labels T0 and T1, T0 is set to 0 to 9999, t1 is set to “Beijing” or “Shanghai”, and a total of 100 million records are inserted.

  • Build tables and write data
Taosdmeo is executed directly from the Linux command line
[root@hadoop1 local]# taosdemo
Copy the code

The output is as follows (and written to a file: /root/output.txt) :

Host: 127.0.0.1:6030 user: root configDir: resultFile:./output. TXT Thread num of insert data: 10 thread num of create table: 10 number of records per req: 30000 max sql length: 1048576 database count: 1 database[0]: database[0] name: test drop: yes replica: 1 precision: ms super table count: 1 super table[0]: stbName: meters autoCreateTable: no childTblExists: no childTblCount: 10000 childTblPrefix: t dataSource: rand iface: taosc insertRows: 10000 interlace rows: 0 interlaceRows: 0 disorderRange: 1000 disorderRatio: 0 maxSqlLen: 1048576 timeStampStep: 1 startTimestamp: 2017-07-14 10:04:00.000 sampleFormat: sampleFile: tagsFile: columnCount: 4 column[0]:INT column[1]:INT column[2]:INT column[3]:INT tagCount: 2 tag[0]:INT tag[1]:BINARY(16) support 3.2100 seconds to create 10000 tables with 10 thread(s) support 59.48 seconds to insert rows: 100000000, affected rows: 100000000 with 10 thread(s) into test.meters. 1681350.46 Records/Second Insert delay, AVG: 47.24ms, Max: 244ms, min: 7msCopy the code

You can see that we created a database test, a super table of meters, 10000 measuring points (tables), each table writes 10000 data.

Ten threads were started, and it took about 60s to build the database, build the table, and insert the data, with a total record of 100000000 and 1681350.46 data records written per second.

From the writing effect, let a person both shock and excitement ~

  • validation
taos> use test;
Database changed.

taos> describe meters;
             Field              |         Type         |   Length    |   Note   |
=================================================================================
 ts                             | TIMESTAMP            |           8 |          |
 col0                           | INT                  |           4 |          |
 col1                           | INT                  |           4 |          |
 col2                           | INT                  |           4 |          |
 col3                           | INT                  |           4 |          |
 t0                             | INT                  |           4 | TAG      |
 t1                             | BINARY               |          16 | TAG      |
Query OK, 7 row(s) in set(0.000135 s) taos > show tables; table_name | created_time | columns | stable_name | uid | tid | vgId | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = t5114 | 14:32:05. 2021-06-15 078 | | 5 meters 1407376679002899 | | 107 | | t6915 | The 2021-06-15 14:32:07. 348 | | 5 meters | 1125944788123835 | 2675 | | 4 t5999 | 2021-06-15 14:32:06. 711 | | 5 meters | 1407403557282797 | 1709 | | t1474 | 14:32:06. 2021-06-15 830 | | 5 meters | 1688873836557746 | | 1429 | 6... Query OK, 10000 row(s)in set (1.042923s)

taos> select count(tbname) from meters;
     count(tbname)     |
========================
                 10000 |
Query OK, 1 row(s) in set (0.006129s)

taos> select count(*) from meters;
       count(*)        |
========================
             100000000 |
Query OK, 1 row(s) in setTaos > select count(*) from t0; count(*) | ======================== 10000 | Query OK, 1 row(s)in set(0.002336s) taOS > select count(*) from test.t9999; count(*) | ======================== 10000 | Query OK, 1 row(s)in set(0.002720s) taOS > select distinct T1 from meters; t1 | =================== beijing | shanghai | Query OK, 2 row(s)in set (0.109290s)

taos> select count(t0) from meters;
       count(t0)       |
========================
                 10000 |
Query OK, 1 row(s) in setTaos > select *, t0, t1 from t9999limit10; ts | col0 | col1 | col2 | col3 | t0 | t1 | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = The 2017-07-14 10:40:00. 000 | 3413 | 2118 | 19236 | 34372 | 9999 | Beijing | 2017-07-14 10:40:00. | | 58510 | 1896 39459 001 137 | | 9999 | Beijing | 10:40:00. 2017-07-14 002 46560 | | 16696 | 57671 | 52867 | 9999 | Beijing | 2017-07-14 10:40:00. 003 | 24451 | 41162 | 19182 | 17327 | 9999 | Beijing | 10:40:00. 2017-07-14 004 2421 | | 10378 | | 18471 | 24160 9999 | Beijing | 10:40:00. 2017-07-14 005 11240 | | 39221 | 16868 | 53291 | 9999 | Beijing | 10:40:00. 2017-07-14 006 | 49797 | 61202 | 47328 | 17810 | 9999 | Beijing | 10:40:00. 2017-07-14 61545 007 | 41747 | | 10914 | | 21212 | 9999 Beijing | 2017-07-14 10:40:00. 008 | 53202 | 18229 | 57033 | 7533 | 9999 | Beijing | 10:40:00. 2017-07-14 009 55927 | | 64343 | 46090 | 30498 | 9999 | beijing | Query OK, 10 row(s)in set(0.003072 s)Copy the code
  • The query
Query the average value, maximum value, minimum value of 100 million records. The first query is time-consumingtaos> select max(col0), avg(col1), max(col2), min(col3) from test.meters; max(col0) | avg(col1) | max(col2) | min(col3) | ====================================================================== 65534 | | 65534 | | 0 32746.001712370 Query OK, 1 row (s)in set(16.479990 s)Query 100 million records for average, maximum, minimum, etc., second query, very fasttaos> select max(col0), avg(col1), max(col2), min(col3) from test.meters; max(col0) | avg(col1) | max(col2) | min(col3) | ====================================================================== 65534 | | 65534 | | 0 32746.001712370 Query OK, 1 row (s)in set(0.063418 s)Select * from t1 where t1=" Beijing"
taos> select count(*) from test.meters where t1="beijing";
       count(*)        |
========================
              50000000 |
Query OK, 1 row(s) in set(0.016695 s)Select * from t1 where t1=" Shanghai"
taos> select count(*) from test.meters where t1="shanghai";
       count(*)        |
========================
              50000000 |
Query OK, 1 row(s) in set(0.017321 s)Select * from t1 where t1=" Shanghai "and" Beijing
taos> select count(*) from test.meters where t1="beijing" or t1="shanghai";
       count(*)        |
========================
             100000000 |

Select * from t0 where t0=100
taos> select count(*) from test.meters where t0=100;
       count(*)        |
========================
                 10000 |
Query OK, 1 row(s) in set(0.002740 s)Select top 10 of 100 million records
taos> select * from test.meters limit10; ts | col0 | col1 | col2 | col3 | t0 | t1 | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = The 2017-07-14 10:40:00. 000 | 20738 | 17079 | 28835 | | 20955 | 0 Shanghai | 2017-07-14 10:40:00. 001 | 9521 | | 16912 | 9092 18897 | 0 | Shanghai | 2017-07-14 10:40:00. 002 | 63838 | 42129 | 52379 | | 50840 | 0 Shanghai | 10:40:00. 2017-07-14 003 | 36661 | 52292 | 5025 | | 15506 | 0 Shanghai | 10:40:00. 2017-07-14 004 988 | 47682 | | 56909 | | 12187 | 0 Shanghai | The 2017-07-14 10:40:00. 005 | 44303 | 50954 | 56846 | | 11266 | 0 Shanghai | 10:40:00. 2017-07-14 006 39137 | | 32445 | 16212 | 40621 | 0 | Shanghai | 10:40:00. 2017-07-14 007 54210 | 50140 | | 54717 | | 56829 | 0 Shanghai | 2017-07-14 10:40:00. 008 | 53177 | 40921 | 15058 | | 10413 | 0 Shanghai | 10:40:00. 2017-07-14 009 39718 | | 14091 | | 47696 | 51869 0 | shanghai | Query OK, 10 row(s)in set(0.033425 s)Select * from t1 where t1=' Beijing '
taos> select last_row(col3) from meters where t1='beijing';
 last_row(col3) |
=================
          30630 |
Query OK, 1 row(s) in set(0.037200 s)Aggregate query: query col3 average value, grouped by T1taos> select avg(col3) from meters group by t1; Avg (col3) | t1 | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 32744.678155980 | Beijing | | 32752.362831000 Shanghai | Query OK, 2 row(s)in set(0.049459 s)Table T1110 is aggregated by 10s for average, maximum and minimum valuestaos> select avg(col1), max(col2), min(col3) from test.t1110 interval(10s); ts | avg(col1) | max(col2) | min(col3) | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 2017-07-14 10:40:00. 000 | 32770.574900000 | 65527 | 3 | Query OK, 1 row (s)in set(0.002239 s)T1 =" Shanghai"
taos> select count(*) from test.meters where t1="shanghai" and ts > 'the 000 10:40:00 2017-07-14' and ts < 'the 000 10:40:01 2017-07-14';
       count(*)        |
========================
               4995000 |
Query OK, 1 row(s) in set(0.562990 s)T1 =" Shanghai ", col1 = 1, col1 = 1, col1 = 1
taos> select avg(col1) from test.meters where t1="shanghai" and ts > 'the 000 10:40:00 2017-07-14' and ts < 'the 000 10:40:05 2017-07-14'interval(1s); Ts | avg (col1) | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 2017-07-14 10:40:00. | | 32755.165719520 000 The 2017-07-14 10:40:01. 000 | | 2017-07-14 10:40:02 32737.603441200. 000 | | 2017-07-14 10:40:03 32739.861599200. 000 | 32747.347343800 | 2017-07-14 10:40:04. 000 | | 32747.446260600 Query OK, 5 row (s)in set(0.382299 s)T1 =" Shanghai "; col1 = 1s; t1=" Shanghai "; Actual requirement: Assume that the last pressure value reported by the device within a minute is the pressure value of this minute. If the device does not report the pressure value of this minute, take the pressure value of the last minute
taos> select last(col1) from test.meters where t1="shanghai" and ts > 'the 000 10:40:00 2017-07-14' and ts < 'the 000 10:40:05 2017-07-14'interval(1s) fill(value, 0); Ts | last (col1) | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 2017-07-14 10:40:00. 000 | 24615 | 2017-07-14 10:40:01. 000 | 44670 | 2017-07-14 10:40:02. 000 | 38200 | 2017-07-14 10:40:03. 000 | 51491 | 2017-07-14 10:40:04. 000 | 58636 | Query OK, 5 row(s)in set(0.507627 s)Copy the code

From this quick experience, we can see that except for the first query of the average, maximum and minimum values of 100 million records which took more than 10s, the other queries (by label filtering, syntable query, by time range, aggregate query, etc.) all gave results in milliseconds

Reference

  • www.taosdata.com/cn/getting-…

If you have any questions or any bugs are found, please feel free to contact me.

Your comments and suggestions are welcome!

This article has participated in the activity of “New person creation Ceremony”, and started the road of digging gold creation together.