Sequential database

Timing database is a database storing event sequence data, which needs to support the basic functions of timing data such as fast writing, persistence and multi-dimensional aggregate query.

Time series data

Time series data is a series of data based on time. By connecting these data points into a line in time coordinates, we can make a multi-latitude report in the past to reveal its tendency, regularity and anomaly. The future is big data analytics, machine learning, prediction and warning.

Data is written to

  • Write smooth, continuous, high concurrency, and high throughput: Writes to sequential data are relatively smooth, unlike app data, which is usually proportional to the number of app visits, which have peaks and troughs. The generation of time series data is usually at a fixed time frequency, which is not restricted by other factors, and the data generation speed is relatively stable.
  • Write more than read less: 95%-99% of operations on sequential data are write operations, which are typically data that is written more than read. This is related to the characteristics of the data, such as monitoring data. You may have a lot of monitoring items, but you may not really read the data, and usually only care about a few specific key indicators or in a specific scenario.
  • Real-time write and no update: The write of sequential data is real-time, and each write is the recently generated data, which is related to the characteristics of data generation, because the data generation advances over time, while the newly generated data will be written in real time. There is no update in data writing. In the dimension of time, as time goes on, every data is new data and there is no update of old data. However, manual data revision cannot be ruled out.

Data query

  • Read by time range
  • The latest data has a high probability of being read
  • Historical data has a high probability of coarse-grained query
  • Multi-precision query
  • Multidimensional analysis

Data is stored

  • Large amount of data: Take monitoring data as an example. If the time interval of monitoring data collected is 1s, a monitoring item will generate 86400 data points every day. If there are 10,000 monitoring items, 864000000 data points will be generated every day. In the Internet of Things scenario, the number will be even larger. The scale of the whole data is terabytes or even petabytes.
  • Clear hot and cold: time series data have very typical hot and cold characteristics. The more historical data, the lower the probability of being queried and analyzed. Time-sensitive: Time-series data is time-sensitive. Data usually has a storage period. Data beyond this storage period can be considered invalid and can be recycled. On the one hand, the more historical data, the less available value; On the other hand, low-value data can be cleaned up to save storage costs.
  • Multi-precision data storage: It is mentioned in the characteristics of query that timing data needs a multi-precision query for storage cost and query efficiency, as well as a storage of multi-precision data.

InfluxDB

What is InfluxDB

  • InfluxDB is an open source distributed time series, events, and metrics database written in the Go language with no external dependencies
  • InfluxDB ranks first in db-Engines’ temporal database category

InfluxDB Related concepts

  • Database: InfluxDB Enables you to create databases. A Database can contain multiple users, save policies, and schemaless. Mersurement can be created flexibly at any time
  • 3. The concept of equivalent tables;
  • Tags: is some KV structure, the tag will be used to build index;
  • Fields: is a structure that stores real data, also a KV structure, but will not be used for indexing;
  • Point: represents a record, which can be understood as a record in a relational database.
  • Timestamp: since the InfluxDB is referred to as a time series database, missing time is not possible, each record must have a Timestamp;
  • Series: is composed of Measurement+Tags

The advantages of InfluxDB

InfluxDB provides various special functions, such as standard deviation, random sampling data, and statistical change ratio, which facilitate data statistics and real-time analysis. In addition, it has the following features:

  • Built-in HTTP interface, easy to use
  • The data can be tagged so that the query can be flexible
  • An SQL-like query statement
  • Installation management is simple, and reading and writing data is efficient
  • It can be queried in real time, and the data can be found immediately after being indexed at the time of writing

InfluxDB version

InfluxDB is currently available in version 2.0, and co-exists with 1.x due to major changes. Currently, the official recommended stable version is still 1.x. The major changes in 2.0 include the following:

  • Integrates the TICK component for one-click installation
  • Secure integration, all requests need to go through token
  • Integrated management page to support more powerful statistics and analysis functions
  • New query language Flux is supported to provide more powerful query and processing functions
  • Added IoT and edge computing capabilities to summarize and analyze time series data at Ingestion Point
  • A new storage engine is started. InfluxDB Iox is written in Rust

Use the Docker from InfluxDB 1. Upgrade to 2.1 x | InfluxDB OSS 2.1 document (influxdata.com)

Install InfluxDB 2.0

Install InfluxDB | InfluxDB OSS 2.1 document (influxdata.com)

Create a new directory to store your data, and then navigate to it

mkdir -p /mydata/influxdb/influxdb-docker-data-volume && cd The $_
Copy the code

Generate the default configuration file on the host file system

docker run\ --rm influxdb:2.1.1 \ xd print-config > config.yml
Copy the code

Restart the InfluxDB container

docker run -p 8086:8086 --name influxdb2 \
  -v $PWD/config.yml:/etc/influxdb2/config.yml \
  -v $PWD: / var/lib/influxdb2 \ - d influxdb: 2.1.1
Copy the code

Open port 8086

Open port 8086

firewall-cmd --zone=public --add-port=8086/tcp --permanent  
Copy the code

The configuration takes effect immediately

firewall-cmd --reload  
Copy the code

Set the InfluxDB

The initial setup process for influxDB gradually completes the process of creating default organization, user, bucket, and operator API tokens

After the operation is successful, log in to http://127.0.0.1:8086/

At the first login, enter the account and password

Checking Tokens on the Interface

Command into container

docker exec -it influxdb2 /bin/bash
Copy the code

To avoid having to use each command to pass the InfluxDB API token, set up a configuration file to store credentials

On the terminal, run the following command:

influx config create -n default \
  -u http://localhost:8086 \
  -o my-org \
  -t wjk4yyPaabbq7cG9hU3Ak-61i8hqOuuFtUWdtJYex9h55BgPjOLgPsANQjYlmHj6GVHx_RafAZlU4O4UnPvvCQ== \
  -a
Copy the code
influx config ls
Copy the code

With the token

export INFLUX_TOKEN=wjk4yyPaabbq7cG9hU3Ak-61i8hqOuuFtUWdtJYex9h55BgPjOLgPsANQjYlmHj6GVHx_RafAZlU4O4UnPvvCQ==
Copy the code
influx write -t  $INFLUX_TOKEN  -b my-bucket -o my-org "measurement field=1"
Copy the code

Other commands

Create user johndoe

influx user create -n johndoe -o my-org
Copy the code

Change the johndo password

influx user password -n johndoe
Copy the code

Insert data

influx write   -b  my-bucket   -o  my-org   -p s   'myMeasurement,host=myHost testField="testData" 888'
Copy the code

Check the data

influx query -o my-org 'from(bucket:"my-bucket") |> range(start:-1d)'
Copy the code

Install InfluxDB 1.8

Version 1.8 can use DBeaver connections for easy visualization

Docker run - p - 8086-8086 the name influx18 \ -v influxdb: / var/lib/influxdb \ influxdb: 1.8Copy the code

Log on to the background

[root@iZbp1bunl8t8qf63wqsy0iZ ~]# docker exec -it influx18 /bin/bash
root@40ce58faab6f:/# influx
Connected to http://localhost:8086 version 1.8.10
InfluxDB shell version: 1.8.10
> CREATE DATABASE mydb
> SHOW DATABASES
name: databases
name
----
_internal
mydb
> USE mydb
Using database mydb
>INSERT the CPU, the host = serverA, region = us_west value = 0.64
> SELECT "host"."region"."value" FROM "cpu"Name: CPU time host region value ---- ---- ------ ----- 1646909654112201899 serverA US_west 0.64> INSERT temperature,machine=unit42,type=assembly external=25,internal=37
> SELECT * FROM "temperature"
name: temperature
time                external internal machine type
----                -------- -------- ------- ----
1646909703830575693 25       37       unit42  assembly
> 
Copy the code