InfluxDB got the first episode

What is a sequential database

Time series data

All is based on time sequence data and produce a series of data, in have time coordinates, the coordinates of the system, in accordance with the time line of the data points, reveals the trend and law can be, if the time-series data analysis processing, machine learning model can be used to training, so as to realize forecast and early warning for the future.

Sequential database

Knowing the timing data, it is not difficult to understand the timing database, which is used to store the timing data database. Due to the characteristics of real-time and large volume of time series data, all time series databases must have the characteristics of fast writing, and in order to facilitate query, they must have the characteristics of persistence and multi-dimensional aggregation query.

The core concept

`Data Point`

`measurement`

`Timestamp (timestamp)`

`Measurement field`

`Tag`

Installing influxDB on a Mac system is relatively easy

The installation

On the Mac OS, it can passbrewTo install

  brew install influxdb
Copy the code

After successful installation, we can register boot up

  ln -sfv /usr/local/opt/influxdb/*.plist ~/Library/LaunchAgents
Copy the code

Check out the Mac OSX startup configuration for more information about LaunchD

Start the

Launchctl can be launched using the scheduled tasks added asynchronously above

  launchctl load load ~/Library/LaunchAgents/homebrew.mxcl.influxdb.plist
Copy the code

For more information about Lanuchctl, see launchctl, a scheduled task tool in Mac

You can also start by reading the installed configuration file

  influxd -config /usr/local/etc/influxdb.conf
Copy the code

Once started, InfluxDB listens on two ports:

8086 is used to provide client/server interaction (via the HTTP API)
8088 is used to provide backup and restore RPC services

With influxDB enabled, we can interact with the database on the terminal via the Influx command (connecting to port 8086 by default).

The play

So far, we haven’t had a deeper understanding of the influxDB, what measurement, what point, but even so, we can play around and see what it is.

Connection influxdb OSS

  influx -precision rfc3339
Copy the code

– Precision Standard RFC3339 (YYYY-MM-DDTHh :MM:SS. NNNNNNNZ) with timestamp format rFC3339 (YYYY-MM-DDTHh :MM:SS.

It is now possible to interact with the Influx data storage service by entering a query following influxQL syntax.

Creating a database

  CREATE DATABASE logdb
Copy the code

Here we will create a database named logDB. Note that in the CLI, there is no feedback after a successful operation, but there is a prompt for failure.

Viewing a Database

  SHOW DATABASES
Copy the code

The result is shown below

Where logDB is the database created and _internal is the database used internally by influxDB.

Select database

Each statement executed by influxDB must specify a fixed database as the context for its execution, so either a database is selected when each statement is written, or a database is selected first and subsequent statements default to that database.

  USE logdb
Copy the code

The result is as follows:

It also says that there is no feedback on any successful operation. Maybe the USE command is a special case

The concept of practice

It can be seen from the above concept that influxDB organizes and stores data by time series. So what does this time series look like?

A Time series is actually a concept, which is the concept of all data sets stored in the influxDB, and can contain 0 ~ ∞ data. A data in the InfluxDB is also called a point.

point

A point represents a set of hashed measurements of a measurement dimension, so point = Measurement + time

Measurement of dimensions

A measurement consists of at least one key-value pair of fields and 0 to infinity of tags.

The field representation is a specific value of measurement.
The tag represents the metadata of measurement, and the specific filtering criteria at query time.

So the syntax of point is as follows

<measurement>[,<tag-key>=<tag-value>...]  <field-key>=<field-value>[,<field2-key>=<field2-value>...]  [unix-nano-timestamp]Copy the code

For example, 🌰

Practice insert & SELECT

The next step is to start practicing

Insert data

INSERT the CPU, the host = serviceA, region = us_west value = 0.64Copy the code

The results are as follows:

Note that Spaces have special meaning in the influxQL. Therefore, it is not allowed to contain Spaces after the comma when writing. The CPU used the insert for the first time in the figure above has a space after it, so the insert failed.

As you can see from the above statement, we do not add any timestamp when we execute, and the database will automatically insert the local timestamp of the system running the database for us.

Query data

  SELECT "host"."region"."value" from cpu
Copy the code