The premise

The previous article explained how to set up a ClickHouse development environment in Windows10, and then you need to learn the data definition of this database in detail, including data types, DDL, and DML. ClickHouse is a full-fledged DBMS that provides DDL and DML functionality similar to MySQL (with some syntax differences) and implements most of the standard SQL specification. Systematic learning of ClickHouse’s data definitions will help developers understand and use ClickHouse better. Outline of this article (right branch) πŸ‘‡πŸ‘‡

This article takes a closer look at all the data types ClickHouse supports in its current release (20.10.3.30).

The data type

ClickHouse data types in general include:

  • Numeric types
  • String type
  • Date time type
  • The compound type
  • Special type

Here is a summary table πŸ‘‡

Categories: type Type the name General concept JavaType note
Numeric types Int8 8bitThe integer TINYINT `Byte Integer`
Numeric types Int16 16bitThe integer SMALLINT `Short Integer`
Numeric types Int32 32bitThe integer INT Integer
Numeric types Int64 64bitThe integer BIGINT Long
Numeric types Int128 128bitThe integer ` –
Numeric types Int256 256bitThe integer
Numeric types UInt8 unsigned8bitThe integer TINYINT UNSIGNED JavaThere is no unsigned integer type in the
Numeric types UInt16 unsigned16bitThe integer SMALLINT UNSIGNED JavaThere is no unsigned integer type in the
Numeric types UInt32 unsigned32bitThe integer INT UNSIGNED JavaThere is no unsigned integer type in the
Numeric types UInt64 unsigned64bitThe integer BIGINT UNSIGNED JavaThere is no unsigned integer type in the
Numeric types Float32 32bitSingle-precision floating point number FLOAT Float
Numeric types Float64 64bitA double – precision floating – point number DOUBLE Double
Numeric types Decimal(P,S) High precision value,PIs the total bit length,SIs the length of decimal places DECIMAL BigDecimal
Numeric types Decimal32(S) High precision value,PThe total bit length belongs to[1, 9].SIs the length of decimal places DECIMAL BigDecimal Decimal(P,S)Specialized type
Numeric types Decimal64(S) High precision value,PThe total bit length belongs to[10] 16.SIs the length of decimal places DECIMAL BigDecimal Decimal(P,S)Specialized type
Numeric types Decimal128(S) High precision value,PThe total bit length belongs to[19, 20].SIs the length of decimal places DECIMAL BigDecimal Decimal(P,S)Specialized type
String type String A string of arbitrary length Broadly similarLONGTEXT String In place of traditionDBMSIn theVARCHAR,BLOB,CLOB,TEXTSuch as the type
String type FixedString(N) A fixed-length character string is usednullByte padding the trailing character A bit likeVARCHAR String
String type UUID Special string,32Bit length in the following format:The 8-4-4-8-4-4 String Typically generated using built-in functions
Date time type Date The date of DATE LocalDate
Date time type DateTime Date/time similarDATE_TIME `LocalDateTime OffsetDateTime`
Date time type DateTime64 Date/time similarDATE_TIME `LocalDateTime OffsetDateTime`
The compound type Array(T) An array of similarT[]
The compound type Tuple(S,T... R) tuples
The compound type Enum The enumeration
The compound type Nested nested
Special type Nullable NULLModifier types, not stand-alone data types
Special type Domain The domain name storageIPV4andIPV6Format domain name

ClickHouse types are case sensitive and generally humped. For example, DateTime cannot be written as DateTime or DATE_TIME, and UUID cannot be written as UUID

Here is a more detailed analysis of each type.

Numeric types

Numerical types include integer values, floating point values, high precision values and special Booleans.

The integer

An integer value is an integer of fixed length (bit number) and can be expressed either with or without sign. Look at the range of integer values πŸ‘‡πŸ‘‡

Signed integer values:

type Number of bytes (byte) The scope of
Int8 1 [- 128, 127]
Int16 2 [- 32768, 32767]
Int32 4 [- 2147483648, 2147483647]
Int64 8 [- 9223372036854775808, 9223372036854775807]
Int128 16 [- 170141183460469231731687303715884105728, 170141183460469231731687303715884105727]
Int256 32 [- 57896044618658097711785492504343953926634992332820282019728792003956564819968578604618580771785925434953266-3 4992332820282019728792003956564819967]

The range of integers that Int128 and Int256 can represent is very large, and the size of the bytes they occupy increases accordingly, so they are rarely used.

Unsigned integer values:

type Number of bytes (byte) The scope of
UInt8 1 [0, 255]
UInt16 2 [0, 65535]
UInt32 4 [0, 4294967295]
UInt64 8 [0, 18446744073709551615]
UInt256 32 [0, 115792089237316195423570985008687907853269984665640564039457584007913129639935]

It is worth noting that the UInt128 type is not supported, so the UInt128 does not exist. The range of integers that can be represented by the UInt256 is very large and the size of bytes consumed increases accordingly, so it is rarely used.

MySQL defines an incrementable primary key of type BIGINT UNSIGNED, indexed in ClickHouse to UInt64. Do a little test on the ClickHouse command line client:

SELECT \
toInt8(127) AS a,toTypeName(a) AS aType, \
toInt16(32767) AS b,toTypeName(b) AS bType, \
toInt32(2147483647) AS c,toTypeName(c) AS cType, \
toInt64(9223372036854775807) AS d,toTypeName(d) AS dType, \
toInt128(170141183460469231731687303715884105727) AS e,toTypeName(e) AS eType, \
toInt256(57896044618658097711785492504343953926634992332820282019728792003956564819967) AS f,toTypeName(f) AS fType, \
toUInt8(255) AS g,toTypeName(g) AS gType, \
toUInt16(65535) AS h,toTypeName(h) AS hType, \
toUInt32(4294967295) AS i,toTypeName(i) AS iType, \
toUInt64(18446744073709551615) AS j,toTypeName(j) AS jType, \
toUInt256(115792089237316195423570985008687907853269984665640564039457584007913129639935) AS k,toTypeName(k) AS kType; 
Copy the code

Output result:

SELECT toInt8(127) AS a, toTypeName(a) AS aType, toInt16(32767) AS b, toTypeName(b) AS bType, toInt32(2147483647) AS c, toTypeName(c) AS cType, toInt64(9223372036854775807) AS d, toTypeName(d) AS dType, ToInt128 (1.7014118346046923e38) AS e, toTypeName(e) AS eType, toInt256(5.78960446186581E76) AS f, toTypeName(f) AS fType, toUInt8(255) AS g, toTypeName(g) AS gType, toUInt16(65535) AS h, toTypeName(h) AS hType, toUInt32(4294967295) AS i, toTypeName(i) AS iType, toUInt64(18446744073709551615) AS j, toTypeName(j) AS jType, ToUInt256 e77 (1.157920892373162) AS k, toTypeName(k) AS kType β”Œ ─ ─ ─ ─ a ┬ ─ aType ─ ┬ ─ ─ ─ ─ ─ ─ b ┬ ─ bType ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ c ┬ ─ cType ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ d ┬ ─ dType ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ e ┬ ─ eType ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ f ┬ ─ fType ─ ─ ┬ ─ ─ ─ ─ g ┬ ─ gType ─ ┬ ─ ─ ─ ─ ─ ─ h ┬ ─ hType ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ I ┬ ─ iType ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ j ─ ┬ ─ jType ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ k ┬ ─ kType ─ ─ ─ ┐ β”‚ β”‚ 127 Int8 β”‚ 2147483647 32767 β”‚ Int16 β”‚ β”‚ Int32 β”‚ β”‚ 9223372036854775807 Int64 β”‚ β”‚ - 170141183460469231731687303715884105728 Int128 β”‚ -9223372036854775808 β”‚ Int256 β”‚ 255 β”‚ UInt8 β”‚ 6294967295 β”‚ UInt32 18446744073709551615 β”‚, UInt64 β”‚ 115792089237316195423570985008687907853269984665640564039448360635876274864128 β”‚ UInt256 β”‚ β”” ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. Elapsed: 0.009 SEC.Copy the code

Embarrassingly, the shell execution result above is a bit long and distorted.

Floating point Numbers

Floating-point numbers include single-precision floating-point number Float32 and double-precision floating-point number Float64πŸ‘‡πŸ‘‡

type The size is in bytes Effective accuracy (excluding leftmost zero decimal places) note
Float32 4 7 After the decimal point minus the zero to the left8Bit up causes data overflow
Float64 8 16 After the decimal point minus the zero to the left17Bit up causes data overflow

Here’s a quiz:

SELECT toTypeName(a) from toTypeName(a); f5abc88ff7e4 :) SELECT toTypeName(a) from toTypeName(a) The SELECT toFloat32 (' 0.1234567890 ') AS a, β”Œ toTypeName (a) ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ a ┬ ─ toTypeName (toFloat32 (' 0.1234567890 ')) ─ ┐ β”‚ β”‚ 0.12345679 Float32 β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.005 sec.f5abc88ff7e4 :) SELECT toTypeName(a) from toTypeName; The SELECT toFloat32 (' 0.0123456789 ') AS a, β”Œ toTypeName (a) ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ a ┬ ─ toTypeName (toFloat32 (' 0.0123456789 ')) ─ ┐ β”‚ β”‚ 0.012345679 Float32 β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.036 sec.f5abc88ff7e4 :) SELECT toTypeName(a,toTypeName(a); The SELECT toFloat64 (' 0.12345678901234567890 ') AS a, β”Œ toTypeName (a) ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ a ┬ ─ toTypeName (toFloat64 (' 0.12345678901234567890 ')) ─ ┐ β”‚ β”‚ 0.12345678901234568 Float64 β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.005 sec.f5abc88ff7e4 :) SELECT toTypeName(a,toTypeName(a); The SELECT toFloat64 (' 0.01234567890123456789 ') AS a, β”Œ toTypeName (a) ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ a ┬ ─ toTypeName (toFloat64 (' 0.01234567890123456789 ')) ─ ┐ β”‚ β”‚ 0.012345678901234568 Float64 β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.005 SEC.Copy the code

In particular, ClickHouse supports the following special classes of floating point numbers compared to standard SQL:

  • InfMinus means plus infinity
  • -InfMinus infinity
  • NaN– Indicates not a number

Verify:

F5abc88ff7e4:) SELECT divide (0.5 0); SELECT 0.5/0 β”Œ ─ divide (0.5, 0) ─ ┐ β”‚ β”‚ inf files β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.007 sec.f5abc88ff7e4 :) SELECT divide(-0.5,0); SELECT - 0.5/0 β”Œ ─ divide (0.5, 0) ─ ┐ β”‚ β”‚ - inf files β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.004 sec.f5abc88ff7e4 :) SELECT divide(0.0,0.0); SELECT 0. / 0. β”Œ ─ divide (0. 0.) ─ ┐ β”‚ nan β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.005 SEC.Copy the code

High precision value

High precision numeric type Decimal, also known as a fixed-point number, can specify the total number of digits and the fixed number of Decimal points to indicate the exact number within a range. The native representation of Decimal is Decimal(P,S), and the meaning of the two parameters is:

  • P: indicates the precision, which determines the total number of digits (that is, how many digits of the whole number plus the decimal part). The value range is,76 [1]
  • S: Representative scale (scale), determine the number of decimal places. The value range is[0,P]

Simple representations derived from Decimal(P,S) are Decimal32(S), Decimal64(S), Decimal128(S), and Decimal256(S). See the table below:

type The range of values of P The range of values of S Numerical range
Decimal(P,S) ,76 [1] [0,P] (-1*10^(P - S), 1*10^(P - S))
Decimal32(S) [1, 9] [0,P] (-1*10^(9 - S), 1*10^(9 - S))
Decimal64(S) [10] 16 [0,P] (-1*10^(18 - S), 1*10^(18 - S))
Decimal128(S) [19, 20] [0,P] (-1*10^(38 - S), 1*10^(38 - S))
Decimal256(S) [39,76] [0,P] (-1*10^(76 - S), 1*10^(76 - S))

If derived types are difficult to understand, use Decimal(P,S) directly. It is defined in the following format:

column_name Decimal(P,S)

#Such asThe amount a Decimal (1, 2)Copy the code

For the four operations, the (built-in function) operation is performed using two Decimal values of different precision, and the rule for the resulting Decimal is as follows (assuming S1 is the lvalue Decimal, S2 is the rvalue Decimal, and S is the result Decimal) :

  • For addition and subtraction,S = max(S1,S2)
  • For multiplication,S = S1 + S2
  • For division,S = S1(Resulting in the same decimal place as the dividend)
SELECT toDecimal32(2,4) AS x, toDecimal32(2,2) AS y,x+y; SELECT toDecimal32(2, 4) AS x, toDecimal32(2, 2) AS y, x + y chrysene ──x─┬── y─┬─plus(toDecimal32(2, 4), toDecimal32(2, 2) 2)) ─ ┐ β”‚ β”‚ β”‚ β”‚ 4.0000 2.00 2.0000 β”” ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: SELECT toDecimal32(2,4) AS x, toDecimal32(2,5) AS y,y/x SELECT toDecimal32(2,4) AS x, ToDecimal32 (2, 5) AS y, y/x β”Œ ─ ─ ─ ─ ─ ─ ─ x ┬ ─ ─ ─ ─ ─ ─ ─ ─ y ┬ ─ divide (toDecimal32 (2, 5), toDecimal32 (2, 4) ─ ┐ β”‚ β”‚ β”‚ β”‚ 1.00000 2.00000 2.0000 β”” ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. Elapsed: 0.004 sec.f5abc88ff7e4 :) SELECT toDecimal32(2,4) AS x, toDecimal32(2,4) AS y,y*x ToDecimal32 (2, 4) AS y, y * x β”Œ ─ ─ ─ ─ ─ ─ ─ x ┬ ─ ─ ─ ─ ─ ─ ─ y ┬ ─ multiply (toDecimal32 (2, 4), toDecimal32 (2, 4) ─ ┐ β”‚ β”‚ β”‚ β”‚ 4.00000000 2.0000 2.0000 β”” ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in Set. Elapsed: 0.004 SEC.Copy the code

Important note: if engaged in the financial field and other numerical storage to pursue accurate accuracy, floating point numbers should not be used, but should consider using integer or fixed-point numbers, rounding as far as possible by the program rules, after all, the database is a tool to store data, should not undertake too much processing data calculation functions.

Boolean value

There is no Boolean value type in ClickHouse and the UInt8 type is officially recommended, with a value of 0 or 1 indicating false or true.

String type

String types include:

  • Indefinite length (dynamic length) stringString
  • Fixed length stringFixedString(N)Here,NIs the largestThe number of bytesInstead of length, for exampleUTF-8Character takes3Bytes,GBKCharacter takes2bytes
  • Special stringUUID(Stores numbers, but as strings)

There is no concept of encoding in ClickHouse, and a string can contain an arbitrary set of bytes that are stored and printed as-is. This encoding and decoding operation is presumably completely handed over to the client. In general, it is recommended to use UTF-8 encoding to store text-type content so that data can be read and written without conversion.

String

The String type does not limit the length of the String, and can directly replace other DBMS String types such as VARCHAR, BLOB, CLOB, etc. Compared with VARCHAR, it is obviously very convenient to consider the maximum length of the predicted data. Using the Java language development, directly using the String type to undertake. A String column is defined as follows:

column_name String
Copy the code

FixedString

Data columns of type FixedString are defined as follows:

column_name FixedString(N)
Copy the code

A FixedString represents a string of fixed length N, where N stands for N bytes, not N characters or code points. Some typical scenarios using FixedString:

  • Binary representation storageIPAddress, as usedFixedString(16)storageIPV6address
  • The binary representation of a hash value, such asFixedString(16)storageMD5The binary value of,FixedString(32)storageSHA256Binary value of

When writing FixedString data:

  • If the number of data bytes is greater thanN, one is returnedToo large value for FixedString(N)The abnormal
  • If the number of data bytes is less thanN, is usednullBytes fill in the rest

If a column of type FixedString needs to be matched in the QUERY condition WHERE, the input query parameter must add \0 to the end. Otherwise, the query condition may become invalid. In other words, it is more recommended to write data and query conditions with fixed number of bytes.

The built-in length() function returns N directly, while the built-in empty() function returns 1 if all null bytes are present and 0 otherwise.

UUID

Uuid. UUID#randomUUID() is a static method that generates UUID directly in Java. Because of its uniqueness, it can sometimes be chosen to generate UUID as the primary key type of the database. ClickHouse directly defines a type UUID, strictly speaking, this type is not a string, but because of its location on the document order under the string type, date/time types, form looks like a string, and it only supports string type of built-in functions, so the author also put it classified as type string. The UUID in ClickHouse is actually a 16-byte number in the following common format:

The 8-4-4-8-4-4
#Example #
61f0c404-5cb3-11e7-907b-a6006ad3dba0

## zero00000000 - the 0000-0000-0000-000000000000Copy the code

The UUID type column definition format is as follows:

column_name UUID
Copy the code

The built-in function generateUUIDv4() can be used to generateUUID data directly.

f5abc88ff7e4 :) CREATE TABLE test_u(id UInt64,u UUID) ENGINE = Memory; CREATE TABLE test_u ( `id` UInt64, `u` UUID ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.018 sec.f5abc88ff7e4 :) INSERT INTO test_u VALUES (1,generateUUIDv4()); INSERT INTO test_u VALUES Ok. 1 rows in set.Elapsed: 0.005 sec.f5abc88ff7e4 :) SELECT * FROM test_u; SELECT * FROM test_u β”Œ ─ ─ id ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ u ┐ β”‚ β”‚ 1 fc379d2c 45 a3-8589-1-0753 - ef95ee0d8c9 β”‚ β”” ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.004 SEC.Copy the code

Date time type

Date and time types include Date (year, month, day), DateTime (year, month, day, minute, second), and DateTime64 (year, month, day, minute, second).

Date

Date represents year, month and day, but this type is used in ClickHouse to store the number of days from Unix era (1970-01-01) as an unsigned integer with 2 bytes (2 byte -> 16 bits). Time zones are not supported. The maximum year that can be represented is 2105. Based on this feature, data of type Date can be inserted in yyyY-MM-DD format or as an unsigned integer. See the test below:

f5abc88ff7e4 :) CREATE TABLE test_dt(date Date) ENGINE = Memory; CREATE TABLE test_dt ( `date` Date ) ENGINE = Memory Ok. 0 rows in set. Elapsed: INSERT INTO DT VALUES(1),(2),('0000-00-00'),('2020-11-11'); INSERT INTO DT VALUES Received exception from server (version 20.10.3): Code: 60. DB:: exception: Received from clickhouse-server:9000. DB::Exception: Table default.dt doesn't exist.. 0 rows in set.Elapsed: 0 SEC. F5abc88ff7e4 :) INSERT INTO test_dt VALUES(1),('0000-00-00'),('2020-11-11'); INSERT INTO test_dt VALUES Ok. 4 rows in set.elapsed: 0.025 sec.f5abc88ff7e4 :) SELECT * FROM test_dt; SELECT * FROM test_dt β”Œ ─ ─ ─ ─ ─ ─ ─ the date ─ ┐ β”‚ the 1970-01-02 β”‚ β”‚ the 1970-01-03 β”‚ β”‚ the 1970-01-01 β”‚ β”‚ β”‚ 2020-11-11 β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 4 rows Elapsed: 0.005 SECCopy the code

The 0 or ‘0000-00-00’ in the Date type represents 1970-01-01

DateTime

DateTime is the usual concept of year, month, day, hour, minute, second. Time zones are supported, but milliseconds are not supported, which is accurate to the second. It is defined in the format:

column_name DateTime[(time_zone)]
Copy the code

The range that can be expressed is [1970-01-01 00:00:00, 2105-12-31 23:59:59]. A few things to note when using DateTime:

  • DateTimeThe time point is actually saved toUnixThe timestamp (which I’ve explored here should be in seconds) is independent of time zone or daylight saving time
  • DateTimeThe time zone is not stored in the column data or result set, but in the column metadata
  • Creating table definitionsDateTimeIf no time zone is specified, the default time zone set on the server or operating system is used
  • Creating table definitionsDateTimeIf time zone is not specified for columns of type,ClickHouseThe client will use itClickHouseThe time zone of the server can also be specified--use_client_time_zoneThe specified
  • You can configure valuesdate_time_input_formatordate_time_output_formatSpecified separatelyDateTimeInput and output formats for type data
  • DateTimeWhen type data is inserted, integers are treated asUnixTimestamp, and will be usedUTCAs the time zone (zero time zone), the string is treated as using the date and time of the time zone (depending on the service or system) and converted to the corresponding time zoneUnixTimestamp for storage

You can test it:

f5abc88ff7e4 :) CREATE TABLE test_dt(t DateTime,tz DateTime('Asia/Shanghai')) ENGINE = Memory; CREATE TABLE test_dt ( `t` DateTime, `tz` DateTime('Asia/Shanghai') ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.029 sec.f5abc88ff7e4 :) INSERT INTO test_dt VALUES(1605194721,'2020-11-01 00:00:00'); INSERT INTO test_dt VALUES Ok. 1 rows in set.Elapsed: 0.006 sec. f5abc88ff7e4 :) SELECT * FROM test_dt; SELECT * FROM test_dt β”Œ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ t ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ tz ─ ┐ β”‚ the 2020-11-12 15:25:21 β”‚ 2020-11-01 00:00:00 β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.005 sec.f5abc88ff7e4 :) SELECT toDateTime(t, 'Asia/Shanghai') AS sh_time,toDateTime(tz, 'Europe/London') AS lon_time FROM test_dt; SELECT toDateTime(t, 'Asia/Shanghai') AS sh_time, toDateTime(tz, 'Europe/London) AS lon_time FROM test_dt β”Œ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ sh_time ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ lon_time ─ ┐ β”‚ the 2020-11-12 23:25:21 β”‚ The 2020-10-31 16:00:00 β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.004 SEC.Copy the code

DateTime64

DateTime64 is the same type as DateTime, but can be subsecond, the accuracy is 10 ^ (-n) (10 ^ n) seconds, such as 0.1 seconds, 0.01 seconds, etc. It is defined in the format:

column_name DateTime64(precision [, time_zone])
Copy the code

Test it out:

f5abc88ff7e4 :) SELECT toDateTime64(now(), 5, 'Asia/Shanghai') AS column, toTypeName(column) AS x; SELECT toDateTime64(now(), 5, 'Asia/Shanghai') AS column, ToTypeName (column) AS x β”Œ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ the column ─ ┬ ─ x ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ β”‚ 23:45:56. 2020-11-12 00000 β”‚ DateTime64 (5, 'Asia/Shanghai) β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: F5abc88ff7e4 :) CREATE TABLE test_dt64(t DateTime64(2),tz DateTime64(3,'Asia/Shanghai')) ENGINE = Memory; CREATE TABLE test_dt64 ( `t` DateTime64(2), `tz` DateTime64(3, 'Asia/Shanghai') ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.017 sec.f5abc88ff7e4 :) INSERT INTO test_dt64 VALUES(1605194721,'2020-11-01 00:00:00'); INSERT INTO test_dt64 VALUES Ok. 1 rows in set.elapsed: 0.005 sec.f5abc88ff7e4 :) SELECT * FROM test_dt64; SELECT * FROM test_dt64 β”Œ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ t ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ tz ─ ┐ β”‚ 18:52:27 1970-07-05. 21 β”‚ 2020-11-01 00:00:00. 000 β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.004 SEC.Copy the code

The compound type

The compound types mainly include Array(T) and Tuple(T,S…. R), enumeration Enum and Nested Nested, where composition refers to the composition of multiple elements of the same type or multiple types of multiple elements.

Array

Array type T in Array(T) can be any data type (but the element type of the same Array must be unique), similar to the generic Array T[]. It is defined as follows:

column_name Array(T)

## define
major Array(String)

## write
VALUES (['a','b','c']), (['A','B','C'])
Copy the code

Writing test examples:

f5abc88ff7e4 :) CREATE TABLE test_arr(a Array(UInt8),b Array(String)) ENGINE = Memory; CREATE TABLE test_arr ( `a` Array(UInt8), `b` Array(String) ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.017 sec.f5abc88ff7e4 :) INSERT INTO test_arr VALUES([1,2,3],['throwable','doge']); INSERT INTO test_arr VALUES Ok. 1 rows in set.elapsed: 0.005 sec.f5abc88ff7e4 :) SELECT * FROM test_arr; SELECT * FROM test_arr β”Œ ─ a ─ ─ ─ ─ ─ ─ ─ ┬ b ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ β”‚ β”‚ [1, 2, 3] [' throwable ', 'doge'] β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.004 SEC. F5abc88ff7e4:)Copy the code

Note that:

  • You can usearray()Function or[]Create an array quickly
  • When you quickly create an array,ClickHouseParameter types are automatically defined as the “narrowest” data type that can store all listed parameters, which can be understood asPrinciple of least cost
  • ClickHouseAn exception will be returned if the data type of the array is not determined (common for quickly creating an array with multiple types of elements)SELECT array(1, 'a')Is illegal)
  • If the element in the array existsNULL, the element type will becomeNullable(T)
f5abc88ff7e4 :) SELECT array(1, 2) AS x, toTypeName(x); SELECT [1, 2] AS x, toTypeName(x) chrysene ──┬─toTypeName(array(1, 2)) ─ ┐ β”‚ β”‚ [1, 2] Array (UInt8) β”‚ β”” ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.006 sec.f5abc88ff7e4 :) SELECT [1, 2, NULL] AS x, toTypeName(x); SELECT [1, 2, NULL] AS x, toTypeName (x) β”Œ ─ x ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ toTypeName ([1, 2, NULL) ─ ┐ β”‚ β”‚ [1, 2, NULL] Array (Nullable (UInt8) β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.004 sec.f5ABC88FF7E4 :) SELECT array(1, 'a') SELECT [1, 'a'] Received exception from server (version 20.10.3): Code: 386. DB::Exception: Received from clickhouse-server:9000. DB::Exception: There is no supertype for types UInt8, String Because some of them are String/FixedString and some of them are not. 0 rows in set.Elapsed: 0.015 SEC.Copy the code

Tuple

Tuples (Tuple (S, T… Data of type R) consists of 1-N elements, each of which can use a separate (and possibly different) data type. It is defined as follows:

column_name Tuple(S,T... R)
## define
x_col Tuple(UInt64, String, DateTime)

## write
VALUES((1,'throwables','2020-11-14 00:00:00')),((2,'throwables','2020-11-13 00:00:00'))
Copy the code

Note that:

  • Similar to an array typeArrayThe tupleTupleType inference for each element is also based onPrinciple of least cost
  • Specify tuples when creating tablesTupleAfter the type of the element is defined, the type of the element will be checked when the data is written. It must correspond one to one; otherwise, an exception will be thrown (such asx_col Tuple(UInt64, String)Can only write(1,'a')You can’t write('a','b'))
f5abc88ff7e4 :) SELECT tuple(1,'1',NULL) AS x, toTypeName(x); SELECT (1, '1', NULL) AS x, toTypeName (x) β”Œ ─ x ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ toTypeName (tuple (1, '1', NULL)) ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ β”‚ β”‚ (1, '1', NULL) Tuple (UInt8, String, Nullable (Nothing)) β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.004 sec.f5abc88ff7e4 :) CREATE TABLE test_tp(id UInt64, a Tuple(UInt64,String)) ENGINE = Memory; CREATE TABLE test_tp ( `id` UInt64, `a` Tuple(UInt64, String) ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.018 sec.f5abc88ff7e4 :) INSERT INTO test_tp VALUES(1,(999,'throwable'),(2,(996,'doge')); INSERT INTO test_tp VALUES Ok. 2 rows in set. Elapsed: 0.003 sec. f5abc88ff7e4 :) INSERT INTO test_tp VALUES(1,('doge','throwable')); INSERT INTO test_tp VALUES Exception on client: Code: 6. DB::Exception: Cannot parse string 'doge' as UInt64: syntax error at begin of string. Note: there are toUInt64OrZero and toUInt64OrNull functions, which returns zero/NULL instead of throwing exception.: while executing 'FUNCTION CAST(_dummy_0, 'Tuple(UInt64, String)') Tuple(UInt64, String) = CAST(_dummy_0, 'Tuple(UInt64, String)')': data for INSERT was parsed from queryCopy the code

Here you can see ClickHouse in dealing with the Tuple type writing data type mismatch is found, the will try to convert the type, which is in accordance with the written data corresponding to the position of the element type and the column definitions in the Tuple corresponding to the position the type of transition (if the same type don’t need to transform), abnormal type conversion will throw an exception. Type of a Tuple (UInt64, String) can actually write (‘ 111 ‘, ‘222’) or (111, ‘222’), but can’t write (‘ a ‘, ‘b’). The conversion process calls built-in functions, which can consume additional performance and time if not unexpected, so it is recommended to ensure that the element type is the same for each location element and column definition when writing data.

Enum

The Enum type is an ingenious compound type in ClickHouse. It defines data in the form of a finite key-value pair, K-V(String:Int), similar to the Java HashMap structure, where neither KEY nor VALUE is allowed to be NULL, but KEY is allowed to be set to an empty String. Enum data query generally returns a set of keys, and writes can be keys or values. It is defined as follows:

column_name Enum('str1' = num1, 'str2' = num2 ...)

#For example,
sex Enum('male' = 1,'female' = 2,'other' = 3)
Copy the code

The VALUE range of Enum is 16 bits, that is, VALUE can only be in the range of [-32768,32767]. It gives rise to two simple types: Enum8 (essentially (String:Int18), representing an 8-bit range ([-128,127]), and Enum16 (essentially (String:Int16), representing a 16-bit range ([-32768,32767]). If the native type Enum is used, Enum8 or Enum16 will be used to store data according to the number of k-V pairs defined. Test it out:

f5abc88ff7e4 :) CREATE TABLE test_e(sex Enum('male' = 1,'female' = 2,'other' = 3)) ENGINE = Memory; CREATE TABLE test_e ( `sex` Enum('male' = 1, 'female' = 2, 'other' = 3) ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.021 sec.f5abc88ff7e4 :) INSERT INTO test_e VALUES(1),(2),('other'); INSERT INTO test_e VALUES Ok. 3 rows in set. Elapsed: SELECT sex,CAST(sex,'Int8') FROM test_e SELECT sex,CAST(sex,'Int8') 'Int8) FROM test_e β”Œ ─ sex ─ ─ ─ ─ ┬ ─ CAST (sex, 'Int8) ─ ┐ β”‚ male 1 β”‚ β”‚ β”‚ getting 2 β”‚ β”‚ β”‚ other 3 β”‚ β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 3 rows in the set. The Elapsed: 0.005 SEC.Copy the code

The Enum IN ClickHouse is essentially String:Int. Specialising on one of these types makes it easy to define a finite set of key-value pairs. Enumeration values are integer values that directly participate IN operations such as ORDER BY, GROUP BY, IN, and DISTINCT. Conventional wisdom suggests that sorting, aggregations, de-repetitions, and so on should provide a good performance boost using integers over strings, so using Enum types in scenarios where finite state sets are used has a natural advantage over using strings to define enumerated sets.

Nested

Nested types are one of the more peculiar types. If you have used GO, the Nested data columns are defined somewhat like GO constructs:

column_name Nested(
    field_name_1 Type1,
    field_name_2 Type2
)

## define
major Nested(
    id UInt64,
    name String
)

## writeVALUES ([1, 2], [' Math ', 'English'])
## query
SELECT major.id,major.name FROM
Copy the code

ClickHouse’s nested type is very different from the conventional nested type in mind. Its essence is a multi-dimensional array structure, which can be interpreted as:

major Nested( id UInt64, The name String) left left left left left left left left left left down down down down down down down down down left left left left left left left left major down down down down down down down down. Id Array (UInt64) major. The name Array (String) left left left left left left left left left left down down down down down down down down down left left left left left left left left down down down down down down down in Java entity class Entity { Long id; List<Major> majors; } class Major { Long id; String name; }Copy the code

Rows of nested types need not have a fixed array length between rows, but the lengths of each array in the nested table must be aligned on the same row, for example:

The line Numbers major.id major.name
1 [1, 2] [‘M’,’N’]
2 [1, 2, 3] [‘M’,’N’,’O’]
3 (Exception) [1, 2, 3, 4] [‘M’,’N’]

Test it out:

f5abc88ff7e4 :) CREATE TABLE test_nt(id UInt64,n Nested(id UInt64,name String)) ENGINE Memory; CREATE TABLE test_nt ( `id` UInt64, `n` Nested( id UInt64, name String) ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.020 SEC. F5abc88ff7e4:) INSERT INTO test_nt VALUES (1, [1, 2, 3], [' a ', 'b', 'c']), (2, [999], [' throwable ']); INSERT INTO test_nt VALUES Ok. 2 rows in set.elapsed: 0.003 sec.f5abc88ff7e4 :) SELECT * FROM test_nt; SELECT * FROM test_nt β”Œ ─ ─ id ┬ ─ n.i d ─ ─ ─ ─ ┬ ─ n.n ame ─ ─ ─ ─ ─ ─ ─ ─ ┐ 1 β”‚ β”‚ β”‚ [1, 2, 3] [' a ', 'b', 'c'] β”‚ β”‚ β”‚ 2 [999] β”‚ β”‚ [' throwable '] β”” ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 2 rows in the set. The Elapsed: 0.005 SEC.Copy the code

The ARRAY JOIN clause can be used to implement tiling of nested subtable data, similar to row to column in MySQL:

f5abc88ff7e4 :) SELECT n.id,n.name FROM test_nt ARRAY JOIN n; SELECT n.id, N. name FROM test_nt ARRAY JOIN chrysene ─ n.I d─┬─n.name──── 1 β”‚ a β”‚ 2 β”‚ b β”‚ 3 β”‚ c β”‚ 999 β”‚ β”” ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜Copy the code

Special type

Special types include Nullable, Domain, and Nothing.

Nullable

Nullable is not a separate type; it is an auxiliary modifier type of another type, used in conjunction with other base types. If you are familiar with java.lang.Optional in Java, Nullable is similar to Optional, indicating that a base data type can be Null. It is defined as follows:

column_name Nullable(TypeName)

#Such asAmount Nullable(Decimal(10,2)) age Nullable(UInt16)Copy the code

A few points to note:

  • NULLisNullableThe default value ofINSERTCan be used whenNULLSpecifies a null value or no value is passed
  • You can’t useNullableModifies compound data types, but elements in compound data types can be usedNullablemodified
  • NullableModified columns cannot be indexed
  • There is a reminder in the official documentation:NullableThere is almost always a negative performance impact and this must be kept in mind when designing a database becauseNullableThe columns ofNULLValues and columns are notNULLValues are stored in two different files, so indexes cannot be added, and queries and writes are involvedNon-single file operations

Test it out:

f5abc88ff7e4 :) CREATE TABLE test_null(id UInt64,name Nullable(String)) ENGINE = Memory; CREATE TABLE test_null ( `id` UInt64, `name` Nullable(String) ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.022 sec. f5abc88ff7e4 :) INSERT INTO test_null VALUES(1,'throwable'),(2,NULL); INSERT INTO test_null VALUES Ok. 2 rows in set.elapsed: 0.004 sec.f5abc88ff7e4 :) SELECT * FROM test_null; SELECT * FROM test_null β”Œ ─ ─ id ┬ ─ name ─ ─ ─ ─ ─ ─ ┐ β”‚ β”‚ 1 throwable β”‚ β”‚ β”‚ 2 NULL β”‚ β”” ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 2 rows in the set. The Elapsed: 0.004 SEC. F5abc88ff7e4:)Copy the code

Domain

The Domain type is also unique to ClickHouse and is a special type that is encapsulated based on other types, including IPv4 (which is essentially based on UInt32 and stored in compact binary form) and IPv6 (which is essentially based on FixedString(16)). They are defined as follows:

column_name IPv4
column_name IPv6
Copy the code

Limitations of Domain types:

  • Can’t passALTER TABLETo change the currentDomainType The type of the column
  • You cannot insert from other columns or tables by string implicit conversionDomainType of column data, for exampleATable hasStringType storedIPColumns in address format cannot be importedBIn the tableDomainType of column
  • DomainType does not limit the value that can be stored, but is checked when writing dataIPv4orIPv6The format of the

In addition, INSERT or SELECT data of Domain type are formatted humanely, so when using INSERT statements, they can be written directly as strings. The query results are displayed as readable “strings” on the client command line. The built-in functions IPv4NumToString() and IPv6NumToString() are used to query results in string format. Implicit conversion is not supported. The CAST() function converts IPv4 to UInt32. Convert IPv6 to FixedString(16)). Test it out:

f5abc88ff7e4 :) CREATE TABLE test_d(id UInt64,ip IPv4) ENGINE = Memory; CREATE TABLE test_d ( `id` UInt64, `ip` IPv4 ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.029 sec. f5abc88ff7e4 :) INSERT INTO test_d VALUES(1,'192.168.1.0'); INSERT INTO test_d VALUES Ok. 1 rows in set. Elapsed: 0.003 sec.f5abc88ff7e4 :) SELECT IP,IPv4NumToString(IP) FROM test_d; SELECT ip, IPv4NumToString (IP) FROM test_d β”Œ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ IP ┬ ─ IPv4NumToString (IP) ─ ┐ β”‚ β”‚ β”‚ 192.168.1.0 192.168.1.0 β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.004 SEC.Copy the code

Nothing

Nothing is not an explicit data type. Its sole purpose is to indicate cases where no value is desired, and users cannot create Nothing types. For example, the literal NULL is actually a Nullable(Nothing), and the empty array() (built-in function) is Nothing.

f5abc88ff7e4 :) SELECT toTypeName(array()); SELECT toTypeName ([]) β”Œ ─ toTypeName (array ()) ─ ┐ β”‚ array (Nothing) β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. The Elapsed: 0.006 SEC.Copy the code

Zero values for all types

If you do not define default values for ClickHouse columns after they are defined (this is a bit more complicated and will be covered later in the DDL article), if Nullable is not used, then empty columns will be filled with zero values of the corresponding type when writing data. Each type of zero value is classified as follows:

  • The zero value of a numeric type is a number0
  • The zero value of the string type is an empty string' '.UUIDThe value is zero00000000 - the 0000-0000-0000-000000000000
  • The zero value of a date-time type is the zero value of the time offset it stores
  • EnumTypes are definedVALUEThe smallest value is zero
  • ArrayThe zero value of the type is[]
  • TupleThe zero value of the type is[Zero value of type 1, zero value of type 2......]
  • NestedThe zero value of the type is a multidimensional array and each array is[]
  • In particular, it can be considered thatNullableThe zero value of the modified type isNULL

Using the JDBC Driver

Here is a simulated scenario that uses basically all the types commonly used in ClickHouse. Define an order table:

CREATE TABLE ShoppingOrder (id UInt64 COMMENT 'primary key ', orderId UUID COMMENT' orderId ', amount Decimal(10,2) COMMENT 'amount ', CreateTime DateTime COMMENT 'create DateTime ', customerPhone FixedString(11) COMMENT' create DateTime ', customerPhone FixedString(11) COMMENT 'create DateTime ', CustomerName String COMMENT 'customerName ', orderStatus Enum('init' = 0,'cancel' = -1,'paid' = 1) COMMENT' orderStatus ', GoodsIdList Array(UInt64) COMMENT 'Array ID ', address Nested(province String, city String, street String, HouseNumber UInt64) COMMENT 'address') ENGINE = Memory; CREATE TABLE ShoppingOrder (id UInt64 COMMENT 'primary key ',orderId UUID COMMENT' orderId ',amount Decimal(10,2) COMMENT CreateTime DateTime COMMENT 'createTime ',customerPhone FixedString(11) COMMENT' customerName String COMMENT 'Customer name ', OrderStatus Enum('init' = 0,'cancel' = -1,'paid' = 1) COMMENT 'orderStatus ',goodsIdList Array(UInt64) COMMENT' Array of goods ids ',address Nested(province String, city String, street String, houseNumber UInt64) COMMENT 'address ') ENGINE = Memory;Copy the code

When created, call DESC ShoppingOrder:

f5abc88ff7e4 :) DESC ShoppingOrder; DESCRIBE TABLE ShoppingOrder β”Œ ─ name ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ type ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ default_type ─ ┬ ─ default_expression ─ ┬ ─ the comment ─ ─ ─ ─ ─ ─ ┬─codec_expression─┬─ttl_expression─ id β”‚ UInt64 β”‚ β”‚ primary key β”‚ β”‚ orderId β”‚ UUID β”‚ β”‚ orderId β”‚ β”‚ amount β”‚ Decimal(10, 2) β”‚ β”‚ β”‚ amount β”‚ β”‚ createTime β”‚ DateTime β”‚ creation date β”‚ β”‚ customerPhone β”‚ FixedString(11) β”‚ β”‚ phone number β”‚ β”‚ CustomerName β”‚ String β”‚ β”‚ β”‚ β”‚ orderStatus β”‚ Enum8('cancel' = -1, 'init' = 0, 'Paid' = 1) β”‚ β”‚ β”‚ goodsIdList β”‚ Array(UInt64) β”‚ β”‚ address. Province β”‚ Array(String) β”‚ β”‚ β”‚ β”‚ β”‚ address. City β”‚ β”‚ β”‚ Array(String) β”‚ β”‚ address. Street β”‚ Array(String) β”‚ β”‚ β”‚ address Address. HouseNumber β”‚ β”‚ β”‚ Array(UInt64) β”‚ β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 12 rows in the set. The Elapsed: 0.004 SEC.Copy the code

Introducing clickHouse-JDBC dependencies:

<dependency>
    <groupId>ru.yandex.clickhouse</groupId>
    <artifactId>clickhouse-jdbc</artifactId>
    <version>0.2.4</version>
</dependency>
Copy the code

Write test cases:

@RequiredArgsConstructor
@Getter
public enum OrderStatus {

    INIT("init".0),

    CANCEL("cancel", -1),

    PAID("paid".1),;private final String type;
    private final Integer status;

    public static OrderStatus fromType(String type) {
        for (OrderStatus status : OrderStatus.values()) {
            if (Objects.equals(type, status.getType())) {
                returnstatus; }}returnOrderStatus.INIT; }}@Data
public class Address {

    private String province;

    private String city;

    private String street;

    private Long houseNumber;
}

@Data
public class ShoppingOrder {

    private Long id;
    private String orderId;
    private BigDecimal amount;
    private OffsetDateTime createTime;
    private String customerPhone;
    private String customerName;
    private Integer orderStatus;
    private Set<Long> goodsIdList;

    /** ** there is really only one element */
    private List<Address> addressList;
}

 @Test
public void testInsertAndSelectShoppingOrder(a) throws Exception {
    ClickHouseProperties props = new ClickHouseProperties();
    props.setUser("root");
    props.setPassword("root");
    // There is a global default database when no database is created
    ClickHouseDataSource dataSource = new ClickHouseDataSource("jdbc:clickhouse://localhost:8123/default", props);
    ClickHouseConnection connection = dataSource.getConnection();
    PreparedStatement ps = connection.prepareStatement("INSERT INTO ShoppingOrder VALUES(? ,? ,? ,? ,? ,? ,? ,? ,? ,? ,? ,?) ");
    // Consider using the Snowflake algorithm to generate auto-trending primary keys
    long id = System.currentTimeMillis();
    int idx = 1;
    ps.setLong(idx ++, id);
    ps.setString(idx ++, "00000000-0000-0000-0000-000000000000");
    ps.setBigDecimal(idx ++, BigDecimal.valueOf(100L));
    ps.setTimestamp(idx ++, new Timestamp(System.currentTimeMillis()));
    ps.setString(idx ++, "12345678901");
    ps.setString(idx ++, "throwable");
    ps.setString(idx ++, "init");
    ps.setString(idx ++, "[4] 1999123");
    ps.setString(idx ++, [' Guangdong Province ']);
    ps.setString(idx ++, [' Guangzhou city ']");
    ps.setString(idx ++, "[' X street]");
    ps.setString(idx , "[10087].");
    ps.execute();
    ClickHouseStatement statement = connection.createStatement();
    ResultSet rs = statement.executeQuery("SELECT * FROM ShoppingOrder");
    List<ShoppingOrder> orders = Lists.newArrayList();
    while (rs.next()) {
        ShoppingOrder order = new ShoppingOrder();
        order.setId(rs.getLong("id"));
        order.setOrderId(rs.getString("orderId"));
        order.setAmount(rs.getBigDecimal("amount"));
        order.setCreateTime(OffsetDateTime.ofInstant(rs.getTimestamp("createTime").toInstant(), ZoneId.systemDefault()));
        order.setCustomerPhone(rs.getString("customerPhone"));
        order.setCustomerName(rs.getString("customerName"));
        String orderStatus = rs.getString("orderStatus");
        order.setOrderStatus(OrderStatus.fromType(orderStatus).getStatus());
        // Array(UInt64) -> Array<BigInteger>
        Array goodsIdList = rs.getArray("goodsIdList");
        BigInteger[] goodsIdListValue = (BigInteger[]) goodsIdList.getArray();
        Set<Long> goodsIds = Sets.newHashSet();
        for (BigInteger item : goodsIdListValue) {
            goodsIds.add(item.longValue());
        }
        order.setGoodsIdList(goodsIds);
        List<Address> addressList = Lists.newArrayList();
        // Array(String) -> Array<String>
        Array province = rs.getArray("address.province");
        List<String> provinceList = arrayToList(province);
        // Array(String) -> Array<String>
        Array city = rs.getArray("address.city");
        List<String> cityList = arrayToList(city);
        // Array(String) -> Array<String>
        Array street = rs.getArray("address.street");
        List<String> streetList = arrayToList(street);
        // UInt64 -> Array<BigInteger>
        Array houseNumber = rs.getArray("address.houseNumber");
        BigInteger[] houseNumberValue = (BigInteger[]) houseNumber.getArray();
        List<Long> houseNumberList = Lists.newArrayList();
        for (BigInteger item : houseNumberValue) {
            houseNumberList.add(item.longValue());
        }
        int size = provinceList.size();
        for (int i = 0; i < size; i++) {
            Address address = new Address();
            address.setProvince(provinceList.get(i));
            address.setCity(cityList.get(i));
            address.setStreet(streetList.get(i));
            address.setHouseNumber(houseNumberList.get(i));
            addressList.add(address);
        }
        order.setAddressList(addressList);
        orders.add(order);
    }
    System.out.println("Query result :" + JSON.toJSONString(orders));
}

private List<String> arrayToList(Array array) throws Exception {
    String[] v = (String[]) array.getArray();
    return Lists.newArrayList(Arrays.asList(v));
}
Copy the code

Output result:

Query result: [{" expressions such as addressList ": [{" city" : "guangzhou", "houseNumber:" 10087, "province", "guangdong province", "street" : "X street"}], "the amount" : 100.00, "createTime": "2020-11-17T23:53:34+08:00", "customerName": "throwable", "customerPhone": "12345678901", "goodsIdList": [1, 1234, 999], "id": 1605628412414, "orderId": "00000000-0000-0000-0000-000000000000", "orderStatus": 0 }]Copy the code

Client query:

f5abc88ff7e4 :) SELECT * FROM ShoppingOrder; SELECT * FROM ShoppingOrder β”Œ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ id ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ orderId ─ ┬ ─ amount ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ createTime ─ ┬ ─ customerPhone ─ ┬ ─ customerName ─ ┬ ─ o RderStatus ─ ┬ ─ goodsIdList ─ ─ ┬ ─ address. The province ─ ┬ ─ address. The city ─ ┬ ─ address. Street ─ ┬ ─ address. HouseNumber ─ ┐ β”‚ β”‚ 1605628412414 00000000-0000-0000-0000-000000000000-100.00 β”‚ β”‚ 0000-0000-0000 15:53:34 β”‚ β”‚ 12345678901 throwable β”‚ init β”‚ β”‚ [1999123] [' Guangdong '] β”‚ [' Guangzhou '] β”‚ ['X Street '] β”‚ [10087] β”‚ β”” ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”΄ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ β”˜ 1 rows in the set. Elapsed: 0.004 SEC.Copy the code

Practice shows that:

  • ClickHouseDataTypeCan be viewed inClickHouseVarious data types andJavaData types andSQLTypeIs the corresponding relationship between, e.gUInt64 => BigInteger
  • ClickHousetheArrayType can be used when writing data[Element X, element y]The format can also be usedjava.sql.ArrayI’m going to passClickHouseArrayReading data can be done similarly
  • The enumerationEnumIt will directly convert toJavaIn theStringtype

summary

This article has covered the functionality and basic usage examples of ClickHouse’s various data types in great detail, and the DDL section will be examined in the next article. Many of the DDLS in ClickHouse are used in unique ways that differ from the DDLS of traditional relational databases.

(C-7-D E-A-20201118 has been playing Kingdom Guard – Revenge for a long time.)