==HiveQL==

HiveQL is a Hive query language that is similar to other SQL dialects. It is derived from the ANSI SQL standard revision, but has its own unique properties. HiveQL syntax is similar to MySQL, but there are also some differences. Table level DDL operations can be found in Hive. Hive table level DDL operations can be found in Hive. Hive table level DDL operations can be found in Hive.

Library = = = = HiveQL

A Hive library is a directory or namespace that adds a logical group to a table to reduce table name conflicts. Hive has a default library installed with Hive. If you do not specify or switch to a library, the default library will be used. Note :Hive database, table files are stored in big data cluster file systems, such as HDFS, AWS S3, etc., remember this, although Hive supports load local mode, it is better to upload files to HDFS before operation.

==HiveQL database ==

Hive will create a directory for each database, the default directory exists when you install the Hive configuration in the Hive – site. XML file under the Hive. Metastore. Warehouse. The dir attribute of the directory, The default value of this property is HDFS \user\hive\warehouse. Tables in the database are stored as subdirectories in this directory (except for the default library, which does not have its own directory). For example, when we create data dW, if the location of the database is not specified during the construction of the library, Hive will correspond to create a directory/user/Hive/warehouse/dw db, the default database file directory will add suffix. Db, but if you to define the location of the database on the HDFS, HDFS file path himself didn’t write. Db suffix, The system will not automatically add the file name to you. If you want to redefine the storage location of the database in HDFS, you can add the file name to.db.

  1. If the table does not specify a database, the newly created table is in the default library. The default library does not have its own directory, so the table storage configuration is inhive-site.xmlUnder the filehive.metastore.warehouse.dirProperty, the default is\user\hive\warehouse;
  2. The HDFS storage location is not specified when you create a database. The new database is stored in the configuration filehive-site.xmlUnder thehive.metastore.warehouse.dirProperty, the default is\user\hive\warehouseAnd will take.dbSuffix, switch to the library under the table, table does not specify the location, the default directory under this directory in the form of subdirectory storage;
  3. If you create a user-defined HDFS storage location for a database, the database will be built in the HDFS directory that you define instead of the default HDFS storage locationhive-site.xmlUnder the filehive.metastore.warehouse.dirIf the location of the table is not specified, it will be stored as a subdirectory under this directory by default.
  4. If you create a user-defined HDFS storage location for a database, the database will be built in the HDFS directory that you define instead of the default HDFS storage locationhive-site.xmlUnder the filehive.metastore.warehouse.dirTable data will be stored in the directory that has no relationship with the database directory. However, table enumeration commands are used in Hive. Table data is stored in the directory that has no relationship with the database directoryshow tables, the result is displayed in table in the house of the rolls, can be understood as, tables and library hive metadata on the logical structure is the subordinate relations, has nothing to do but the actual physical storage location, this method is very chaotic, inconvenient file level membership management, general won’t write so, unless you want to write for colleagues won’t be able to maintain the code.

HiveQL library statement and MySQL still have a bit like, specific as follows;

-- The lazy lite version
create database dw;

-- 2
create database if not exists dw;

--3. Redefine database locations with attributes and support rerun fault-tolerant versions
create database if not exists dw comment 'Data Warehouse' location '/hive/warehouse' with dbproperties('creater'='rowyet'.'date'='20200520');
Copy the code

Execution effect:

hive> create database if not exists dw comment 'Data Warehouse' location '/hive/warehouse' with dbproperties('creater'='rowyet'.'date'='20200520');
OK
Time taken: 0.227 seconds

Copy the code

[] optional official total syntax, symbols, | a choice, visible, HiveQL inside can use schema keyword instead of the database, but this is just a fancy features, there are no major role, and suggest that we develop a general database keyword.


CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
  [COMMENT database_comment] -- Database Description
  [LOCATION hdfs_path]  - Specifies the default storage location of the HDFS external table
  [MANAGEDLOCATION hdfs_path] -- Specifies the default path for the internal tables of the library
  [WITH DBPROPERTIES (property_name=property_value, ...) ]-- Set properties
  ;
Copy the code

==HiveQL Lists existing databases ==

HiveQL lists existing databases in the same syntax as MySQL.

1. Use databases. Note that databases have s, which is the plural form of the English word database.
show databases;


Mysql > select * from db where id = 'd'
show databases like 'd.*'

Copy the code

Execution effect:

hive> show databases;
OK
default
dw
liuxw_test
Time taken: 0.136 seconds, Fetched: 3 row(s)

Copy the code

==HiveQL Displays a database attribute ==

Describe describe keyword (also supported by abbreviation DESC);

-- 1. A child with good English
describe database dw;

-- lazy kid
desc database dw;

-- 3. View the extended information, which indicates that the extended keyword is optional
desc database extended dw;
Copy the code

Execution effect:

hive> descdatabase extended dw; OK DW data warehouse HDFS://dw-test-cluster- 007./hive/warehouse       hadoop  USER    {date=20191008, creater=jianggongqing}
Time taken: 0.025 seconds, Fetched: 1 row(s)
Copy the code

== Switch the HiveQL database ==

HiveQL switch data maintains the common SQL keyword, use, as follows;

use dw;
Copy the code

Execution effect:

hive> use dw;
OK
Time taken: 0.024 seconds

Copy the code

==HiveQL Query the current database ==

Open hive window, do half, called to have lunch, come back to have a nap for a while, alas, just statement window is in which DB again? That’s always the question, right? Of course, you can also use the use command repeatedly to switch to one of the libraries you want, check your current database statement as follows;

select current_database();
Copy the code

Execution effect:

hive> select current_database();
OK
dw
Time taken: 0.805 seconds, Fetched: 1 row(s)

Copy the code

Hive V0.8.0 can address this problem with the set attribute. You can use the following statement to indicate which library hive is currently in.

set hive.cli.print.current.db=true;
Copy the code

Hive (dW)> = hive (dW)>

hive> set hive.cli.print.current.db=true;
hive (dw)>

Copy the code

==HiveQL Deletes the database ==

MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL

-- 1. Delete the database
drop database rowyet;

-- 2. Rerun delete fault tolerant versions
drop database if exists rowyet;

Copy the code

Effect display;

hive (rowyet)> drop database rowyet;
OK
Time taken: 0.545 seconds
Copy the code

Mysql > delete table from database; delete table from database;

hive (rowyet)> drop database rowyet;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidOperationException(message:Database rowyet is not empty. One or more tables exist.)

Copy the code

==HiveQL Modify database ==

Modify the database instruction is rarely used, but when necessary, it can also be used to assist in modifying the attributes of the database, which is better than deleting and rebuilding it, the instruction is as follows;

alter database database_name set dbproperties (property_name=property_value, ...) ;- (note:Schema Added in Hive 0.14.0)
alter database database_name set owner [user|role] user_or_role;   - (note:Hive 0.13.0 and later; Schema Added in Hive 0.14.0)
alter database database_name set location hdfs_path; - (note:Hive 2.2.1, 2.4.0 and later)
alter database database_name set managedlocation hdfs_path; - (note:Hive 4.0.0 and later)

Copy the code

Effect display;

hive (rowtext)> alter database rowyet set dbproperties('edited-by'='lauliu');
OK
Time taken: 0.044 seconds

Copy the code