View = = = = my back

Hive in practice, also involves a lot of Shell related instructions, here are some common examples;

== Start the Hive Cli ==

To log in to the Hive Cli, configure environment variables in the Hive installation directory on Linux, and directly log in to Hive.

[hadoop@dw-test-cluster-007 ]$ hive which: no hbase in (/usr/local/tools/anaconda3/bin:/usr/local/tools/anaconda3/condabin:/usr/local/tools/azkaban/azkaban-exec-server/bin:/us r/local/tools/azkaban/azkaban-web-server/bin:/usr/local/tools/anaconda3/bin:/usr/local/tools/java/current/bin:/usr/local /tools/scala/current/bin:/usr/local/tools/hadoop/current/bin:/usr/local/tools/spark/current/bin:/usr/local/tools/hive/cu rrent/bin:/usr/local/tools/zookeeper/current/bin:/usr/local/tools/flume/current/bin:/usr/local/tools/flink/current/bin:/ usr/local/tools/node/current/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/liuxiaowei/.local/bin:/home/liu xiaowei/bin) Logging initialized using configuration in File: / usr/local/tools/hive/apache - hive - 2.3.5 - bin/conf/hive - log4j2. Properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.hive>
Copy the code

==Hive Cli extension ==

== The Hive command has option parameters ==

hive [-hiveconf x=y]* [<-i filename>]* [<-f filename>|<-e query-string>] [-S]
Copy the code

-i Initializes the HQL from a file

-e Run the specified HQL -f run the HQL script -v output the HQL statements to the console -p connect to Hive Server on port number -hiveconf x=y (Use this to set Hive/Hadoop Configuration Variables) -s: Naming operations are performed without logs

== Example of the Hive command with options ==

Directly execute the HiveQL statement as follows;

[hadoop@dw-test-cluster-001 ~]$ hive -e "select * from dw.ods_sale_order_producttion_amount;" SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 202005, 20200501, 199000.00 2020 to 30, 20200731 00 202005 20200502 185000.00 2020 30 20200731 00 202005 20200503 199000.00 2020 30 20200731 00 202005 20200504 138500.00 2020 30 20200731 00 202005 20200505 196540.00 2020 30 20200731 00 202005 20200506 138500.00 2020 30 20200731 00 202005 20200507 159840.00 2020 30 20200731 00 202005 20200508 189462.00 2020 30 20200731 00 202005 20200509 200000.00 2020 30 20200731 00 202005 20200510 198540.00 2020 30 20200731 00 202006 20200601 189000.00 2020 30 20200731 00 202006 20200602 20200731 00 202006 20200603 189000.00 2020 30 20200731 00 202006 20200604 158500.00 2020 30 20200731 00 202006 20200605 200140.00 2020 30 20200731 00 202006 20200606 158500.00 2020 30 20200731 00 202006 20200607 198420.00 2020 30 20200731 00 202006 20200608 158500.00 2020 30 20200731 00 202006 20200609 200100.00 2020 30 20200731 00 202006 20200610 135480.00 2020 30 20200731 00 Time taken: 4.23 seconds, Touchback: 20 row(s) [hadoop@dw-test-cluster-001 ~]$Copy the code

Hive -e “select * from DW.ods_sale_order_producttion_amount;” To do this, write the file ods_sale_order_producttion_amount.sql S;

[hadoop@dw-test-cluster-001 ~]$ hive -f ods_sale_order_producttion_amount.sql SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 202005, 20200501, 199000.00 2020 to 30, 20200731 00 202005 20200502 185000.00 2020 30 20200731 00 202005 20200503 199000.00 2020 30 20200731 00 202005 20200504 138500.00 2020 30 20200731 00 202005 20200505 196540.00 2020 30 20200731 00 202005 20200506 138500.00 2020 30 20200731 00 202005 20200507 159840.00 2020 30 20200731 00 202005 20200508 189462.00 2020 30 20200731 00 202005 20200509 200000.00 2020 30 20200731 00 202005 20200510 198540.00 2020 30 20200731 00 202006 20200601 189000.00 2020 30 20200731 00 202006 20200602 20200731 00 202006 20200603 189000.00 2020 30 20200731 00 202006 20200604 158500.00 2020 30 20200731 00 202006 20200605 200140.00 2020 30 20200731 00 202006 20200606 158500.00 2020 30 20200731 00 202006 20200607 198420.00 2020 30 20200731 00 202006 20200608 158500.00 2020 30 20200731 00 202006 20200609 200100.00 2020 30 20200731 00 202006 20200610 135480.00 2020 30 20200731 00 Time taken: 4.23 seconds, Touchback: 20 row(s) [hadoop@dw-test-cluster-001 ~]$Copy the code

== Start the JDBC connection in Linux ==

Ensure that the JDBC connection has been configured for hive and the JDBC Service has been enabled, and the JDBC connection using shell commands is as follows:

Method 1 beeline

! Connect JDBC :hive2:// DW-test-cluster-007:10000 Enter the user name and password. Method 2: or beeline -u “JDBC :hive2:// DW-test-cluster-007:10000” -n hadoop hadoop -u: specifies the link information of the metadata database. -n: specifies the user name and password

The JDBC connection effect in a shell environment is as follows:

[hadoop@node1 apache-hive-2.1.5-bin]$beeline -u "JDBC :hive2://node1:10000" -n hadoop Hadoop SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found the binding in [the jar: file: / data/tools/apache - hive - 2.3.5 - bin/lib/log4j - slf4j - impl - 2.6.2. Jar! /org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found the binding in [the jar: file: / data/tools/hadoop - 2.8.5 / share/hadoop/common/lib/slf4j - log4j12-1.7.10. Jar! /org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Connecting to jdbc:hive2://node1:10000 Connected To: Apache Hive (Version 2.3.5) Driver: Hive JDBC (Version 2.3.5) Transaction Isolation: TRANSACTION_REPEATABLE_READ Beeline version 2.3.5 by Apache Hive 0: JDBC :hive2://node1:10000> show databases; + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | database_name | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | default | | dw | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + 2 rows selected (5.817) seconds) 0: jdbc:hive2://node1:10000>Copy the code

== Restart the Hive. ==

To restart Hive, run the JPS script to kill the two runjars and run the JPS script. If you change the HDFS or Hive configuration using JDBC, you may not be able to refresh the new configuration. If you change the HDFS block size, you may even report an error. If you change the HDFS block size using JDBC, you may even report an error. An exception may be raised. In this case, you need to restart the HIVE JDBC service.

[hadoop@node1 apache-hive-2.1.5-bin]$JPS 8195 DFSZKFailoverController 15686 JPS 7607 NameNode 15303 RunJar 6408 QuorumPeerMain 15304 RunJarCopy the code

Sh shell command vim /data/tools/ apache-hive-2.1.5-bin /start_hive.sh. The contents in the file are as follows, and then save and exit.

#! /usr/bin/env bash

#Start the Hive Service in the background
nohup hive --service metastore >> /data/logs/hive/meta.log 2>&1 &

#Start the JDBC service of hive in the background
nohup  hive --service hiveserver2 >> /data/logs/hive/hiveserver2.log 2>&1 &

Copy the code

These are some common Shell commands used in Hive.

==Hive parameter Settings ==

In practice, the most reliable and stable Hive Cli environment is used. Hive Cli environment has a very common function, that is, Hive parameter configuration, which can not only facilitate us to read data results, but also play a role in tuning. Hive parameter configuration can be roughly divided into three situations.

  • Through the configuration file
  • Using command line arguments (valid for hive startup instances)
  • Parameter declaration (valid for Hive connection sessions)

== Through the configuration file ==

Hive also reads The Hadoop configuration. Since Hive is started as a Hadoop client, the Hive configuration overrides the Hadoop configuration, and the configuration file Settings are valid for all Hive processes started on the host. Configuration files usually refer to

  1. hive-site.xml
  2. hive-default.xml
  3. Hadoop related user-defined profiles (core-site.xml.hdfs-site.xml.mapred-site.xmlEtc.)
  4. Hadoop related default configuration files (core-default.xml.hdfs-default.xml.mapred-default.xmlEtc.)

== Command line parameters (valid for starting hive instances) ==

When starting Hive (client or Server), you can add the -hiveconf param=value option on the Cli to set parameters. For example, when starting Hive Cli, you can specify the table header output.

[hadoop@shucang-10 ~]$  hive -hiveconf hive.cli.print.header=true

Copy the code

== Parameter declaration (valid for Hive connection sessions) ==

After logging in to the Hive CLI, run set hive.cli.print.header=true. To set the properties as follows;

[hadoop@dw-test-cluster-007 ~]$ hive Logging initialized using configuration in File: / usr/local/tools/hive/apache - hive - 2.3.5 - bin/conf/hive - log4j2. Properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.hive>use dw;
OK
Time taken: 5.351 seconds
hive> set hive.cli.print.header=true;
hive>

Copy the code

The parameters of these modes have a certain priority order, as follows:

  1. parameterSETCommand declaration (valid for Hive connection sessions);
  2. Through the command line-hiveconfParameter (valid for Hive startup instance);
  3. Hive user-defined parameter configuration filehive-site.xml
  4. Default Hive parameter configuration filehive-default.xml
  5. Hadoop related user-defined profiles (core-site.xml.hdfs-site.xml.mapred-site.xmlEtc.)
  6. Hadoop related default configuration files (core-default.xml.hdfs-default.xml.mapred-default.xmlEtc.)

For detailed parameter values and Configuration meanings, see Hive Configuration Properties.