Hive series (3) : Basic use of Hive CLI and Beeline commands

A Hive, CLI

1.1 Help

You can run the hive -h or hive –help command to view the help information about all commands. The following information is displayed:

usage: hive
 -d,--define <key=value>          Variable subsitution to apply to hive 
                                  commands. e.g. -dA=B or --define A=B --define A user-defined variable --database <databasename> Specify the database to use -- Specify the database to use-e <quoted-query-string>         SQL from commandLine -- Executes the specified SQL-f<filename> SQL from files -- Run the SQL script -h,--help PrinthelpHiveconf <property=value> Use valueforGiven property -- custom configuration --hivevar <key=value> Variable subsitution to apply to hive -- custom Variable commands. E.g. --hivevar A=B -i <filename> Initialization SQL file -- Run the Initialization script before entering interactive mode. -s,--silent silent modeinInteractive shell -- Silent mode -v,--verbose verbose mode (echoExecuted SQL to the console) -- detailed modeCopy the code

1.2 Interactive Command Line

To access the interactive cli, run the Hive command without adding any parameters.

1.3 Running SQL Commands

You can run SQL commands using hive -e without entering the interactive command line.

hive -e 'select * from emp';
Copy the code

1.4 Executing SQL Scripts

SQL scripts can be executed on a local file system or HDFS.

#Local file system
hive -f /usr/file/simple.sql;

#HDFS File system
hive -f hdfs://hadoop001:8020/tmp/simple.sql;
Copy the code

The simple. SQL content is as follows:

select * from emp;
Copy the code

1.5 Configuring Hive Variables

You can use –hiveconf to set variables for the Hive runtime.

hive -e 'select * from emp' \ --hiveconf hive.exec.scratchdir=/tmp/hive_scratch \ --hiveconf mapred.reduce.tasks=4;Copy the code

Hive.exec. scratchdir: Specifies a directory location on HDFS for storing execution plans and intermediate output results of different Map/Reduce phases.

1.6 Starting the Configuration File

Use -i to run the initialization script before entering interactive mode, which is equivalent to specifying the configuration file to start.

hive -i /usr/file/hive-init.conf;
Copy the code

The hive-init.conf content is as follows:

set hive.exec.mode.local.auto = true;
Copy the code

Hive, exec mode. Local. Auto default value is false, is set to true here, on behalf of open local mode.

1.7 User-defined Variables

–define

and –hivevar

are functionally equivalent and are used to implement custom variables. Here’s an example:
=value>
=value>

Define variables:

hive  --define n=ename --hiveconf --hivevar j=job;
Copy the code

Reference custom variables in a query:

Hive > select ${n} from emp; hive > select ${hivevar:n} from emp; Hive > select ${j} from emp; hive > select ${hivevar:j} from emp;Copy the code

The results are as follows:

Second, the Beeline

2.1 HiveServer2

Hive provides HiveServer and HiveServer2 services. Both of them allow clients to connect using multiple programming languages. However, HiveServer cannot process concurrent requests from multiple clients.

HiveServer2 (HS2) allows remote clients to submit requests and retrieve results to Hive using a variety of programming languages, and supports concurrent access and authentication from multiple clients. HS2 is a single process consisting of multiple services, including a Thrift based Hive service (TCP or HTTP) and a Jetty Web server for Web UI.

HiveServer2 has its own CLI(Beeline), which is a JDBC client based on SQLLine. HiveServer2 is the focus of Hive development and maintenance (Hiveserver is not supported after Hive0.15). Therefore, the Hive CLI is not recommended. Beeline is recommended.

2.1 Beeline

Beeline has more parameters available, you can use Beeline –help to view the parameters as follows:

Usage: java org.apache.hive.cli.beeline.BeeLine -u <database url> the JDBC URL to connect to -r reconnect to last saved connect  url (in conjunction with ! save) -n <username> the username to connect as -p <password> the password to connect as -d <driver class> the driver class to use -i <init file> script file for initialization -e <query> query that should be executed -f <exec file> script file that should be executed -w (or) --password-file <password file> the password file to read password from --hiveconf property=value Use value for given property --hivevar name=value hive variable name and value This is Hive specific settings in which variables can be set at session level and referenced in Hive commands or queries. --property-file=<property-file> the file to read connection properties (url, driver, user, password) from --color=[true/false] control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose  error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern] format numbers using DecimalFormat pattern --force=[true/false] continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTH the maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv] format mode for result display --incrementalBufferRows=NUMROWS the number of rows to buffer when printing rows on stdout, defaults to 1000; only applicable if --incremental=true and --outputformat=table --truncateTable=[true/false] truncate table column when it exceeds length --delimiterForDSV=DELIMITER specify the delimiter for delimiter-separated values output format (default: |) --isolation=LEVEL set the transaction isolation level --nullemptystring=[true/false] set to true to get historic behavior of printing null as empty string --maxHistoryRows=MAXHISTORYROWS The maximum number of rows to store beeline history. --convertBinaryArrayToString=[true/false] display binary column data as string or as byte array --help display this messageCopy the code

2.3 Common Parameters

Beeline supports all the parameters supported by the Hive CLI. Common parameters are as follows. For more parameters, see the official documents Beeline Command Options

parameter	instructions
-u <database URL>	Database address
-n <username>	The user name
-p <password>	password
-d <driver class>	Driver (optional)
-e <query>	Executing SQL commands
-f <file>	Executing SQL Scripts
-i (or)–init <file or files>	Run the initialization script before entering interactive mode
–property-file <file>	Specifying a configuration file
–hiveconf property* = *value	Specifying configuration properties
–hivevar name* = *value	User-defined attributes, valid at the session level

Example: Use the user name and password to connect to Hive

$beeline -u jdbc:hive2://localhost:10000 -n username -p password
Copy the code

3. Hive configuration

You can configure Hive attributes in the following ways:

3.1 Configuration File

The first method is using a configuration file. The configuration specified in the configuration file is permanent. Hive provides the following configuration files:

Hive-site. XML: main hive configuration file.
Hivemetastore-site. XML: configures metadata.
Hiveserver2-site. XML: about hiveserver2 configuration.

As an example, configure hive.exec.scratchdir in hive-site.xml:

 <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/mydir</value>
    <description>Scratch space for Hive jobs</description>
  </property>
Copy the code

3.2 hiveconf

The second method is to use –hiveconf to specify the configuration when starting the Hive CLI/Beeline command. In this way, the configuration applies to the entire Session.

hive --hiveconf hive.exec.scratchdir=/tmp/mydir
Copy the code

3.3 the set

Method 3: In an interactive environment (Hive CLI/Beeline), run the set command. The scope of this setting is also at the Session level, and the configuration takes effect for all commands after this command is executed. Set allows you to set and view parameters. As follows:

0: jdbc:hive2://hadoop001:10000> set hive.exec.scratchdir=/tmp/mydir; No rows affected (0.025 seconds) 0: JDBC: hive2: / / hadoop001:10000 > set hive. The exec. Scratchdir; +----------------------------------+--+ | set | +----------------------------------+--+ | hive.exec.scratchdir=/tmp/mydir | +----------------------------------+--+Copy the code

3.4 Configuring a Priority

XML – >hivemetastore-site. XML – > hiveserver2-site. XML – >– hiveconf- > set

3.5 Setting Parameters

Hive Configuration parameters are optional. You can refer to the official document AdminManual Configuration when using Hive

The resources

HiveServer2 Clients
LanguageManual Cli
AdminManual Configuration

See the GitHub Open Source Project: Getting Started with Big Data for more articles in the big Data series