directory

1. Installation environment

Install Hadoop

1. Download Hadoop

2. Modify environment variables

3. Install Hive

1. Download Hive

2. Modify environment variables

3. Modify hivesite configurations

4. Check whether the installation is successful

Hive data integration

1. Hive synchronization configuration integration

2. Configure full synchronization

3. Hook tests

Five, error records

1. Abnormal characters exist in the configuration file

2. Guava versions are inconsistent


1. Installation environment

JDK 1.8

Install Hadoop

1. Download Hadoop

Mirror.bit.edu.cn/apache/hado… Choose the appropriate version

Download hadoop

Wget HTTP: / / http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gzCopy the code

To decompress mv, change the name of mv for easy use

Tar -xzvf hadoop-3.2.0.tar. gz mv hadoop-3.2.0.tar. gz HadoopCopy the code

2. Modify environment variables

Write hadoop environment information to environment variables

vim /etc/profile

export HADOOP_HOME=/opt/hadoop

export PATH=$HADOOP_HOME/bin:$PATH
Copy the code

Run source etc/profile for it to take effect

3. Modify the configuration file

Sh file, vim etc/hadoop/hadoop-env.sh modify JAVA_HOME information

Export JAVA_HOME = / usr/lib/JVM/Java -- 1.8.0 comes with its - 1.8.0.262. B10-0. El7_8. X86_64Copy the code

Execute hadoop jar share/hadoop/graphs/hadoop – graphs – examples – 3.3.0. Jar grep input output ‘DFS [a-z], hadoop’s own example, Verify that Hadoop is installed successfully

3. Install Hive

1. Download Hive

Wget mirror.bit.edu.cn/apache/hive…

Decompress tar -zxvf apache-hive-3.1.2-bin.tar.gz

Change the name mv apache-hive-3.1.2-bin hive

2. Modify environment variables

vim /etc/profile

export HIVE_HOME=/opt/hive

export PATH=$MAVEN_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$PATH
Copy the code

source etc/profile

3. Modify hivesite configurations

<! -- WARNING!!! This file is auto generated for documentation purposes ONLY! -- > <! -- WARNING!!! Any changes you make to this file will be ignored by Hive. --> <! -- WARNING!!! You must make your changes in hive-site.xml instead. --> <! -- Hive Execution Parameters --> <! -- The following configuration has the original configuration. Search after modified or deleted after adding in the same position - > < property > < name > javax.mail. Jdo. Option. ConnectionUserName < / name > user name < value > root < value > / < / property > The < property > < name > javax.mail. Jdo. Option. ConnectionPassword < / name > password < value > 123456 < value > / < / property > < property > <name>javax.jdo.option.ConnectionURL</name>mysql <value>jdbc:mysql: / / 127.0.0.1:3306 / hive < value > / < / property > < property > < name > javax.mail. Jdo. Option. ConnectionDriverName < / name > mysql driver <value>com.mysql.jdbc.Driver</value> </property> <property> <name>hive.exec.script.wrapper</name> <value/> <description/> </property>Copy the code

Copy the mysql driver to hive/lib and go to /hive/bin

 schematool -dbType mysql -initSchema
Copy the code

4. Check whether the installation is successful

Hive –version View the current version

Hive Check whether the hive command operation line is displayed. If yes, the command operation line is displayed

Hive data integration

The Ingest module of Atlas consumes the message in Kafka and writes the corresponding Atlas metadata to the underlying Janus graph database for storage and management. The Ingest module of Atlas consumes the message in Kafka as an event.

1. Hive synchronization configuration integration

[distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro

Export HIVE_AUX_JARS_PATH = / opt/apache - atlas - 2.1.0 / hook/hiveCopy the code

Modify hive-site. XML to specify hook execution methods

<property>
<name>hive.exec.post.hooks</name>
<value>org.apache.atlas.hive.hook.HiveHook</value>
</property>
Copy the code

Note that this is actually post-execution monitoring. You can have pre-execution and in-execution monitoring. This is essentially a callback monitor that performs the lifecycle.

2. Configure full synchronization

Copy the atlas configuration file atlas-application.properties to the Hive configuration directory

Add two lines of configuration:

atlas.hook.hive.synchronous=false
atlas.rest.address=http://doit33:21000
Copy the code

Before installing Atlas, hooks will not automatically sense and generate metadata for existing tables in Hive. You can use an Atlas tool to import metadata from existing Hive libraries or tables. This tool also exists in the Hive -hook package generated by atlas compilation.

bin/import-hive.sh
Copy the code

The result is as follows. You need to enter the account password of Atlas to import data. After input, data will be imported.

Hive Meta Data Imported successfully!! Data is successfully imported

sh import-hive.sh Using Hive configuration directory [/opt/hive/conf] Log file for import is /opt/apache-atlas-2.1.0/logs/import-hive.log SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found the binding in [the jar: file: / opt/hive/lib/log4j - slf4j - impl - 2.10.0. Jar! / org/slf4j/impl/StaticLoggerBinder class] slf4j: Found the binding in [the jar: file: / opt/hadoop/share/hadoop/common/lib/slf4j - log4j12-1.7.25. Jar! /org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: The Actual binding is of type [. Org. Apache logging. Slf4j. Log4jLoggerFactory] : the 2021-01-15 T11 41:01, 614 INFO [main] Org. Apache. Atlas. ApplicationProperties - & for atlas - application. The properties in the classpath: the 2021-01-15 T11 41:01, 619 INFO [main] org.apache.atlas.ApplicationProperties - Loading atlas-application.properties from File: / opt/hive/conf/atlas - application. The properties: the 2021-01-15 T11 41:01, 660 INFO [main] Org. Apache. Atlas. ApplicationProperties - Using graphdb backend 'janus' : the 2021-01-15 T11 41:01, 660 INFO [main] Org. Apache. Atlas. ApplicationProperties - Using storage backend 'hbase2' : the 2021-01-15 T11 41:01, 660 INFO [main] Org. Apache. Atlas. ApplicationProperties - Using the index backend 'solr: the 2021-01-15 T11 41:01, 660 INFO [main] org.apache.atlas.ApplicationProperties - Atlas is running in MODE: PROD. 2021-01-15 T11:41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Setting solr - wait - a searcher property 'true' : the 2021-01-15 T11 41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Setting index. The search. The map - the name The property 'false' : the 2021-01-15 T11 41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Setting Atlas. Graph. Index. Search. Max - result - set - size = 150: the 2021-01-15 T11 41:01, 660 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.db-cache = true : the 2021-01-15 T11 41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Property (set to the default) Atlas. Graph. Cache. Db - cache - the clean - wait = 20:2021-01-15 T11 41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Property (set to default) atlas.graph.cache.db-cache-size = 0.5 2021-01-15T11:41:01.661 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.tx-cache-size = 15000 : the 2021-01-15 T11 41:01, 661 INFO [main] org. Apache. Atlas. ApplicationProperties - Property (set to the default) Atlas.graph.cache. Tx -dirty-size = 120 Enter username for atlas: -admin # Enter username for atlas :- : the 2021-01-15 T11 41:05, 721 INFO [main] org. Apache. Atlas. AtlasBaseClient - Trying with the address http://127.0.0.1:21000 : the 2021-01-15 T11 41:05, 831 INFO [main] org. Apache. Atlas. AtlasBaseClient - method = GET path = API/atlas/admin/status contentType=application/json; charset=UTF-8 accept=application/json status=200Copy the code

3. Hook tests

Once all the hooks are configured, try creating a test table in Hive and see if it is searchable in Atlas. The configuration can be considered successful

Before creating a data table, the following table information is displayed

Then create a table in Hive

` ` `
hive> CREATE TABLE teache(>id int , > name string , > age int , > sex string, > peojectstring > ) ; OK Time taken: 0.645 seconds hive> show tables; OK class student teache Time taken: 0.108 seconds, 3 row(s)
` ` `

Atlas is automatically available

Five, error records

1. Abnormal characters exist in the configuration file

As specified

Logging the initialized using the configuration in the jar: file: / opt/hive/lib/hive - common - 3.1. 2. The jar! /hive-log4j2.properties Async: true Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at org.apache.hadoop.fs.Path.initialize(Path.java: 263 ) at org.apache.hadoop.fs.Path.<init>(Path.java: 221 ) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java: 710 ) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java: 627 ) at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java: 591 ) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java: 747 ) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java: 683 ) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 62 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43 ) at java.lang.reflect.Method.invoke(Method.java: 498 ) at org.apache.hadoop.util.RunJar.run(RunJar.java: 323 ) at org.apache.hadoop.util.RunJar.main(RunJar.java: 236 ) Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at java.net.URI.checkPath(URI.java: 1823 ) at java.net.URI.<init>(URI.java: 745 ) at org.apache.hadoop.fs.Path.initialize(Path.java: 260 ) ... 12 moreCopy the code

Solution:

Find the specified number of config file lines and delete the description

<property> <name>hive.exec.scratchdir</name> <value>/tmp/hive</value> <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt; username&gt; is created, with ${hive.scratch.dir.permission}.</description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/tmp/hive/local</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/tmp/hive/resources</value> <description>Temporary local directory for  added resources in the remote file system.</description> </property>Copy the code

2. Guava versions are inconsistent

Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8 at [row,col,system-id]: [3215, 96, "file:/opt/hive/conf/hive-site.xml" ] at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java: 3051 ) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java: 3000 ) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java: 2875 ) at org.apache.hadoop.conf.Configuration.get(Configuration.java: 1484 ) at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java: 4996 ) at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java: 5069 ) at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java: 5156 ) at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java: 5104 ) at org.apache.hive.beeline.HiveSchemaTool.<init>(HiveSchemaTool.java: 96 ) at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java: 1473 ) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 62 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43 ) at java.lang.reflect.Method.invoke(Method.java: 498 ) at org.apache.hadoop.util.RunJar.run(RunJar.java: 323 ) at org.apache.hadoop.util.RunJar.main(RunJar.java: 236 ) Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8 at [row,col,system-id]: [3215, 96, "file:/opt/hive/conf/hive-site.xml" ] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java: 621 ) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java: 491 ) at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java: 2456 ) at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java: 2403 ) at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java: 2369 ) at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java: 1515 ) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java: 2828 ) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java: 1123 ) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java: 3347 ) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java: 3141 ) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java: 3034 ) ... 15 moreCopy the code

Solutions:

1, com.google.com mon. Base. The Preconditions. CheckArgument in jars for this class: guava. Jar

Hadoop-3.2.1 (path: hadoop-share/hadoop-common/lib) : guava-27.0-jre.jar; In hive-3.1.2(path: hive/lib), the jar package is guava-19.0.1.jar

3. Change the JAR packages to the same version: Delete the jar packages of the earlier version of Hive and copy the jar packages of the earlier version of Hadoop to the Hive lib.

Restart problem solved!