Centos7 Hadoop installation record

Environment introduction

  • Number of virtual machines: 3
  • Operating system version: centos-7-x86_64-minimal -2009.iso

The cluster is introduced

Software Version Introduction

  • JDK version: JDK-8U281-linux-x64.tar.gz
  • Hadoop version: hadoop-3.2.2.tar.gz
  • Zookeeper version:
  • Hbase version:
  • Storm version:
  • Kafka version:
  • MySQL: mysql-8.0.22-linux-glibc2.12-x86_64
  • Hive version: Apache-hive-3.1.2
  • The Flume version:
  • The Spark version:

preparation

Perform the same Settings on each host node

1. Set the hostname

Set this to master and the command syntax is:

[root@192 ~]sudo hostnamectl set-hostname master [root@master ~]vi /etc/hostsCopy the code

2. Set a user

Creating a User Hadoop

Do not configure the cluster under root

  • Create a Hadoop user group
[root@master ~]groupadd hadoop
Copy the code
  • Create user Hadoop and add the user to user group Hadoop
[root@master ~]useradd hadoop -g hadoop
Copy the code
  • Set passwords for Hadoop users
[root@master ~]passwd hadoop
Copy the code

Add sudo permission

  • Switch to user root and modify the /etc/sudoers file
[root@master ~]vi /etc/sudoers add: ##Allow root to run any commands anywhere rootALL=(ALL) ALL Hadoop ALL=(ALL) ALLCopy the code

3. Install Java and set environment variables

*** only Java8 versions are supported for Hadoop3.0 and later. *** After downloading the JDK, unzip it and place it in /soft (directory can be changed). After downloading, configure the relevant environment variables in /etc/profile.

Install the JDK

  • Prepare JDK: JDK-8U281-linux-x64.tar. gz and upload it to the /home/hadoop directory on the host
  • Create a /soft folder and change the user group permissions and user permissions of this folder to Hadoop. All software to be installed is stored in this folder
// Create soft folder [hadoop@master /]$sudo mkdir /soft // Modify permissions [hadoop@master /]$sudo chown hadoop:hadoop /softCopy the code
  • Decompress jdK-8U281-linux-x64.tar. gz to the /soft directory and create a symbolic link
// Decompress the file from /home/hadoop to /soft [hadoop@master ~]$tar -xzvf JDK-8u281-linux-x64.tarCopy the code
  • Configure the environment variables in the /etc/profile file and run source /etc/profile for them to take effect immediately
// Go to profile [hadoop@master ~]$sudo vi /etc/profile // environment variable # JDK export JAVA_HOME=/soft/jdk1.8.0_281 export JRE_HOME=/soft/jdk1.8.0_281/jre export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib export PATH=$PATH:$JAVA_HOME/bin // Source immediately [hadoop@master ~]$source /etc/profileCopy the code
  • Check whether the installation and configuration are successful
[hadoop@master ~]$Java -version // Display Java version "1.8.0_281" Java(TM) SE Runtime Environment (build 1.8.0_281-b09) Java HotSpot(TM) 64-bit Server VM (Build 25.281-B09, Mixed mode)Copy the code

Before configuring SSH secret free login, clone 3 copies of Slaves master, then verify whether their IP is consistent with the above and use Xshell connection, so we can get three additional machines, and all installed Java.

4.SSH password-free login

Configure the SSHD

  • Example Modify the SSHD configuration file
[root@master ~]vi /etc/ssh/sshd_config  RSAAuthentication yes PubkeyAuthentication yes AuthorizedKeysFile .ssh/authorized_keysCopy the code
  • Generate the key
[root@master ~]su - hadoop
[hadoop@master ~]ssh-keygen -t rsa
Copy the code

After the command is executed, two files will be generated in the hadoop user’s home directory (/home/hadoop/.ssh) :

Id_rsa: private key id_rsa.pub: public key

  • Import the public key to the authentication file
[hadoop@master ~]cat /home/hadoop/.ssh/id_rsa.pub >> /home/hadoop/.ssh/authorized_keys
Copy the code
  • Set file access permission
[hadoop@master ~]chmod 700 /home/hadoop/.ssh
[hadoop@master ~]chmod 600 /home/hadoop/.ssh/authorized_keys
Copy the code

5. Install Hadoop 3.2

Install and configure environment variables

  • Download hadoop-3.2.2.tar.gz and upload it to the /home/hadoop directory on the host
  • Decompress hadoop-3.2.2.tar.gz to the /soft directory and create a symbolic link
// Decompress /home/hadoop to /soft [hadoop@master ~]$tar -xzvf hadoop-3.2.2.tar.gz -c /softCopy the code
  • Add the following two lines at the end of the /etc/profile file and run source /etc/profile for immediate effect
// Go to profile [hadoop@master ~]$sudo vi /etc/profile export HADOOP_HOME=/soft/hadoop-3.2.2 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin/source Immediately [hadoop@master ~]$source /etc/profileCopy the code
  • Check whether the installation and configuration are successful
[hadoop @ master ~] hadoop version / / show the hadoop 3.2.2 Source code repository at https://gitbox.apache.org/repos/asf/hadoop.git -r b3cbbb467e22ea829b3808f4b7b01d07e0bf3842 Compiled by rohithsharmaks on the 2021-01-03 T09:26 z Compiled with protoc 2.5.0 From source with checksum 776eaf9eee9c0ffc370bcbc1888737 This command was run using / soft/hadoop - 3.2.2 / share/hadoop/common/hadoop - common - 3.2.2. JarCopy the code
  • Set the corresponding data directory in the Hadoop installation directory

These data directories can be set by themselves. You only need to specify the corresponding directories in the subsequent configuration.

Create folder TMP *** under ***/soft/hadoop as our temporary directory.

[hadoop@master ~]mkdir -p /soft/hadoop/ TMP Store temporary files [hadoop@master ~]mkdir -p /soft/hadoop/ HDFS /nn #namenode directory [hadoop@master ~]mkdir -p /soft/hadoop/ HDFS /dn [hadoop@master ~]mkdir -p /soft/hadoop/yarn/nm # nodeManager directoryCopy the code
  • Configure the configuration file in the hadoop-3.2.2./etc/hadoop directory.
file introduce
core-site.xml Core configuration file
dfs-site.xml HDFS stores related configurations
apred-site.xml MapReduce configurations
arn-site.xml Yarn related configurations
workers Used to specify the slave node, default localhost in the file
hadoop-env.sh Configure hadoop variables

Modifying a Configuration File

  • Modify the core – site. XML

Input:

[hadoop@master ~]vi core-site.xml
Copy the code

In the add:

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://master:9000</value>
    </property>  
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/soft/hadoop/tmp</value>
    </property>  
    <property>
      <name>hadoop.proxyuser.hadoop.hosts</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.proxyuser.hadoop.groups</name>
     <value>hadoop</value>
    </property>
</configuration>
Copy the code
  • Modify the hadoop – env. Sh

The input

[hadoop@master ~]vi hadoop-env.sh
Copy the code

Change ${JAVA_HOME} to your JDK path

export   JAVA_HOME=${JAVA_HOME}
Copy the code

Is amended as:

Export JAVA_HOME = / soft/jdk1.8.0 _281Copy the code
  • Modify the HDFS – site. XML

Input:

[hadoop@master ~]vi hdfs-site.xml
Copy the code

In the add:

<property> <name>dfs.name.dir</name> <value>/soft/hadoop/hdfs/nn</value> <description>Path on the local filesystem where  theNameNode stores the namespace and transactions logs persistently.</description> </property> <property> <name>dfs.data.dir</name> <value>/soft/hadoop/hdfs/dn</value> <description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.</description> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.permissions</name> <value>true</value> <description>need permissions</description> </property> <property> <name>dfs.http.address</name> < value > 0.0.0.0:50070 < value > / < / property >Copy the code

DFS. Permissions allows files to be generated on DFS without checking permissions, which is convenient, but you need to prevent accidental deletion. Set it to true, or simply delete the property node, as the default is true.

  • Modify the mapred – site. XML

If the mapred-site. XML file does not exist, copy the mapred-site.xml.template file and name it mapred-site. XML. Input:

[hadoop@master ~]vi mapred-site.xml
Copy the code

Modify the newly created mapred-site. XML file and add the configuration to the < Configuration > node:

<property>
	<name>mapred.job.tracker</name>
	<value>master:9001</value>
</property>
<property>  
    <name>mapreduce.jobhistory.address</name>  
    <value>master:10020</value>  
</property>
<property>
    <name>mapred.local.dir</name>
    <value>/soft/hadoop/yarn</value>
</property>
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
    <name>mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
Copy the code
  • Modify the mapred – site. XML
[hadoop@master ~]vi mapred-site.xml
Copy the code

Add the configuration to the < Configuration > node:

<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> <description>Whether virtual memory limits will be  enforced for containers</description> </property>Copy the code

Hadoop start

When starting Hadoop for the first time, you need to switch to the /soft/hadoop-3.2.2/bin directory

[hadoop@master ~]./hadoop namenode -format
Copy the code

Switch to the /soft/hadoop-3.2.2/sbin directory and start

[hadoop@master sbin]$ ./start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [master]
Starting datanodes
Starting secondary namenodes [master]
2021-03-05 11:47:40,324 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
Copy the code

In your browser, type JPS to see if it started successfully

http://192.168.11.212:8088/cluster

http://192.168.11.212:50070

6. The Mysql installation

  • Upload and decompress the mysql package in /soft
[hadoop@master soft]$tar-xvf mysql-8.0.22-linux-glibc2.12-x86_64.tar.xzCopy the code
  • The decompressed package is deleted
[hadoop@master soft]$rm -rf mysql-8.0.22-linux-glibc2.12-x86_64.tar.xzCopy the code
  • Changing folder names
[hadoop@master soft]$mv mysql-8.0.22-linux-glibc2.12-x86_64/ mysql-8.0.22/Copy the code
  • Create a data directory in mysql-8.0.22
[hadoop @ master soft] $mkdir/soft/mysql - 8.0.22 / dataCopy the code
  • Change the mysql directory permission
[hadoop@master soft]$chmod -r 755 /soft/mysql-8.0.22/Copy the code
  • Compile, install, and initialize mysql

Be sure to remember the password at the end of the initialization output log (temporary database administrator password)

[hadoop@master soft]$CD /soft/mysql-8.0.22/bin/ [hadoop@master bin]$./mysqld --initialize --user=mysql [Warning] [My-010139] --datadir=/soft/mysql-8.0.22/data --basedir=/soft/mysql-8.0.22 #  [Server] Changed limits: max_open_files: [Warning] [My-010142] [Server] Changed limits: table_open_cache: Requested 8161 2021-03-05T07:10:12.910610z 0 [Warning] [My-010142] [Server] Changed limits: table_open_cache: 431 (Requested 4000) 2021-03-05T07:10:12.911947z 0 [Warning] [my-011070] [Server] 'Disabling symbolic links using (requested 4000) 2021-03-05T07:10:12.911947z 0 [Warning] [My-011070] [Server] 'Disabling symbolic links using --skip-symbolic-links (or equivalent) is the default. Consider not using this option as it' is deprecated and will be Removed in a future release. 2021-03-05T07:10:12.912379z 0 [System] [my-013169] [Server] /soft/mysql-8.0.22/bin/mysqld (mysqld 8.0.22) Initializing of Server in Progress as Process 20165 2021-03-05T07:10:12.937653z 0 [Warning] [My-010122] [Server] One can only use the --user switch if running as root 2021-03-05T07:10:13.073996z 1 [System] [my-013576] 2021-03-05T07:10:16.457296z 1 [System] [my-013577] [InnoDB] InnoDB initialization started 2021-03-05T07:10:24.775289z 6 [Note] [my-010454] [Server] A temporary password is generated for root@localhost: jai; A5_I-xyuCopy the code
  • Edit the configuration file my.cnf
[root@master /]# vi /etc/my.cnf # add datadir=/soft/mysql-8.0.22/data basedir=/soft/mysql-8.0.22 port=3306 sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES symbolic-links=0 max_connections=600 innodb_file_per_table=1 Lower_case_table_names =0 CHARACTER_set_server = UTf8 // Enable this parameter for changing the root password. Comment #skip-grant-tables default_authentication_plugin=mysql_native_password immediatelyCopy the code
  • Test start mysql server
[hadoop@master support-files]$/soft/mysql-8.0.22/support-files/mysql.server start Starting mysql...... SUCCESS!Copy the code
  • Add a soft connection and restart the mysql service
[root@master /]# ln -s /soft/mysql-8.0.22/support-files/mysql.server /etc/init.d/mysql [root@master /]# ln -s /soft/mysql-8.0.22/bin/mysql /usr/bin/mysql [hadoop@master mysql-8.0.22]$service mysql restart Shutting down mysql.. SUCCESS! Starting MySQL.... SUCCESS!Copy the code
  • Log in to the mysql database and change the password
[hadoop@master mysql-8.0.22]$mysql -u root -p Enter password: Welcome to the mysql monitor. Commands end with; or \g. Your MySQL connection id is 8 Server version: 8.0.22 MySQL Community Server - GPL Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help; ' or '\h' for help. Type '\c' to clear the current input statement. mysql> alter user 'root'@'master' identified with mysql_native_password by '123456QWEasd'; Query OK, 0 rows affected (0.00 SEC) mysql> select authentication_string from user where user = 'root'; +-------------------------------------------+ | authentication_string | +-------------------------------------------+ | * C4FE36EE5830F8BBC49315A96EEADF30D7292EBE | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 1 row in the set (0.00 SEC) mysql >  update user set user.Host='%' where user.User='root'; Query OK, 0 rows affected (0.00 SEC) mysql> flush PRIVILEGES; Query OK, 0 rows affected (0.00 SEC) mysql> quit;Copy the code

7. Install and configure Hive

Configuring environment Variables

  • Upload and decompress the Hive package in /soft
[hadoop@master soft]$ tar -xvf apache-hive-3.1.2-bin.tar.gz 
Copy the code
  • Environment configuration
[root@master etc]# vi /etc/profile # Add the following configuration: export HIVE_HOME=/soft/apache-hive-3.1.2-bin export HIVE_CONF_DIR=${HIVE_HOME}/conf export PATH=.:${HIVE_HOME}/bin:$PATHCopy the code
  • For the configuration to take effect, enter:
[root@master etc]# source /etc/profile
Copy the code

Configuration changes

  • New Folder
[hadoop@master /]$ mkdir /soft/hive [hadoop@master /]$ mkdir /soft/hive/warehouse [hadoop@master /]$ cd /soft/ [hadoop@master soft]$ls -l total 0 drwxrwxr-x. 10 Hadoop Hadoop 184 Mar 5 16:33 apache-hive-3.1.2-bin drwxr-x--.5 Hadoop Hadoop 41 Mar 4 15:07 Hadoop drwxr-xr-x. 10 Hadoop Hadoop 161 Mar 4 15:36 hadoop-3.2.2 drwxrwxr-x. 3 Hadoop Hadoop 23 Mar 5 16:46 hive drwxr-xr-x. 8 Hadoop Hadoop 273 Dec 9 20:50 jdk1.8.0_281 drwxr-xr-x. 10 Hadoop Hadoop 141 Mar 5 15:02 mysql - 8.0.22Copy the code

After creating this file, you need to have Hadoop create /soft/hive/warehouse and /soft/hive/ directories. Execute command:

$HADOOP_HOME/bin/hadoop fs -mkdir -p /soft/hive/
$HADOOP_HOME/bin/hadoop fs -mkdir -p /soft/hive/warehouse
Copy the code

To grant read/write permission to the newly created directory, run the following command:

$HADOOP_HOME/bin/hadoop fs -chmod 777 /soft/hive/
$HADOOP_HOME/bin/hadoop fs -chmod 777 /soft/hive/warehouse 
Copy the code

To check whether the two directories were created successfully enter:

[hadoop@master soft]$$HADOOP_HOME/bin/hadoop Fs-ls /soft/ 2021-03-05 16:49:06,480 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items drwxrwxrwx - hadoop supergroup 0 2021-03-05 16:48 /soft/hive [hadoop@master soft]$$HADOOP_HOME/bin/hadoop fs-ls /soft/hive/ 2021-03-05 16:49:24,664 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items drwxrwxrwx - hadoop supergroup 0 2021-03-05 16:48 /soft/hive/warehouseCopy the code

Modify the hive – site. XML

[hadoop@master soft]$CD /soft/apache-hive-3.1.2-bin/conf/ [hadoop@master conf]$cp hive-default.xml.template Hive-site. XML [hadoop@master conf]$vi hive-site. XML - specifies the HDFS hive warehouse address - > < property > < name >. Hive metastore. Warehouse. Dir < / name > < value > / soft/hive/warehouse < / value > <description>location of default database for the warehouse</description> </property> <property> <name>hive.exec.scratchdir</name> <value>/soft/hive</value> <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt; username&gt; is created, with ${hive.scratch.dir.permission}.</description> </property> <! - specifies the mysql connection - > < property > < name > javax.mail. Jdo. Option. The ConnectionURL < / name > JDBC: mysql: / / master: 3306 / hive? createDatabaseIfNotExist=true&amp; serverTimezone=GMT%2B8&amp; useSSL=false </property> <! - specifies the class -- -- > < property > < name > javax.mail. Jdo. Option. ConnectionDriverName < / name > < value > com. Mysql.. JDBC Driver value > < / </property> <! - specify a user name - > < property > < name > javax.mail. Jdo. Option. ConnectionUserName < / name > < value > root < value > / < / property > <! - specifies the password -- -- > < property > < name > javax.mail. Jdo. Option. ConnectionPassword < / name > < value > 123456 qweasd < value > / < / property > <property> <name>hive.metastore.schema.verification</name> <value>false</value> <description> </description> </property>Copy the code

Then change all the system:java.io. Tmpdir in the configuration file to /soft/hive/ TMP (create if you do not have the file) and grant read and write permissions to this folder. Change {system:java.io.tmpdir} to /soft/hive/ TMP (create if you don’t have this file) and grant read and write permissions to this folder. Change system:java.io. Tmpdir to /soft/hive/ TMP (create it if it does not exist), grant read and write permissions to this folder, and change {system:user.name} to root

Modify the hive – env. Sh

[hadoop@master conf]$cp hive-env.sh.template hive-env.sh [hadoop@master conf]$vi hive-env.sh # Add the following configuration export HADOOP_HOME = / soft/hadoop - 3.2.2 export HIVE_CONF_DIR = / soft/apache - hive - 3.1.2 - bin/conf export HIVE_AUX_JARS_PATH = / soft/apache - hive - 3.1.2 - bin/libCopy the code

Add a data-driven package

The default Hive database uses mysql. Upload the mysql driver package to /soft/apache-hive-3.1.2-bin/lib using mysql

8. Hive Shell test

Switch to the Hive bin directory. Ensure that the guava.jar versions of Hadoop and Hive are the same. The two directories are in the following directories: / soft/apache – hive – 3.1.2 – bin/lib/soft/hadoop – 3.2.2 / share/hadoop/common/lib

Solution: Delete the lower version and copy the higher version to the lower version directory

[hadoop@master bin]$ schematool  -initSchema -dbType mysql
Copy the code

Start the hive

[hadoop@master sbin]$CD /soft/apache-hive-3.1.2-bin/bin [hadoop@master bin]$hiveserver2Copy the code

Hadoop starts and stops

Starting the Hadoop Service

1. Start Hadoop

[hadoop@master bin]$CD /soft/hadoop-3.2.2/sbin/ [hadoop@master sbin]$start-all.sh [hadoop@master sbin]$CD /soft/hadoop-3.2.2/bin/ [hadoop@master bin]$JPS 68035 JPS 63651 NameNode 67426 RunJar 63764 DataNode 63972 SecondaryNameNode 64197 ResourceManager 64309 NodeManager 64678 RunJarCopy the code

2. Start mysql

[hadoop@master bin]$service mysql start # [hadoop@master bin]$service mysql restartCopy the code

3. Start Hive

[hadoop@master bin]$CD /soft/apache-hive-3.1.2-bin/bin #hiveShell Startup command [hadoop@master bin]$hive #JDBC connection startup command (After the command is executed, [hadoop@master bin]$nohup hiveserver2&Copy the code

Stopping the Hadoop Service

1. Stop Hive

[hadoop@master hadoop]# ps -aux| grep hiveserver2
[hadoop@master hadoop]# kill -9 <PID>
Copy the code

2. Stop mysql

[hadoop@master hadoop]# service mysql stop
Shutting down MySQL........... SUCCESS! 
Copy the code

3. Stop Hadoop

[hadoop@master hadoop]# cd /soft/hadoop-3.2.2/sbin/
[hadoop@master sbin]# stop-all.sh 
[hadoop@master sbin]$ ../bin/jps
70353 Jps
Copy the code

10. Configure information

Hadoop related pages:

HTTP: / / http://192.168.11.212:8088/cluster / / hadoop monitoring

HTTP: / / http://192.168.11.212:50070/ / / the namenode information

Hive Related Web pages:

HTTP: / / http://192.168.11.212:10002/ / / hiveserver2

Database:

The database name The port number account password
mysql hive 3306 root 123456QWEasd
hive db_hiveTest 10000 hadoop hadoop

Note: the hive JDBC connection link: JDBC: hive2: / / 192.168.11.212:10000 / db_hiveTest