1. Install jdk1.8

FTP file to the /opt directory

tar -zxvf /opt/jdk-8u121-linux-x64.tar.gz -C /usr/local/
Copy the code

Unpack to user/local

Configure the environment

Modify the profile

Chmod +x /etc/profile # add execution permission source /etc/profile # make it effective

2 install hadoop

FTP upload

Decompress the package to usr/local

Mv hadoop-2.7.3 Hadoop Changing a folder name

Configure Hadoop environment variables

Final configuration file



If the following figure is displayed, the configuration is successful



# vim /etc/hostname192.168.0.101 instead${yourname}Edit the /etc/hosts file and set the mapping between the host name and IP address# vi /etc/hosts
192.168.0.101     ${yourname}Copy the code

$SSH -keygen -t dsa -p “-f ~/. SSH /id_dsa $cat ~/. SSH /id_dsa.pub >> ~/. SSH /authorized_keys This command generates a public key (~/.ssh/id_rsa.pub) and a key (~/.ssh/id_rsa). -t dsa: indicates the encryption type of the key. The value can be ‘rsa’ or ‘dsa’ -p ‘: indicates that no password is required for login. SSH /id_dsa command 2: $cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys Add the local public key to the authorized_keys, which allows the local to log in via SSH without password. The public key of host A can also be added to the authorized_keys file if other hosts (such as A) also log in without login. In this way, host A can SSH to the machine without logging in.

1. Modify the hadoop – env. Sh

vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh

This file changes the location of javA_HOME to the following path

exportJAVA_HOME=/usr/local/ jdk1.8.0 _121Copy the code

2. Modify the core – site. XML

join

<property> <! </name> <value> HDFS ://jinkai:9000</value> </property> <! <property> <name>hadoop.tmp.dir</name> <value>/mysoft/hadoop/ TMP </value> </property>Copy the code

3. Modify the HDFS – site. XML

<property> <name>dfs.replication</name> <value>1</value> </property>Copy the code

4. Modify the mapred – site. XML

Specifies that Mr Runs on YARN

Note that there is no mapred-site. XML file, but there is mapred-site.xml.template

So we just need to rename the mapred-site.xml.template to mapred-site.xml

mv mapred-site.xml.template mapred-site.xml

5. Modify the yarn – site. XML

Specify the running address of Yarn (ResourceManager) and the data acquisition mode of reducer

<property> <name>yarn.resourcemanager.hostname</name> <value>jinkai</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>Copy the code

6. Disable the firewall

Systemctl stop firewalld. Service # stop firewall systemctl disable firewalld

7. Reboot Restarts the server

Formatting HDFS

hdfs namenode -format



If the figure is shown, success is achieved

8. Start DFS and YARN

Run the start-dfs.sh command



The following figure indicates that the startup is successful

Type the command JPS to see which programs we run



Now you can test it

Upload a TXT file containing data mining on data warehouse to /opt

Create a directory in HDFS

hadoop dfs -mkdir /input



The version of the glibc library that Hadoop expects is not the same, so print the warning message regardless

Blog.csdn.net/l1028386804… Refer to this blog for a solution

Upload the text file to the newly created HDFS directory



View the file directory in the file system



Go to the JAR file directory and execute the following command.

Hadoop jar hadoop/hadoop – 2.7.3 / share/hadoop/graphs/hadoop – graphs – examples – 2.7.3. Jar wordcount/input/output



Hadoop fs-cat /output/part-r-00000

Hadoop fs -rMR /output delete existing folder