Environment: Ali Cloud server CentOS 7 x86_64

Installation media: JDK-7U75-linux-i586.tar. gz, hadoop-2.1.1.tar. gz

Install the JDK

tar -zxvf jdk-7u75-linux-i586.tar.gz
Copy the code

Configure environment variables:

# vi .bash_profileJAVA_HOME = / root/training/jdk1.7.0 _75export JAVA_HOME

PATH=$JAVA_HOME/bin:$PATH
export PATH

# source .bash_profile
# which java
# java -version
Copy the code

Bug resolution: 64-bit operating systems cannot run 32-bit applications. Install the 32-bit glibc library.

-bash: /root/training/jdk1.7.0_75/bin/ Java: /lib/ld-linux.so.2: Bad ELF interpreter: No such file or directoryCopy the code
# yum install glibc*.i686
# locate /lib/ld-linux.so.2
# rpm -qf /lib/ld-linux.so.2
Copy the code

Install Hadoop

# tar - ZXVF hadoop - against 2.4.1. Tar. Gz
Copy the code

Configure environment variables:

# vi .bash_profileHADOOP_HOME = / root/training/hadoop - against 2.4.1export HADOOP_HOME

PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export PATH

# source .bash_profile
Copy the code

Local mode configuration

Parameter file Configuration parameters reference
hadoop-env.sh JAVA_HOME / root/training/jdk1.7.0 _75
# vi hadoop-env.sh

exportJAVA_HOME = / root/training/jdk1.7.0 _75Copy the code

Change hostname, /etc/hosts address must use private address.

# vi /etc/hosts

192.168.1.107 izwz985sjvpoji48moqz01z
Copy the code

Verify the graphs

# hadoop jar hadoop - graphs - examples - against 2.4.1. Jar wordcount ~ / training/data/input/data. TXT ~ / training/data/output /
# more part-r-00000
Copy the code

Pseudo distributed mode configuration

Parameter file Configuration parameters reference note
hadoop-env.sh JAVA_HOME / root/training/jdk1.7.0 _75 Java’s home directory
hdfs-site.xml dfs.replication 1 Redundancy of data
core-site.xml fs.defaultFS hdfs://<hostname>:9000 IP address and port of namenode. 9000 is the port for RPC communication
core-site.xml hadoop.tmp.dir / root/training/hadoop - against 2.4.1 / TMP If you do not change the default path to/TMP, the path must exist in advance
mapred-site.xml mapreduce.framework.name yarn Specifies that MR runs on YARN
yarn-site.xml yarn.resourcemanager.hostname <hostname> Specify the address of YARN’s ResourceManager
yarn-site.xml yarn.nodemanager.aux-services mapreduce_shuffle Reducer method of obtaining data

hdfs-site.xml

<property>
    <name>dfs.replication</name>
    <value>1</value>
</property>
Copy the code

core-site.xml

< property > < name > fs. DefaultFS < / name > < value > HDFS: / / 192.168.1.107:9000 < value > / < / property > < property > < name > hadoop. TMP. Dir < / name > < value > / root/training/hadoop - against 2.4.1 / TMP < value > / < / property >Copy the code

XML, cp mapred-site.xml. Template mapred-site. XML

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
Copy the code

yarn-site.xml

< property > < name > yarn. The resourcemanager. The hostname < / name > < value > 192.168.1.107 < value > / < / property > < property > <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>Copy the code

Verify HDFS and MapReduce

# cd ~/trainging
# ls hadoop - against 2.4.1 / TMP /
# hdfs namenode -format
# start-all.sh
# jps
5828 NodeManager
6284 Jps
5438 SecondaryNameNode
5288 DataNode
5579 ResourceManager
5172 NameNode
# hdfs dfsadmin -report
# hdfs dfs -mkdir /input
# hdfs dfs -put data/input/data.txt /input/data.txt
# hdfs dfs -lsr /
# hadoop jar hadoop-mapreduce-examples-2.4.1.jar wordcount /input/data.txt /output
# hdfs dfs -cat /output/part-r-00000
# stop-all.sh
# jps
Copy the code

Configure SSH password-free login to Hadoop

Server A Server B
1. Generate the key and public key of A ssh-keygen -t rsa
2. Add the public key of A to B, and ssh-copy -i to B 3. Obtain the public key of Server A
Create a random string helloWorld
5. Use A’s public key for encryption: *****
6. Send the encrypted string *** to A
7. Get the encrypted string sent by B
Decrypt with private key –> HelloWorld
9. Send the decrypted HelloWorld to B 10, get the decrypted string helloWorld sent by A
11. Compare the two strings of step4 and step10. If they are the same, Server B allows Server A to log in to Server B without password
# cd ~
# ls .ssh/
hnown_hosts
# ssh-keygen -t rsa
# ssh-copy-id -i. SSH /id_rsa.pub [email protected]
# more .ssh/authorized_keys
Copy the code

The wechat official account “Data Analysis” is used to share self-cultivation of data scientists. Since we met each other, it is better to grow up together.

Reprint please specify: Reprint from wechat official account “Data Analysis”


Reader communication telegraph group:

https://t.me/sspadluo