Operating system: macOS Mojave 10.14.5

The JDK: 1.8

Hadoop: 2.7.7

1. Java and Hadoop installation

Download and install will be no problem

Note that:

  • I’m a MAC so javaHome is $(/usr/libexec/ java_HOME); I’m ZSH so change.zshrc, don’t forget source.
#JAVA_HOME
export JAVA_HOME=$(/usr/libexec/java_home)
export PATH=$JAVA_HOME/bin:$PATH
#HADOOP_HOME
export HADOOP_HOME=/usr/local/ hadoop - 2.7.7export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Copy the code
  • Because I put Hadoop in/usr/local/So you need to change the hadoop folder permissions
sudo chown -R zxy:admin /usr/local/ hadoop - 2.7.7Copy the code

2. Configure SSH

ssh-keygen -t rsa -P ""
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# test
ssh localhost
Copy the code

3. Pseudo-distributed configuration

core-site.xml

Modify/usr/local/hadoop – 2.7.7 / etc/hadoop/core – site. XML

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/ hadoop - 2.7.7 / data/TMP < value > / < / property > < / configuration >Copy the code
  • Fs. defaultFS NameNode address of the HDFS

  • Hadoop.tmp. dir Specifies the address of the temporary hadoop file

hdfs-site.xml

Modify/usr/local/hadoop – 2.7.7 / etc/hadoop/HDFS – site. XML

<configuration>
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
</configuration>
Copy the code
  • Dfs. replication Number of copies stored in the HDFS file. The default value is 3. Since we only have one node here, set 1.

yarn-site.xml

Modify/usr/local/hadoop – 2.7.7 / etc/hadoop/yarn – site. XML

<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>localhost</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <! -- Enable aggregation log --> <property> <name>yarn.log-aggregation-enable</name> <value>true</value>
    </property>
    <property>
        <name>yarn.log.server.url</name>
        <value>http://localhost:19888/jobhistory/logs</value>
    </property>
</configuration>
Copy the code
  • Arn. log-aggregation-enable Enables log aggregation
  • Yarn. The resourcemanager. The hostname yarn resourcemanager address

mapred-site.xml

Modify/usr/local/hadoop – 2.7.7 / etc/hadoop/mapred – site. XML

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>localhost:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>localhost:19888</value>
    </property>
</configuration>
Copy the code
  • Mapreduce.framework. name Manages the MR using YARN
  • Graphs. Jobhistory. Address history server port address
  • Graphs. Jobhistory. Webapp. Address history both server web address

Check the JAVA_HOME

Hadoop-env. sh, mapred-env.sh, and yarn-env.sh, check whether the JAVA_HOME path is added to the three files as follows:

export JAVA_HOME=$JAVA_HOME
Copy the code

Use 4.

  • Open cluster.

Format files for the first time (only for the first time. Files in the log and data directories need to be deleted for subsequent formatting)

hadoop namenode -format
Copy the code

Enable Namenode and Datanode

hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode
Copy the code
  • Open the yarn
yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager
Copy the code
  • Open historyserver
mr-jobhistory-daemon.sh start historyserver
Copy the code
  • You can use JPS to see the effect
jps
35953 JobHistoryServer
32930
35682 NodeManager
35990 Jps
35559 DataNode
35624 ResourceManager
35502 NameNode
Copy the code
  • test

Create a folder named zxyTest, place a file in it, and upload it to HDFS to test wordcount

HDFS DFS - put zxytest/hadoop jar share/hadoop/graphs/hadoop - graphs - examples - 2.7.7. Jar wordcount/zxytest/zxyoutCopy the code
  • Shut down
mr-jobhistory-daemon.sh stop historyserver
yarn-daemon.sh stop resourcemanager
yarn-daemon.sh stop nodemanager
hadoop-daemon.sh stop namenode
hadoop-daemon.sh stop datanode
Copy the code
  • Visual address

All tasks: http://localhost:8088/

DataNode: http://localhost:50070/

History server: http://localhost:19888/