Hadoop - Single-node pseudo-distributed construction

Operating system: macOS Mojave 10.14.5

The JDK: 1.8

Hadoop: 2.7.7

1. Java and Hadoop installation

Download and install will be no problem

Note that:

I’m a MAC so javaHome is $(/usr/libexec/ java_HOME); I’m ZSH so change.zshrc, don’t forget source.

#JAVA_HOME
export JAVA_HOME=$(/usr/libexec/java_home)
export PATH=$JAVA_HOME/bin:$PATH
#HADOOP_HOME
export HADOOP_HOME=/usr/local/ hadoop - 2.7.7export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Copy the code

Because I put Hadoop in/usr/local/So you need to change the hadoop folder permissions

sudo chown -R zxy:admin /usr/local/ hadoop - 2.7.7Copy the code

2. Configure SSH

ssh-keygen -t rsa -P ""
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# test
ssh localhost
Copy the code

3. Pseudo-distributed configuration

core-site.xml

Modify/usr/local/hadoop – 2.7.7 / etc/hadoop/core – site. XML

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/ hadoop - 2.7.7 / data/TMP < value > / < / property > < / configuration >Copy the code

Fs. defaultFS NameNode address of the HDFS
Hadoop.tmp. dir Specifies the address of the temporary hadoop file

hdfs-site.xml

Modify/usr/local/hadoop – 2.7.7 / etc/hadoop/HDFS – site. XML

<configuration>
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
</configuration>
Copy the code

Dfs. replication Number of copies stored in the HDFS file. The default value is 3. Since we only have one node here, set 1.

yarn-site.xml

Modify/usr/local/hadoop – 2.7.7 / etc/hadoop/yarn – site. XML

<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>localhost</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <! -- Enable aggregation log --> <property> <name>yarn.log-aggregation-enable</name> <value>true</value>
    </property>
    <property>
        <name>yarn.log.server.url</name>
        <value>http://localhost:19888/jobhistory/logs</value>
    </property>
</configuration>
Copy the code

Arn. log-aggregation-enable Enables log aggregation
Yarn. The resourcemanager. The hostname yarn resourcemanager address

mapred-site.xml

Modify/usr/local/hadoop – 2.7.7 / etc/hadoop/mapred – site. XML

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>localhost:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>localhost:19888</value>
    </property>
</configuration>
Copy the code

Mapreduce.framework. name Manages the MR using YARN
Graphs. Jobhistory. Address history server port address
Graphs. Jobhistory. Webapp. Address history both server web address

Check the JAVA_HOME

Hadoop-env. sh, mapred-env.sh, and yarn-env.sh, check whether the JAVA_HOME path is added to the three files as follows:

export JAVA_HOME=$JAVA_HOME
Copy the code

Use 4.

Open cluster.

Format files for the first time (only for the first time. Files in the log and data directories need to be deleted for subsequent formatting)

hadoop namenode -format
Copy the code

Enable Namenode and Datanode

hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode
Copy the code

Open the yarn

yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager
Copy the code

Open historyserver

mr-jobhistory-daemon.sh start historyserver
Copy the code

You can use JPS to see the effect

jps
35953 JobHistoryServer
32930
35682 NodeManager
35990 Jps
35559 DataNode
35624 ResourceManager
35502 NameNode
Copy the code

test

Create a folder named zxyTest, place a file in it, and upload it to HDFS to test wordcount

HDFS DFS - put zxytest/hadoop jar share/hadoop/graphs/hadoop - graphs - examples - 2.7.7. Jar wordcount/zxytest/zxyoutCopy the code

Shut down

mr-jobhistory-daemon.sh stop historyserver
yarn-daemon.sh stop resourcemanager
yarn-daemon.sh stop nodemanager
hadoop-daemon.sh stop namenode
hadoop-daemon.sh stop datanode
Copy the code

Visual address

All tasks: http://localhost:8088/

DataNode: http://localhost:50070/

History server: http://localhost:19888/

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Hadoop – Single-node pseudo-distributed construction

1. Java and Hadoop installation

2. Configure SSH

3. Pseudo-distributed configuration

core-site.xml

hdfs-site.xml

yarn-site.xml

mapred-site.xml

Check the JAVA_HOME

Use 4.

Hadoop – Single-node pseudo-distributed construction

1. Java and Hadoop installation

2. Configure SSH

3. Pseudo-distributed configuration

core-site.xml

hdfs-site.xml

yarn-site.xml

mapred-site.xml

Check the JAVA_HOME

Use 4.

Related Posts

Backpack nine tell full solution (01 backpack, multiple backpack, complete backpack)

From ordinary to ordinary [My 2021]

Rokid and My Life out of the Box experience