1. Change the Linux host name

hostnamectl set-hostname dhf1
Copy the code

Or modify the configuration file

vim /etc/sysconfig/network 

NETWORKING=yes
HOSTNAME=dhf1
Copy the code

2. Modify the IP

vim /etc/sysconfig/network-scripts/ifcfg-eth0

systemctl restart network
Copy the code

3. Modify the mapping between the host name and IP address

vim /etc/hosts

192.xxx.xxx.227 dhf1
192.xxx.xxx.228 dhf2
192.xxx.xxx.229 dhf3
192.xxx.xxx.230 dhf4
192.xxx.xxx.231 dhf5
192.xxx.xxx.232 dhf6
192.xxx.xxx.233 dhf7
Copy the code

4. Disable the firewall

systemctl status firewalld

systemctl stop firewalld

systemctl disable firewalld
Copy the code

5. SSH from landing

Ssh-keygen -t rsa (four carriage returns)Copy the code

After the command is executed, two files id_rsa (private key) and id_rsa.pub (public key) are generated to copy the public key to the machine (including the local machine) on which you want to avoid login:

ssh-copy-id dhf1
Copy the code

The machine that needs to generate the public key	The machine you need to copy to
dhf1	Dhf1, DHF2, DHF3, DHF4, DHF5, DHF6, dhF7
dhf2	Dhf1, dhf2
dhf3	Dhf3, DHF4, DHF5, DHF6, dhF7

6. Install JDK and configure environment variables

Export JAVA_HOME=/usr/lib/ JVM/java-1.8.0-openJDK-1.8.0.272.b10-1.el7_9.x86_64 export JRE_HOME=$JAVA_HOME/jre export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib source /etc/profileCopy the code

7. Restart the server

Reboot
Copy the code

8. Cluster planning

The host name	IP	Installed software	Running processes
dhf1	192.xxx.xxx.227	The JDK, hadoop,	The NameNode, DFSZKFailoverController (ZKFC)
dhf2	192.xxx.xxx.228	The JDK, hadoop,	The NameNode, DFSZKFailoverController (ZKFC)
dhf3	192.xxx.xxx.229	The JDK, hadoop,	ResourceManager
dhf4	192.xxx.xxx.230	The JDK, hadoop,	ResourceManager
dhf5	192.xxx.xxx.231	JDK, Hadoop, zooKeeper	DataNode, NodeManager, JournalNode, QuorumPeerMain
dhf6	192.xxx.xxx.232	JDK, Hadoop, zooKeeper	DataNode, NodeManager, JournalNode, QuorumPeerMain
dhf7	192.xxx.xxx.233	JDK, Hadoop, zooKeeper	DataNode, NodeManager, JournalNode, QuorumPeerMain

Note: hadoop2.0 usually consists of two namenodes, one in active state and the other in standby state. Active NameNode provides services externally. Standby NameNode does not provide services externally. It only synchronizes the status of the Active NameNode so that it can be switched over quickly if it fails. Hadoop officially provides two HDFS HA solutions: NFS and QJM. Here we use simple QJM. In this scheme, metadata information is synchronized between the master and standby Namenodes through a set of JournalNodes. If a piece of data is successfully written to most JournalNodes, it is considered as successfully written. An odd number of JournalNodes are usually configured. A ZooKeeper cluster is configured for ZKFC (DFSZKFailoverController) failover. When the Active NameNode fails, the Standby NameNode is automatically switched to Standby. Two ResourceManager are Active and Standby. The status is coordinated by ZooKeeper. Namenode and Resourcemanager are separated because of performance problems. Because they both consume a lot of resources, they are separated. When they are separated, they must be started on different machines.

9. Install the zookeeper

9.1. Install and configure the ZooeKeeper cluster

(Operating on DHF5)

CD/CDC /apache-zookeeper-3.5.8-bin/conf/ cp zoo_sample. CFG zoo.cfgCopy the code

Modify the zoo. CFG

CFG dataDir=/ CDC /apache-zookeeper-3.5.8-bin/ TMP server.1= DHF5:2888:3888 server.2= DHF6:2888:3888 server.3=dhf7:2888:3888Copy the code

Save the exit

Then create a TMP folder

The mkdir/CDC/apache - they are - 3.5.8 - bin/TMPCopy the code

Create an empty file

Touch/CDC/apache - zookeeper 3.5.8 - bin/TMP/myidCopy the code

Finally, write the ID to the file

Echo 1 > / CDC/apache - they are - 3.5.8 - bin/TMP/myidCopy the code

9.2 Copying zooKeeper to another node

Scp-r/CDC /apache-zookeeper-3.5.8-bin/ dhf6:/ CDC/scp-r/CDC /apache-zookeeper-3.5.8-bin/ dhf7:/ CDC /Copy the code

Note: Modify/CDC /apache-zookeeper-3.5.8-bin/ TMP/myID for DHF6 and DHF7

Dhf6: echo 2 > / CDC /apache-zookeeper-3.5.8-bin/ TMP /myid DHF7: echo 3 > / CDC /apache-zookeeper-3.5.8-bin/ TMP /myidCopy the code

10. Hadoop installation

10.1 Installing and Configuring a Hadoop Cluster

(Operating on DHF1)

10.1.1 Adding Hadoop to Environment Variables

Vim /etc/profile export HADOOP_HOME=/ CDC /hadoop-3.3.0 export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin export HDFS_NAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root export HDFS_JOURNALNODE_USER=root export HDFS_ZKFC_USER=rootCopy the code

HDFS 10.1.2 configuration

(All hadoop configuration files are in the $HADOOP_HOME/etc/hadoop directory)

HADOOP_CLASSPATH = HADOOP_CLASSPATH = HADOOP_CLASSPATH

/ CDC/hadoop - 3.3.0 / etc/hadoop: / CDC/hadoop - 3.3.0 / share/hadoop/common/lib / * : / CDC/hadoop - 3.3.0 / share/hadoop/common / * : / CDC/ha Doop - 3.3.0 / share/hadoop/HDFS: / CDC/hadoop - 3.3.0 / share/hadoop/HDFS/lib / * : / CDC/hadoop - 3.3.0 / share/hadoop/HDFS / * : / CDC/hadoop - 3.3.0 / share/hadoop/graphs / * : / CDC/hadoop - 3.3.0 / share/hadoop/yarn: / CDC/hadoop - 3.3.0 / share/hadoop/yarn/lib / * : / CDC/hadoo P - 3.3.0 / share/hadoop/yarn / *Copy the code

10.1.2.1 modify hadoop – env. Sh

Export JAVA_HOME = / usr/lib/JVM/Java -- 1.8.0 comes with its - 1.8.0.272. B10-1. El7_9. X86_64 export HADOOP_CLASSPATH = / CDC/hadoop - 3.3.0 / etc/hadoop: / CDC/hadoop - 3.3.0 / share/hadoop/common/lib / * : / CDC/hadoop - 3.3.0 / share/hadoop / common / * : / CDC/hadoop - 3.3.0 / share/hadoop/HDFS: / CDC/hadoop - 3.3.0 / share/hadoop/HDFS/lib / * : / CDC/hadoop - 3.3.0 / share/hadoop/h DFS / * : / CDC/hadoop - 3.3.0 / share/hadoop/graphs / * : / CDC/hadoop - 3.3.0 / share/hadoop/yarn: / CDC/hadoop - 3.3.0 / share/hadoop/yarn / lib / * : / CDC/hadoop - 3.3.0 / share/hadoop/yarn / *Copy the code

10.1.2.2 modify core – site. XML

<configuration>
	<! -- set HDFS nameservice to ns1
	<property>
        <name>fs.defaultFS</name>
        <value>hdfs://ns1</value>
	</property>
    <! Hadoop temporary directory -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>The CDC/hadoop - 3.3.0 / TMP</value>
    </property>
    <! -- Set zooKeeper address -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>dhf5:2181,dhf6:2181,dhf7:2181</value>
    </property>
	<property>
        <name>hadoop.proxyuser.root.hosts</name>
        <value>*</value>
	</property>
	<property>
    	<name>hadoop.proxyuser.root.groups</name>
    	<value>*</value>
	</property>
</configuration>
Copy the code

10.1.2.3 modify HDFS – site. XML

<configuration>
	<! -- set HDFS nameservice to ns1.
	<property>
        <name>dfs.nameservices</name>
        <value>ns1</value>
    </property>
    <! Nn1, nn2 -->
	<property>
        <name>dfs.ha.namenodes.ns1</name>
        <value>nn1,nn2</value>
	</property>
	<! -- NN1 RPC address -->
	<property>
        <name>dfs.namenode.rpc-address.ns1.nn1</name>
        <value>dhf1:9000</value>
    </property>
    <! -- Nn1 HTTP address -->
    <property>
        <name>dfs.namenode.http-address.ns1.nn1</name>
        <value>dhf1:50070</value>
    </property>
    <! -- NN2 RPC address -->
    <property>
        <name>dfs.namenode.rpc-address.ns1.nn2</name>
        <value>dhf2:9000</value>
    </property>
    <! -- HTTP address for Nn2 -->
    <property>
        <name>dfs.namenode.http-address.ns1.nn2</name>
        <value>dhf2:50070</value>
    </property>
    <! JournalNode where NameNode metadata is stored -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://dhf5:8485; dhf6:8485; dhf7:8485/ns1</value>
    </property>
    <! -- Specify the location where JournalNode stores data on the local disk -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>The CDC/hadoop - 3.3.0 / journal</value>
    </property>
    <! -- Enable NameNode automatic switchover -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    <! -- Automatic switch implementation mode when the configuration fails -->
    <property>
        <name>dfs.client.failover.proxy.provider.ns1</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <! -- Configure the isolation mechanism method, multiple mechanisms with a line break, i.e. each mechanism with a temporary line -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
        sshfence
        shell(/bin/true)
        </value>
    </property>
    <! -- Sshfence isolation requires SSH login free -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>
    <! Sshfence isolation timeout -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
</configuration>
Copy the code

10.1.2.4 modify mapred – site. XML

<configuration>
    <! -- Set Mr Frame to YARN mode -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>yarn.app.mapreduce.am.env</name>
        <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
    <property>
        <name>mapreduce.map.env</name>
        <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
    <property>
        <name>OP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
</configuration>   
Copy the code

10.1.2.5 modify yarn – site. XML

<configuration>
    <! -- Enable RM high reliability -->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <! -- Specify RM cluster ID -->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yrc</value>
    </property>
    <! -- Specify RM name -->
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>
    <! RM address -->
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>dhf3</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>dhf4</value>
    </property>
    <property> 
        <name>yarn.resourcemanager.webapp.address.rm1</name> 
        <value>dhf3:8088</value>
    </property> 
    <property> 
        <name>yarn.resourcemanager.webapp.address.rm2</name> 
        <value>dhf4:8088</value>
    </property>
    <! Zk cluster address -->
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>dhf5:2181,dhf6:2181,dhf7:2181</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.application.classpath</name>
        <value>/ CDC/hadoop - 3.3.0 / etc/hadoop: / CDC/hadoop - 3.3.0 / share/hadoop/common/lib / * : / CDC/hadoop - 3.3.0 / share/hadoop/common / * : / CDC/hadoop - 3.3.0 / share/hadoop/HDFS: / CDC/hadoop - 3.3.0 / share/hadoop/HDFS/lib / * : / CDC/hadoop - 3.3.0 / share/hadoop/HDFS / * : / CDC/hadoop - 3.3.0 / share/hadoop/graphs / * : / CDC/hadoop - 3.3.0 / share/hadoop/yarn: / CDC/hadoop - 3.3.0 / share/hadoop/yarn/lib / * : / CDC/hadoop - 3.3.0 / share/hadoop/yarn / *</value>
    </property>   
</configuration>
Copy the code

10.1.2.6 Modifying workers (workers)

(Workers specifies the location of the child node. Because HDFS needs to be started on DHF1 and YARN needs to be started on DHF3, the workers file on DHF1 specifies the location of datanode, and the workers file on DHF3 specifies the location of NodeManager.)

vim workers 

dhf5
dhf6
dhf7
Copy the code

10.2 Copying Configured Hadoop to other Nodes

Scp-r/CDC /hadoop-3.3.0/ root@dhf2:/ CDC/scp-r/CDC /hadoop-3.3.0/ root@dhf3:/ CDC/scp-r/CDC /hadoop-3.3.0/ root@dhf4:/ CDC/scp-r/CDC /hadoop-3.3.0/ root@dhf5:/ CDC/scp-r/CDC /hadoop-3.3.0/ root@dhf6:/ CDC/scp-r / CDC/hadoop - 3.3.0 / root @ dhf7: / CDC /Copy the code

11. Start the service

11.1 Starting the ZooKeeper Cluster

(Launch ZK on DHF5, DHF6, dhF7) (QuorumPeerMain)

CD/CDC/apache - they are - 3.5.8 - bin/bin /. / zkServer. Sh startCopy the code

Check the status: one leader and two followers

./zkServer.sh status
Copy the code

11.2 start journalnode

CD/CDC/hadoop - 3.3.0 /; rm -rf journal/ns1/; rm -rf logs/; rm -rf tmp/;Copy the code

(Executed on DHF5, DHF6 and TCAST07 respectively)

CD/CDC /hadoop-3.3.0/sbin/./hadoop-daemon.sh start journalnodeCopy the code

JournalNode is running on DHf5, DHf6, and DHf7

11.3 Formatting the HDFS

(Execute command on DHF1)

hdfs namenode -format
Copy the code

/ CDC /hadoop-3.3.0/ TMP Copy/CDC /hadoop-3.3.0/ TMP to/CDC /hadoop-3.3.0/ for DHF2.

SCP - r/CDC/hadoop - 3.3.0 / TMP/root @ dhf2: / CDC/hadoop - 3.3.0 /Copy the code

11.4 Formatting the ZK

(Executed on DHF1)

hdfs zkfc -formatZK
Copy the code

11.5 start the HDFS

(Executed on DHF1)

CD/CDC/hadoop - 3.3.0 / sbin /. / start - DFS. ShCopy the code

11.6 start the YARN

(Executed on DHF3)

CD/CDC/hadoop - 3.3.0 / sbin /. / start - yarn. ShCopy the code

12. validation

192.xxx.xxx.228:50070

NameNode ‘dhf2:9000’ (active)

192.xxx.xxx.227:50070

NameNode ‘dhf1:9000’ (standby)

Check that all Datanodes are online

First, upload a file to HDFS

hadoop fs -mkdir /dhf

hadoop fs -put /test.txt /dhf

hadoop fs -ls /dhf
Copy the code

Kill active NameNode(dhf2)

kill -9 16950
Copy the code

Access through a browser: 192.XXX.XXX.27:50070

NameNode ‘dhf1:9000’ (active)

At this point the NameNode on DHF1 becomes active

hadoop fs -ls /dhf
Copy the code

The uploaded file still exists

Manually start the failed NameNode

./hadoop-daemon.sh start namenode
Copy the code

Access through a browser: 192. Xxx.xxx.228 :50070

NameNode ‘dhf2:9000’ (standby)

Follow the official account, add the author’s wechat, discuss more together.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Quickly deploy a Hadoop cluster

1. Change the Linux host name

2. Modify the IP

3. Modify the mapping between the host name and IP address

4. Disable the firewall

5. SSH from landing

6. Install JDK and configure environment variables

7. Restart the server

8. Cluster planning

9. Install the zookeeper

9.1. Install and configure the ZooeKeeper cluster

9.2 Copying zooKeeper to another node

10. Hadoop installation

10.1 Installing and Configuring a Hadoop Cluster

10.1.1 Adding Hadoop to Environment Variables

HDFS 10.1.2 configuration

10.1.2.1 modify hadoop – env. Sh

10.1.2.2 modify core – site. XML

10.1.2.3 modify HDFS – site. XML

10.1.2.4 modify mapred – site. XML

10.1.2.5 modify yarn – site. XML

10.1.2.6 Modifying workers (workers)

10.2 Copying Configured Hadoop to other Nodes

11. Start the service

11.1 Starting the ZooKeeper Cluster

11.2 start journalnode

11.3 Formatting the HDFS

11.4 Formatting the ZK

11.5 start the HDFS

11.6 start the YARN

12. validation

Follow the official account, add the author’s wechat, discuss more together.

Quickly deploy a Hadoop cluster

1. Change the Linux host name

2. Modify the IP

3. Modify the mapping between the host name and IP address

4. Disable the firewall

5. SSH from landing

6. Install JDK and configure environment variables

7. Restart the server

8. Cluster planning

9. Install the zookeeper

9.1. Install and configure the ZooeKeeper cluster

9.2 Copying zooKeeper to another node

10. Hadoop installation

10.1 Installing and Configuring a Hadoop Cluster

10.1.1 Adding Hadoop to Environment Variables

HDFS 10.1.2 configuration

10.1.2.1 modify hadoop – env. Sh

10.1.2.2 modify core – site. XML

10.1.2.3 modify HDFS – site. XML

10.1.2.4 modify mapred – site. XML

10.1.2.5 modify yarn – site. XML

10.1.2.6 Modifying workers (workers)

10.2 Copying Configured Hadoop to other Nodes

11. Start the service

11.1 Starting the ZooKeeper Cluster

11.2 start journalnode

11.3 Formatting the HDFS

11.4 Formatting the ZK

11.5 start the HDFS

11.6 start the YARN

12. validation

Follow the official account, add the author’s wechat, discuss more together.

Related Posts

TiDB calls in special practice

I feel like you’re looking down on me! Computer hardware I don’t understand?

One diagram to thoroughly understand Spring loop dependencies