Hadoop series (4) - Hadoop development environment construction

First, preconditions

Hadoop relies on JDK to run and needs to be installed in advance. See the installation steps:

JDK installation under Linux

2. Configure encryption-free login

Hadoop components communicate with each other based on SSH.

2.1 Mapping Configuration

Configure mapping between IP addresses and host names:

vim /etc/hosts
#End of file add
192.168.43.202  hadoop001
Copy the code

2.2 Generating Public and Private Keys

Execute the following command line to generate the public and private keys:

ssh-keygen -t rsa
Copy the code

3.3 license

Go to the ~/. SSH directory, view the generated public and private keys, and write the public key to the authorization file:

SSH [root@@hadoop001.ssh]# ll-rw -------. 1 root root 1675 3 月 15 09:48 id_rsa -rw-r--r--. 1 root root 388 March 15 09:48 id_rsa.pubCopy the code

#Writes the public key to the authorization file
[root@hadoop001 .ssh]# cat id_rsa.pub >> authorized_keys
[root@hadoop001 .ssh]# chmod 600 authorized_keys
Copy the code

Third, Hadoop(HDFS) environment construction

3.1 Download and Decompress the file

Download Hadoop installation package, here I download is CDH version, the download address is: archive.cloudera.com/cdh5/cdh/5/

#Unpack theThe tar - ZVXF hadoop - server - cdh5.15.2. Tar. GzCopy the code

3.2 Configuring Environment Variables

# vi /etc/profile
Copy the code

Configure environment variables:

Export HADOOP_HOME = / usr/app/hadoop - server - cdh5.15.2 export PATH = ${HADOOP_HOME} / bin: $PATHCopy the code

Run the source command to make the configured environment variables take effect immediately:

# source /etc/profile
Copy the code

3.3 Modifying Hadoop Configurations

Go to ${HADOOP_HOME}/etc/hadoop/ and modify the following configuration:

1. hadoop-env.sh

#JDK Installation PathExport JAVA_HOME = / usr/Java/jdk1.8.0 _201 /Copy the code

2. core-site.xml

<configuration>
    <property>
        <! -- set HDFS address for namenode -->
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop001:8020</value>
    </property>
    <property>
        <! -- Specify the directory where Hadoop stores temporary files -->
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/tmp</value>
    </property>
</configuration>
Copy the code

3. hdfs-site.xml

Specify copy coefficient and temporary file storage location:

<configuration>
    <property>
        <!--由于我们这里搭建是单机版本，所以指定 dfs 的副本系数为 1-->
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>
Copy the code

4. slaves

Configure the host name or IP address of all slave nodes. Since it is a single-node version, you can specify the local machine:

hadoop001
Copy the code

3.4 Disabling the Firewall

If the firewall is not disabled, you may fail to access the Hadoop Web UI:

#Checking the Firewall Status
sudo firewall-cmd --state
#Disable the firewall:
sudo systemctl stop firewalld.service
Copy the code

3.5 the initialization

To initialize Hadoop for the first time, go to ${HADOOP_HOME}/bin/ and run the following command:

[root@hadoop001 bin]# ./hdfs namenode -format
Copy the code

3.6 start the HDFS

Go to ${HADOOP_HOME}/sbin/ and start HDFS:

[root@hadoop001 sbin]# ./start-dfs.sh
Copy the code

3.7 Verifying the startup

Method 1: Run JPS to check whether the NameNode and DataNode services are started.

[root@hadoop001 hadoop-2.6.0-cdh5.15.2]# jps
9137 DataNode
9026 NameNode
9390 SecondaryNameNode
Copy the code

Method 2: View the Web UI. Port number is 50070.

Set up the Hadoop(YARN) environment

4.1 Modifying the Configuration

Go to ${HADOOP_HOME}/etc/hadoop/ and modify the following configuration:

1. mapred-site.xml

#If mapred-site. XML is not available, copy a sample file and modify it
cp mapred-site.xml.template mapred-site.xml
Copy the code

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
Copy the code

2. yarn-site.xml

<configuration>
    <property>
        <! Configure ancillary services to run on NodeManager. You need to configure mapreduce_shuffle to run the MapReduce program on Yarn. -->
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>
Copy the code

4.2 Starting the Service

Go to ${HADOOP_HOME}/sbin/ and start YARN:

./start-yarn.sh
Copy the code

4.3 Verifying the startup

Method 1: Run the JPS command to check whether the NodeManager and ResourceManager services are started.

[root@hadoop001 hadoop-2.6.0-cdh5.15.2]# jps
9137 DataNode
9026 NameNode
12294 NodeManager
12185 ResourceManager
9390 SecondaryNameNode
Copy the code

Method 2: View the Web UI using port 8088.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Hadoop series (4) — Hadoop development environment construction

First, preconditions

2. Configure encryption-free login

2.1 Mapping Configuration

2.2 Generating Public and Private Keys

3.3 license

Third, Hadoop(HDFS) environment construction

3.1 Download and Decompress the file

3.2 Configuring Environment Variables

3.3 Modifying Hadoop Configurations

1. hadoop-env.sh

2. core-site.xml

3. hdfs-site.xml

4. slaves

3.4 Disabling the Firewall

3.5 the initialization

3.6 start the HDFS

3.7 Verifying the startup

Set up the Hadoop(YARN) environment

4.1 Modifying the Configuration

1. mapred-site.xml

2. yarn-site.xml

4.2 Starting the Service

4.3 Verifying the startup

Hadoop series (4) — Hadoop development environment construction

First, preconditions

2. Configure encryption-free login

2.1 Mapping Configuration

2.2 Generating Public and Private Keys

3.3 license

Third, Hadoop(HDFS) environment construction

3.1 Download and Decompress the file

3.2 Configuring Environment Variables

3.3 Modifying Hadoop Configurations

1. hadoop-env.sh

2. core-site.xml

3. hdfs-site.xml

4. slaves

3.4 Disabling the Firewall

3.5 the initialization

3.6 start the HDFS

3.7 Verifying the startup

Set up the Hadoop(YARN) environment

4.1 Modifying the Configuration

1. mapred-site.xml

2. yarn-site.xml

4.2 Starting the Service

4.3 Verifying the startup

Related Posts

Basic Python crawler series: Introduction to the Requests library

Swoole entry to actual combat (A) : PHP7&Swoole source code installation, play network communication engine, asynchronous non-blocking IO scene

Four synchronization tool classes and how to gracefully stop threads