Hadoop Installation and Configuration

Download and install Hadoop

Download the Hadoop package and decompress it to a specified directory. The download url is as follows:

Hadoop.apache.org/releases.ht…

Set environment variables

The main JDK, Hadoop environment variable configuration

JAVA_HOME = / Library/Java/JavaVirtualMachines jdk1.8.0 _201. JDK/Contents/Home CLASSPAHT=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar PATH=$JAVA_HOME/bin:$PATH: Export JAVA_HOME export CLASSPATH export PATH export HADOOP_HOME=/Users/Bigdata/hadoop-3.1.4 export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native/ export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_ROOT_LOGGER=INFO,console export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"Copy the code

Configure Hadoop

Go to the /etc/hadoop directory and modify the following files

  • core-site.xml

  • hdfs-site.xml

  • mapred.xml
 <property>  
        <name>mapreduce.framework.name</name>  
        <value>yarn</value>  
    </property>  
    <property>  
        <name>mapreduce.admin.user.env</name>  
        <value>HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME</value>  
    </property>  
    <property>  
        <name>yarn.app.mapreduce.am.env</name>  
        <value>HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME</value>  
    </property> 

    <property>
      <name>mapreduce.application.classpath</name>
       <value>/ Users/Bigdata/hadoop - 3.1.4 / etc/hadoop, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/common / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/common/lib / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/HDFS / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/HDFS/lib / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/graphs / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/graphs/lib / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/yarn / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/yarn/lib / *</value>
    </property> 
Copy the code
  • yarn-site.xml

4. Run Hadoop

Startup problem:

Enabling Remote Login

To enable SSH password-free login, run the following command and press Enter (to ask whether to overwrite, select Y) :

ssh-keygen -t rsa
Copy the code

1) Initialization:

hdfs namenode -format
Copy the code

2) start

3) Check whether the server is started

jps
Copy the code

4) check the namenode, enter the following url: http://localhost:9870/dfshealth.html#tab-overview

5) Start YARN

start-yarn.sh
Copy the code

I’ve got it up here.

Input http://localhost:8088/cluster in your browser, after a successful start to appear the interface for the following:

6) Use HDFS

  • Creating a file Directory
hdfs fs -mkdir /input
Copy the code
  • Viewing file Directories
hdfs fs -ls /
Copy the code

Create three file directories as follows:

7) Test hadoop’s wordCount starter case

Preparation: To upload the written test TXT file to the input directory, simply execute the following directory:

HDFS dfs-put Local TXT file path /inputCopy the code

Then go to the share/hadoop directory in the home directory and run the following command:

Hadoop jar hadoop-mapreduce-examples-3.1.4.jar wordcount /input/ inputWord /output/wordcountCopy the code

The Hadoop UI already has a job running

View production files:

If you want to delete folders, use the following command:

hadoop fs -rm -r /output
Copy the code

To enable hadoop debug log output, add the following configuration to the ~/. Bash_profile file:

Export HADOOP_ROOT_LOGGER=DEBUG,console // Enable dubug export HADOOP_ROOT_LOGGER=INFO,console // Disable debuggingCopy the code

5. Problems encountered:

1. The following warnings are always displayed when you run HDFS commands:

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

The problem remains unresolved for the time being

Tracking UI cannot be accessed

Go to the Hadoop /sbin directory and run the following command to start the service. Use JPS to check whether JobHistoryServer is enabled:

Just visit again