The text before

As a person who will definitely do big data in the future, has not played Java and Hadoop will be beaten to death by the teacher? So I thought, I will set up a Hadoop on my cloud host abroad, and then I will set up another one under the Ubuntu system of Dell computer, and then I can set up another one on an old Dell, and I can set up another one on a MAC, which can be regarded as a distributed cluster. No matter. Install Hadoop in Ubuntu 17.04.

The body of the

Domestic data are too old. I used Google search a wave, sure enough to use ah!!

Select a tutorial: www.admintome.com/blog/instal… Now enter the installation link:

1、 Install required software

# apt update && apt upgrade -y
# reboot

# apt install -y openjdk-8-jdk

# apt install ssh pdsh -y
Copy the code

2, the Download Hadoop

Wget # http://apache.cs.utah.edu/hadoop/common/stable/hadoop-2.8.2.tar.gz
# tar - XZVF hadoop - 2.8.2. Tar. Gz
# CD hadoop - 2.8.2 /
Copy the code

The url above seems to be defunct now. I found some new ones, and you can choose them by yourself:

Apache.claz.org/hadoop/comm…

Apache.claz.org/hadoop/comm…

Here’s what it looks like when it’s downloaded and installed:

Enter the configuration link below:

We need to make some additions to our configuration, so edit the next couple of files with the appropriate contents:

etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr
Copy the code

etc/hadoop/core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

Copy the code

etc/hadoop/hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

Copy the code

Now in order to make the scripts work, we need to setup passwordless SSH to localhost:

  $ ssh-keygen -t rsa -P ' ' -f ~/.ssh/id_rsa
  $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  $ chmod 0600 ~/.ssh/authorized_keys
Copy the code

Format the HDFS filesystem.

# bin/hdfs namenode -format
Copy the code

And finally, start up HDFS.

# sbin/start-dfs.sh
Copy the code

After it starts up you can access the web interface for the NameNode at this URL: http://{server-ip}50070 .

Because my cloud host, so direct use of similar websites can also enter:

Configure YARN

Create the directories we will need for YARN.

# bin/hdfs dfs -mkdir /user
# bin/hdfs dfs -mkdir /user/root
Copy the code

Edit etc/hadoop/mapred-site.xml and add the following contents:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
Copy the code

And edit

etc/hadoop/yarn-site.xml:

<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<property>
		<name>yarn.nodemanager.vmem-check-enabled</name>
		<value>false</value>
	</property>
Copy the code

Start YARN:

# sbin/start-yarn.sh
Copy the code

If the startup fails, the following error message is displayed:

Root @ HustWolfzzb: / home/hustwolf/Hadoop/Hadoop - 2.8.2# sbin/start-yarn.shstarting yarn daemons resourcemanager running as process 16803. Stop it first. localhost: starting nodemanager, Logging to/home/hustwolf/Hadoop/Hadoop - 2.8.2 / logs/yarn - root - nodemanager - HustWolfzzb. Out Root @ HustWolfzzb: / home/hustwolf/Hadoop/Hadoop - 2.8.2# kill -9 16803Root @ HustWolfzzb: / home/hustwolf/Hadoop/Hadoop - 2.8.2# sbin/start-yarn.shstarting yarn daemons starting resourcemanager, Logging to/home/hustwolf/Hadoop/Hadoop - 2.8.2 / logs/yarn - root - the resourcemanager - HustWolfzzb. Out localhost: Nodemanager running as process 17374. Stop it first. Root @ HustWolfzzb: / home/hustwolf/Hadoop/Hadoop - 2.8.2# ls
Copy the code

So, let’s just turn everything off. To do this, run the stop script in sbin. You can use stop-all.sh or stop-dfs.sh stop-yarn.sh. Then just turn it on again.

You can now view the web interface at

http://{server-ip}:8088 .

Testing our installation

In order to test that everything is working we can run a MapReduce job using YARN:

# bin/yarn jar share/hadoop/graphs/hadoop - graphs - examples - 2.8.2. Jar PI 16, 1000
Copy the code

This is going to calculate PI to 16 decimal places for us using the quasiMonteCarlo method. After a minute or two you should get your response:

Job Finished in 96.095 seconds
Estimated value of Pi is 3.14250000000000000000
Copy the code

I have come across a vexing problem here:

Number of Maps = 16 Samples per Map = 1000 17/11/24 07:49:52 WARN ipc.Client: Failed to connect to server: Localhost / 127.0.0.1:9000: try once and fail. Java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:682)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:778)
	at org.apache.hadoop.ipc.Client$Connection.accessThe $3500(Client.java:410)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1544)
	at org.apache.hadoop.ipc.Client.call(Client.java:1375)
	at org.apache.hadoop.ipc.Client.call(Client.java:1339)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java :792) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
	at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1704)
	at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1436)
	at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1433)
	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1437)
	at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:278)
	at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:358)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) java.net.ConnectException: Call the From HustWolfzzb / 127.0.1.1 to localhost: 9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1487) at org.apache.hadoop.ipc.Client.call(Client.java:1429) at org.apache.hadoop.ipc.Client.call(Client.java:1339) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java :792) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
	at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1704)
	at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1436)
	at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1433)
	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1437)
	at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:278)
	at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:358)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
	at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
	at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:682)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:778)
	at org.apache.hadoop.ipc.Client$Connection.accessThe $3500(Client.java:410)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1544)
	at org.apache.hadoop.ipc.Client.call(Client.java:1375)
	... 38 more
Copy the code

It hasn’t been solved yet, but there’s really nothing left to change. If there is a solution, I will post it in the comments section, or directly modify the article! Thanks a bye ~ fitness to ~~!!

This should be enough to get you started on your Hadoop journey. Subscribe to my newsletter below to get notifications of more Hadoop articles.

I hope you enjoyed this post. If it was helpful or if it was way off then please comment and let me know.

Looks like it worked?? Did I miss the part about creating HDFS users? And then there’s the reboot, and there’s something missing. But just when MY expectations were at their highest, the truth hit me hard. All right, GG. Still found a lot of useful tutorial!!

Hadoop Environment Setup Hadoop 2.7.1 Installation and Configuration

After the body

Somebody else laowai English writing is very good, I will not change. Surely even if not understand can also grope in baidu translation help get to the point, really can not send a comment to ask me and, the command is to you the whole good, can not return? It doesn’t exist!!