Setting up a hadoop cluster environment with a single machine

One preparation

First create the folder with the following structure:

weim@weim:~/myopt$ ls
ubuntu1  ubuntu2  ubuntu3

And extract the downloaded JDK (version: 8u172), Hadoop (version: hadoop-2.9.1) into three folders, as follows:

weim@weim:~/myopt$ ls ubuntu1
hadoop  jdk
weim@weim:~/myopt$ ls ubuntu2
hadoop  jdk
weim@weim:~/myopt$ ls ubuntu3
hadoop  jdk

2 Prepare three machines

Use docker to create three machines here, using mirror ubuntu:16.04

weim@weim:~/myopt$ docker image ls
REPOSITORY                                          TAG                 IMAGE ID            CREATED             SIZE
ubuntu                                              16.04               f975c5035748        2 months ago        112MB

Start three Ubuntu containers and load the local/myopt/ubuntu1, /myopt/ubuntu2, /myopt/ubuntu3 into the container's/home/software path, respectively.


weim@weim:~/myopt$ docker run --hostname ubuntu1 --name ubuntu1 -v /home/weim/myopt/ubuntu1:/home/software -it --rm  ubuntu:16.04 bash
root@ubuntu1:/# ls /home/software/
hadoop  jdk


weim@weim:~/myopt$ docker run --hostname ubuntu2 --name ubuntu2 -v /home/weim/myopt/ubuntu2:/home/software -it --rm  ubuntu:16.04 bash
root@ubuntu2:/# ls /home/software/
hadoop  jdk


weim@weim:~/myopt$ docker run --hostname ubuntu3 --name ubuntu3 -v /home/weim/myopt/ubuntu3:/home/software -it --rm  ubuntu:16.04 bash
root@ubuntu3:/# ls /home/software/
hadoop  jdk

This creates the three most basic machines.

View machine information:

weim@weim:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS              PORTS               NAMES
b4c6de2a4326        ubuntu:16.04        "bash"              About a minute ago   Up About a minute                       ubuntu2
53d1f6389710        ubuntu:16.04        "bash"              About a minute ago   Up About a minute                       ubuntu3
0f210a01d47f        ubuntu:16.04        "bash"              About a minute ago   Up About a minute                       ubuntu1
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu1
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu2
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu3
//Here is the ip address of each machine
//Three machines in the same LAN

3. Install some necessary software

Install the necessary software on three machines. First execute the apt-get update command to update the ubuntu software library.

Then install the software vim, openssh-server software.

Four Environment Configuration

a Configure the java environment first, append the java path configuration below the file

root@ubuntu1:/home/software/jdk# vim /etc/profile
//Add the following configuration to the end of the profile file
#set jdk environment  
export JAVA_HOME=/home/software/jdk 
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH

root@ubuntu1:/home/software/jdk# source /etc/profile  
root@ubuntu1:/home/software/jdk# java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

b Set ssh passwordless access

root@ubuntu1:/home/software/jdk# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/
The key fingerprint is:
SHA256:hSMrNTp6/1d7L/QZGKdTCPivDJspbY2tcyjke2qjpBI root@ubuntu1
The key's randomart image is:
+---[RSA 2048]----+
|          .      |
|         o .     |
|      + o o . .  |
|     o + o . o o |
|    + . S   . *  |
| E . o . .  .=.. |
|  o ..o . @..o..o|
| . .o. * @.*. o..|
|  .. .++Xo+  . o.|
root@ubuntu1:/home/software/jdk# cd ~/.ssh
root@ubuntu1:~/.ssh# ls
root@ubuntu1:~/.ssh# cat >> authorized_keys
root@ubuntu1:~/.ssh# chmod 600 authorized_keys 

Once the configuration is complete, verify that the local machine can be accessed without a password by ssh localhost, and first ensure that the SSH service is started.If it is not started, you can start the service using/etc/init.d/ssh start.

root@ubuntu1:/home/software# /etc/init.d/ssh start
 * Starting OpenBSD Secure Shell server sshd                                                                                                               [ OK ] 
root@ubuntu1:/home/software# ssh localhost
The authenticity of host 'localhost (' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-41-generic x86_64)

 * Documentation:
 * Management:
 * Support:

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

root@ubuntu1:~# exit
Connection to localhost closed.

Copy the authorized_keys file to the ubuntu2,ubuntu3 container.(Here, I don't know the password for ubuntu2 root, so I don't know how to copy it through the scp command for the time being) It's a compromise.

First enter the ~/.ssh file and copy the authorized_keys file to the / home/software path.

root@ubuntu1:~/.ssh# ls
authorized_keys  id_rsa  known_hosts
root@ubuntu1:~/.ssh# cp authorized_keys /home/software/
root@ubuntu1:~/.ssh# ls /home/software/
authorized_keys  hadoop  jdk

Then back to the local system, you can see the file you just copied under ~/myopt/ubuntu1 path, and copy it to ubuntu2,ubuntu3.

weim@weim:~/myopt/ubuntu1$ ls
authorized_keys  hadoop  jdk
weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu2/
weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu3/

Then go back to the ubuntu2,ubuntu3 container and copy the file to the ~/.ssh directory.

root@ubuntu2:/home/software# cp authorized_keys ~/.ssh
root@ubuntu2:/home/software# ls ~/.ssh
authorized_keys  id_rsa

Verify that ubuntu1 can access ubuntu2, ubuntu3 without a password (see ip pass)

root@ubuntu1:~/.ssh# ssh root@
root@ubuntu1:~/.ssh# ssh root@

Five hadoop environment configuration

Take ubuntu1 for example, 2 and 3 are the same.

First, create a data save directory for hadoop.

root@ubuntu1:/home/software/hadoop# mkdir data
root@ubuntu1:/home/software/hadoop# cd data/
root@ubuntu1:/home/software/hadoop/data# mkdir tmp
root@ubuntu1:/home/software/hadoop/data# mkdir data
root@ubuntu1:/home/software/hadoop/data# mkdir checkpoint
root@ubuntu1:/home/software/hadoop/data# mkdir name

Enter / home/software/hadoop/etc/hadoop directory

Modify the file to set java

export JAVA_HOME=/home/software/jdk

Configure core-site.xml


Configure hdfs-site.xml


Configure mapred-site.xml


Configure yarn-site.xml


Configure slaves

Six Starts

In ubuntu1, enter the / home/software/hadoop/bin directory, execute hdfs namenode-format to initialize hdfs

root@ubuntu1:/home/software/hadoop/bin# ./hdfs namenode -format

In ubuntu1, enter the / home/software/hadoop/sbin directory.


root@ubuntu1:/home/software/hadoop/sbin# ./
This script is Deprecated. Instead use and
Starting namenodes on [ubuntu1]
The authenticity of host 'ubuntu1 (' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu1: Warning: Permanently added 'ubuntu1,' (ECDSA) to the list of known hosts.
ubuntu1: starting namenode, logging to /home/software/hadoop/logs/hadoop-root-namenode-ubuntu1.out starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu1.out starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu2.out starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu3.out
Starting secondary namenodes []
The authenticity of host ' (' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '' (ECDSA) to the list of known hosts. starting secondarynamenode, logging to /home/software/hadoop/logs/hadoop-root-secondarynamenode-ubuntu1.out
starting yarn daemons
starting resourcemanager, logging to /home/software/hadoop/logs/yarn--resourcemanager-ubuntu1.out starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu1.out starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu3.out starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu2.out

View startup


root@ubuntu1:/home/software/hadoop/sbin# jps
3827 SecondaryNameNode
3686 DataNode
4007 ResourceManager
4108 NodeManager
4158 Jps


root@ubuntu2:/home/software/hadoop/sbin# jps
3586 Jps
3477 DataNode
3545 NodeManager


root@ubuntu3:/home/software/hadoop/sbin# jps
3472 DataNode
3540 NodeManager
3582 Jps

Next we visit and You can see some information.

