One preparation
First create the folder with the following structure:
weim@weim:~/myopt$ ls ubuntu1 ubuntu2 ubuntu3
And extract the downloaded JDK (version: 8u172), Hadoop (version: hadoop-2.9.1) into three folders, as follows:
weim@weim:~/myopt$ ls ubuntu1 hadoop jdk weim@weim:~/myopt$ ls ubuntu2 hadoop jdk weim@weim:~/myopt$ ls ubuntu3 hadoop jdk
2 Prepare three machines
Use docker to create three machines here, using mirror ubuntu:16.04
weim@weim:~/myopt$ docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE ubuntu 16.04 f975c5035748 2 months ago 112MB
Start three Ubuntu containers and load the local/myopt/ubuntu1, /myopt/ubuntu2, /myopt/ubuntu3 into the container's/home/software path, respectively.
ubuntu1
weim@weim:~/myopt$ docker run --hostname ubuntu1 --name ubuntu1 -v /home/weim/myopt/ubuntu1:/home/software -it --rm ubuntu:16.04 bash root@ubuntu1:/# ls /home/software/ hadoop jdk
ubuntu2
weim@weim:~/myopt$ docker run --hostname ubuntu2 --name ubuntu2 -v /home/weim/myopt/ubuntu2:/home/software -it --rm ubuntu:16.04 bash root@ubuntu2:/# ls /home/software/ hadoop jdk
ubuntu3
weim@weim:~/myopt$ docker run --hostname ubuntu3 --name ubuntu3 -v /home/weim/myopt/ubuntu3:/home/software -it --rm ubuntu:16.04 bash root@ubuntu3:/# ls /home/software/ hadoop jdk root@ubuntu3:/#
This creates the three most basic machines.
View machine information:
weim@weim:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b4c6de2a4326 ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu2 53d1f6389710 ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu3 0f210a01d47f ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu1 weim@weim:~$ weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu1 172.17.0.2 weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu2 172.17.0.4 weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu3 172.17.0.3 ---------------------------------------------------------------------------------- //Here is the ip address of each machine //Three machines in the same LAN ----------------------------------------------------------------------------------
3. Install some necessary software
Install the necessary software on three machines. First execute the apt-get update command to update the ubuntu software library.
Then install the software vim, openssh-server software.
Four Environment Configuration
a Configure the java environment first, append the java path configuration below the file
root@ubuntu1:/home/software/jdk# vim /etc/profile --------------------------------------------------------------- //Add the following configuration to the end of the profile file #set jdk environment export JAVA_HOME=/home/software/jdk export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH --------------------------------------------------------------- root@ubuntu1:/home/software/jdk# source /etc/profile root@ubuntu1:/home/software/jdk# java -version java version "1.8.0_172" Java(TM) SE Runtime Environment (build 1.8.0_172-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode) root@ubuntu1:/home/software/jdk#
b Set ssh passwordless access
root@ubuntu1:/home/software/jdk# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:hSMrNTp6/1d7L/QZGKdTCPivDJspbY2tcyjke2qjpBI root@ubuntu1 The key's randomart image is: +---[RSA 2048]----+ | . | | o . | | + o o . . | | o + o . o o | | + . S . * | | E . o . . .=.. | | o ..o . @..o..o| | . .o. * @.*. o..| | .. .++Xo+ . o.| +----[SHA256]-----+ root@ubuntu1:/home/software/jdk# cd ~/.ssh root@ubuntu1:~/.ssh# ls id_rsa id_rsa.pub root@ubuntu1:~/.ssh# cat id_rsa.pub >> authorized_keys root@ubuntu1:~/.ssh# chmod 600 authorized_keys
Once the configuration is complete, verify that the local machine can be accessed without a password by ssh localhost, and first ensure that the SSH service is started.If it is not started, you can start the service using/etc/init.d/ssh start.
root@ubuntu1:/home/software# /etc/init.d/ssh start * Starting OpenBSD Secure Shell server sshd [ OK ] root@ubuntu1:/home/software# ssh localhost The authenticity of host 'localhost (127.0.0.1)' can't be established. ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts. Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-41-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage The programs included with the Ubuntu system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. root@ubuntu1:~# exit logout Connection to localhost closed. root@ubuntu1:/home/software#
Copy the authorized_keys file to the ubuntu2,ubuntu3 container.(Here, I don't know the password for ubuntu2 root, so I don't know how to copy it through the scp command for the time being) It's a compromise.
First enter the ~/.ssh file and copy the authorized_keys file to the / home/software path.
root@ubuntu1:~/.ssh# ls authorized_keys id_rsa id_rsa.pub known_hosts root@ubuntu1:~/.ssh# cp authorized_keys /home/software/ root@ubuntu1:~/.ssh# ls /home/software/ authorized_keys hadoop jdk root@ubuntu1:~/.ssh#
Then back to the local system, you can see the file you just copied under ~/myopt/ubuntu1 path, and copy it to ubuntu2,ubuntu3.
weim@weim:~/myopt/ubuntu1$ ls authorized_keys hadoop jdk weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu2/ weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu3/
Then go back to the ubuntu2,ubuntu3 container and copy the file to the ~/.ssh directory.
root@ubuntu2:/home/software# cp authorized_keys ~/.ssh root@ubuntu2:/home/software# ls ~/.ssh authorized_keys id_rsa id_rsa.pub root@ubuntu2:/home/software#
Verify that ubuntu1 can access ubuntu2, ubuntu3 without a password (see ip pass)
root@ubuntu1:~/.ssh# ssh root@172.17.0.3 root@ubuntu1:~/.ssh# ssh root@172.17.0.4
Five hadoop environment configuration
Take ubuntu1 for example, 2 and 3 are the same.
First, create a data save directory for hadoop.
root@ubuntu1:/home/software/hadoop# mkdir data root@ubuntu1:/home/software/hadoop# cd data/ root@ubuntu1:/home/software/hadoop/data# mkdir tmp root@ubuntu1:/home/software/hadoop/data# mkdir data root@ubuntu1:/home/software/hadoop/data# mkdir checkpoint root@ubuntu1:/home/software/hadoop/data# mkdir name
Enter / home/software/hadoop/etc/hadoop directory
Modify the hadoop-env.sh file to set java
export JAVA_HOME=/home/software/jdk
Configure core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://172.17.0.2:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/data/tmp</value> </property> <property> <name>fs.trash.interval</name> <value>1440</value> </property> <property> <name>io.file.buffer.size</name> <value>65536</value> </property> </configuration>
Configure hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/home/hadoop/data/name</value> </property> <property> <name>dfs.blocksize</name> <value>67108864</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/hadoop/data/data</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>/home/hadoop/data/checkpoint</value> </property> <property> <name>dfs.namenode.handler.count</name> <value>10</value> </property> <property> <name>dfs.datanode.handler.count</name> <value>10</value> </property> <property> <name>dfs.namenode.rpc-address</name> <value>172.17.0.2:9000</value> </property> </configuration>
Configure mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
Configure yarn-site.xml
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>172.17.0.2</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
Configure slaves
172.17.0.2 172.17.0.3 172.17.0.4
Six Starts
In ubuntu1, enter the / home/software/hadoop/bin directory, execute hdfs namenode-format to initialize hdfs
root@ubuntu1:/home/software/hadoop/bin# ./hdfs namenode -format
In ubuntu1, enter the / home/software/hadoop/sbin directory.
Execute start-all.sh
root@ubuntu1:/home/software/hadoop/sbin# ./start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [ubuntu1] The authenticity of host 'ubuntu1 (172.17.0.2)' can't be established. ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY. Are you sure you want to continue connecting (yes/no)? yes ubuntu1: Warning: Permanently added 'ubuntu1,172.17.0.2' (ECDSA) to the list of known hosts. ubuntu1: starting namenode, logging to /home/software/hadoop/logs/hadoop-root-namenode-ubuntu1.out 172.17.0.2: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu1.out 172.17.0.4: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu2.out 172.17.0.3: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu3.out Starting secondary namenodes [0.0.0.0] The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established. ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY. Are you sure you want to continue connecting (yes/no)? yes 0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts. 0.0.0.0: starting secondarynamenode, logging to /home/software/hadoop/logs/hadoop-root-secondarynamenode-ubuntu1.out starting yarn daemons starting resourcemanager, logging to /home/software/hadoop/logs/yarn--resourcemanager-ubuntu1.out 172.17.0.2: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu1.out 172.17.0.3: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu3.out 172.17.0.4: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu2.out
View startup
ubuntu1
root@ubuntu1:/home/software/hadoop/sbin# jps 3827 SecondaryNameNode 3686 DataNode 4007 ResourceManager 4108 NodeManager 4158 Jps
ubuntu2
root@ubuntu2:/home/software/hadoop/sbin# jps 3586 Jps 3477 DataNode 3545 NodeManager
ubuntu3
root@ubuntu3:/home/software/hadoop/sbin# jps 3472 DataNode 3540 NodeManager 3582 Jps
Next we visit http://172.17.0.2:50070 and http://172.17.0.2:8088 You can see some information.