Configuration environment: Mac OS 10.14.5
hadoop version: 3.2.1
Time: February 29, 2020
Install Homebrew
Homebrew is commonly used on mac, with few descriptions and installation methods
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
SSH login local
Configure SSH to local without password
ssh-keygen -t rsa -P "" cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Then execute
ssh localhost
If the error is reported, the remote login may not be set in the mac
"System Preferences" - > "sharing" - > "remote login" can be opened
Similar information will be prompted if login succeeds
Last login: Sun Mar 1 12:18:15 2020 from ::1
Install Hadoop
brew install hadoop
Openjdk, hadoop and autoconf will be downloaded under the default directory of homebrew / usr / local / cell. It is recommended to hang the agent on the terminal in this step, otherwise the speed of openjdk will be quite slow.
After downloading, our Hadoop is already in stand-alone mode.
Configure pseudo distributed mode
The files we want to configure are all in the directory / usr / local / cell / Hadoop / 3.2.1_ / libexec / etc / Hadoop /
hadoop-env.sh
Configure JDK and JAVA_HOME as my own (brew installation should be configured by default, but maybe because I have another JDK8 before, I still configure it as my original)
View Java home with command
/usr/libexec/java_home
Here
# The java implementation to use. By default, this environment # variable is REQUIRED on ALL platforms except OS X! export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home
Core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/Cellar/hadoop/hdfs/tmp</value> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
hdfs-sit.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
yarn-sit.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value> </property> <!-- Site specific YARN configuration properties --> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
test run
Initialization
hadoop namenode -format
Do not run this multiple times, only initialize it before the first run, or the next run will report an error
If it's right, the last message is similar
/************************************************************ SHUTDOWN_MSG: Shutting down NameNode at zhangruilinMBP.local/192.168.1.121 ************************************************************/
Run start-all.sh in the sbin folder
./sbin/start-all.sh
Warning may appear
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Don't worry about this, it won't affect normal operation
Then, you can view the corresponding processes through jps. There should be the following
2161 NodeManager 1825 SecondaryNameNode 2065 ResourceManager 1591 NameNode 2234 Jps 1691 DataNode
Accessing system information through Hadoop's WebUI
localhost: 9870 (old 50070)
localhost: 8088
If you can see the corresponding interface, the installation is basically correct.
Test example
Run through this test system
hadoop jar ./libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar pi 2 5
Correct output finally
Estimated value of Pi is 3.60000000000000000000
Test successful!
Finally, close Hadoop pseudo distribution
./sbin/stop-all.sh