Installing Hadoop pseudo distributed experimental environment on mac

Posted by CorfuVBProgrammer on Sun, 01 Mar 2020 07:23:11 +0100

Configuration environment: Mac OS 10.14.5
hadoop version: 3.2.1
Time: February 29, 2020

Install Homebrew

Homebrew is commonly used on mac, with few descriptions and installation methods

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

SSH login local

Configure SSH to local without password

ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Then execute

ssh localhost

If the error is reported, the remote login may not be set in the mac
"System Preferences" - > "sharing" - > "remote login" can be opened
Similar information will be prompted if login succeeds

Last login: Sun Mar  1 12:18:15 2020 from ::1

Install Hadoop

brew install hadoop

Openjdk, hadoop and autoconf will be downloaded under the default directory of homebrew / usr / local / cell. It is recommended to hang the agent on the terminal in this step, otherwise the speed of openjdk will be quite slow.
After downloading, our Hadoop is already in stand-alone mode.

Configure pseudo distributed mode

The files we want to configure are all in the directory / usr / local / cell / Hadoop / 3.2.1_ / libexec / etc / Hadoop /

hadoop-env.sh

Configure JDK and JAVA_HOME as my own (brew installation should be configured by default, but maybe because I have another JDK8 before, I still configure it as my original)
View Java home with command

/usr/libexec/java_home

Here

# The java implementation to use. By default, this environment
# variable is REQUIRED on ALL platforms except OS X!
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home

Core-site.xml

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
    </property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

hdfs-sit.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

yarn-sit.xml

<configuration>
<property> 
        <name>yarn.nodemanager.aux-services</name> 
        <value>mapreduce_shuffle</value> 
    </property>
    <property> 
        <name>yarn.nodemanager.env-whitelist</name>
                  <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>
<!-- Site specific YARN configuration properties -->

</configuration>

mapred-site.xml

<configuration>
<property>
	<name>mapreduce.framework.name</name>
	<value>yarn</value>
</property>
</configuration>

test run

Initialization

hadoop namenode -format

Do not run this multiple times, only initialize it before the first run, or the next run will report an error
If it's right, the last message is similar

/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at zhangruilinMBP.local/192.168.1.121
************************************************************/

Run start-all.sh in the sbin folder

./sbin/start-all.sh

Warning may appear

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Don't worry about this, it won't affect normal operation
Then, you can view the corresponding processes through jps. There should be the following

2161 NodeManager
1825 SecondaryNameNode
2065 ResourceManager
1591 NameNode
2234 Jps
1691 DataNode

Accessing system information through Hadoop's WebUI
localhost: 9870 (old 50070)
localhost: 8088
If you can see the corresponding interface, the installation is basically correct.

Test example

Run through this test system

hadoop jar ./libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar pi 2 5

Correct output finally

Estimated value of Pi is 3.60000000000000000000

Test successful!
Finally, close Hadoop pseudo distribution

./sbin/stop-all.sh
Published 8 original articles, won praise 6, visited 1954
Private letter follow

Topics: Hadoop ssh Java xml