hadoop installation and configuration

Posted by gr8dane on Sun, 29 Dec 2019 19:56:32 +0100

hadoop installation and configuration

First, decompression.

tar -zxvf hadoop-2.8.2.tar.gz 

2, Modify Hadoop configuration file

cd hadoop-2.8.2/etc/hadoop

Add the following to the hadoop-env.sh file:

export JAVA_HOME=/usr/local/src/jdk1.8.0_152

Add the following to the yarn-env.sh file:

export JAVA_HOME=/usr/local/src/jdk1.8.0_152

The slaves file adds the following:

slave1
slave2

core-site.xml file:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://172.16.11.97:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:/usr/local/src/hadoop-2.8.2/tmp/</value>
    </property>
</configuration>

hdfs-site.xml file:

<configuration>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>master:9001</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/usr/local/src/hadoop-2.8.2/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/usr/local/src/hadoop-2.8.2/dfs/data</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
</configuration>

mapred-site.xml file:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

yarn-site.xml file:

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
        <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
        <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
    </property>
        <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
    </property>
        <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8035</value>
    </property>
        <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>master:8033</value>
    </property>
        <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
    </property>
</configuration>

Create temporary directory and file directory:

mkdir /usr/local/src/hadoop-2.8.2/tmp
mkdir -p /usr/local/src/hadoop-2.8.2/dfs/name
mkdir -p /usr/local/src/hadoop-2.8.2/dfs/data

4. Configure environment variables

vim ~/.bashrc

HADOOP_HOME=/usr/local/src/hadoop-2.8.2
export PATH=$PATH:$HADOOP_HOME/bin

Refresh environment variables

source ~/.bashrc

5. Copy the installation package

scp -r /usr/local/src/hadoop-2.8.2 root@slave1:/usr/local/src/hadoop-2.8.2
scp -r /usr/local/src/hadoop-2.8.2 root@slave2:/usr/local/src/hadoop-2.8.2

6. Start cluster

Initialize and start the cluster on the Master node
Initialize Namenode:

hadoop namenode -format

Start cluster:

./sbin/start-all.sh

7. Cluster status

jps

8. Monitoring page

NameNode:

http://master:50070/dfshealth.jsp

SecondaryNameNode:

http://master:50090/status.jsp

DataNode:

http://slave1:50075/
http://slave2:50075/

JobTracker:

http://master:50030/jobtracker.jsp

TaskTracker:

http://slave1:50060/tasktracker.jsp
http://slave2:50060/tasktracker.jsp

9. Shut down the cluster

./sbin/hadoop stop-all.sh

Topics: Hadoop JSP xml NodeManager