2021SC@SDUSC HBase project analysis: installation, configuration and division of labor

Posted by Jordi_E on Tue, 28 Sep 2021 19:54:25 +0200

2021SC@SDUSC

catalogue

HBase overview

HBase cluster installation

Hadoop installation and configuration

ZooKeeper installation and configuration

Hbase installation and configuration

  Hbase source code download

Intra group division of labor

2021SC@SDUSC

HBase overview

HBase is a distributed, high reliable The NoSQL database is, high-performance, column oriented and scalable. Hadoop HDFS provides high reliability underlying storage support for HBase, Hadoop MapReduce provides high-performance computing power for HBase, and Zookeeper provides stable services and failover mechanism for HBase.

HBase cluster installation

Hadoop installation and configuration

  1. Create the virtual machine Hadoop 102 and complete the corresponding configuration
  2. Install EPEL release
    yum install -y epel-release
  3. Turn off the firewall. Turn off the firewall and start it automatically
    systemctl stop firewalld
    systemctl disable firewalld.service
  4. The configuration ycx user has root permission, which is convenient for executing commands with root permission later
    vim /etc/sudoers
    
    ycx     ALL=(ALL)     NOPASSWD:ALL
  5. Create software and module folders in the / opt directory, and modify the owner and group
  6. Uninstall the JDK of the virtual machine
    rpm -qa | grep -i java | xargs -n1 rpm -e --nodeps 
  7. Clone virtual machines Hadoop 103 and Hadoop 104 with Hadoop 102
  8. Modify the three virtual machines to static IP, taking Hadoop 102 as an example
    vim /etc/sysconfig/network-scripts/ifcfg-ens33

  9.   Set up VMware's network virtual editor

  10. Modify the IP address of Windows system adapter VMware Network Adapter VMnet8

  11. Modify the host name, configure the Linux clone host name mapping hosts file, and configure the windows host mapping hosts file

    192.168.10.102 hadoop102
    192.168.10.103 hadoop103
    192.168.10.104 hadoop104
    
  12. Install xshell and xftp, establish a connection with three virtual machines, and download and copy the compressed package required by JDK and hadoop to the virtual machine

  13. Install JDK for Hadoop 102, configure environment variables, and write xsync distribution script to distribute JDK to Hadoop 103 and Hadoop 104

    tar -zxvf jdk-8u212-linux-x64.tar.gz -C /opt/module/
    sudo vim /etc/profile.d/my_env.sh
    
    #JAVA_HOME
    export JAVA_HOME=/opt/module/jdk1.8.0_212
    export PATH=$PATH:$JAVA_HOME/bin
    
    source /etc/profile
    
    xsync Distribution script:
    #!/bin/bash
    
    if [ $# -lt 1 ]
    then
        echo Not Enough Arguement!
        exit;
    fi
    
    for host in hadoop102 hadoop103 hadoop104
    do
        echo ====================  $host  ====================
        for file in $@
        do
            if [ -e $file ]
                then
                    pdir=$(cd -P $(dirname $file); pwd)
                    fname=$(basename $file)
                    ssh $host "mkdir -p $pdir"
                    rsync -av $pdir/$fname $host:$pdir
                else
                    echo $file does not exists!
            fi
        done
    done
    
    chmod 777 xsync
    
    xsync /opt/module/
    sudo ./bin/xsync /etc/profile.d/my_env.sh
  14.   Install hadoop for hadoop 102, configure environment variables, and distribute them to hadoop 103 and hadoop 104

    tar -zxvf hadoop-3.1.3.tar.gz -C /opt/module/
    sudo vim /etc/profile.d/my_env.sh
    
    #HADOOP_HOME
    export HADOOP_HOME=/opt/module/hadoop-3.1.3
    export PATH=$PATH:$HADOOP_HOME/bin
    export PATH=$PATH:$HADOOP_HOME/sbin
    
    source /etc/profile
    
    xsync /opt/module/
    sudo ./bin/xsync /etc/profile.d/my_env.sh
  15. Configure SSH password free login for Hadoop 102, Hadoop 103 and Hadoop 104
    hadoop102:
    ssh-keygen -t rsa
    ssh-copy-id hadoop102
    ssh-copy-id hadoop103
    ssh-copy-id hadoop104
  16. Configure cluster
    core-site.xml: 
    
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://hadoop102:8020</value>
        </property>
    
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/opt/module/hadoop-3.1.3/data</value>
        </property>
    
        <property>
            <name>hadoop.http.staticuser.user</name>
            <value>ycx</value>
        </property>
    
    hdfs-site.xml:
    
        <property>
            <name>dfs.namenode.http-address</name>
            <value>hadoop102:9870</value>
        </property>
    
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>hadoop104:9868</value>
        </property>
    
    yarn-site.xml:
    
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
    
        <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>hadoop103</value>
        </property>
    
        <property>
            <name>yarn.nodemanager.env-whitelist</name>
            <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
        </property>
    
    mapred-site.xml:
    
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
    
    workers:
    
    hadoop102
    hadoop103
    hadoop104
    
    
    xsync /opt/module/hadoop-3.1.3/etc/hadoop/
    
    
    
  17.   Start cluster

    hdfs namenode -format
    sbin/start-dfs.sh
    sbin/start-yarn.sh

ZooKeeper installation and configuration

  1. Download the ZooKeeper installation package and copy it to the virtual machine using xftp
  2. Unzip the installation package
    tar -zxvf apache-zookeeper-3.5.7-bin.tar.gz -C /opt/module/
  3. Configure ZooKeeper  

    mv zoo_sample.cfg zoo.cfg
    mkdir zkData
    vim myid
    
    myid File:
    hadoop102 For 2
    hadoop103 For 3
    hadoop104 For 4
    
    xsync apache-zookeeper-3.5.7-bin/
    vim zoo.cfg
    
    take zoo.cfg Medium dataDir Change to/opt/module/zkData,And add
    server.2=hadoop102:2888:3888
    server.3=hadoop103:2888:3888
    server.4=hadoop104:2888:3888
    
    xsync zoo.cfg
  4. Start ZooKeeper

    bin/zkServer .sh start
    bin/zkCli.sh

Hbase installation and configuration

  1. Download the Hbase installation package and copy it to the virtual machine using xftp
  2. Unzip the installation package
    tar -zxvf hbase-2.3.6-bin.tar.gz -C /opt/module
    
  3. Configure Hbase

    hbase-env.sh:
    
    export JAVA_HOME=/opt/module/jdk1.8.0_212
    export HBASE_MANAGES_ZK=false
    
    hbase-site.xml:
    
    <property>
      <name>hbase.cluster.distributed</name>
      <value>true</value>
    </property>
    
    <property>
      <name>hbase.master.port</name>
      <value>16000</value>
    </property>
    
    <property>
      <name>hbase.wal.provider</name>
      <value>filesystem</value>
    </property>
    
    <property>
      <name>hbase.zookeeper.quorum</name>
      <value>hadoop102,hadoop103,hadoop104</value>
    </property>
    
    <property>
      <name>hbase.zookeeper.property.dataDir</name>
      <value>/opt/module/zkData</value>
    </property>
    
    regionservers:
    hadoop102
    hadoop103
    hadoop104
    
    ln -s /opt/module/hadoop-3.1.3/etc/hadoop/core-site.xml /opt/module/hbase/conf/core- site.xml
    ln -s /opt/module/hadoop-3.1.3/etc/hadoop/hdfs-site.xml /opt/module/hbase/conf/hdfs-site.xml
    
    xsync hbase-2.3.6/
  4. Start Hbase

    bin/start-hbase.sh

  Hbase source code download

  1. Enter the Hbase download page : https://hbase.apache.org/downloads.html
  2. Download version 2.3.6

  3. Decompression, maven compilation

    mvn clean compile  package -DskipTests

  4. Import IDEA and configure

  5. Copy the files in the conf directory to the resources of HBase server and HBase shell

Intra group division of labor

I am responsible for the source code analysis of Hbase data reading and writing process, and the subsequent dynamic adjustment according to the actual progress

Topics: Hadoop HBase Zookeeper