Hbase cluster deployment

Posted by jebadoa on Wed, 29 May 2019 21:53:18 +0200

Hhase cluster deployment

 Software used 
 hadoop-2.7.4
 hbase-1.2.6
 jdk-8u144
 zookeeper-3.4.10
 Hbase comes with zookeeper, where you use your own deployed zookeeper  

zookeeper cluster deployment

Install jdk 
Download zookeeper program 
Modify zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataLogDir=/zookeeper/logs
dataDir=/zookeeper/data
clientPort=2181
server.1= 10.39.6.178:2888:3888
server.2= 10.39.6.179:2888:3888
server.3= 10.39.6.180:2888:3888

Add myid, where myid corresponds to server.n one by one. 
Here server.1 so node 1 node myid=1  
echo "1" /zookeeper/data/myid

Create the required directories
 Adding environment variables
vi /etc/profile 
export ZOOKEEPER_HOME=/application/zookeeper-3.4.10
export PATH=$PATH:$ZOOKEEPER_HOME/bin  

start-up

Pack and copy all the configuration of node 1 to other nodes, and start zookeeper.
Errors in startup can be tracked using zkServer.sh start-foreground 

role

zkServer.sh status displays zookeeper status
Mode: leader 

Here's Mode: leader and follower 
There is only leader in a cluster 
leader, who is responsible for voting decisions and updating system status 
follower followers are used to accept client requests and want the client to return the results, voting during the selection process 

Detailed configuration parameters

TickTime is the time interval between the zookeeper servers or between the client and the server to maintain the heartbeat, which means that each tickTime sends a heartbeat. 
The initLimit configuration item is used to configure the maximum number of heartbeat intervals zookeeper can tolerate when accepting client initialization connections.
When the zookeeper server has not received the return information from the client after more than 10 tickTime heartbeat lengths, it indicates that the client connection failed, and the total time length is 10*2000=20 seconds. 
The syncLimit configuration item identifies the length of message, request and response time between leader and follower. The maximum length of tickTime should not exceed how many tickTimes. The total length of time is 5*2000=10 seconds.
dataDir saves data directories
 clientPort Port, which is a client connection to the zookeeper server port, which zookeeper listens on to receive client access requests
 server.n=B:C:D's n is a number, indicating the number of servers, B is the IP address of the server, the first port of C is used to exchange information between the members of the cluster, the port of exchange information between the server and the leader server in the cluster, and D is used to elect the leader when the leader hangs up. 

Connect the zookeeper cluster
zkCli.sh -server 10.39.6.178:2181

Hadoop Installation

Download address http://apache.fayea.com/hadoop/common/stable/hadoop-2.7.4.tar.gz

From hbase01 to hbase02 hbase03, ssh keyless login is required. 

hadoop configuration file

configuration file Configuration objects primary coverage
core-site.xml Cluster global parameters User-defined system-level parameters, such as HDFS URL Hadoop temporary directory
hdfs-site.xml HDFS parameters For example, the location of the name node and data node, the number of copies of the file, and the permission to read the file.
mapred-site.xml Mapreduce parameter Including JobHistry Server and application parameters, such as the default number of reduce tasks, the default upper and lower limits of memory that tasks can use
yarn-site.xml Cluster Resource Management System Parameters Including the communication port of ResourceManager, NodeManager, web monitoring port, etc.
Cluster configuration
vi /application/hadoop-2.7.4/etc/hadoop/hadoop-env.sh
 export  JAVA_HOME="/usr/java/jdk1.8.0_144" 
(rpm Installed jdk Storage location)
vi /application/hadoop-2.7.4/etc/hadoop/core-site.xml
  <configuration>
 <property>
 <name>fs.defaultFS</name>
 <value>hdfs://hbase01:9000</value>
    <description>The name of the default file system</description>
  </property>

 <property>
    <name>hadoop.tmp.dir</name>
    <value>/zookeeper/hadoopdata/tmp</value>
    <description>A base for other temporary directories</description>
</property>

 <property>
     <name>hadoop.native.lib</name>
     <value>true</value>
     <description>Should native hadoop libraries, if present, be used.</description>
</property>
</configuration>
vi /application/hadoop-2.7.4/etc/hadoop/hdfs-site.xml
<configuration>
<property>
    <name>dfs.replication</name>
    <value>3</value>
</property>

<property>
      <name>dfs.namenode.name.dir</name>
      <value>/zookeeper/hadoopdata/dfs/name</value>
</property>

<property>
    <name>dfs.datanode.data.dir</name>
    <value>/zookeeper/hadoopdata/dfs/data</value>
 </property>

</configuration>
vi /application/hadoop-2.7.4/etc/hadoop/mapred-site.xml
   <configuration>
 <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
 </property>
</configuration>     
vi /application/hadoop-2.7.4/etc/hadoop/yarn-site.xml
    <configuration>
     <property>
       <name>yarn.resourcemanager.hostname</name>
       <value>hbase01</value>
    </property>

   <property>
        <name>yarn.nodemanager.aux-services</name>
         <value>mapreduce_shuffle</value>
    </property>


 </configuration>
vi /application/hadoop-2.7.4/etc/hadoop/slaves
     hbase02
     hbase03

All configurations COPY to hbase02 hbase03

Format HDFS storage

1. Execute on namenode
   Enter the hadoop directory
   ./bin/hadoop namenode -format   
2. In datanode 
   ./bin/hadoop datanode -format 

Start Hadoop

  1. start-up HDFS 
    ./sbin/start-dfs.sh
    ./sbin/stop-dfs.sh
  2. start-up Yarn
   ./sbin/start-yarn.sh 
   ./sbin/stop-yarn.sh
  3.start-up MapReduce JobHistory Server
   ./sbin/mr-jobhistory-daemon.sh  start historyserver   

  jps View process
  jps
  12016 ResourceManager
  11616 NameNode
  11828 SecondaryNameNode
  12317 JobHistoryServer
  31453 Jps

web access port

  NameNode    50070
  ResourceManager 8088
  MapReduce JobHistory Server 19888

Hbase installation

 hbase Configuration file modification
 vi conf/hbase-env.sh  
    export JAVA_HOME=/usr/java/jdk1.8.0_144
    export HBASE_MANAGES_ZK=false      

 vi conf/hbase-site.xml
    <configuration>
     <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://hbase01:9000/hbase</value>
    </property>
    <property>
       <name>hbase.zookeeper.quorum</name>
       <value>hbase01,hbase02,hbase03</value>
   </property>

   <property>
     <name>hbase.zookeeper.property.dataDir</name>
     <value>/zookeeper/data</value>
   </property>
</configuration> 

vi conf/regionservers 
    hbase02
    hbase03

 //Synchronize the above configuration to other nodes 

hbase boot

  ./bin/start-hbase.sh 

  View the status of Hbase
  jps 
  12016 ResourceManager
  11616 NameNode
  12546 HMaster
  10403 QuorumPeerMain
  11828 SecondaryNameNode
  21225 Jps
  12317 JobHistoryServer
Enter the hbase shell and use the command to view the hbase status
 ./bin/hbase shell 
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in [jar:file:/application/hbase-1.2.6/lib/slf4j-l 
 HBase Shell; enter 'help<RETURN>' for list of supported commands.
 Type "exit<RETURN>" to leave the HBase Shell
 Version 1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017

 hbase(main):001:0> status 
1 active master, 0 backup masters, 2 servers, 0 dead, 1.0000 average load

 hbase(main):002:0> 


Hbase web ui Port 16010 

Topics: Hadoop HBase Zookeeper xml