From Hadoop high availability to HBase environment construction (in virtual machine)
Objective: to build and install HBase in the environment
Idea: after completing the basic configuration of the master master, use cloning to complete the high availability cluster
1. Configure network
First, check the network address of the local computer ...
Posted by detrox on Tue, 21 Sep 2021 05:05:35 +0200
This article is about Learning Guide for Big Data Specialists from Zero (Full Upgrade) Added in part by Haop.
1. Template Virtual Machine Environment Preparation
0) Install template virtual machine, IP address 192.168.10.100, host name hadoop100, memory 4G, hard disk 50G
1) The configuration requirements for hadoop100 virtual machine are a ...
Posted by fael097 on Tue, 21 Sep 2021 03:32:56 +0200
No one has to help you. Everything has to be done by yourself
Hadoop source code compilation
(1) CentOS networking
Configure CentOS to connect to the Internet. Linux virtual machine ping is smooth
Note: use root role compilation to reduce the folder permissions
(2) jar package preparation (hadoop source code, JDK8, maven, ant, p ...
Posted by henryblake1979 on Tue, 30 Jun 2020 10:51:58 +0200
Hadoop configuration (non-HA)
Startup and Validation
Hadoop configuration (non-HA)
Hadoop is a distributed, highly available batch processing framework. Hadoop for CDH comes with other components such as Hbase, H ...
Posted by kwilameiya on Tue, 23 Jun 2020 03:53:38 +0200
This note is written by myself with reference to Lin Ziyu's teaching documents. Please refer to the database Laboratory of Xiamen University for details
Personal built hadoop platform practical environment: Ubuntu 14.04 64 bit * 3, JDK1.8, Hadoop 2.6.5 (apache)
1, Hadoop preparation before instal ...
Posted by caspert_ghost on Sun, 21 Jun 2020 11:11:30 +0200
AI: Keras PyTorch MXNet TensorFlow PaddlePaddle deep learning practice (updated from time to time)
4.4 real time log analysis
Master the connection between Flume and Kafka
We have collected the log data into hadoop, but when doing real-time ana ...
Posted by pontiac007 on Thu, 18 Jun 2020 06:33:47 +0200
**This paper introduces the use of the Java & shell API of HBase RowFilter in detail, and posts the relevant sample code for reference. RowFilter filters based on row keys. When it comes to data filtering through HBase Rowkey, you can consider using it. For details and principle of comparator, please refer to the previous revision: Comparat ...
Posted by jonners on Tue, 05 May 2020 08:00:28 +0200
A tool for translating sql statements into mapreduce programs.
Create table statement
CREATE TABLE page_view(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(dt STRING, country STRING)
ROW FORMAT DELIMI ...
Posted by mainewoods on Wed, 29 Apr 2020 14:54:07 +0200
Reference article: apache Impala detailed installation (lying in the most complete pit)
Apache impala detailed installation
impala is an efficient sql query tool provided by cloudera, which provides real-time query results. The official test performance is 10 to 100 times faster than hive, and its sql query is even faster than spark sql. imp ...
Posted by deth4uall on Tue, 21 Apr 2020 09:18:04 +0200