HBase integrates Phoenix -- dark horse

Posted by dotbands on Fri, 18 Feb 2022 01:36:03 +0100

Phoenix

What is Phoenix

Phoenix is an open source SQL Engine Based on HBase. You can use standard JDBC API instead of HBase client API to create tables, insert data and query your HBase data
Phoenix is written entirely in Java as a JDBC Driver embedded in HBase. The Phoenix query engine converts the SQL query into one or more HBase scans and choreographs the execution to generate a standard JDBC result set. Using HBase API, coprocessor and custom filter directly, the performance energy level is milliseconds for simple queries and seconds for millions of rows
Phoenix performance
- Phoenix is an SQL Engine built on HBase
- Phoenix enables high-performance operation of HBase in the following ways
  - Compile your SQL query as a scan statement of native HBase
  - key to detect the best start and end of scan statement
  - Orchestrate your scan statements so that they execute in parallel
  - Push the predicate of your WHERE clause to the server filter for processing
  - Execute aggregate query through server hook (called coprocessor)
  - The secondary index is implemented to improve the performance of non primary key field query
  - Statistics related data to improve the level of parallelization and help select the best optimization scheme
  - Skip scan filters to optimize IN, LIKE, OR queries
  - Optimize the primary key to evenly distribute the write pressure

Official website address:

http://phoenix.apache.org/

Installation and deployment of Phoenix

1. Preparatory work

Install ZK cluster, hadoop cluster and Hbase cluster in advance

2. Installation package
Download address: https://mirrors.cnnic.cn/apache/phoenix/apache-phoenix-4.14.0-cdh5.14.2/bin/

Materials \ installation \ apache-phoenix-4.14.0-hbase-1.1-bin tar. gz

3. Upload and unzip

Upload the corresponding installation package to a directory of one of the servers in the corresponding Hbase cluster and unzip it

tar -xvzf apache-phoenix-4.14.0-HBase-1.1-bin.tar.gz -C ../servers/

4. Copy the JAR package required by Phoenix to integrate HBase

Add phoenix-4.14.0-hbase-1.1-server. In the Phoenix directory jar(phoenix-4.14.0-cdh5.14.2-server.jar),phoenix-core-4.14.0-HBase-1.1. Copy the jar (phoenix-core-4.14.0-cdh5.14.2.jar) to the lib directory of each HBase

scp phoenix-4.14.0-HBase-1.1-server.jar phoenix-core-4.14.0-HBase-1.1.jar node1:/export/servers/hbase-1.1.1/lib
scp phoenix-4.14.0-HBase-1.1-server.jar phoenix-core-4.14.0-HBase-1.1.jar node2:/export/servers/hbase-1.1.1/lib
scp phoenix-4.14.0-HBase-1.1-server.jar phoenix-core-4.14.0-HBase-1.1.jar node3:/export/servers/hbase-1.1.1/lib

5. Configure HADOOP and HBASE in Phoenix

The hbase configuration file hbase site XML, core site.xml under hadoop/etc/hadoop xml ,hdfs-site.xml under phoenix/bin / to replace the original configuration file of phoenix

# Enter hbase bin directory
cd /export/servers/apache-phoenix-4.14.0-HBase-1.1-bin/bin

# Back up the original HBase site XML file
mv hbase-site.xml hbase-site.xml.bak

ln -s $HBASE_HOME/conf/hbase-site.xml .
ln -s $HADOOP_HOME/etc/hadoop/core-site.xml .
ln -s $HADOOP_HOME/etc/hadoop/hdfs-site.xml .

6. Restart hbase cluster to make Phoenix jar package take effect

7. Verify success

./sqlline.py node1:2181

The following interface appears, indicating that the startup is successful

Setting property: [incremental, false]
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
issuing: !connect jdbc:phoenix:node1:2181 none none org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:node1:2181
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/export/servers/apache-phoenix-4.14.0-HBase-1.1-bin/phoenix-4.14.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/export/servers/hadoop-2.6.0-cdh5.14.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
19/10/18 09:58:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connected to: Phoenix (version 4.14)
Driver: PhoenixEmbeddedDriver (version 4.14)
Autocommit status: true
Transaction isolation: TRANSACTION_READ_COMMITTED
Building list of tables and columns for tab-completion (set fastconnect to true to skip)...
133/133 (100%) Done
Done
sqlline version 1.2.0

8. View which tables are currently available

Input! Tables view which tables are available
The red box part is the user created table, while the others are Phoenix system tables, which maintain the metadata information of the user table

+--------------+-------------+---------------+-+-----------------+---------------+---------+
| TABLE_SCHEM  | TABLE_NAME  |  TABLE_TYPE   | | IMMUTABLE_ROWS  | SALT_BUCKETS  | MULTI_T |
+--------------+-------------+---------------+-+-----------------+---------------+---------+
| SYSTEM       | CATALOG     | SYSTEM TABLE  | | false           | null          | false   |
| SYSTEM       | FUNCTION    | SYSTEM TABLE  | | false           | null          | false   |
| SYSTEM       | LOG         | SYSTEM TABLE  | | true            | 32            | false   |
| SYSTEM       | SEQUENCE    | SYSTEM TABLE  | | false           | null          | false   |
| SYSTEM       | STATS       | SYSTEM TABLE  | | false           | null          | false   |
+--------------+-------------+---------------+-+-----------------+---------------+---------+

9. Exit Phoenix, enter! quit

Introduction to Phoenix

Demand 1:

Use SQL statements to create a user table in Phoenix. The user table has the following columns

ID	full name	Age	Gender	address
1	Zhang San	30	male	Xicheng district
2	Li Si	20	female	Minhang District

Insert two pieces of data into the table, query the data, and view the data in HBase

Demand analysis:

Directly in Phoenix, use the create table syntax to create a table structure
Because the data ultimately needs to be saved in HBase, the column cluster name in HBase needs to be specified when creating a table

Reference code:

-- Create table
create table if not exists "user_info"(
    "id" varchar primary key,
    "cf"."name" varchar,
    "cf"."age" integer,
    "cf"."sex" varchar,
    "cf"."address" varchar
);

-- New data
upsert into "user_info" values('1', 'Zhang San', 30, 'male', 'Xicheng District ');
upsert into "user_info" values('2', 'Li Si', 20, 'female', 'Minhang District, Shanghai');

Demand 2:

The age of the user whose modification id is 1 is 45

-- Modify data
upsert into "user_info"("id", "age") values('1', 35);

Demand 3:

Delete user data with id 2

-- Delete data
delete from "user_info" where "id" = '2';

Establish mapping with HBase table

There is already a table in HBase. You need to use Phoenix to establish a mapping with HBase, so as to operate HBase through Phoenix in the way of SQL.

Case:

1. In HBase, establish the mapping table of employee s - data preparation

create 'employee','company','family'

put 'employee','row1','company:name','ted'
put 'employee','row1','company:position','worker'
put 'employee','row1','family:tel','13600912345'

put 'employee','row2','company:name','michael'
put 'employee','row2','company:position','manager'
put 'employee','row2','family:tel','1894225698'

scan 'employee'

2. Create mapping view

There are tables in HBase. To create a mapping in Phoenix, you must use create view
Phoenix is case sensitive
All commands are capitalized
If the table name is not enclosed in double quotation marks, the created table name is capitalized regardless of whether you enter uppercase or lowercase
If you want to create a table name and field name that contain both uppercase and lowercase, enclose the table name or field name in double quotation marks

Open command line in Phoenix

CREATE VIEW IF NOT EXISTS "employee" (
    "rowid" VARCHAR NOT NULL PRIMARY KEY, 
    "company"."name" VARCHAR,
    "company"."position" VARCHAR,
    "family"."tel" VARCHAR
);

There are several points to note in this statement

IF NOT EXISTS ensures that if this table has been created, the configuration will not be overwritten
The field as rowkey is calibrated with PRIMARY KEY
Columnfamily. Is used for column clusters Columnname
After the establishment, query the data

3. Query all mapping table data

0: jdbc:phoenix:node01> SELECT * FROM "employee";
+-------+----------+-----------+--------------+-------+
|  no   |   name   | position  |     tel      |  age  |
+-------+----------+-----------+--------------+-------+
| row1  | ted      | worker    | 13600912345  | null  |
| row2  | michael  | manager   | 1894225698   | null  |
+-------+----------+-----------+--------------+-------+

4. Query the data of all employees whose position is' worker '

select * from "employee" where "position" = 'worker'

Using Phoenix to build secondary index to speed up query efficiency

HBase queries through rowkey, otherwise it must compare the values of each column row by row, that is, full table scanning
For tables with large amount of data, the performance of full table scanning is very poor
If you need to query data from multiple perspectives, it is impossible to use rowkey to query. This can be done using the secondary index
Phoenix provides support for HBase secondary index

Configure HBase to support Phoenix secondary index

1. In each HRegionServce HBase site XML add the following attributes

<property> 
  <name>hbase.regionserver.wal.codec</name> 
  <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value> 
</property>

2. Restart the HBase cluster to make the configuration effective

Creating secondary indexes using Phoenix

1. Index creation

create local index "idx_tel" on "employee"("family"."tel");

2. View the execution plan and check whether to query the secondary index

explain select * from "employee" where "name" = 'ted';
explain select  * from "employee" where "tel" = '13600912345';

3. Delete index

drop index "idx_tel" on "employee";

4. View all indexes on the table

!indexes "employee"

Create Phoenix mapping table for order details

Create mapping table

Implementation steps:

Create a mapping table in Phoenix
Realize various queries

1. Create table SQL statement:

create view "dwd_order_detail"(
    "rowid" varchar primary key,
    "detail"."ogId" varchar,
    "detail"."orderId" varchar,
    "detail"."goodsId" varchar,
    "detail"."goodsNum" varchar,
    "detail"."goodsPrice" varchar,
    "detail"."goodsName" varchar,
    "detail"."shopId" varchar,
    "detail"."goodsThirdCatId" varchar,
    "detail"."goodsThirdCatName" varchar,
    "detail"."goodsSecondCatId" varchar,
    "detail"."goodsSecondCatName" varchar,
    "detail"."goodsFirstCatId" varchar,
    "detail"."goodsFirstCatName" varchar,
    "detail"."areaId" varchar,
    "detail"."shopName" varchar,
    "detail"."shopCompany" varchar,
    "detail"."cityId" varchar,
    "detail"."cityName" varchar,
    "detail"."regionId" varchar,
    "detail"."regionName" varchar
);

2. Create an index in the following columns on the table

Listing	explain
goodsThirdCatName	Three level classification
goodsSecondCatName	Secondary classification
goodsFirstCatName	Primary classification
cityName	City name
regionName	Region name

Reference code:

-- Create index
create local index "idx_dwd_order_detail" on "dwd_order_detail"("detail"."goodsThirdCatName", "detail"."goodsSecondCatName", "detail"."goodsFirstCatName", "detail"."cityName", "detail"."regionName");

explain select * from "dwd_order_detail" where "goodsThirdCatName" = 'Other fruits and vegetables' and "cityName" = 'Jingdezhen branch';

Query Phoenix data using JDBC program

Requirements:

Write Java code to query order details

1. Import dependency

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencies>
    <dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>3.8.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.phoenix</groupId>
        <artifactId>phoenix-core</artifactId>
        <version>4.14.0-HBase-1.1</version>
    </dependency>
</dependencies>

2. Write JDBC program

URL: jdbc:phoenix:node1:2181

3. Create a new module. Be careful not to use itcast_shop_parent inheritance (otherwise, it depends on CDH)

Reference code:

Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");

Connection connection = DriverManager.getConnection("jdbc:phoenix:node1:2181", "", "");
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery("select * from \"dwd_order_detail\" limit 10 ");

while(resultSet.next()) {
    String rowid = resultSet.getString("rowid");
    String ogId = resultSet.getString("ogId");
    String orderId = resultSet.getString("orderId");
    String goodsId = resultSet.getString("goodsId");
    String goodsNum = resultSet.getString("goodsNum");
    String goodsPrice = resultSet.getString("goodsPrice");
    String goodsName = resultSet.getString("goodsName");
    String shopId = resultSet.getString("shopId");
    String goodsThirdCatId = resultSet.getString("goodsThirdCatId");
    String goodsThirdCatName = resultSet.getString("goodsThirdCatName");
    String goodsSecondCatId = resultSet.getString("goodsSecondCatId");
    String goodsSecondCatName = resultSet.getString("goodsSecondCatName");
    String goodsFirstCatId = resultSet.getString("goodsFirstCatId");
    String goodsFirstCatName = resultSet.getString("goodsFirstCatName");
    String areaId = resultSet.getString("areaId");
    String shopName = resultSet.getString("shopName");
    String shopCompany = resultSet.getString("shopCompany");
    String cityId = resultSet.getString("cityId");
    String cityName = resultSet.getString("cityName");
    String regionId = resultSet.getString("regionId");
    String regionName = resultSet.getString("regionName");

    System.out.print(rowid);
    System.out.print(ogId);
    System.out.print(orderId);
    System.out.print(goodsId);
    System.out.print(goodsNum);
    System.out.print(goodsPrice);
    System.out.print(goodsName);
    System.out.print(shopId);
    System.out.print(goodsThirdCatId);
    System.out.print(goodsThirdCatName);
    System.out.print(goodsSecondCatId);
    System.out.print(goodsSecondCatName);
    System.out.print(goodsFirstCatId);
    System.out.print(goodsFirstCatName);
    System.out.print(areaId);
    System.out.print(shopName);
    System.out.print(shopCompany);
    System.out.print(cityId);
    System.out.print(cityName);
    System.out.print(regionId);
    System.out.print(regionName);
    System.out.println();
}

resultSet.close();
statement.close();
connection.close();
}

Topics: HBase

Programmer Think

HBase integrates Phoenix -- dark horse

Phoenix

What is Phoenix

Installation and deployment of Phoenix

Introduction to Phoenix

Establish mapping with HBase table

Using Phoenix to build secondary index to speed up query efficiency

Configure HBase to support Phoenix secondary index

Creating secondary indexes using Phoenix

Create Phoenix mapping table for order details

Create mapping table

Query Phoenix data using JDBC program

Hot Topics