Creating databases and tables using AnalyticDB MySQL

Posted by todd2006 on Thu, 03 Mar 2022 21:20:06 +0100

brief introduction

The goal is to let cloud data warehouse users and developers experience the main process of building cloud native data warehouse based on AnalyticDB MySQL and DMS through simple steps. The scene will be through the links of instance opening, structure and data initialization, report development, report visualization and so on, Use three specific application scenarios to experience the interactive query and ETL calculation speed of AnalyticDB MySQL in the new retail scenario, as well as the process of data warehouse data report development through DMS.

The data set provided is the simulation data of a retail scenario, including customer information, order records, goods information, national and regional information, etc. the total amount of data is 10GB, and the maximum number of data table records is 59.99 million.

Product introduction

Cloud native data warehouse AnalyticDB MySQL is a new generation of cloud native data warehouse that supports high concurrency and low latency queries. It is highly compatible with MySQL protocol and SQL:2003 Syntax Standard. It can conduct real-time multidimensional analysis, perspective and business exploration of massive data, and quickly build enterprise cloud data warehouse.

Data management DMS is based on the cloud version of Alibaba Group's database service platform for more than ten years. It provides a web database management terminal that is free of installation, operation and maintenance, ready to use, and unified with a variety of database types and environments; It can quickly copy and build the same safe, efficient and standardized database DevOps R & D process and data warehouse development solution as Alibaba Group for enterprise users.

Create high authority account

ADB cluster details page, customize the account and password with high authority as needed

Each cluster is limited to 1 high authority account, and the password can be modified; Subsequently, you can create a new ordinary account through DMS (ADB instance can be directly used in the free operation control mode on DMS). Address: https://ads.console.aliyun.com/adb/cn-shenzhen/instances

Select the region where the created ADB instance is located and click the instance name.

Create database

The database name can be customized on demand.

Address: https://ads.console.aliyun.com/adb/cn-shenzhen/instances

Select the region where the created ADB instance is located and click the instance name.

Create OSS appearance

explain:

The following table creation statements need to replace the strings in the endpoint and URL parameters respectively according to the region of ADB. For example, ADB is in Beijing. You need to add "endpoint" OSS CN Shenzhen internal aliyuncs. Com "replace with" OSS CN Beijing internal aliyuncs. Com ", you need to add" in the URL " oss://adb-tpch-shenzhen" Replace with " oss://adb-tpch-beijing" Codes replaced by regions:

shenzhen: beijing: beijing: hangzhou: shanghai: shanghai: zhangjiakou: Note: the following table creation statements need to be executed after the accessid and accesskey of your cloud account are replaced.

Address: https://dms.aliyun.com/

Path: all functions - data scheme - Data Import The initialization table structure statement is as follows: (Note: the script needs to edit and replace relevant information locally, and then save it as a text with. sql suffix for work order submission)

####1. CUSTOMER appearance

CREATE TABLE `OSS_CUSTOMER` (
 `C_CUSTKEY` int NOT NULL,
 `C_NAME` varchar NOT NULL,
 `C_ADDRESS` varchar NOT NULL,
 `C_NATIONKEY` int NOT NULL,
 `C_PHONE` varchar NOT NULL,
 `C_ACCTBAL` decimal(12, 2) NOT NULL,
 `C_MKTSEGMENT` varchar NOT NULL,
 `C_COMMENT` varchar NOT NULL
  ,`dummy` varchar
) engine='oss'
TABLE_PROPERTIES='{
     "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", 
     "accessid":"xxxxxxxx",
     "accesskey":"xxxxxxxx",
     "delimiter":"|",
     "URL":"oss://adb-tpch-shenzhen/tpch/10g/customer/customer.tbl"
}';


####2. ORDERS appearance

CREATE TABLE `OSS_ORDERS` (
 `O_ORDERKEY` bigint NOT NULL,
 `O_CUSTKEY` int NOT NULL,
 `O_ORDERSTATUS` varchar NOT NULL,
 `O_TOTALPRICE` decimal(12, 2) NOT NULL,
 `O_ORDERDATE` date NOT NULL,
 `O_ORDERPRIORITY` varchar NOT NULL,
 `O_CLERK` varchar NOT NULL,
 `O_SHIPPRIORITY` int NOT NULL,
 `O_COMMENT` varchar NOT NULL
  ,`dummy` varchar
) engine='oss'
TABLE_PROPERTIES='{
     "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com",
     "accessid":"xxxxxxxx",
     "accesskey":"xxxxxxxx",
     "delimiter":"|",
     "URL":"oss://adb-tpch-shenzhen/tpch/10g/orders/orders.tbl"
}';


####3. LINEITEM appearance

CREATE TABLE `OSS_LINEITEM` (
 `L_ORDERKEY` bigint NOT NULL,
 `L_PARTKEY` int NOT NULL,
 `L_SUPPKEY` int NOT NULL,
 `L_LINENUMBER` bigint NOT NULL,
 `L_QUANTITY` decimal(12, 2) NOT NULL,
 `L_EXTENDEDPRICE` decimal(12, 2) NOT NULL,
 `L_DISCOUNT` decimal(12, 2) NOT NULL,
 `L_TAX` decimal(12, 2) NOT NULL,
 `L_RETURNFLAG` varchar NOT NULL,
 `L_LINESTATUS` varchar NOT NULL,
 `L_SHIPDATE` date NOT NULL,
 `L_COMMITDATE` date NOT NULL,
 `L_RECEIPTDATE` date NOT NULL,
 `L_SHIPINSTRUCT` varchar NOT NULL,
 `L_SHIPMODE` varchar NOT NULL,
 `L_COMMENT` varchar NOT NULL
  ,`dummy` varchar
) engine='oss'
TABLE_PROPERTIES='{
     "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com",
     "accessid":"xxxxxxxx",
     "accesskey":"xxxxxxxx",
     "delimiter":"|",
     "URL":"oss://adb-tpch-shenzhen/tpch/10g/lineitem/lineitem.tbl"
}';


####4. NATION appearance

CREATE TABLE `OSS_NATION` (
 `N_NATIONKEY` int NOT NULL,
 `N_NAME` varchar NOT NULL,
 `N_REGIONKEY` int NOT NULL,
 `N_COMMENT` varchar
  ,`dummy` varchar
) engine='oss'
TABLE_PROPERTIES='{
     "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com",
     "accessid":"xxxxxxxx",
     "accesskey":"xxxxxxxx",
     "delimiter":"|",
     "URL":"oss://adb-tpch-shenzhen/tpch/10g/nation/nation.tbl"
}';


####5. PART appearance

CREATE TABLE `OSS_PART` (
 `P_PARTKEY` int NOT NULL,
 `P_NAME` varchar NOT NULL,
 `P_MFGR` varchar NOT NULL,
 `P_BRAND` varchar NOT NULL,
 `P_TYPE` varchar NOT NULL,
 `P_SIZE` int NOT NULL,
 `P_CONTAINER` varchar NOT NULL,
 `P_RETAILPRICE` decimal(12, 2) NOT NULL,
 `P_COMMENT` varchar NOT NULL
  ,`dummy` varchar
) engine='oss'
TABLE_PROPERTIES='{
     "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com",
     "accessid":"xxxxxxxx",
     "accesskey":"xxxxxxxx",
     "delimiter":"|",
     "URL":"oss://adb-tpch-shenzhen/tpch/10g/part/part.tbl"
}';


####6. PARTSUPP appearance

CREATE TABLE `OSS_PARTSUPP` (
 `PS_PARTKEY` int NOT NULL,
 `PS_SUPPKEY` int NOT NULL,
 `PS_AVAILQTY` int NOT NULL,
 `PS_SUPPLYCOST` decimal(12, 2) NOT NULL,
 `PS_COMMENT` varchar NOT NULL
  ,`dummy` varchar
) engine='oss'
TABLE_PROPERTIES='{
     "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com",
     "accessid":"xxxxxxxx",
     "accesskey":"xxxxxxxx",
     "delimiter":"|",
     "URL":"oss://adb-tpch-shenzhen/tpch/10g/partsupp/partsupp.tbl"
}';


####7. REGION appearance

CREATE TABLE `OSS_REGION` (
 `R_REGIONKEY` int NOT NULL,
 `R_NAME` varchar NOT NULL,
 `R_COMMENT` varchar
  ,`dummy` varchar
) engine='oss'
TABLE_PROPERTIES='{
     "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com",
     "accessid":"xxxxxxxx",
     "accesskey":"xxxxxxxx",
     "delimiter":"|",
     "URL":"oss://adb-tpch-shenzhen/tpch/10g/region/region.tbl"
}';


####8. SUPPLIER appearance

CREATE TABLE `OSS_SUPPLIER` (
 `S_SUPPKEY` int NOT NULL,
 `S_NAME` varchar NOT NULL,
 `S_ADDRESS` varchar NOT NULL,
 `S_NATIONKEY` int NOT NULL,
 `S_PHONE` varchar NOT NULL,
 `S_ACCTBAL` decimal(12, 2) NOT NULL,
 `S_COMMENT` varchar NOT NULL
  ,`dummy` varchar
) engine='oss'
TABLE_PROPERTIES='{
     "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com",
     "accessid":"xxxxxxxx",
     "accesskey":"xxxxxxxx",
     "delimiter":"|",
     "URL":"oss://adb-tpch-shenzhen/tpch/10g/supplier/supplier.tbl"
}';

Create ADB table

Create ADB table

The steps are the same as OSS appearance creation

Address: https://dms.aliyun.com/ Path: all functions - data scheme - data import initialization table structure statement is as follows: (Note: the script needs to be copied to local editing and saved as a text with. sql suffix for work order submission)

#### CUSTOMER table

CREATE TABLE `CUSTOMER` (
 `C_CUSTKEY` int NOT NULL,
 `C_NAME` varchar NOT NULL,
 `C_ADDRESS` varchar NOT NULL,
 `C_NATIONKEY` int NOT NULL,
 `C_PHONE` varchar NOT NULL,
 `C_ACCTBAL` decimal(12, 2) NOT NULL,
 `C_MKTSEGMENT` varchar NOT NULL,
 `C_COMMENT` varchar NOT NULL,
 primary key (c_custkey)
) 
DISTRIBUTE BY HASH(`c_custkey`) 
INDEX_ALL='Y';


#### ORDERS table

CREATE TABLE `ORDERS` (
 `O_ORDERKEY` bigint NOT NULL,
 `O_CUSTKEY` int NOT NULL,
 `O_ORDERSTATUS` varchar NOT NULL,
 `O_TOTALPRICE` decimal(12, 2) NOT NULL,
 `O_ORDERDATE` date NOT NULL,
 `O_ORDERPRIORITY` varchar NOT NULL,
 `O_CLERK` varchar NOT NULL,
 `O_SHIPPRIORITY` int NOT NULL,
 `O_COMMENT` varchar NOT NULL,
 primary key (o_orderkey,o_orderdate)
) 
DISTRIBUTE BY HASH(`o_orderkey`) 
PARTITION BY VALUE(`date_format(O_ORDERDATE, '%Y%m')`) 
LIFECYCLE 90 
INDEX_ALL='Y';


#### LINEITEM table

CREATE TABLE `LINEITEM` (
 `L_ORDERKEY` bigint NOT NULL,
 `L_PARTKEY` int NOT NULL,
 `L_SUPPKEY` int NOT NULL,
 `L_LINENUMBER` bigint NOT NULL,
 `L_QUANTITY` decimal(12, 2) NOT NULL,
 `L_EXTENDEDPRICE` decimal(12, 2) NOT NULL,
 `L_DISCOUNT` decimal(12, 2) NOT NULL,
 `L_TAX` decimal(12, 2) NOT NULL,
 `L_RETURNFLAG` varchar NOT NULL,
 `L_LINESTATUS` varchar NOT NULL,
 `L_SHIPDATE` date NOT NULL,
 `L_COMMITDATE` date NOT NULL,
 `L_RECEIPTDATE` date NOT NULL,
 `L_SHIPINSTRUCT` varchar NOT NULL,
 `L_SHIPMODE` varchar NOT NULL,
 `L_COMMENT` varchar NOT NULL,
 primary key (l_orderkey,l_linenumber,l_shipdate)
) 
DISTRIBUTE BY HASH(`l_orderkey`) 
PARTITION BY VALUE(`date_format(l_shipdate, '%Y%m')`) 
LIFECYCLE 90 
INDEX_ALL='Y';


#### NATION table

CREATE TABLE `NATION` (
 `N_NATIONKEY` int NOT NULL,
 `N_NAME` varchar NOT NULL,
 `N_REGIONKEY` int NOT NULL,
 `N_COMMENT` varchar,
 primary key (n_nationkey)
) DISTRIBUTE BY BROADCAST INDEX_ALL='Y';



#### PART table

CREATE TABLE `PART` (
 `P_PARTKEY` int NOT NULL,
 `P_NAME` varchar NOT NULL,
 `P_MFGR` varchar NOT NULL,
 `P_BRAND` varchar NOT NULL,
 `P_TYPE` varchar NOT NULL,
 `P_SIZE` int NOT NULL,
 `P_CONTAINER` varchar NOT NULL,
 `P_RETAILPRICE` decimal(12, 2) NOT NULL,
 `P_COMMENT` varchar NOT NULL,
 primary key (p_partkey)
) 
DISTRIBUTE BY HASH(`p_partkey`) 
INDEX_ALL='Y';

####  PARTSUPP table

CREATE TABLE `PARTSUPP` (
 `PS_PARTKEY` int NOT NULL,
 `PS_SUPPKEY` int NOT NULL,
 `PS_AVAILQTY` int NOT NULL,
 `PS_SUPPLYCOST` decimal(12, 2) NOT NULL,
 `PS_COMMENT` varchar NOT NULL,
 primary key (ps_partkey,ps_suppkey)
) 
DISTRIBUTE BY HASH(`ps_partkey`) 
INDEX_ALL='Y';


####  REGION table

CREATE TABLE `REGION` (
 `R_REGIONKEY` int NOT NULL,
 `R_NAME` varchar NOT NULL,
 `R_COMMENT` varchar,
 primary key (r_regionkey)
) 
DISTRIBUTE BY BROADCAST 
INDEX_ALL='Y';


#### SUPPLIER table

CREATE TABLE `SUPPLIER` (
 `S_SUPPKEY` int NOT NULL,
 `S_NAME` varchar NOT NULL,
 `S_ADDRESS` varchar NOT NULL,
 `S_NATIONKEY` int NOT NULL,
 `S_PHONE` varchar NOT NULL,
 `S_ACCTBAL` decimal(12, 2) NOT NULL,
 `S_COMMENT` varchar NOT NULL,
 primary key (s_suppkey)
) 
DISTRIBUTE BY HASH(`s_suppkey`) 
INDEX_ALL='Y';

Dump OSS data to ADB

Same operation steps

Address: https://dms.aliyun.com/

Path: all functions - data scheme - Data Import

The initialization script is as follows (Note: the script needs to be copied to local editing and saved as a text with. sql suffix for work order submission):

###1. CUSTOMER table  
### Number of records: 150W, storage space: 234M, import time: 1 minute

INSERT INTO CUSTOMER
(C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT)
SELECT C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT  FROM OSS_CUSTOMER;


###2. ORDERS table   
### Number of records: 1500W, storage space: 1.6GB, import time: 10 minutes

INSERT INTO ORDERS
(O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT)
SELECT O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT  FROM OSS_ORDERS;

###3. LINEITEM table   
### Number of records: 5999W, storage space: 7.24GB, import time: 40 minutes

INSERT INTO LINEITEM  
(L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT
,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT)
SELECT L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT
,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT  FROM OSS_LINEITEM;


###4. NATION table  
### Number of records: 25, storage space: 2KB, import time: 1s

INSERT INTO NATION
(N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT)
SELECT N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT  FROM OSS_NATION;


###5. PART table  
### Number of records: 200W, storage space: 232M, import time: 23 seconds

INSERT INTO PART
(P_PARTKEY,P_NAME,P_MFGR,P_BRAND,P_TYPE,P_SIZE,P_CONTAINER,P_RETAILPRICE,P_COMMENT)
SELECT P_PARTKEY,P_NAME,P_MFGR,P_BRAND,P_TYPE,P_SIZE,P_CONTAINER,P_RETAILPRICE,P_COMMENT  FROM OSS_PART;


###6. PARTSUPP table  
### Number of records: 800W, storage space: 1.12GB, import time: 3 minutes

INSERT INTO PARTSUPP
(PS_PARTKEY,PS_SUPPKEY,PS_AVAILQTY,PS_SUPPLYCOST,PS_COMMENT)
SELECT PS_PARTKEY,PS_SUPPKEY,PS_AVAILQTY,PS_SUPPLYCOST,PS_COMMENT FROM OSS_PARTSUPP;


###7. REGION table  
### Number of records: 5, storage space: 0.38KB, import time: 10 seconds


INSERT INTO REGION
(R_REGIONKEY,R_NAME,R_COMMENT)
SELECT R_REGIONKEY,R_NAME,R_COMMENT from OSS_REGION;


###8. SUPPLIER table  
### Number of records: 10W, storage space: 14M, import time: 10 seconds


INSERT INTO SUPPLIER
(S_SUPPKEY,S_NAME,S_ADDRESS,S_NATIONKEY,S_PHONE,S_ACCTBAL,S_COMMENT)
SELECT S_SUPPKEY,S_NAME,S_ADDRESS,S_NATIONKEY,S_PHONE,S_ACCTBAL,S_COMMENT FROM OSS_SUPPLIER;

Import data validation

Address: https://dms.aliyun.com/ Path: all functions - SQLConsole - single database query. Confirm the data import results of the largest table here. The expected value of LINEITEM table is consistent with the amount of imported data, which is 5999W

select count(*) from LINEITEM;

Topics: MySQL Big Data DevOps watermark shadow