brief introduction
The goal is to let cloud data warehouse users and developers experience the main process of building cloud native data warehouse based on AnalyticDB MySQL and DMS through simple steps. The scene will be through the links of instance opening, structure and data initialization, report development, report visualization and so on, Use three specific application scenarios to experience the interactive query and ETL calculation speed of AnalyticDB MySQL in the new retail scenario, as well as the process of data warehouse data report development through DMS.
The data set provided is the simulation data of a retail scenario, including customer information, order records, goods information, national and regional information, etc. the total amount of data is 10GB, and the maximum number of data table records is 59.99 million.
Product introduction
Cloud native data warehouse AnalyticDB MySQL is a new generation of cloud native data warehouse that supports high concurrency and low latency queries. It is highly compatible with MySQL protocol and SQL:2003 Syntax Standard. It can conduct real-time multidimensional analysis, perspective and business exploration of massive data, and quickly build enterprise cloud data warehouse.
Data management DMS is based on the cloud version of Alibaba Group's database service platform for more than ten years. It provides a web database management terminal that is free of installation, operation and maintenance, ready to use, and unified with a variety of database types and environments; It can quickly copy and build the same safe, efficient and standardized database DevOps R & D process and data warehouse development solution as Alibaba Group for enterprise users.
Create high authority account
ADB cluster details page, customize the account and password with high authority as needed
Each cluster is limited to 1 high authority account, and the password can be modified; Subsequently, you can create a new ordinary account through DMS (ADB instance can be directly used in the free operation control mode on DMS). Address: https://ads.console.aliyun.com/adb/cn-shenzhen/instances
Select the region where the created ADB instance is located and click the instance name.
Create database
The database name can be customized on demand.
Address: https://ads.console.aliyun.com/adb/cn-shenzhen/instances
Select the region where the created ADB instance is located and click the instance name.
Create OSS appearance
explain:
The following table creation statements need to replace the strings in the endpoint and URL parameters respectively according to the region of ADB. For example, ADB is in Beijing. You need to add "endpoint" OSS CN Shenzhen internal aliyuncs. Com "replace with" OSS CN Beijing internal aliyuncs. Com ", you need to add" in the URL " oss://adb-tpch-shenzhen" Replace with " oss://adb-tpch-beijing" Codes replaced by regions:
shenzhen: beijing: beijing: hangzhou: shanghai: shanghai: zhangjiakou: Note: the following table creation statements need to be executed after the accessid and accesskey of your cloud account are replaced.
Address: https://dms.aliyun.com/
Path: all functions - data scheme - Data Import The initialization table structure statement is as follows: (Note: the script needs to edit and replace relevant information locally, and then save it as a text with. sql suffix for work order submission)
####1. CUSTOMER appearance CREATE TABLE `OSS_CUSTOMER` ( `C_CUSTKEY` int NOT NULL, `C_NAME` varchar NOT NULL, `C_ADDRESS` varchar NOT NULL, `C_NATIONKEY` int NOT NULL, `C_PHONE` varchar NOT NULL, `C_ACCTBAL` decimal(12, 2) NOT NULL, `C_MKTSEGMENT` varchar NOT NULL, `C_COMMENT` varchar NOT NULL ,`dummy` varchar ) engine='oss' TABLE_PROPERTIES='{ "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", "accessid":"xxxxxxxx", "accesskey":"xxxxxxxx", "delimiter":"|", "URL":"oss://adb-tpch-shenzhen/tpch/10g/customer/customer.tbl" }'; ####2. ORDERS appearance CREATE TABLE `OSS_ORDERS` ( `O_ORDERKEY` bigint NOT NULL, `O_CUSTKEY` int NOT NULL, `O_ORDERSTATUS` varchar NOT NULL, `O_TOTALPRICE` decimal(12, 2) NOT NULL, `O_ORDERDATE` date NOT NULL, `O_ORDERPRIORITY` varchar NOT NULL, `O_CLERK` varchar NOT NULL, `O_SHIPPRIORITY` int NOT NULL, `O_COMMENT` varchar NOT NULL ,`dummy` varchar ) engine='oss' TABLE_PROPERTIES='{ "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", "accessid":"xxxxxxxx", "accesskey":"xxxxxxxx", "delimiter":"|", "URL":"oss://adb-tpch-shenzhen/tpch/10g/orders/orders.tbl" }'; ####3. LINEITEM appearance CREATE TABLE `OSS_LINEITEM` ( `L_ORDERKEY` bigint NOT NULL, `L_PARTKEY` int NOT NULL, `L_SUPPKEY` int NOT NULL, `L_LINENUMBER` bigint NOT NULL, `L_QUANTITY` decimal(12, 2) NOT NULL, `L_EXTENDEDPRICE` decimal(12, 2) NOT NULL, `L_DISCOUNT` decimal(12, 2) NOT NULL, `L_TAX` decimal(12, 2) NOT NULL, `L_RETURNFLAG` varchar NOT NULL, `L_LINESTATUS` varchar NOT NULL, `L_SHIPDATE` date NOT NULL, `L_COMMITDATE` date NOT NULL, `L_RECEIPTDATE` date NOT NULL, `L_SHIPINSTRUCT` varchar NOT NULL, `L_SHIPMODE` varchar NOT NULL, `L_COMMENT` varchar NOT NULL ,`dummy` varchar ) engine='oss' TABLE_PROPERTIES='{ "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", "accessid":"xxxxxxxx", "accesskey":"xxxxxxxx", "delimiter":"|", "URL":"oss://adb-tpch-shenzhen/tpch/10g/lineitem/lineitem.tbl" }'; ####4. NATION appearance CREATE TABLE `OSS_NATION` ( `N_NATIONKEY` int NOT NULL, `N_NAME` varchar NOT NULL, `N_REGIONKEY` int NOT NULL, `N_COMMENT` varchar ,`dummy` varchar ) engine='oss' TABLE_PROPERTIES='{ "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", "accessid":"xxxxxxxx", "accesskey":"xxxxxxxx", "delimiter":"|", "URL":"oss://adb-tpch-shenzhen/tpch/10g/nation/nation.tbl" }'; ####5. PART appearance CREATE TABLE `OSS_PART` ( `P_PARTKEY` int NOT NULL, `P_NAME` varchar NOT NULL, `P_MFGR` varchar NOT NULL, `P_BRAND` varchar NOT NULL, `P_TYPE` varchar NOT NULL, `P_SIZE` int NOT NULL, `P_CONTAINER` varchar NOT NULL, `P_RETAILPRICE` decimal(12, 2) NOT NULL, `P_COMMENT` varchar NOT NULL ,`dummy` varchar ) engine='oss' TABLE_PROPERTIES='{ "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", "accessid":"xxxxxxxx", "accesskey":"xxxxxxxx", "delimiter":"|", "URL":"oss://adb-tpch-shenzhen/tpch/10g/part/part.tbl" }'; ####6. PARTSUPP appearance CREATE TABLE `OSS_PARTSUPP` ( `PS_PARTKEY` int NOT NULL, `PS_SUPPKEY` int NOT NULL, `PS_AVAILQTY` int NOT NULL, `PS_SUPPLYCOST` decimal(12, 2) NOT NULL, `PS_COMMENT` varchar NOT NULL ,`dummy` varchar ) engine='oss' TABLE_PROPERTIES='{ "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", "accessid":"xxxxxxxx", "accesskey":"xxxxxxxx", "delimiter":"|", "URL":"oss://adb-tpch-shenzhen/tpch/10g/partsupp/partsupp.tbl" }'; ####7. REGION appearance CREATE TABLE `OSS_REGION` ( `R_REGIONKEY` int NOT NULL, `R_NAME` varchar NOT NULL, `R_COMMENT` varchar ,`dummy` varchar ) engine='oss' TABLE_PROPERTIES='{ "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", "accessid":"xxxxxxxx", "accesskey":"xxxxxxxx", "delimiter":"|", "URL":"oss://adb-tpch-shenzhen/tpch/10g/region/region.tbl" }'; ####8. SUPPLIER appearance CREATE TABLE `OSS_SUPPLIER` ( `S_SUPPKEY` int NOT NULL, `S_NAME` varchar NOT NULL, `S_ADDRESS` varchar NOT NULL, `S_NATIONKEY` int NOT NULL, `S_PHONE` varchar NOT NULL, `S_ACCTBAL` decimal(12, 2) NOT NULL, `S_COMMENT` varchar NOT NULL ,`dummy` varchar ) engine='oss' TABLE_PROPERTIES='{ "endpoint":"oss-cn-shenzhen-internal.aliyuncs.com", "accessid":"xxxxxxxx", "accesskey":"xxxxxxxx", "delimiter":"|", "URL":"oss://adb-tpch-shenzhen/tpch/10g/supplier/supplier.tbl" }';
Create ADB table
Create ADB table
The steps are the same as OSS appearance creation
Address: https://dms.aliyun.com/ Path: all functions - data scheme - data import initialization table structure statement is as follows: (Note: the script needs to be copied to local editing and saved as a text with. sql suffix for work order submission)
#### CUSTOMER table CREATE TABLE `CUSTOMER` ( `C_CUSTKEY` int NOT NULL, `C_NAME` varchar NOT NULL, `C_ADDRESS` varchar NOT NULL, `C_NATIONKEY` int NOT NULL, `C_PHONE` varchar NOT NULL, `C_ACCTBAL` decimal(12, 2) NOT NULL, `C_MKTSEGMENT` varchar NOT NULL, `C_COMMENT` varchar NOT NULL, primary key (c_custkey) ) DISTRIBUTE BY HASH(`c_custkey`) INDEX_ALL='Y'; #### ORDERS table CREATE TABLE `ORDERS` ( `O_ORDERKEY` bigint NOT NULL, `O_CUSTKEY` int NOT NULL, `O_ORDERSTATUS` varchar NOT NULL, `O_TOTALPRICE` decimal(12, 2) NOT NULL, `O_ORDERDATE` date NOT NULL, `O_ORDERPRIORITY` varchar NOT NULL, `O_CLERK` varchar NOT NULL, `O_SHIPPRIORITY` int NOT NULL, `O_COMMENT` varchar NOT NULL, primary key (o_orderkey,o_orderdate) ) DISTRIBUTE BY HASH(`o_orderkey`) PARTITION BY VALUE(`date_format(O_ORDERDATE, '%Y%m')`) LIFECYCLE 90 INDEX_ALL='Y'; #### LINEITEM table CREATE TABLE `LINEITEM` ( `L_ORDERKEY` bigint NOT NULL, `L_PARTKEY` int NOT NULL, `L_SUPPKEY` int NOT NULL, `L_LINENUMBER` bigint NOT NULL, `L_QUANTITY` decimal(12, 2) NOT NULL, `L_EXTENDEDPRICE` decimal(12, 2) NOT NULL, `L_DISCOUNT` decimal(12, 2) NOT NULL, `L_TAX` decimal(12, 2) NOT NULL, `L_RETURNFLAG` varchar NOT NULL, `L_LINESTATUS` varchar NOT NULL, `L_SHIPDATE` date NOT NULL, `L_COMMITDATE` date NOT NULL, `L_RECEIPTDATE` date NOT NULL, `L_SHIPINSTRUCT` varchar NOT NULL, `L_SHIPMODE` varchar NOT NULL, `L_COMMENT` varchar NOT NULL, primary key (l_orderkey,l_linenumber,l_shipdate) ) DISTRIBUTE BY HASH(`l_orderkey`) PARTITION BY VALUE(`date_format(l_shipdate, '%Y%m')`) LIFECYCLE 90 INDEX_ALL='Y'; #### NATION table CREATE TABLE `NATION` ( `N_NATIONKEY` int NOT NULL, `N_NAME` varchar NOT NULL, `N_REGIONKEY` int NOT NULL, `N_COMMENT` varchar, primary key (n_nationkey) ) DISTRIBUTE BY BROADCAST INDEX_ALL='Y'; #### PART table CREATE TABLE `PART` ( `P_PARTKEY` int NOT NULL, `P_NAME` varchar NOT NULL, `P_MFGR` varchar NOT NULL, `P_BRAND` varchar NOT NULL, `P_TYPE` varchar NOT NULL, `P_SIZE` int NOT NULL, `P_CONTAINER` varchar NOT NULL, `P_RETAILPRICE` decimal(12, 2) NOT NULL, `P_COMMENT` varchar NOT NULL, primary key (p_partkey) ) DISTRIBUTE BY HASH(`p_partkey`) INDEX_ALL='Y'; #### PARTSUPP table CREATE TABLE `PARTSUPP` ( `PS_PARTKEY` int NOT NULL, `PS_SUPPKEY` int NOT NULL, `PS_AVAILQTY` int NOT NULL, `PS_SUPPLYCOST` decimal(12, 2) NOT NULL, `PS_COMMENT` varchar NOT NULL, primary key (ps_partkey,ps_suppkey) ) DISTRIBUTE BY HASH(`ps_partkey`) INDEX_ALL='Y'; #### REGION table CREATE TABLE `REGION` ( `R_REGIONKEY` int NOT NULL, `R_NAME` varchar NOT NULL, `R_COMMENT` varchar, primary key (r_regionkey) ) DISTRIBUTE BY BROADCAST INDEX_ALL='Y'; #### SUPPLIER table CREATE TABLE `SUPPLIER` ( `S_SUPPKEY` int NOT NULL, `S_NAME` varchar NOT NULL, `S_ADDRESS` varchar NOT NULL, `S_NATIONKEY` int NOT NULL, `S_PHONE` varchar NOT NULL, `S_ACCTBAL` decimal(12, 2) NOT NULL, `S_COMMENT` varchar NOT NULL, primary key (s_suppkey) ) DISTRIBUTE BY HASH(`s_suppkey`) INDEX_ALL='Y';
Dump OSS data to ADB
Same operation steps
Address: https://dms.aliyun.com/
Path: all functions - data scheme - Data Import
The initialization script is as follows (Note: the script needs to be copied to local editing and saved as a text with. sql suffix for work order submission):
###1. CUSTOMER table ### Number of records: 150W, storage space: 234M, import time: 1 minute INSERT INTO CUSTOMER (C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT) SELECT C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT FROM OSS_CUSTOMER; ###2. ORDERS table ### Number of records: 1500W, storage space: 1.6GB, import time: 10 minutes INSERT INTO ORDERS (O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT) SELECT O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT FROM OSS_ORDERS; ###3. LINEITEM table ### Number of records: 5999W, storage space: 7.24GB, import time: 40 minutes INSERT INTO LINEITEM (L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT ,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT) SELECT L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT ,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT FROM OSS_LINEITEM; ###4. NATION table ### Number of records: 25, storage space: 2KB, import time: 1s INSERT INTO NATION (N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT) SELECT N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT FROM OSS_NATION; ###5. PART table ### Number of records: 200W, storage space: 232M, import time: 23 seconds INSERT INTO PART (P_PARTKEY,P_NAME,P_MFGR,P_BRAND,P_TYPE,P_SIZE,P_CONTAINER,P_RETAILPRICE,P_COMMENT) SELECT P_PARTKEY,P_NAME,P_MFGR,P_BRAND,P_TYPE,P_SIZE,P_CONTAINER,P_RETAILPRICE,P_COMMENT FROM OSS_PART; ###6. PARTSUPP table ### Number of records: 800W, storage space: 1.12GB, import time: 3 minutes INSERT INTO PARTSUPP (PS_PARTKEY,PS_SUPPKEY,PS_AVAILQTY,PS_SUPPLYCOST,PS_COMMENT) SELECT PS_PARTKEY,PS_SUPPKEY,PS_AVAILQTY,PS_SUPPLYCOST,PS_COMMENT FROM OSS_PARTSUPP; ###7. REGION table ### Number of records: 5, storage space: 0.38KB, import time: 10 seconds INSERT INTO REGION (R_REGIONKEY,R_NAME,R_COMMENT) SELECT R_REGIONKEY,R_NAME,R_COMMENT from OSS_REGION; ###8. SUPPLIER table ### Number of records: 10W, storage space: 14M, import time: 10 seconds INSERT INTO SUPPLIER (S_SUPPKEY,S_NAME,S_ADDRESS,S_NATIONKEY,S_PHONE,S_ACCTBAL,S_COMMENT) SELECT S_SUPPKEY,S_NAME,S_ADDRESS,S_NATIONKEY,S_PHONE,S_ACCTBAL,S_COMMENT FROM OSS_SUPPLIER;
Import data validation
Address: https://dms.aliyun.com/ Path: all functions - SQLConsole - single database query. Confirm the data import results of the largest table here. The expected value of LINEITEM table is consistent with the amount of imported data, which is 5999W
select count(*) from LINEITEM;