Alibaba cloud MaxComputer SQL learning DDL

Posted by gilijk on Fri, 21 Jan 2022 23:49:20 +0100

💝 Today, we will introduce some features of MaxComputer, a big data engine, and MaxComputer SQL. Students interested in hive sql can view the following contents 👇:

🐯 MaxComputer is an enterprise level big data engine. What rules should he pay attention to in the actual production process? What should MaxComputer SQL pay attention to? Next, we will give a detailed explanation.

1. Introduction to maxcomputer

MaxComputer is a fast and fully hosted EB level data warehouse solution designed by Alibaba cloud. The main functions are:

  • It is committed to the storage and calculation of batch structured telephone data, and provides solutions for massive data warehouse and analysis and modeling services.
  • It is suitable for storage and computing requirements above 100GB, up to EB level.
  • Support SQL, MapReduce, UDF (Java/Python), Graph, DAG based processing, interactive, memory computing, machine learning and other computing types and MP! Iterative algorithm. It simplifies the application architecture of enterprise big data platform.

1.1 architecture of maxcomputer

The architecture of MaxComputer mainly consists of four layers:

Among them, let's talk about the computing layer: the computing layer uses Feitian kernel to realize user data storage and computing tasks. Take a closer look to see if it is very similar to hadoop ecology!

1.2 MaxComputer usage process

In this paper, we mainly talk about the part of data processing.

2. MaxComputer SQL

MaxCompute SQL is a data warehouse tool, which adopts a syntax similar to SQL. It is a subset of the standard syntax ANSISQL92. It has its own extension, supports common SQL syntax, including window functions, and is suitable for data operation of massive data (TB/PB level). The features are as follows:

  • It is not equivalent to relational database and does not support transaction, primary foreign key constraint, index and other syntax.
  • After the Query is parsed, the calculation and storage are handed over to the underlying Feitian implementation.
  • Task preparation, submission and other stages take a long time, and the real-time performance is not high.
  • The SQL length is limited. At present, it cannot exceed 2MB

2.1 data type of maxcomputer

Please keep the following data types in mind!!!

2.2 introduction to maxcomputer SQL

What is MaxComputer SQL?

  • Is a data warehouse tool (hive similar)
  • Similar to sql syntax, but different, and has its own expansion
  • Support common SQL syntax and window functions
  • It is applicable to the operation of massive data (TP/BP level)

MaxComputer SQL feature

  • It is not equivalent to relational database and does not support transaction, primary foreign key constraint, index and other syntax.
  • After Query parsing, the calculation and storage are implemented by the underlying Feitian (equivalent to the relationship between hive, MapReduce and HDFS)
  • The task preparation and submission phases take a lot of time
  • SQL time is limited.
  • There are no delete and update statements.

2.2 MaxComputer SQL DDL

DDL refers to the data definition language, which mainly includes the following operations:

  1. Table operations: create, modify, delete, view
  2. Partition and column operations: add, modify, delete
  3. Lifecycle operations: create, modify, prohibit / restore
  4. View operations: create, delete

2.2.1 table creation statement

  • Common table building
--Create table
CREATE TABLE [IF NOT EXISTS] table_name
[(col name data type (COMMENT col comment],...) ]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment],...)]
[LIFECYCLE days]
[AS select statement];
  • Quick table creation
--as Build table
CREATE TABLE ‹table_name> AS SELECT <column_list> FROM ‹table_name> WHERE ...;
--like Build table
CREATE TABLE ‹table_name> LIKE ‹table name>;

AS and LIKE:

  • Data:
    AS can bring in data and rely on multiple tables.
    LIKE can only copy the structure of a single table and cannot substitute data.
  • Properties:
    AS cannot bring in information and comments such AS life cycle and partition key.
    LIKE cannot be brought into the life cycle, but can bring in partition keys, comments and other information.

2.2.2 delete table

--Delete table
DROP TABLE [IF EXISTS] table_name;

2.2.3 modification table

--Add column:
ALTER TABLE <table_name> ADD COLUMNS (<co1_name1> ‹type1>, ‹col_name2> ‹type2>
...]) ;
--Change column name:
ALTER TABLE <table_name> CHANGE COLUMN <old_col_name> RENAME TO <new_col_name> ;
--Comments for this column:
ALTER TABLE <table_name> CHANGE COLUMN <col_name> COMMENT "‹col_comment>"
:

2.2.4 Partition operation

--Add partition
ALTER TABLE <table_name> ADD [IF NOT EXISTS] PARTITION ‹pt spec> [PARTITION <pt_spec>
PARTITION <pt_spec>...];
--delete a partition
ALTER TABLE <table_name> DROP [IF EXISTS] PARTITION <partition_filtercondition>;

2.2.5 modify life cycle

--Specifies the lifecycle of the table when it is created
CREATE TABLE <table_name> (<col_spec>) LIFECYCLE days ;
--Modify the lifecycle of a table
ALTER TABLE <table_name> SET LIFECYCLE days
--Disable or restore the lifecycle.
ALTER TABLE <table_name> DISABLE LIFECYCLE
ALTER TABLE <table_name> ENABLE LIFECYCLE
--Modify update time
ALTER TABLE <table_name> TOUCH PARTITION (<pt_spec>);

2.2.6 view operation

--Create view:
CREATE [OR REPLACE] VIEW [IF NOT EXISTS] view_name
[ (col name [COMMENT col comment], ...)]
[COMMENT view_comment]
[AS select_statement]
--Delete view
DROP VIEW [IF NOT EXISTS] view_name;
--Rename view
ALTER VIEW view_name RENAME TO new_view_name;

3. Summary

Originally, we should show you the steps of using MaxComputer to process data tables. However, we found that there were many DDL statements and each part would be very messy. We won't show you, but we can show you the Alibaba cloud client:

  • This is your own odps client:
  • This is DataWorks

    There are two main points to note here:
  • Data type of MaxComputer SQL
  • There are no delete and update statements

4. References

Alibaba cloud global training center MaxComputer

Topics: Hadoop SQL Alibaba Cloud