💝 Today, we will introduce some features of MaxComputer, a big data engine, and MaxComputer SQL. Students interested in hive sql can view the following contents 👇:
- Part I: Import and export of Hive data of Hadoop (DML).
- Part II: Hive query statement of Hadoop.
- Part III: Seven Join statements of Hive in Hadoop.
- Part IV: Ranking of Hive in Hadoop.
- Part V: Partition table of Hive in Hadoop.
- Part VI: Bucket table of Hive in Hadoop.
🐯 MaxComputer is an enterprise level big data engine. What rules should he pay attention to in the actual production process? What should MaxComputer SQL pay attention to? Next, we will give a detailed explanation.
1. Introduction to maxcomputer
MaxComputer is a fast and fully hosted EB level data warehouse solution designed by Alibaba cloud. The main functions are:
- It is committed to the storage and calculation of batch structured telephone data, and provides solutions for massive data warehouse and analysis and modeling services.
- It is suitable for storage and computing requirements above 100GB, up to EB level.
- Support SQL, MapReduce, UDF (Java/Python), Graph, DAG based processing, interactive, memory computing, machine learning and other computing types and MP! Iterative algorithm. It simplifies the application architecture of enterprise big data platform.
1.1 architecture of maxcomputer
The architecture of MaxComputer mainly consists of four layers:
Among them, let's talk about the computing layer: the computing layer uses Feitian kernel to realize user data storage and computing tasks. Take a closer look to see if it is very similar to hadoop ecology!
1.2 MaxComputer usage process
In this paper, we mainly talk about the part of data processing.
2. MaxComputer SQL
MaxCompute SQL is a data warehouse tool, which adopts a syntax similar to SQL. It is a subset of the standard syntax ANSISQL92. It has its own extension, supports common SQL syntax, including window functions, and is suitable for data operation of massive data (TB/PB level). The features are as follows:
- It is not equivalent to relational database and does not support transaction, primary foreign key constraint, index and other syntax.
- After the Query is parsed, the calculation and storage are handed over to the underlying Feitian implementation.
- Task preparation, submission and other stages take a long time, and the real-time performance is not high.
- The SQL length is limited. At present, it cannot exceed 2MB
2.1 data type of maxcomputer
Please keep the following data types in mind!!!
2.2 introduction to maxcomputer SQL
What is MaxComputer SQL?
- Is a data warehouse tool (hive similar)
- Similar to sql syntax, but different, and has its own expansion
- Support common SQL syntax and window functions
- It is applicable to the operation of massive data (TP/BP level)
MaxComputer SQL feature
- It is not equivalent to relational database and does not support transaction, primary foreign key constraint, index and other syntax.
- After Query parsing, the calculation and storage are implemented by the underlying Feitian (equivalent to the relationship between hive, MapReduce and HDFS)
- The task preparation and submission phases take a lot of time
- SQL time is limited.
- There are no delete and update statements.
2.2 MaxComputer SQL DDL
DDL refers to the data definition language, which mainly includes the following operations:
- Table operations: create, modify, delete, view
- Partition and column operations: add, modify, delete
- Lifecycle operations: create, modify, prohibit / restore
- View operations: create, delete
2.2.1 table creation statement
- Common table building
--Create table CREATE TABLE [IF NOT EXISTS] table_name [(col name data type (COMMENT col comment],...) ] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment],...)] [LIFECYCLE days] [AS select statement];
- Quick table creation
--as Build table CREATE TABLE ‹table_name> AS SELECT <column_list> FROM ‹table_name> WHERE ...; --like Build table CREATE TABLE ‹table_name> LIKE ‹table name>;
AS and LIKE:
- Data:
AS can bring in data and rely on multiple tables.
LIKE can only copy the structure of a single table and cannot substitute data. - Properties:
AS cannot bring in information and comments such AS life cycle and partition key.
LIKE cannot be brought into the life cycle, but can bring in partition keys, comments and other information.
2.2.2 delete table
--Delete table DROP TABLE [IF EXISTS] table_name;
2.2.3 modification table
--Add column: ALTER TABLE <table_name> ADD COLUMNS (<co1_name1> ‹type1>, ‹col_name2> ‹type2> ...]) ; --Change column name: ALTER TABLE <table_name> CHANGE COLUMN <old_col_name> RENAME TO <new_col_name> ; --Comments for this column: ALTER TABLE <table_name> CHANGE COLUMN <col_name> COMMENT "‹col_comment>" :
2.2.4 Partition operation
--Add partition ALTER TABLE <table_name> ADD [IF NOT EXISTS] PARTITION ‹pt spec> [PARTITION <pt_spec> PARTITION <pt_spec>...]; --delete a partition ALTER TABLE <table_name> DROP [IF EXISTS] PARTITION <partition_filtercondition>;
2.2.5 modify life cycle
--Specifies the lifecycle of the table when it is created CREATE TABLE <table_name> (<col_spec>) LIFECYCLE days ; --Modify the lifecycle of a table ALTER TABLE <table_name> SET LIFECYCLE days --Disable or restore the lifecycle. ALTER TABLE <table_name> DISABLE LIFECYCLE ALTER TABLE <table_name> ENABLE LIFECYCLE --Modify update time ALTER TABLE <table_name> TOUCH PARTITION (<pt_spec>);
2.2.6 view operation
--Create view: CREATE [OR REPLACE] VIEW [IF NOT EXISTS] view_name [ (col name [COMMENT col comment], ...)] [COMMENT view_comment] [AS select_statement] --Delete view DROP VIEW [IF NOT EXISTS] view_name; --Rename view ALTER VIEW view_name RENAME TO new_view_name;
3. Summary
Originally, we should show you the steps of using MaxComputer to process data tables. However, we found that there were many DDL statements and each part would be very messy. We won't show you, but we can show you the Alibaba cloud client:
- This is your own odps client:
- This is DataWorks
There are two main points to note here: - Data type of MaxComputer SQL
- There are no delete and update statements
4. References
Alibaba cloud global training center MaxComputer