MySQL parsing order

Posted by rlhms09 on Sun, 21 Jun 2020 04:28:16 +0200

SQL parsing order

Next, let's take a look at the past and present life of an SQL statement.
First, let's look at the sample statement:

SELECT DISTINCT
    < select_list >
FROM
    < left_table > < join_type >
JOIN < right_table > ON < join_condition >
WHERE
    < where_condition >
GROUP BY
    < group_by_list >
HAVING
    < having_condition >
ORDER BY
    < order_by_condition > 
LIMIT < limit_number >

However, its execution order is as follows:

-- Row filtering
FROM <left_table>
ON <join_condition>
<join_type> JOIN <right_table> -- The second and third steps will loop
WHERE <where_condition> -- The fourth step will be executed circularly. The execution order of multiple conditions is from left to right.
GROUP BY <group_by_list>
HAVING <having_condition>
-- Column filtering
SELECT -- Group before execution SELECT
DISTINCT <select_list>
--sort
ORDER BY <order_by_condition>
-- MySQL additional
LIMIT <limit_number> -- The first nine steps are SQL92 Standard grammar. limit yes MySQL The unique syntax of.

Although I didn't expect it to be like this, it's still very natural and harmonious at first sight. Where can I get it? I need to choose the same or different filter conditions and arrange them in order. Then I know I need to take the first few.
In this case, let's take a step-by-step look at the details of the case.
Now start the journey of SQL parsing!

1.FROM

Calculate the cross join for the left and right tables of FROM. Generate virtual table VT1

mysql> select * from product,category;
+-----+-----------------+-------+-------+------+----+--------------+
| pid | pname           | price | pdate | cid  | id | cname        |
+-----+-----------------+-------+-------+------+----+--------------+
|   1 | Durian, Thailand      |    98 | NULL  |    1 |  1 | Foreign food     |
|   1 | Durian, Thailand      |    98 | NULL  |    1 |  2 | Domestic food     |
|   1 | Durian, Thailand      |    98 | NULL  |    1 |  3 | Domestic clothing     |
|   2 | Date of Thailand        |    38 | NULL  |    1 |  1 | Foreign food     |
|   2 | Date of Thailand        |    38 | NULL  |    1 |  2 | Domestic food     |
|   2 | Date of Thailand        |    38 | NULL  |    1 |  3 | Domestic clothing     |
|   3 | Xinjiang cut cake        |    68 | NULL  |    2 |  1 | Foreign food     |
|   3 | Xinjiang cut cake        |    68 | NULL  |    2 |  2 | Domestic food     |
|   3 | Xinjiang cut cake        |    68 | NULL  |    2 |  3 | Domestic clothing     |
|   4 | Thirteen fragrance          |    10 | NULL  |    2 |  1 | Foreign food     |
|   4 | Thirteen fragrance          |    10 | NULL  |    2 |  2 | Domestic food     |
|   4 | Thirteen fragrance          |    10 | NULL  |    2 |  3 | Domestic clothing     |
|   5 | Date of Thailand        |    20 | NULL  |    2 |  1 | Foreign food     |
|   5 | Date of Thailand        |    20 | NULL  |    2 |  2 | Domestic food     |
|   5 | Date of Thailand        |    20 | NULL  |    2 |  3 | Domestic clothing     |
|   6 | Date of Thailand        |    98 | NULL  |   20 |  1 | Foreign food     |
|   6 | Date of Thailand        |    98 | NULL  |   20 |  2 | Domestic food     |
|   6 | Date of Thailand        |    98 | NULL  |   20 |  3 | Domestic clothing     |
|   7 | iPhone mobile phone      |   800 | NULL  |   30 |  1 | Foreign food     |
|   7 | iPhone mobile phone      |   800 | NULL  |   30 |  2 | Domestic food     |
|   7 | iPhone mobile phone      |   800 | NULL  |   30 |  3 | Domestic clothing     |
+-----+-----------------+-------+-------+------+----+--------------+
21 rows in set (0.00 sec)

2.ON filtering

ON filter virtual table VT1, only those matching rows will be recorded in virtual table VT2.
Note: because of the grammar limitation, we use "WHERE" instead, from which readers can also feel the subtle relationship between them;
 

mysql> select * from product a,category b where a.cid-b.id;
+-----+-----------------+-------+-------+------+----+--------------+
| pid | pname           | price | pdate | cid  | id | cname        |
+-----+-----------------+-------+-------+------+----+--------------+
|   1 | Durian, Thailand      |    98 | NULL  |    1 |  2 | Domestic food     |
|   1 | Durian, Thailand      |    98 | NULL  |    1 |  3 | Domestic clothing     |
|   2 | Date of Thailand        |    38 | NULL  |    1 |  2 | Domestic food     |
|   2 | Date of Thailand        |    38 | NULL  |    1 |  3 | Domestic clothing     |
|   3 | Xinjiang cut cake        |    68 | NULL  |    2 |  1 | Foreign food     |
|   3 | Xinjiang cut cake        |    68 | NULL  |    2 |  3 | Domestic clothing     |
|   4 | Thirteen fragrance          |    10 | NULL  |    2 |  1 | Foreign food     |
|   4 | Thirteen fragrance          |    10 | NULL  |    2 |  3 | Domestic clothing     |
|   5 | Date of Thailand        |    20 | NULL  |    2 |  1 | Foreign food     |
|   5 | Date of Thailand        |    20 | NULL  |    2 |  3 | Domestic clothing     |
|   6 | Date of Thailand        |    98 | NULL  |   20 |  1 | Foreign food     |
|   6 | Date of Thailand        |    98 | NULL  |   20 |  2 | Domestic food     |
|   6 | Date of Thailand        |    98 | NULL  |   20 |  3 | Domestic clothing     |
|   7 | iPhone mobile phone      |   800 | NULL  |   30 |  1 | Foreign food     |
|   7 | iPhone mobile phone      |   800 | NULL  |   30 |  2 | Domestic food     |
|   7 | iPhone mobile phone      |   800 | NULL  |   30 |  3 | Domestic clothing     |
+-----+-----------------+-------+-------+------+----+--------------+
16 rows in set (0.00 sec)

3.OUTER JOIN add outer column

If the OUTER JOIN (such as left join and right join) is specified, the unmatched rows in the reserved table will be added to the virtual table VT2 as outer rows to generate the virtual table VT3.
If there are more than two tables in the FROM clause, steps 1-3 will be repeated for the result VT3 FROM the previous join connection and the next table until all the tables are processed.

mysql> select * from product a left outer join category b on a.cid=b.id; # Data in the left table shall prevail
+-----+-----------------+-------+-------+------+------+--------------+
| pid | pname           | price | pdate | cid  | id   | cname        |
+-----+-----------------+-------+-------+------+------+--------------+
|   1 | Durian, Thailand      |    98 | NULL  |    1 |    1 | Foreign food     |
|   2 | Date of Thailand        |    38 | NULL  |    1 |    1 | Foreign food     |
|   3 | Xinjiang cut cake        |    68 | NULL  |    2 |    2 | Domestic food     |
|   4 | Thirteen fragrance          |    10 | NULL  |    2 |    2 | Domestic food     |
|   5 | Date of Thailand        |    20 | NULL  |    2 |    2 | Domestic food     |
|   6 | Date of Thailand        |    98 | NULL  |   20 | NULL | NULL         |
|   7 | iPhone mobile phone      |   800 | NULL  |   30 | NULL | NULL         |
+-----+-----------------+-------+-------+------+------+--------------+
7 rows in set (0.00 sec)
mysql> select * from product a right outer join category b on a.cid=b.id; # Data in the right table shall prevail
+------+-----------------+-------+-------+------+----+--------------+
| pid  | pname           | price | pdate | cid  | id | cname        |
+------+-----------------+-------+-------+------+----+--------------+
|    1 | Durian, Thailand      |    98 | NULL  |    1 |  1 | Foreign food     |
|    2 | Date of Thailand        |    38 | NULL  |    1 |  1 | Foreign food     |
|    3 | Xinjiang cut cake        |    68 | NULL  |    2 |  2 | Domestic food     |
|    4 | Thirteen fragrance          |    10 | NULL  |    2 |  2 | Domestic food     |
|    5 | Date of Thailand        |    20 | NULL  |    2 |  2 | Domestic food     |
| NULL | NULL            |  NULL | NULL  | NULL |  3 | Domestic clothing     |
+------+-----------------+-------+-------+------+----+--------------+
6 rows in set (0.00 sec)

4.WHERE

Filter the virtual table VT3 by WHERE condition. Only matching records will be inserted into the virtual table VT4.
be careful:
At this time, because of grouping, aggregation operation cannot be used, and alias created in SELECT cannot be used;

Difference with ON:

  • If there are external columns, ON filters the associated table, and the main table (reserved table) will return all the columns;
  • If no external columns are added, the effect is the same;

Application:

  • Filter the main table in WHERE;
  • For association tables, use ON after condition query and then join, and use WHERE after condition query;
mysql> select * from product a left outer join category b on a.cid=b.id where a.pname='Date of Thailand';
+-----+--------------+-------+-------+------+------+--------------+
| pid | pname        | price | pdate | cid  | id   | cname        |
+-----+--------------+-------+-------+------+------+--------------+
|   2 | Date of Thailand     |    38 | NULL  |    1 |    1 | Foreign food     |
|   5 | Date of Thailand     |    20 | NULL  |    2 |    2 | Domestic food     |
|   6 | Date of Thailand     |    98 | NULL  |   20 | NULL | NULL         |
+-----+--------------+-------+-------+------+------+--------------+
3 rows in set (0.00 sec)

5.GROUP BY

According to the columns in the group by clause, the records in VT4 are grouped to generate the virtual table VT5.

be careful:

The columns used in the subsequent statements, such as select and having, must be included in GROUP BY. For those that do not appear, aggregate functions are used;
reason:
GROUP BY changes the reference to the table, converts it to a new reference mode, and reduces the number of columns that can perform the next level of logical operation on it;

My understanding is:
According to the grouping field, the records with the same grouping field can be merged into one record, because each grouping can only return one record, unless it is filtered out, and the fields not in the grouping field may have multiple values, and multiple values cannot be put into one record, so these columns with multiple values must be converted into single values through the aggregation function;

mysql> select * from product a left outer join category b on a.cid=b.id where a.pname='Date of Thailand' group by a.price;
+-----+--------------+-------+-------+------+------+--------------+
| pid | pname        | price | pdate | cid  | id   | cname        |
+-----+--------------+-------+-------+------+------+--------------+
|   5 | Date of Thailand     |    20 | NULL  |    2 |    2 | Domestic food     |
|   2 | Date of Thailand     |    38 | NULL  |    1 |    1 | Foreign food     |
|   6 | Date of Thailand     |    98 | NULL  |   20 | NULL | NULL         |
+-----+--------------+-------+-------+------+------+--------------+
3 rows in set (0.00 sec)

6.HAVING

Apply the having filter to the virtual table VT5, and only the matching records will be inserted into the virtual table VT6.
 

mysql> select * from product a left outer join category b on a.cid=b.id where a.pname='Date of Thailand' group by a.price having b.id <=2;
+-----+--------------+-------+-------+------+------+--------------+
| pid | pname        | price | pdate | cid  | id   | cname        |
+-----+--------------+-------+-------+------+------+--------------+
|   5 | Date of Thailand     |    20 | NULL  |    2 |    2 | Domestic food     |
|   2 | Date of Thailand     |    38 | NULL  |    1 |    1 | Foreign food     |
+-----+--------------+-------+-------+------+------+--------------+
2 rows in set (0.00 sec)

7.SELECT

This clause processes the elements in the SELECT clause to generate the VT5 table. (5-J1) evaluation expression evaluates the expression in the SELECT clause to generate VT5-J1

8.DISTINCT

Find the duplicate columns in VT5-1 and delete them to generate VT5-J2
If the DISTINCT clause is specified in the query, a temporary memory table will be created (if the memory cannot be stored, it needs to be stored on the hard disk). The table structure of this temporary table is the same as the virtual table VT5 generated in the previous step. The difference is that a unique index is added to the column for DISTINCT operation to divide the duplicate data.
 

mysql> select distinct a.pname from product a left outer join category b on a.cid=b.id where a.pname='Date of Thailand' group by a.price ;
+--------------+
| pname        |
+--------------+
| Date of Thailand     |
+--------------+
1 row in set (0.00 sec)

9.ORDER BY

From the table in VT5-J2, sort the results according to the conditions of the ORDER BY clause to generate the VT6 table.

be careful:

The only place where aliases in SELECT can be used;
 

mysql> select * from product a left outer join category b on a.cid=b.id where a.pname='Date of Thailand' group by a.price having b.id <=2 order by b.id;
+-----+--------------+-------+-------+------+------+--------------+
| pid | pname        | price | pdate | cid  | id   | cname        |
+-----+--------------+-------+-------+------+------+--------------+
|   2 | Date of Thailand     |    38 | NULL  |    1 |    1 | Foreign food     |
|   5 | Date of Thailand     |    20 | NULL  |    2 |    2 | Domestic food     |
+-----+--------------+-------+-------+------+------+--------------+
2 rows in set (0.00 sec)

10.LIMIT (unique to MySQL)

The LIMIT clause selects the specified row data starting from the specified location from the VT6 virtual table obtained in the previous step.

be careful:
The positive and negative effects of offset and rows;
When the offset is large and the aging rate is low, you can do this:
The sub query is used to optimize. In the sub query, the maximum id is obtained from the index first, then it is inverted, and then the result set of N rows is obtained
Using INNER JOIN optimization, the JOIN clause also takes priority to get ID list from index, and then directly associates with query to get the final result

mysql> select * from product a left outer join category b on a.cid=b.id where a.pname='Date of Thailand' group by a.price having b.id <=2 order by b.id limit 1;
+-----+--------------+-------+-------+------+------+--------------+
| pid | pname        | price | pdate | cid  | id   | cname        |
+-----+--------------+-------+-------+------+------+--------------+
|   2 | Thailand jujube     |    38 | NULL  |    1 |    1 | Foreign food     |
+-----+--------------+-------+-------+------+------+--------------+
1 row in set (0.00 sec)

Summary of analytical sequence

Illustration

Process analysis

FROM (Cartesian product the last two tables) --- VT1
 ON (filter VT1 according to its conditions) --- VT2
 JOIN - VT3
 WHERE (filter records in VT3) - VT4 VTn
 GROUP BY - VT5
 HAVING (filtering records in VT5) --- VT6
 SELECT (SELECT the specified column for the record in VT6) - VT7
 ORDER BY - VT8
 LIMIT (paging sorted values) - MySQL specific syntax

Process Description:

  • Single table query: filter the records in the table according to WHERE criteria to form an intermediate table (the intermediate table is invisible to users); then SELECT the corresponding column according to the SELECT column to return the final result.
  • Two table join query: two table quadrature (Cartesian product) is filtered with ON condition and join join type to form an intermediate table; then the records of the intermediate table are filtered according to WHERE condition and the query result is returned according to the column specified by SELECT.
Cartesian product: row multiplication and column addition.
  • Multi table join query: first query the first and second tables according to the two table join, then use the query results and the third table to join the query, and so on, until all tables are connected, and finally form an intermediate result table, then filter the records of the intermediate table according to the WHERE condition, and return the query results according to the column specified by SELECT.

WHERE condition resolution order

  1. MySQL: execute the WHERE condition from left to right.
  2. Oracle: execute the WHERE condition from right to left.
When writing the WHERE condition, the part with high priority should write the condition statement with the greatest filtering force.

 

Topics: MySQL Mobile SQL Oracle