1. Exits statement
When MySQL treats the EXISTS clause, it still adopts the execution method of nested subquery. As shown in the following SQL statement:
SELECT * FROM my_neighbor n LEFT JOIN my_neighbor_apply sra ON n.id = sra.neighbor_id AND sra.user_id = 'xxx' WHERE n.topic_status < 4 AND EXISTS(SELECT 1 FROM message_info m WHERE n.id = m.neighbor_id AND m.inuser = 'xxx') AND n.topic_type <> 5
explain:
+----+--------------------+-------+------+-----+------------------------------------------+---------+-------+---------+ -----+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+ | 1 | PRIMARY | n | ALL | | NULL | NULL | NULL | 1086041 | Using where | | 1 | PRIMARY | sra | ref | | idx_user_id | 123 | const | 1 | Using where | | 2 | DEPENDENT SUBQUERY | m | ref | | idx_message_info | 122 | const | 1 | Using index condition; Using where | +----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+
Removing exists and changing it to join can avoid nested subqueries and reduce the execution time from 1.93 seconds to 1 millisecond.
SELECT * FROM my_neighbor n INNER JOIN message_info m ON n.id = m.neighbor_id AND m.inuser = 'xxx' LEFT JOIN my_neighbor_apply sra ON n.id = sra.neighbor_id AND sra.user_id = 'xxx' WHERE n.topic_status < 4 AND n.topic_type <> 5
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+ | 1 | SIMPLE | m | ref | | idx_message_info | 122 | const | 1 | Using index condition | | 1 | SIMPLE | n | eq_ref | | PRIMARY | 122 | ighbor_id | 1 | Using where | | 1 | SIMPLE | sra | ref | | idx_user_id | 123 | const | 1 | Using where | +----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
2. Push under conditions
External query criteria cannot be pushed down to complex views or sub queries:
-
Aggregate sub query;
-
Subquery with LIMIT;
-
UNION or UNION ALL subquery;
-
Sub query in the output field;
As shown in the following statement, it can be seen from the execution plan that its conditions act after the aggregate subquery:
SELECT * FROM (SELECT target, Count(*) FROM operation GROUP BY target) t WHERE target = 'rm-xxxx'
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+ | 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 514 | const | 2 | Using where | | 2 | DERIVED | operation | index | idx_4 | idx_4 | 519 | NULL | 20 | Using index | +----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
SELECT target, Count(*) FROM operation WHERE target = 'rm-xxxx' GROUP BY target
3. Reduce the scope in advance
SELECT * FROM my_order o LEFT JOIN my_userinfo u ON o.uid = u.uid LEFT JOIN my_productinfo p ON o.pid = p.pid WHERE ( o.display = 0 ) AND ( o.ostaus = 1 ) ORDER BY o.selltime DESC LIMIT 0, 15
Since the last WHERE condition and sorting are for the leftmost main table, you can sort my first_ Order sorting reduces the amount of data in advance, and then makes a left connection. After SQL rewriting, the execution time is reduced to about 1 millisecond.
SELECT * FROM ( SELECT * FROM my_order o WHERE ( o.display = 0 ) AND ( o.ostaus = 1 ) ORDER BY o.selltime DESC LIMIT 0, 15 ) o LEFT JOIN my_userinfo u ON o.uid = u.uid LEFT JOIN my_productinfo p ON o.pid = p.pid ORDER BY o.selltime DESC limit 0, 15
Recheck the execution plan: after materializing the sub query (select_type=DERIVED), participate in the JOIN. Although the estimated row scan is still 900000, the actual execution time becomes very small after using the index and LIMIT clause.
4. Reduce the scope in advance
SELECT a.*, c.allocated FROM ( SELECT resourceid FROM my_distribute d WHERE isdelete = 0 AND cusmanagercode = '1234567' ORDER BY salecode limit 20) a LEFT JOIN ( SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated FROM my_resources GROUP BY resourcesid) c ON a.resourceid = c.resourcesid
Is there any other problem with this statement? It is not difficult to see that sub query c is a full table aggregate query. When the number of tables is particularly large, the performance of the whole statement will be degraded.
In fact, for sub query c, the final result set of the left join only cares about the data that can match the resource ID of the main table. Therefore, we can rewrite the following statement to reduce the execution time from 2 seconds to 2 milliseconds.
SELECT a.*, c.allocated FROM ( SELECT resourceid FROM my_distribute d WHERE isdelete = 0 AND cusmanagercode = '1234567' ORDER BY salecode limit 20) a LEFT JOIN ( SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated FROM my_resources r, ( SELECT resourceid FROM my_distribute d WHERE isdelete = 0 AND cusmanagercode = '1234567' ORDER BY salecode limit 20) a WHERE r.resourcesid = a.resourcesid GROUP BY resourcesid) c ON a.resourceid = c.resourcesid
But the subquery a appears many times in our SQL statement. This writing method not only has additional overhead, but also makes the whole sentence more complicated. Rewrite again using the WITH statement:
WITH a AS ( SELECT resourceid FROM my_distribute d WHERE isdelete = 0 AND cusmanagercode = '1234567' ORDER BY salecode limit 20) SELECT a.*, c.allocated FROM a LEFT JOIN ( SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated FROM my_resources r, a WHERE r.resourcesid = a.resourcesid GROUP BY resourcesid) c ON a.resourceid = c.resourcesid
5. Summary
The database compiler generates an execution plan, which determines the actual execution mode of SQL. However, the compiler only serves the best, and the compilers of all databases are not perfect. Most of the scenarios mentioned above also have performance problems in other databases. Only by understanding the characteristics of database compiler can we avoid its shortcomings and write high-performance SQL statements.
When programmers design data models and write SQL statements, they should bring the idea or consciousness of algorithms into them.
Develop the habit of using WITH statements when writing complex SQL statements. Simple and clear SQL statements can also reduce the burden of the database ^ ^.
If you encounter difficulties in using cloud database (not limited to SQL problems), you should always seek the help of alicloud original expert services.