Problem description
An SQL was encountered in the optimization process:
SELECT SUM(user_value) FROM user_log WHERE del_flag = 0 AND product_id = 2324 AND user_type = 1;
Its implementation plan is as follows:
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: user_log partitions: NULL type: ref possible_keys: index_sId_ty_vl_ct,IDX_product_id_oth1 key: IDX_product_id_oth1 key_len: 12 ref: const,const,const rows: 14884 filtered: 100.00 Extra: Using index 1 row in set, 1 warning (0.00 sec
From the perspective of execution plan, using Use Index (Overlay Index) is the best execution plan, but each query scans more data, affecting the overall query performance.
Optimization plan
Queries need to use SUM to calculate the sum of user_value s, and borrow the example of 1+1+0+0+0+0+0=1+1=2 to test as follows:
SELECT SUM(CASE WHEN user_value>0 THEN 1 ELSE 0 END) AS count1, COUNT(1) AS count2 FROM user_log WHERE del_flag = 0 AND product_id = 2324 AND user_type = 1; +--------+--------+ | count1 | count2 | +--------+--------+ | 680 | 8067 | +--------+--------+
Assuming that user_value has no negative value, the results of the following two SQL s are the same:
##Test SQL1 SELECT SUM(user_value), COUNT(1) AS count2 FROM user_log WHERE del_flag = 0 AND product_id = 2324 AND user_type = 1; ##Test SQL2 SELECT SUM(user_value) FROM user_log WHERE del_flag = 0 AND product_id = 2324 AND user_type = 1 AND user_value>0;
The execution time of test SQL1 is 0.00327250, and its resource consumption is as follows:
+----------------------+----------+----------+------------+--------------+---------------+-------+ | Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out | Swaps | +----------------------+----------+----------+------------+--------------+---------------+-------+ | starting | 0.000066 | 0.000000 | 0.000000 | 0 | 0 | 0 | | checking permissions | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | Opening tables | 0.000016 | 0.000000 | 0.000000 | 0 | 0 | 0 | | init | 0.000021 | 0.000000 | 0.000000 | 0 | 0 | 0 | | System lock | 0.000015 | 0.000000 | 0.000000 | 0 | 0 | 0 | | optimizing | 0.000016 | 0.000000 | 0.000000 | 0 | 0 | 0 | | statistics | 0.000127 | 0.000000 | 0.000000 | 0 | 0 | 0 | | preparing | 0.000019 | 0.000000 | 0.000000 | 0 | 0 | 0 | | executing | 0.000011 | 0.000000 | 0.000000 | 0 | 0 | 0 | | Sending data | 0.002867 | 0.002999 | 0.000000 | 0 | 0 | 0 | | end | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | query end | 0.000013 | 0.000000 | 0.000000 | 0 | 0 | 0 | | closing tables | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | freeing items | 0.000054 | 0.000000 | 0.000000 | 0 | 8 | 0 | | cleaning up | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | +----------------------+----------+----------+------------+--------------+---------------+-------+
The execution time of testing SQL2 is 0.00072325, and its resource consumption is as follows:
+----------------------+----------+----------+------------+--------------+---------------+-------+ | Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out | Swaps | +----------------------+----------+----------+------------+--------------+---------------+-------+ | starting | 0.000072 | 0.000000 | 0.000000 | 0 | 0 | 0 | | checking permissions | 0.000013 | 0.000000 | 0.000000 | 0 | 0 | 0 | | Opening tables | 0.000016 | 0.000000 | 0.000000 | 0 | 0 | 0 | | init | 0.000021 | 0.000000 | 0.000000 | 0 | 0 | 0 | | System lock | 0.000014 | 0.000000 | 0.000000 | 0 | 0 | 0 | | optimizing | 0.000020 | 0.000000 | 0.000000 | 0 | 0 | 0 | | statistics | 0.000089 | 0.000000 | 0.000000 | 0 | 0 | 0 | | preparing | 0.000020 | 0.000000 | 0.000000 | 0 | 0 | 0 | | executing | 0.000011 | 0.000000 | 0.000000 | 0 | 0 | 0 | | Sending data | 0.000365 | 0.000000 | 0.000000 | 0 | 0 | 0 | | end | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | query end | 0.000014 | 0.000000 | 0.000000 | 0 | 0 | 0 | | closing tables | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | freeing items | 0.000035 | 0.000000 | 0.000000 | 0 | 0 | 0 | | cleaning up | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | +----------------------+----------+----------+------------+--------------+---------------+-------+
In the Sending data section, the gap between the two in the Durion section is about 10 times, while the gap in the CPU and user section of the test SQL2 is more obvious.
Conclusion:
When optimizing SQL, DBA needs to start with business logic besides data distribution/index structure.
PS: The above optimization assumes that user_value does not have a negative value, while user_value may have a negative value in the actual business logic, so the above optimization is purely fictitious.