MYSQL learning notes - sql statement optimization tool

Posted by kester on Sun, 27 Feb 2022 06:56:51 +0100

I explained a lot earlier mysql This chapter explains the sentence optimization of mysql.

1, Locate slow query

We need to optimize sql statements. The first step must be to find the statements with slow execution speed. So how to locate these sql statements with slow execution speed in a project? Here is a method to locate slow queries.

1.1. Database preparation

First create a database table:

CREATE TABLE emp
(empno  MEDIUMINT UNSIGNED  NOT NULL  DEFAULT 0 COMMENT 'number',
ename VARCHAR(20) NOT NULL DEFAULT "" COMMENT 'name',
job VARCHAR(9) NOT NULL DEFAULT "" COMMENT 'work',
mgr MEDIUMINT UNSIGNED NOT NULL DEFAULT 0 COMMENT 'Superior number',
hiredate DATE NOT NULL COMMENT 'Entry time',
sal DECIMAL(7,2)  NOT NULL COMMENT 'salary',
comm DECIMAL(7,2) NOT NULL COMMENT 'dividend',
deptno MEDIUMINT UNSIGNED NOT NULL DEFAULT 0 COMMENT 'Department number'
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

Then we build a storage function, which will return a random string with the length of parameter n:

This code is by Java Architect must see network-Structure Sorting
delimiter $$

create function rand_string(n INT) 
returns varchar(255) #This function returns a string
begin 
	declare chars_str varchar(100) default 'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ';
	declare return_str varchar(255) default '';
	declare i int default 0;
 	while i < n do 
   	    set return_str =concat(return_str,substring(chars_str,floor(1+rand()*52),1));
        set i = i + 1;
 	end while;
 	return return_str;
end $$

delimiter ;

Next, we will create a storage function, which will return a random int value:

delimiter $$

create function rand_num( )
returns int(5)
begin 
 declare i int default 0;
 set i = floor(10+rand()*500);
return i;
  end $$

delimiter ;

Then we use the two storage functions just created to create a stored procedure. The stored procedure contains a parameter that represents the number of data pieces inserted into the data table emp:

This code is by Java Architect must see network-Structure Sorting
delimiter $$

create procedure insert_emp(in max_num int(10))
begin
declare i int default 0; 
 set autocommit = 0;  
 repeat
 set i = i + 1;
 insert into emp values (i ,rand_string(6),'SALESMAN',0001,curdate(),2000,400,rand_num());
  until i = max_num
 end repeat;
   commit;
 end $$

delimiter ;

Finally, we call the modified stored procedure and insert 1000w pieces of data into the emp table:

call insert_emp(10000000);

1.2. View slow query

We can view the number of slow queries with the following command:

show status like 'slow_queries';

Now type the command in mysql, and you can see that the value is 1. This slow query is generated by just inserting 1000w pieces of data in batch.

Using this command can only view the number of slow queries, but we have no way to know which queries produce slow queries. If we want to know which queries cause slow queries, we must modify the mysql configuration file. Open the mysql configuration file (my.ini for windows and my.cnf for Linux), and add the following code under [mysqld]:

log-slow-queries=mysql_slow.log
long_query_time=1

At this point, run the following command in mysql to see slow_query_log is ON, log_file is also the file specified by us:

mysql> show variables like 'slow_query%';  
+---------------------+------------------------------+
| Variable_name       | Value                        |
+---------------------+------------------------------+
| slow_query_log      | ON                           |
| slow_query_log_file | mysql_slow.log |
+---------------------+------------------------------+
2 rows in set (0.00 sec)

Run the following command to see that the slow query time we set is also effective. At this time, as long as the query time is greater than 1s, the query statements will be stored in the log file.

mysql> show variables like 'long_query_time';  
+-----------------+----------+
| Variable_name   | Value    |
+-----------------+----------+
| long_query_time | 1.000000 |
+-----------------+----------+
1 row in set (0.00 sec)

Now let's run a query statement with a query time of more than 1s:

mysql> select * from emp where empno=413345;
+--------+--------+----------+-----+------------+---------+--------+--------+
| empno  | ename  | job      | mgr | hiredate   | sal     | comm   | deptno |
+--------+--------+----------+-----+------------+---------+--------+--------+
| 413345 | vvOHUB | SALESMAN |   1 | 2014-10-26 | 2000.00 | 400.00 |     11 |
+--------+--------+----------+-----+------------+---------+--------+--------+
1 row in set (6.55 sec)

Then check the data directory under the mysql installation directory, which will generate a slow query log file: mysql_slow.log, the contents of the file are as follows:

/usr/local/mysql/bin/mysqld, Version: 5.1.73-log (MySQL Community Server (GPL)). started with:
Tcp port: 3306  Unix socket: /tmp/mysql.sock
Time                 Id Command    Argument
# Time: 141026 23:24:08
# User@Host: root[root] @ localhost []
# Query_time: 6.547536  Lock_time: 0.002936 Rows_sent: 1  Rows_examined: 10000000
use temp;
SET timestamp=1414337048;
select * from emp where empno=413345;

In the log file, we can know the generation time of the slow query, the final results of several lines, the results of several lines tested, and what the running statement is. Here we can see that this statement produces a result, but it detects 1000w rows of records, which is a full table scan.

2, Explain execution plan

The slow query log can help us record all sql statements that take too long to query. Before optimizing these statements, we should use the explain command to check the mysql execution plan and find the optimization points.

The explain command is very simple to use. You only need "explain + sql statement". The following command is the result of using explain for the slow query statement we just mentioned:

mysql> explain select * from emp where empno=413345\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: emp
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10000351
        Extra: Using where
1 row in set (0.00 sec)

ERROR: 
No query specified

You can see that the result of the explain command has the following columns: id, select_type, table, type, possible_keys, key, key_len, ref, rows, Extra, these columns respectively represent the following meanings:

1. id: SELECT id. This is the query serial number of SELECT;

      2,select_type: query type, mainly including primary (the outermost query in the sub query), SUBQUERY (the first SELECT in the inner layer of the sub query), Union (all the selections after the second SELECT in the union statement), SIMPLE (other queries except the sub query or union);

3. table: the accessed database indicates;

4. type: the access method to the table, including the following types: all (full table scan), index (full index scan), Lang (index rang e scan), ref (query referenced by the index of the driven table in the join statement), eq_ref (accessed through primary key or unique index, there will be only one result at most), const (read constant, only need to read once), system (system table. There is only one data in the table), null (the fastest).

      5,possible_keys: query the possible indexes;

6. key: the last selected index;

      7,key_len: maximum length of index used;

8. ref: lists a field of a table for filtering;

9. Rows: the number of estimated result rows;

10. extra: query details, which may be the following values: distinct, using filesort (order by operation), using index (the queried data only needs to be obtained in index), using temporary (using temporary table), using where (this information will be included if where is included and the content is not obtained only through index).

In this way, through the output of "explain select * from emp where empno=413345\G" command, we can clearly see that this query statement is a full table scanning statement, and no index is used in the query, so its query time will be very slow.

3, Use of Profiling

mysql not only provides the explain command to view the command execution plan, but also provides the profiling tool to view the resource consumption in the process of statement query. To start profiling, use the following command:

set profiling = 1;

Next, we execute a query command:

mysql> select * from emp where empno=413345;
+--------+--------+----------+-----+------------+---------+--------+--------+
| empno  | ename  | job      | mgr | hiredate   | sal     | comm   | deptno |
+--------+--------+----------+-----+------------+---------+--------+--------+
| 413345 | vvOHUB | SALESMAN |   1 | 2014-10-26 | 2000.00 | 400.00 |     11 |
+--------+--------+----------+-----+------------+---------+--------+--------+
1 row in set (6.44 sec)

After the Query Profiler function is enabled, MySQL will automatically record the profile information of all executed queries. Then we can obtain the profile profile information of all queries saved in the system through the following command:

mysql> show profiles;
+----------+------------+--------------------------------------+
| Query_ID | Duration   | Query                                |
+----------+------------+--------------------------------------+
|        1 | 0.00053000 | show tables                          |
|        2 | 0.07412700 | select * from dept                   |
|        3 | 0.06743300 | select * from salgrade               |
|        4 | 6.44056000 | select * from emp where empno=413345 |
+----------+------------+--------------------------------------+
4 rows in set (0.00 sec)

Then we can view the profile information of a specific query through the following command:

mysql> show profile cpu, block io for query 4;
+--------------------+----------+----------+------------+--------------+---------------+
| Status             | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+--------------------+----------+----------+------------+--------------+---------------+
| starting           | 0.000107 | 0.000072 |   0.000025 |            0 |             0 |
| Opening tables     | 0.000021 | 0.000018 |   0.000003 |            0 |             0 |
| System lock        | 0.000006 | 0.000004 |   0.000001 |            0 |             0 |
| Table lock         | 0.000009 | 0.000008 |   0.000001 |            0 |             0 |
| init               | 0.000034 | 0.000033 |   0.000002 |            0 |             0 |
| optimizing         | 0.000012 | 0.000011 |   0.000001 |            0 |             0 |
| statistics         | 0.000014 | 0.000012 |   0.000001 |            0 |             0 |
| preparing          | 0.000013 | 0.000012 |   0.000002 |            0 |             0 |
| executing          | 0.000005 | 0.000005 |   0.000016 |            0 |             0 |
| Sending data       | 6.440260 | 7.818553 |   0.178155 |            0 |             0 |
| end                | 0.000008 | 0.000006 |   0.000011 |            0 |             0 |
| query end          | 0.000002 | 0.000002 |   0.000003 |            0 |             0 |
| freeing items      | 0.000030 | 0.000013 |   0.000017 |            0 |             0 |
| logging slow query | 0.000001 | 0.000000 |   0.000001 |            0 |             0 |
| logging slow query | 0.000035 | 0.000020 |   0.000015 |            0 |             0 |
| cleaning up        | 0.000003 | 0.000003 |   0.000000 |            0 |             0 |
+--------------------+----------+----------+------------+--------------+---------------+
16 rows in set (0.00 sec)

The profile shows the time-consuming of each operation and the consumption of cpu and Block IO, so that we can optimize the query statements more specifically. You can see that since this is a full table scan, the most time-consuming here is on sending data. In addition to this situation, the following situations may also take a lot of time: converting heap to MyISAM (when the query result is too large, put the result on disk), create TMP table (create temporary table, such as storing intermediate results when group), copying to TMP table on disk, locked, and logging slow query.