MysqlExplain explains in detail when the query may use the index union index action rule id select_type table type possible_keys key key_len ref Extra

Posted by dreado on Mon, 04 Oct 2021 02:35:58 +0200

preface

sql tuning is very common in actual development, especially when the server is under pressure, it is necessary to optimize the time-consuming and commonly used sql to improve the high availability of the server. The explain tool will be used during tuning. This chapter will use explain to see how powerful it is.

1, EXPLAIN

Use the EXPLAIN keyword to simulate the optimizer to execute SQL statements and analyze the performance bottlenecks of your query statements or structures;
Add the explain keyword before the select statement, MySQL will set a flag on the query, and executing the query will return the information of the execution plan instead of executing the SQL;

If the from contains a subquery, the subquery will still be executed and the results will be put into the temporary table;

2, Create three tables

# Cast
DROP TABLE IF EXISTS `actor`;
CREATE TABLE `actor` (
    `id` int(11) NOT NULL,
    `name` varchar(45) DEFAULT NULL,
    `update_time` datetime DEFAULT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `actor` (`id`, `name`, `update_time`) VALUES (1,'a','2017-12-2 15:27:18'), (2,'b','2017-12-22 15:27:18'), (3,'c','2017-12-22 15:27:18');

# Movie table
DROP TABLE IF EXISTS `film`;
CREATE TABLE `film` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT NULL,
    PRIMARY KEY (`id`),
    KEY `idx_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `film` (`id`, `name`) VALUES (3,'film0'),(1,'film1'),(2,'film2');

# Association table of actors and movies
DROP TABLE IF EXISTS `film_actor`;
CREATE TABLE `film_actor` (
    `id` int(11) NOT NULL,
    `film_id` int(11) NOT NULL,
    `actor_id` int(11) NOT NULL,
    `remark` varchar(255) DEFAULT NULL,
    PRIMARY KEY (`id`),
    KEY `idx_film_actor_id` (`film_id`,`actor_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `film_actor` (`id`, `film_id`, `actor_id`) VALUES (1,1,1),(2,1,2),(3,2,1);

3, Columns in explain

explain select * from actor;

The above implementation plan results:

id column

The number of the id column is the serial number of the select. If there are several selections, there are several IDS, and the occurrence order of IDS increases according to the occurrence order of the select.
The larger the id column, the higher the execution priority. If the id is the same, it will be executed from top to bottom. If the id is NULL, it will be executed last.

select_type column

select_type indicates whether the corresponding row is a simple or complex query.
Simple queries are the only simple queries. Complex queries include: primary, subquery, derived, union;

  • Simple: simple query. The query does not contain subqueries and union s;
explain select * from film where id = 2;

  • primary: select the outermost layer---- Complex query
  • Subquery: subquery contained in the select (not in the from clause) – complex query
  • Derived: the subquery contained in the from clause. MySQL will store the results in a temporary table, also known as a derived table (derived in English)-- Complex query

Use the following examples to understand the primary, subquery, and derived types:

 set session optimizer_switch='derived_merge=off'; #Close the new feature of mysql5.7 to merge and optimize the derived tables
 explain select (select 1 from actor where id = 1) from (select * from film where id = 1) der;
 set session optimizer_switch='derived_merge=on'; #Restore default configuration


Analyze the execution sequence of SQL: first execute the query of film table -- > then the query of actor table -- > and finally the generated temporary table query;

Select in line 1 (id 1)_ Type is primary, indicating the outermost query of the; If the id is 1 and is the smallest, it is executed last;

Select in line 2 (id 3)_ The type is derived, which indicates the sub query after the from, also known as the derived table, and the derived table alias is der; If the id is 3 and the id is the largest, it will be executed first;

Select in line 3 (id 2)_ Type is subquery, which indicates the subquery before from; If the id is 2 and it is intermediate, it is executed in the middle;

  • Union: the second and subsequent select ions in the union are union;
explain select 1 union all select 1;

table column

This column indicates which table a row of explain is accessing.

When there is a subquery in the from clause, the table column is in the < derivenn > format, which means that the current query depends on the query with id=N, so the query with id=N is executed first.

When there is a union, the value of the table column of UNION RESULT is < union1,2 >, and 1 and 2 represent the select row id participating in the union.

NULL: mysql can decompose query statements in the optimization phase, and there is no need to access tables or indexes in the execution phase. For example, selecting the minimum value in the index column can be completed by looking up the index separately, and there is no need to access the table during execution

explain select min(id) from film;

type column

This column represents the association type or access type, that is, MySQL determines how to find rows in the table and the approximate range of data row records.
From the best to the worst:
system > const > eq_ref > ref > range > index > ALL
Generally speaking, it is necessary to ensure that the query reaches the range level, preferably ref;

  • const, system: mysql can optimize a part of the query and convert it into a constant (you can see the result of show warnings) system = 1.

When comparing all columns of primary key or unique key with constants, the table has at most one matching row, which is read once, which is fast. System is a special case of const. It is system when there is only one data in the table; (appears when using primary key or unique index)

set session optimizer_switch='derived_merge=off'; #Close the new feature of mysql5.7 to merge and optimize the derived tables
explain extended select * from (select * from film where id = 1) tmp;
set session optimizer_switch='derived_merge=on'; #Restore default configuration

  • eq_ref: all parts of the primary key or unique key index are used by connection, and only one qualified record will be returned at most.

This is probably the best join type outside const, which will not appear in a simple select query.
(all fields of primary key or unique index are used when connecting queries)

explain select * from film_actor left join film on film_actor.film_id = film.id;

Description: film_id: a field in the union index of the table, but the type is All;
Because * query means to query all fields, but film_ The remark field of the actor table is not indexed, so a full table scan is required;

  • ref: compared to eq_ref, instead of using the unique index, use the partial prefix of the ordinary index or the unique index. If the index is compared with a certain value, multiple qualified rows may be found;

(partial prefix of common index or union unique index used)

For a simple select query, name is a common index (non unique index)

explain select * from film where name = 'film1';


Association table query, idx_film_actor_id is film_id and actor_ For the joint index of ID, film is used here_ The left prefix of actor is film_id section.

explain select film_id from film left join film_actor on film.id = film_actor.film_id;

  • Range: range scanning usually occurs in in (), between, >, <, > = and other operations. Use an index to retrieve rows in a given range.
explain select * from actor where id > 1;

  • Index: scan the full table index, which is usually faster than All;
explain select * from film;


All fields in the film table are indexed. If * is used to query, the type is index;

  • ALL: that is, full table scanning, which means that mysql needs to find the required rows from beginning to end. Usually, this needs to increase the index for optimization.
explain select * from actor;

possible_keys column

This column shows which indexes the query may use to find.
Possible when explain ing_ Keys has a value, but the key displays NULL. This is because there is not much data in the table. mysql thinks that the index is not helpful for this query, so it selects the full table query.
If the column is NULL, there is no associated index. In this case, you can improve query performance by checking the where clause to see if you can create an appropriate index, and then use explain to see the effect.

key column

This column shows which index mysql actually uses to optimize access to the table.
If no index is used, the column is NULL. If you want to force mysql to use or ignore possible_ The index in the keys column. force index and ignore index are used in the query.

explain select * from film where name = 'film1';

key_len column

This column shows the number of bytes used by mysql in the index. Through this value, you can calculate which columns in the index are used.
film_ Union index idx of actor_ film_actor_id by film_id and actor_id consists of two int columns, and each int is 4 bytes. Pass the key in the result_ Len = 4, it can be inferred that the query uses the first column: film_id column to perform index lookup.

explain select * from film_actor where film_id = 2;


key_len calculation rules are as follows:

  • character string
    • char(n): n-byte length
    • varchar(n): 2 bytes to store the string length. In case of utf-8, the length is 3n + 2
  • value type
    • tinyint: 1 byte
    • smallint: 2 bytes
    • int: 4 bytes
    • bigint: 8 bytes
  • Time type
    • date: 3 bytes
    • timestamp: 4 bytes
    • datetime: 8 bytes

Note: if the field is allowed to be NULL, whether the 1-byte record is NULL is required
The maximum length of the index is 768 bytes. When the string is too long, mysql will do a process similar to the left prefix index to extract the first half of the characters for indexing.

ref column

This column shows the columns or constants used by the table to find the value in the index of the key column record. The common ones are const (constant) and field name (e.g. film.id).

rows column

This column is the number of rows that mysql estimates to read and detect. Note that this is not the number of rows in the result set. The number of possible rows of the scanned index

Extra column

This column shows additional information. Common important values are as follows:
Using index > Using index condition > Using where

Using index:

Use overlay index; Overlay index is the data column of select, which can be obtained only from the index without reading the data row. In other words, the query column should be overwritten by the built index. That is, all fields in the query result set are in the index;

explain select film_id from film_actor where film_id = 1;

remark has no index, so

explain select film_id,remark from film_actor where film_id = 1;

Using index condition:

The query column is not completely covered by the index. The where condition is the range of the leading column of a joint index;

explain select * from film_actor where film_id > 1

Using where:

Use the where statement to process the results, and the columns of the query are not overwritten by the index; When searching using an index, you need to go back to the table to query the required data

explain select * from actor where name = 'a';

Using temporary:

mysql needs to create a temporary table to process the query. In this case, it is generally necessary to optimize. The first thing is to think of using index to optimize.

  • actor.name has no index. At this time, a temporary table is created to distinguish;
    (distinct queries may use temporary tables)
explain select distinct name from actor;

  • film.name establishes idx_name index. In this case, extra is using index when querying, and no temporary table is used;
    Load the index tree into memory and then de duplicate it;
 explain select distinct name from film;

Using filesort:

External sorting will be used instead of index sorting. When the data is small, it will be sorted from memory. Otherwise, it needs to be sorted on disk. In this case, it is generally necessary to consider using indexes for optimization.

  • If actor.name does not create an index, it will browse the entire actor table, save the sorting keyword name and the corresponding id, then sort name and retrieve row records
explain select * from actor order by name;

  • film.name establishes idx_name index. When querying, extra is using index
explain select * from film order by name;

Select tables optimized away:

Use some aggregate functions (such as max, min) to access a field with an index; It has been optimized by MySQL;

explain select min(id) from film;

Indexing practice

CREATE TABLE `employees` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `name` varchar(24) NOT NULL DEFAULT '' COMMENT 'full name',
    `age` int(11) NOT NULL DEFAULT '0' COMMENT 'Age',
    `position` varchar(20) NOT NULL DEFAULT '' COMMENT 'position',
    `hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Entry time',
    PRIMARY KEY (`id`),
    KEY `idx_name_age_position` (`name`,`age`,`position`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COMMENT='Employee record form';

INSERT INTO employees(name,age,position,hire_time) VALUES('LiLei',22,'manager',NOW());
INSERT INTO employees(name,age,position,hire_time) VALUES('HanMeimei',23,'dev',NOW());
INSERT INTO employees(name,age,position,hire_time) VALUES('Lucy',23,'dev',NOW());

Full value matching

Full value matching means that all the fields queried are index fields.

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei';

Name is the federated index idx_ name_ age_ Leading field of position;
key_ If len is 74 and name is varchar(24), then 3 * 24 + 2 = 74. Therefore, the name field in the joint index is used for the index;

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22;

Name, age is the union index idx_name_age_position field;
key_ If len is 78 and name is varchar(24), then 3 * 24 + 2 = 74; Age is int, so the value is 4; 74 + 4 = 78, so the name and age fields in the joint index are used as the index;

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';

Name, age and position are the union index idx_name_age_position field;

key_ If len is 140 and name is varchar(24), then 3 * 24 + 2 = 74; Age is of type int, so the value is 4; Position is varchaer(20), so the value is 3 * 20 + 2 = 62; 74 + 4 + 62 = 140, so use the name, age and position fields in the joint index to enter the index;

Leftmost prefix rule

If a union index is established, the leftmost prefix rule should be observed. This means that the query starts at the top left of the federated index and does not skip columns in the index.

  • The first two fields of the joint index are used for query;
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' and age = 22;


key_len is 78, (3 * 24 + 2) + 4 = 78; Left the name and age indexes;

  • Query using the 1st and 3rd fields of the federated index
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' and position ='manager'

key_len is 74 and name's length is just 74, so only name left the index;

The underlying storage of the joint index is to compare the first field first. If the first field is the same, compare the second field, and compare the third field only if the second field is the same; The first field name is compared, and some can be searched. Position is the third field of the joint index, but it is impossible for the index to skip the second field and directly compare the third field during storage and search. The position field still lists a wide range of data for query, so name goes through the index, and position doesn't go through the index.

Not used on Index (calculation, function, (automatic or manual) type conversion)

Generally speaking, as long as the function operation is added to the index column, the bottom layer of MySQL will not use the index directly. It will cause index invalidation and turn to full table scanning.

EXPLAIN SELECT * FROM employees WHERE left(name,3) = 'LiLei';

  • Give it to hire_time adds a common index:
ALTER TABLE `employees`
ADD INDEX `idx_hire_time` (`hire_time`) USING BTREE ;
EXPLAIN select * from employees where date(hire_time) ='2018-09-30';

For the above SQL, if it is converted to date range query, the index will follow:

EXPLAIN select * from employees where hire_time >='2018-09-30 00:00:00' and hire_time <='2018-09-30 23:59:59';


Restore original index state

ALTER TABLE `employees`
DROP INDEX `idx_hire_time`;

The storage engine does not use the columns to the right of the range lookup criteria in the index

The field order of the joint index, and the columns after the range search will not go to the index;

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';

Only the indexes of the first two fields will be taken. The first field name uses the same, so you can find the specific data. The second field will be narrowed to a range. Whether the third field makes an equal query in this range or needs to traverse this range, so only name and age go through the index.

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age > 22 AND position ='manager';

Try to use the overlay index (only the query accessing the index (the index column contains the query column)) and reduce the select * statement

EXPLAIN SELECT name,age FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';

mysql cannot use the index when it is not equal to (! = or < >), which will lead to full table scanning

EXPLAIN SELECT * FROM employees WHERE name != 'LiLei';

is null, is not null, and indexes cannot be used

It is recommended to set the fields to not null and set a default value;

EXPLAIN SELECT * FROM employees WHERE name is null

like starts with a wildcard ('$abc...') if the mysql index fails, it will become a full table scan operation

EXPLAIN SELECT * FROM employees WHERE name like '%Lei'

EXPLAIN SELECT * FROM employees WHERE name like 'Lei%'

In fuzzy search, the former fuzzy does not go through the index, and the latter fuzzy will go through the index;
Because we know that there are several characters in front of this field in the post fuzzy time, we only compare the first few characters in the index;

How to solve the problem that like '%' string index is not used?

  • To use an overlay index, the query field must be the field where the overlay index is established
EXPLAIN SELECT name,age,position FROM employees WHERE name like '%Lei%';

  • If you cannot use an overlay index, you may need to use a search engine

Invalid string index without single quotation marks

EXPLAIN SELECT * FROM employees WHERE name = '1000';

EXPLAIN SELECT * FROM employees WHERE name = 1000; 

Because name is a string type, MySql will perform implicit type conversion and type conversion, so it will not go to the index;

Use less or or in. When using it to query, mysql does not necessarily use an index. The mysql internal optimizer will evaluate whether to use an index according to multiple factors such as search proportion and table size. See range query optimization for details

EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' or name = 'HanMeimei';

Range query optimization

To add a single valued index to an age:

ALTER TABLE `employees` ADD INDEX `idx_age` (`age`) USING BTREE ;

Execution range query

explain select * from employees where age >=1 and age <=2000; 

It can be seen from the results of the implementation plan that the above range search will still go through the index; However, after I add more data, I won't go to the index.

Reason why the index is not used: the mysql internal optimizer will evaluate whether to use the index as a whole according to multiple factors such as retrieval ratio and table size.

**Optimization method: * * you can split a large range into multiple small ranges

explain select * from employees where age >=1 and age <=1000;
explain select * from employees where age >=1001 and age <=2000;

Restore original index state:

ALTER TABLE `employees` DROP INDEX `idx_age`;

Index usage summary

A joint index is established: (a, b, c)

where statementIs the index used
where a = 3Y. Use to a
where a = 3 and b = 5Y. Use a,b
where a = 3 and b = 5 and c = 4Y. Use a,b,c
Where b = 3 or where b = 3 and c = 4 or where c = 4N
where a = 3 and c = 5Use a, but c can't, b is broken in the middle
where a = 3 and b>4 and c = 5When a and b are used, C cannot be used after the range, and b is broken
where a = 3 and b like 'kk%' and c = 4Y. Use a,b,c
where a = 3 and b like '%kk' and c = 4Y. Only a
where a = 3 and b like '%kk%' and c = 4Y. Only a
where a = 3 and b like 'k%kk%' and c = 4Y. Use a,b,c

like KK% is equivalent to a constant, so go to index;% KK and% KK% are equivalent to the range, so the index is not used;
Reason why the index is not used: the mysql internal optimizer will evaluate whether to use the index as a whole according to multiple factors such as retrieval ratio and table size.

Topics: Database MySQL SQL