1, Index principle
Index is an important method to improve the performance of database query.
Use indexes to quickly find rows in a column that contain specific values. If you do not use an index, you must start reading from the first record. You may have to read the whole table to find the relevant rows.
Using an index is like looking up a dictionary. We can quickly find the words we need according to the directory (index) sorted by pinyin, strokes, partial radicals, etc.
Previous articles on MySQL storage engine( Talk about MySQL's storage engine ), the test compares two storage engines (MyISAM or InnoDB). The efficiency of using primary key index query is dozens of times faster.
Although the index greatly improves the speed of query (select), it will also reduce the speed of updating (insert,update,delete) tables, because when updating tables, the database should not only update and save data, but also update and save index files. Of course, index files also occupy storage space.
2, Scenario of index invalidation
Although the index speeds up the query efficiency, if it is not used properly, the index will fail.
Here are some scenarios of index invalidation.
CREATETABLE customer( id INTAUTO_INCREMENT, companyVARCHAR(30), nameVARCHAR(30), sex enum('male','female'), age INT, phoneVARCHAR(30), addressVARCHAR(60), PRIMARYKEY (id)); CREATEINDEX index_company ON customer(company); CREATEINDEX index_age ON customer(age); CREATEINDEX index_phone ON customer(phone); CREATEINDEX index_phone_name ON customer(phone,name);
A customer table, three single column indexes and a combined index are created above. Let's check the indexes:
mysql>show index from customer; # (some information has been removed due to typographical problems) +------------+------------------+--------------+-------------+------------+ |Non_unique | Key_name |Seq_in_index | Column_name | Index_type | +------------+------------------+--------------+-------------+------------+ | 0 | PRIMARY | 1 | id | BTREE | | 1 | index_company | 1 | company | BTREE | | 1 | index_age | 1 | age | BTREE | | 1 | index_phone | 1 | phone |BTREE | | 1 | index_phone_name | 1 | phone | BTREE | | 1 | index_phone_name | 2 | name | BTREE | +------------+------------------+--------------+-------------+------------+ 6 rows inset (0.01 sec)
The above display includes a default primary key index, three single column indexes and two combined index items. The primary key index is a unique index, and the index_type is displayed as BTREE, which is actually a B + tree.
2.1 fuzzy match LIKE starts with%, which will lead to index invalidation.
explain select *from customer where company like '%abc'\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where
Fuzzy matching should be placed on the far right.
explain select * from customer where company like 'abc%'\G *************************** 1. row*************************** id: 1 select_type:SIMPLE table:customer partitions:NULL type:range possible_keys: index_company key:index_company key_len: 123 ref:NULL rows: 1 filtered:100.00 Extra:Using index condition
2.2 index column calculation will lead to index invalidation.
explain select *from customer where age-1=20\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where
You should put the operation on the right.
explain select *from customer where age=20+1\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ref possible_keys:index_age key: index_age key_len: 5 ref: const rows: 1 filtered: 100.00 Extra: NULL
2.3 the use of functions in index columns will lead to index invalidation.
explain select *from customer where lcase(company)='abc'\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where
2.4 index column type conversion will lead to index invalidation.
explain select *from customer where phone=13012345678\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys:index_phone,index_phone_name key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where
You should keep the original type and avoid type conversion.
explain select *from customer where phone='13012345678'\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ref possible_keys:index_phone,index_phone_name key: index_phone key_len: 123 ref: const rows: 1 filtered: 100.00 Extra: NULL
2.5 when the character set of index column comparison is inconsistent, the index will become invalid.
Create the supplier and supplier tables of two suppliers according to the format of the customer table above_ utf8, and the character sets are utf8mb4 and utf8 respectively. For the character set, see the previous article( Talk about the character set of MySQL).
CREATETABLE supplier ( id INTAUTO_INCREMENT, companyVARCHAR(30), nameVARCHAR(30), sexenum('male','female'), age INT, phoneVARCHAR(30), addressVARCHAR(60), PRIMARYKEY (id)); CREATEINDEX index_company ON supplier(company); CREATEINDEX index_age ON supplier(age); CREATEINDEX index_phone ON supplier(phone); CREATEINDEX index_phone_name ON supplier(phone,name);
CREATETABLE supplier_utf8 ( id INTAUTO_INCREMENT, companyVARCHAR(30), nameVARCHAR(30), sexenum('male','female'), age INT, phoneVARCHAR(30), addressVARCHAR(60), PRIMARYKEY (id)) CHARSET=utf8;
CREATEINDEX index_company ON supplier_utf8(company); CREATEINDEX index_age ON supplier_utf8(age); CREATEINDEX index_phone ON supplier_utf8(phone); CREATEINDEX index_phone_name ON supplier_utf8(phone,name);
explain select *from customer,supplier where customer.company = supplier.company\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys:index_company key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where ***************************2. row *************************** id: 1 select_type: SIMPLE table: supplier partitions: NULL type: ref possible_keys:index_company key: index_company key_len: 123 ref: mydb.customer.company rows: 1 filtered: 100.00 Extra: NULL
You can see that the character set is the same. One table uses an index, and the other table goes through a full table scan.
explain select *from customer,supplier_utf8 where customer.company = supplier_utf8.company\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys:index_company key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: NULL ***************************2. row *************************** id: 1 select_type: SIMPLE table: supplier_utf8 partitions: NULL type: ALL possible_keys:index_company key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where; Using join buffer(hash join)
The character sets are different. The two tables are scanned in full.
This situation is quite hidden, and it is often found only after the business is online.
2.6 using or query, whether another item is an index column or not, will lead to index invalidation.
explainselect * from customer where company='abc' or address='abc'\G explainselect * from customer where company='abc' or age=20\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys:index_company key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where
However, if you have to use or to query, you can use the MRR function to sort and optimize the back table query.
2.7 combined index. If the first column index is not used, the index will become invalid.
explain select *from customer where name='abc'\G ***************************1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where
The combined index does not need to be arranged in order. The index can be used below.
explain select * from customer where name ='abc' and phone='13012345678'\G *************************** 1. row*************************** id: 1 select_type:SIMPLE table:customer partitions:NULL type: ref possible_keys: index_phone,index_phone_name key:index_phone key_len: 123 ref:const rows: 1 filtered:100.00 Extra:Using where
If the database expects to use full table scans faster than indexes, indexes are not used. In the latest version of MySQL, the index judgment has been greatly improved. The previous version used in,! =, < >, is null and is not null will lead to index invalidation. The latest version of the test still uses index query.
MySQL5. The second level of MRR optimization is introduced to optimize the range of MRR.
Its principle is to sort multiple secondary indexes that need to be returned to the table according to the primary key, and then return to the table together. The original random I/O during the return to the table is transformed into sequential I/O, so as to reduce the I/O overhead in the query process and reduce the frequency of data pages being replaced in the buffer pool.
MySQL5.7. The Generated Column is introduced. If you really need to calculate the index column, you can use the method of virtual column. For example, add the corresponding column when creating the customer table, and the company name is lowercase lcase_company, and full_age and index.
CREATE TABLE customer2( id INT AUTO_INCREMENT, company VARCHAR(30), lcase_company VARCHAR(30)as(lcase(company)), name VARCHAR(30), sex enum('male','female'), age INT, full_age INT as(age-1), phone VARCHAR(30), address VARCHAR(60), PRIMARY KEY (id));
CREATE INDEX index_lcase_company ON customer2(lcase_company); CREATE INDEX index_full_age ON customer2(full_age);
Then directly query the Cenerated Column, which is equivalent to calculating the index of the column.
3, Usage specification of index
When using the index, we should not only pay attention to avoid index failure, but also follow certain specifications in order to use the index efficiently.
The following summarizes some normative suggestions that can be used for reference, not absolute truth.
3.1 it is recommended that the index number of a single table should not exceed 5, and the field of combined index should not exceed 3 in principle.
3.2 try not to establish an index on a field with a long string. You can set the prefix length of the index field.
3.3 select the columns with high utilization rate in query filtering, such as where, orderby and group by to build indexes.
3.4 do not create an index on a column with low discrimination, such as gender, which will not make use of the index performance.
3.5 do not build indexes on frequently updated columns. Data updates will also update indexes, affecting database performance.
3.6 when building a composite index, the one with the highest discrimination or the highest query frequency shall be placed on the far left.
3.7 make rational use of coverage index to meet query requirements, avoid back table query and reduce I/O overhead.
3.8 delete indexes that are no longer used, less used, or duplicated to reduce the cost of data update.
3.9 use explain to judge whether the query statement uses index or full table scan.
KunlunDB project is open source
[GitHub: ]
https://github.com/zettadb
[Gitee: ]
https://gitee.com/zettadb
END