How to understand and use MySQL index correctly and efficiently

Posted by macattack on Thu, 30 Dec 2021 18:43:09 +0100

summary

Index is a data structure used by the storage engine to quickly find records. Through the rational use of database index, the access performance of the system can be greatly improved. Next, it mainly introduces the index types in MySql database and how to create a more reasonable and efficient index skills.

Note: This is mainly for the B+Tree index data structure of InnoDB storage engine

Advantages of indexing

1. It greatly reduces the amount of data that the server needs to scan, so as to improve the data retrieval speed

2. Help the server avoid sorting and temporary tables

3. Random I/O can be changed to sequential I/O

Index creation

3.1. Primary key index

ALTER TABLE 'table_name'
ADD PRIMARY KEY 'index_name' ('column');

3.2. Unique index

ALTER TABLE 'table_name'
ADD UNIQUE 'index_name' ('column');

3.3 general index

ALTER TABLE 'table_name'
ADD INDEX 'index_name' ('column');

3.4 full text index

ALTER TABLE 'table_name'
ADD FULLTEXT 'index_name' ('column');

3.5 combined index

ALTER TABLE 'table_name'
ADD INDEX 'index_name' ('column1', 'column2', ...);

Index rules of B + tree

Create a user table for the test

DROP TABLE IF EXISTS user_test;
CREATE TABLE user_test(
id int AUTO_INCREMENT PRIMARY KEY,
user_name varchar(30) NOT NULL,
sex bit(1) NOT NULL DEFAULT b'1',
city varchar(50) NOT NULL,
age int NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Create a composite index:

ALTER TABLE user_test
ADD INDEX idx_user(user_name , city , age);

4.1 query with valid index

4.1. 1. Full value matching

Full value matching refers to matching all columns in the index. For example, taking the index created above as an example, you can query the data with (user_name, city, age) as the condition after the where condition.

Note: it has nothing to do with the order of query conditions after where. This is a place easily misunderstood by many students

SELECT * FROM user_test
WHERE user_name = 'feinik'
AND age = 26
AND city = 'Guangzhou';

4.1. 2. Match leftmost prefix

Matching the leftmost prefix means that the leftmost index column is preferentially matched. For example, the indexes created above can be used to query: (user_name), (user_name, city), (user_name, city, age)

Note: the order in which the leftmost prefix query conditions are met has nothing to do with the order of index columns, such as: (city, user_name), (age, city, user_name)

4.1. 3. Match column prefix

It refers to the beginning of the matching column value. For example, query all users whose user name starts with feinik

SELECT * FROM user_test
WHERE user_name LIKE 'feinik%';

4.1. 4. Match range value

For example, query all users whose user name starts with feinik. Here, the first column of the index is used

SELECT * FROM user_test
WHERE user_name LIKE 'feinik%';

4.2 restriction of index

1. If the leftmost index column in the index column is not included in the where query criteria, the index query cannot be used, such as:

SELECT * FROM user_test
WHERE city = 'Guangzhou';

or

SELECT * FROM user_test
WHERE age= 26;

or

SELECT * FROM user_test
WHERE city = 'Guangzhou'
AND age = '26';

2. Even if the query criteria of where is the leftmost index column, you cannot use the index to query users whose user names end in feinik

SELECT * FROM user_test
WHERE user_name like '%feinik';

3. If there is a range query for a column in the where query criteria, all the columns on the right cannot use the index to optimize the query, such as:

SELECT * FROM user_test
WHERE user_name = 'feinik'
AND city LIKE 'Guangzhou%'
AND age = 26;

Efficient indexing strategy

5.1. The index column cannot be part of an expression, nor can it be used as a parameter of a function, otherwise the index query cannot be used.

SELECT * FROM user_test
WHERE user_name = concat(user_name, 'fei');

5.2 prefix index

Sometimes a long character column needs to be indexed, which will increase the storage space of the index and reduce the efficiency of the index. One strategy is to use hash index, and the other is to use prefix index. Prefix index selects the first n characters of the character column as the index, which can greatly save the index space and improve the index efficiency.

5.2. 1. Selectivity of prefix index

The prefix index should be long enough to ensure high selectivity, but not too long. We can calculate the selection length of the appropriate prefix index in the following ways:

(1)

SELECT COUNT(DISTINCT index_column)/COUNT(*)
FROM table_name;
-- index_column Represents the column to which you want to add a prefix index

Note: the selectivity ratio of prefix index is calculated through the above methods. The higher the ratio, the more efficient the index will be.

(2)

SELECT
COUNT(DISTINCT LEFT(index_column,1))/COUNT(*),
COUNT(DISTINCT LEFT(index_column,2))/COUNT(*),
COUNT(DISTINCT LEFT(index_column,3))/COUNT(*)
...
FROM table_name;

Note: gradually find the selectivity ratio closest to the prefix index in (1) through the above statements, then you can use the corresponding character intercept length as the prefix index

5.2. 2. Prefix index creation

ALTER TABLE table_name
ADD INDEX index_name (index_column(length));

5.2. 3. Notes on using prefix index

Prefix index is an effective way to make the index smaller and faster, but MySql cannot use prefix index for ORDER BY and GROUP BY, and use prefix index for overlay scanning.

5.3. Select the appropriate index column order

The order of index columns is very important in the creation of composite index. The correct index order depends on the query method using the index. The index order of composite index can be helped by empirical rules: put the column with the highest selectivity at the forefront of the index, which is consistent with the selective method of prefix index, However, it does not mean that the order of all combined indexes can be determined by using this rule, but also the specific index order needs to be determined according to the specific query scenario.

5.4 clustered and nonclustered indexes

1. Clustered index

The clustered index determines the physical sorting of data on the physical disk. A table can only have one clustered index. If a primary key is defined, InnoDB will aggregate data through the primary key. If no primary key is defined, InnoDB will select a unique non empty index instead. If there is no unique non empty index, InnoDB will implicitly define a primary key as the clustered index.

The clustered index can greatly improve the access speed. Because the clustered index saves the index and row data in the same B-Tree, the corresponding row data will be found when the index is found, However, when using a clustered index, you should pay attention to avoid random clustered indexes (generally referring to discontinuous primary key values and uneven distribution range). If you use UUID as a clustered index, the performance will be very poor, because discontinuous UUID values will increase a lot of index fragments and random I/O, and eventually lead to a sharp decline in query performance.

2. Nonclustered index

Unlike the clustered index, the nonclustered index does not determine the physical sorting of data on the disk, and the B-Tree contains indexes but does not contain row data. The row data only points to the row data through the pointer corresponding to the index saved in the B-Tree. For example, the index established on (user_name, city, age) above is a nonclustered index.

5.5 coverage index

If an index (such as a combined index) contains the values of all fields to be queried, it is called an overlay index, such as:

SELECT user_name, city, age
FROM user_test
WHERE user_name = 'feinik'
AND age > 25;

Because the fields to be queried (user_name, city, age) are included in the index column of the composite index, the overlay index query is used to check whether the overlay index is used. The value in Extra in the execution plan is Using index, which proves that the overlay index is used, and the overlay index can greatly improve the access performance.

5.6. How to use index to sort

In the sorting operation, if you can use index to sort, you can greatly improve the sorting speed. To use index to sort, you need to meet the following two points.

  • 1. The column order after the ORDER BY clause shall be consistent with the column order of the composite index, and the sorting direction (positive order / reverse order) of all row sequences shall be consistent

  • 2. The queried field value needs to be included in the index column and meet the coverage index

Specific analysis through examples

In user_ Create a composite index on the test table

ALTER TABLE user_test
ADD INDEX index_user(user_name , city , age);

Cases that can be sorted using the index

1,SELECT user_name, city, age
FROM user_test
ORDER BY user_name;

2,SELECT user_name, city, age
FROM user_test
ORDER BY user_name, city;

3,SELECT user_name, city, age
FROM user_test
ORDER BY user_name DESC, city DESC;

4,SELECT user_name, city, age
FROM user_test
WHERE user_name = 'feinik'
ORDER BY city;

Note: point 4 is special. If the where query condition is the first column of the index column and is a constant condition, the index can also be used

Cases that cannot be sorted by index

1. sex is not in the index column

SELECT user_name, city, age
FROM user_test
ORDER BY user_name, sex;

2. The direction of sorting columns is inconsistent

SELECT user_name, city, age
FROM user_test
ORDER BY user_name ASC, city DESC;

3. The field column sex to query is not included in the index column

SELECT user_name, city, age, sex
FROM user_test
ORDER BY user_name;

4. User after where query criteria_ Name is a range query, so other columns of the index cannot be used

SELECT user_name, city, age
FROM user_test
WHERE user_name LIKE 'feinik%'
ORDER BY city;

5. In the multi table join query, index sorting can be used only when the sorting fields after ORDER BY are the index columns in the first table (the above two rules of index sorting need to be met). For example, create another user's extended table user_test_ext and establish the index of uid.

DROP TABLE IF EXISTS user_test_ext;
CREATE TABLE user_test_ext(
   id int AUTO_INCREMENT PRIMARY KEY,
   uid int NOT NULL,
   u_password VARCHAR(64) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE user_test_ext
ADD INDEX index_user_ext(uid);

Index sort

SELECT user_name, city, age
FROM user_test u
LEFT JOIN user_test_ext ue
ON u.id = ue.uid
ORDER BY u.user_name;

No index sort

SELECT user_name, city, age
FROM user_test u
LEFT JOIN user_test_ext ue
ON u.id = ue.uid
ORDER BY ue.uid;

summary

This paper mainly talks about the index rules of B+Tree tree structure, the creation of different indexes, and how to correctly create efficient index skills to improve the query speed as much as possible. Of course, there are not only these skills about index, but also more skills about index need to accumulate relevant experience at ordinary times.

Article from: https://my.oschina.net/feinik/blog/1305784

Topics: Database MySQL SQL