003MySQL performance tuning 007 sub table and partition

Posted by Dragoon1 on Tue, 28 Dec 2021 11:17:18 +0100

1 Introduction

1.1 background

For the InnoDB storage engine, generally, a table corresponds to an FRM file, which saves the table structure and index.

When the amount of data is large (generally more than ten million records), the performance of the database will begin to decline. At this time, you can consider partitioning or table splitting the database.

1.2 sub table

Table splitting is to decompose a table with a large amount of data into multiple tables with a small amount of data according to certain rules, and each table corresponds to an FRM file.

When using, you need to modify the SQL statement and access different tables to get the corresponding data.

The sub table is divided into vertical segmentation and horizontal segmentation.

1.3 zoning

Partitioning is similar to splitting tables, but also breaks down tables according to rules. The difference is that the partition will not decompose a table into multiple tables, but store the data in a table separately. A table corresponds to multiple FRM files.

In addition, for the client, the distinction between database tables is transparent to the client, and there is no need to modify SQL statements when using.

2 point table

2.1 vertical sub table

Vertical table splitting refers to splitting a table containing many columns into multiple sub tables. Each sub table contains several columns of the original table, and the number of entries of each sub table is the same as that of the original table.

Principle of vertical table division:

Split commonly used fields and infrequently used fields.

Separate large fields.

2.2 horizontal sub table

Horizontal split table refers to splitting the number of entries of the original table, and the table structure of each sub table is the same as that of the original table.

Horizontal table splitting method:

1) Split by time.

If the business system requires high timeliness, such as the article table of the news release system, the database can be designed into a time structure and split according to the time level.

2) Split by module

If the original table contains multiple modules or classifications, it can be split according to modules.

3) Split by hash

Hash structure is commonly used in business scenarios such as blogs. There are a large number of users, and the publishing time and times of each user are uncertain, but the total amount is large.

For the original table with this structure, the MD5 code of the entry can be used to split the first few bits.

3 zoning

3.1 description

At present, MySQL supports the following types of partitions: RANGE partition, LIST partition, HASH partition and KEY partition. Nine times out of ten, the actual combat is partitioned with RANGE.

If a table has a primary key or a unique index, the partitioned column must be a part of the unique index.

3.2 general operation

To view partition information for a specified table:

1 select
2   partition_name part,
3   partition_expression expr,
4   partition_description descr,
5   table_rows
6 from information_schema.partitions
7 where table_schema = schema() and table_name='test';

View the table creation statement to obtain the partition settings when creating a table:

1 show create table test;

Check the table status to determine whether it is a partitioned table:

1 show table status like 'test';

Rebuild the partition, which can be used to defragment the partition:

1 alter table test rebuild partition part1, part2, part3;

Optimize partitions to reclaim space and defragment:

1 alter table test optimize partition part1, part2, part3;

Analyze the partition, read and save the key distribution of the partition, which can be used after modifying the partition:

1 alter table test analyze partition part1, part2, part3;

Repair partition, repair damaged partition:

1 alter table test repair partition part1, part2, part3;

Deleting a partition will also delete the data in the partition:

1 alter table test drop partition part4, part3;

Delete partition data and keep partition:

1 alter table test truncate partition part4, part3;

3.3 RANGE zoning

RANGE partition is the most commonly used partition type in actual combat. It is composed of continuous intervals separated by partition keys. These intervals should be continuous and cannot overlap each other.

When the inserted data is not in any partition, an exception will be thrown.

The partition key must have a primary key index or a unique index, and the partition key must be of Int type, or the Int type returned through an expression, which can be Null.

The partition key requires an integer value, but it can also be an expression, but the expression must return an integer value.

Set partition when creating table:

 1 create table test (
 2   id int(11) not null comment 'number',
 3   name varchar(50) default null comment 'full name',
 4   address varchar(50) default null comment 'address',
 5   status int(2) default null comment 'state',
 6   createtime date default null comment 'Creation time',
 7   updatetime date default null comment 'Modification time',
 8   primary key (id)
 9 ) partition by range (id) (
10   partition part0 values less than (50),
11   partition part1 values less than (100),
12   partition part2 values less than (150),
13   partition part3 values less than maxvalue
14 );

Set partition when modifying table:

1 alter table test partition by range (id) (
2   partition part0 values less than (50),
3   partition part1 values less than (100),
4   partition part2 values less than (150),
5   partition part3 values less than maxvalue
6 );

Adding partitions requires that only partitions larger than the partition key can be added:

1 alter table test add partition (partition part3 values less than (200));

Decomposition partition:

1 alter table test reorganize partition part3 into (
2   partition part3 values less than (200),
3   partition part4 values less than maxvalue
4 );

Merge partitions:

1 alter table test reorganize partition part4, part3 into (
2   partition part3 values less than maxvalue
3 );

3.4 LIST partition

It is similar to partitioning by RANGE, except that the LIST partition is composed of multiple hash value sets. These sets require that the values in them cannot be repeated.

When the inserted data is not in any partition, an exception will be thrown.

The hash value must have a primary key index or a unique index, and the hash value must be of Int type, or the Int type returned through an expression, which can be Null.

The hash value is required to be an integer value, but it can also be an expression, but the expression must return an integer value.

Set partition when creating table:

 1 create table test (
 2   id int(11) not null comment 'number',
 3   name varchar(50) default null comment 'full name',
 4   address varchar(50) default null comment 'address',
 5   status int(2) default null comment 'state',
 6   createtime date default null comment 'Creation time',
 7   updatetime date default null comment 'Modification time',
 8   primary key (id, status)
 9 ) partition by list (status) (
10   partition part0 values in (0, 1, 5),
11   partition part1 values in (2),
12   partition part2 values in (3, 4)
13 );

Set partition when modifying table:

1 alter table test partition by list (status) (
2   partition part0 values in (0, 1, 5),
3   partition part1 values in (2),
4   partition part2 values in (3, 4)
5 );

To add a partition, only undefined partitions can be added:

1 alter table test add partition (partition part4 values in (6, 7));

Decomposition partition:

1 alter table test reorganize partition part2 into (
2   partition part2 values in (3),
3   partition part3 values in (4)
4 );

Merge partitions:

1 alter table test reorganize partition part3, part2 into (
2   partition part3 values in (3, 4)
3 );

3.5 HASH zoning

HASH partition is mainly used to ensure that the data is evenly distributed in a predetermined number of partitions.

Unlike RANGE partition and LIST partition, HASH partition does not need to define partition conditions, but only needs to specify the number of partitions.

When setting partitions, you only need to specify the number of partitions. If the number of partitions is not specified, the number of partitions will default to 1.

The hash value must have a primary key index or a unique index, and the hash value must be of Int type, or the Int type returned through an expression, which can be Null.

The hash value is required to be an integer value, but it can also be an expression, but the expression must return an integer value.

Set partition when creating table:

1 create table test (
2   id int(11) not null comment 'number',
3   name varchar(50) default null comment 'full name',
4   address varchar(50) default null comment 'address',
5   status int(2) default null comment 'state',
6   createtime date default null comment 'Creation time',
7   updatetime date default null comment 'Modification time',
8   primary key (id, status)
9 ) partition by hash (id) partitions 3;

Set partition when modifying table:

1 alter table test partition by hash (id) partitions 3;

Add a partition. Add a partition based on the original partition:

1 alter table test add partition partitions 2;

Reduce partitions. Reduce partitions based on the original partitions:

1 alter table test coalesce partition 2;

3.6 KEY partition

In fact, the KEY partition is similar to the HASH partition. It is also used to ensure the average distribution of data in a predetermined number of partitions. Moreover, the KEY partition does not need to define partition conditions, but only needs to specify the number of partitions.

Different from HASH partition, KEY partition allows multiple columns but does not allow expressions, while HASH partition only allows one column but allows expressions.

If there is a primary KEY or unique KEY, the partition KEY in the KEY partition can not be specified. It defaults to the primary KEY or unique KEY. If there is no primary KEY or unique KEY, the column must be explicitly specified.

The algorithm of KEY partition is different from that of HASH partition. HASH partition uses user-defined functions for partition, and KEY partition uses functions provided by the database for partition.

Set partition when creating table:

1 create table test (
2   id int(11) not null comment 'number',
3   name varchar(50) default null comment 'full name',
4   address varchar(50) default null comment 'address',
5   status int(2) default null comment 'state',
6   createtime date default null comment 'Creation time',
7   updatetime date default null comment 'Modification time',
8   primary key (id)
9 ) partition by key (id) partitions 3;

Set partition when modifying table:

1 alter table test partition by key (id) partitions 3;

Add a partition. Add a partition based on the original partition:

1 alter table test add partition partitions 2;

Reduce partitions. Reduce partitions based on the original partitions:

1 alter table test coalesce partition 2;

3.7 subarea

The sub partition is the re partition of each partition in the partition table of RANGE type and LIST type. The re partition can be HASH type and KEY type.

It should be noted that each partition must have the same number of sub partitions, and if any sub partition is clearly defined on any partition on a partition table, all sub partitions must be defined.

Subdivide each RANGE partition into two HASH partitions:

 1 create table test (
 2   id int(11) not null comment 'number',
 3   name varchar(50) default null comment 'full name',
 4   address varchar(50) default null comment 'address',
 5   status int(2) default null comment 'state',
 6   createtime date default null comment 'Creation time',
 7   updatetime date default null comment 'Modification time',
 8   primary key (id)
 9 ) partition by range (id) subpartition by hash (id) subpartitions 2 (
10   partition part0 values less than (50),
11   partition part1 values less than (100),
12   partition part2 values less than (150),
13   partition part3 values less than maxvalue
14 );

Clearly define sub partitions:

 1 create table test (
 2   id int(11) not null comment 'number',
 3   name varchar(50) default null comment 'full name',
 4   address varchar(50) default null comment 'address',
 5   status int(2) default null comment 'state',
 6   createtime date default null comment 'Creation time',
 7   updatetime date default null comment 'Modification time',
 8   primary key (id)
 9 ) partition by range (id) subpartition by hash (id) (
10   partition part0 values less than (50) (
11         subpartition part0sub0,
12         subpartition part0sub1
13     ),
14   partition part1 values less than (100) (
15         subpartition part1sub0,
16         subpartition part1sub1
17     ),
18   partition part2 values less than (150) (
19         subpartition part2sub0,
20         subpartition part2sub1
21     ),
22   partition part3 values less than maxvalue (
23         subpartition part3sub0,
24         subpartition part3sub1
25     )
26 );

 

Topics: MySQL