Technology sharing | how MySQL limits the number of records in a table

Posted by dwees on Tue, 14 Dec 2021 21:51:06 +0100

Author: Yang Taotao

Senior database expert, specializing in MySQL for more than ten years. He is good at backup and recovery, SQL tuning, monitoring, operation and maintenance, high availability architecture design related to MySQL, PostgreSQL, MongoDB and other open source databases. At present, he works in aikesheng, providing MySQL related technical support and MySQL related course training for major operators and banking and financial enterprises.

Source: original contribution

*It is produced by aikesheng open source community. The original content cannot be used without authorization. For reprint, please contact Xiaobian and indicate the source.

background

This article also comes from the customer's question: can you control the number of records in a single table, such as 1W, beyond which new records will not be inserted or errors will be thrown directly?

There is no simplified answer to this problem. For example, executing a command or simply setting a parameter cannot be solved perfectly. Next, I give some alternative solutions.

text

For databases, there are only two solutions to general problems. One is on the application side; The other is on the database side.

First, on the database side (assuming that the table is hard limited to 1W records):

1, Trigger solution:

The idea of trigger is very simple. Before inserting a new record each time, check whether the number of records in the table reaches the limit. If the number does not reach, continue to insert; When the quantity reaches, insert a new record first, and then delete the oldest record, or reverse it. In order to avoid scanning the total records of each detection table, another table is planned to be used as the counter of the current table. Before insertion, just check the counter table. To achieve this requirement, you need two triggers and a counter table.

t1 is the table that needs to limit the number of records, t1_count is the counter table:

mysql:ytt_new>create table t1(id int auto_increment primary key, r1 int);
Query OK, 0 rows affected (0.06 sec)
   
mysql:ytt_new>create table t1_count(cnt smallint unsigned);
Query OK, 0 rows affected (0.04 sec)
   
mysql:ytt_new>insert t1_count set cnt=0;
Query OK, 1 row affected (0.11 sec)

You have to write two triggers, one is the insertion action trigger:

DELIMITER $$

USE `ytt_new`$$

DROP TRIGGER /*!50032 IF EXISTS */ `tr_t1_insert`$$

CREATE
    /*!50017 DEFINER = 'ytt'@'%' */
    TRIGGER `tr_t1_insert` AFTER INSERT ON `t1` 
    FOR EACH ROW BEGIN
       UPDATE t1_count SET cnt= cnt+1;
    END;
$$

DELIMITER ;

The other is triggered by the delete action:

DELIMITER $$

USE `ytt_new`$$

DROP TRIGGER /*!50032 IF EXISTS */ `tr_t1_delete`$$

CREATE
    /*!50017 DEFINER = 'ytt'@'%' */
    TRIGGER `tr_t1_delete` AFTER DELETE ON `t1` 
    FOR EACH ROW BEGIN
       UPDATE t1_count SET cnt= cnt-1;
    END;
$$

DELIMITER ;

Create 1W pieces of data for table t1 to reach the upper limit:

mysql:ytt_new>insert t1 (r1) with recursive tmp(a,b) as (select 1,1 union all select a+1,ceil(rand()*20) from tmp where a<10000 ) select b from tmp;
Query OK, 10000 rows affected (0.68 sec)
Records: 10000  Duplicates: 0  Warnings: 0

Counter table t1_count is recorded as 1W.

mysql:ytt_new>select cnt from t1_count;
+-------+
| cnt   |
+-------+
| 10000 |
+-------+
1 row in set (0.00 sec)

Before inserting, you need to judge whether the counter table reaches the limit. If it reaches the limit, delete the old records first. I write a stored procedure to simply sort out the following logic:

DELIMITER $$

USE `ytt_new`$$

DROP PROCEDURE IF EXISTS `sp_insert_t1`$$

CREATE DEFINER=`ytt`@`%` PROCEDURE `sp_insert_t1`(
    IN f_r1 INT
    )
BEGIN
      DECLARE v_cnt INT DEFAULT 0;
      SELECT cnt INTO v_cnt FROM t1_count;
      IF v_cnt >=10000 THEN
        DELETE FROM t1 ORDER BY id ASC LIMIT 1;
      END IF;
      INSERT INTO t1(r1) VALUES (f_r1);          
    END$$

DELIMITER ;

At this point, you can call the stored procedure to achieve:

mysql:ytt_new>call sp_insert_t1(9999);
Query OK, 1 row affected (0.02 sec)

mysql:ytt_new>select count(*) from t1;
+----------+
| count(*) |
+----------+
|    10000 |
+----------+
1 row in set (0.01 sec)

The processing logic of this stored procedure can also continue to be optimized for batch processing. For example, if you cache twice the number of table records each time, the judgment logic becomes that before 2W records, only new records are inserted and the old records are not deleted. When 2W records are reached, the old 1W records are deleted at one time.

This scheme has the following defects:

The record update of the counter table is triggered by insert/delete. If the table is truncate d, the counter table will not trigger the update, resulting in inconsistent data.
If you drop the table, the trigger will be deleted. You need to rebuild the trigger and reset the counter table.
Writing to a table can only be a single entry like a stored procedure, not other entries.

2, Partition table solution

Create a range partition. The first partition has 1W records and the second partition is the default partition. After the number of table records reaches the limit, delete the first partition and readjust the partition definition.

Initial definition of partition table:

mysql:ytt_new>create table t1(id int auto_increment primary key, r1 int) partition by range(id) (partition p1 values less than(10001), partition p_max values less than(maxvalue));
Query OK, 0 rows affected (0.45 sec)

Find out if the first partition is full:

mysql:ytt_new>select count(*) from t1 partition(p1);
+----------+
| count(*) |
+----------+
|    10000 |
+----------+
1 row in set (0.00 sec)

Delete the first partition and readjust the partition table:

mysql:ytt_new>alter table t1 drop partition p1;
Query OK, 0 rows affected (0.06 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql:ytt_new>alter table t1 reorganize partition p_max into (partition p1 values less than (20001), partition p_max values less than (maxvalue));
Query OK, 0 rows affected (0.60 sec)
Records: 0  Duplicates: 0  Warnings: 0

The advantages of this method are obvious:

Table entry can be random, including INSERT statement, stored procedure and export file.
Deleting the first partition is a DROP operation, which is very fast.

But there are also disadvantages: table records cannot have gaps. If there are gaps, you have to change the partition table definition. For example, if the maximum value of partition p1 is changed to 20001, even if half of the records in this partition are discontinuous, it will not affect the total number of records in the search partition.

3, Common tablespace solution

Calculate the disk space required for 1W records in this table in advance, and then divide an area on the disk to store the data of this table.

Mount the zoned partition and add it as an alternative directory for the InnoDB table space (/ tmp/mysql /).

mysql:ytt_new>create tablespace ts1 add datafile '/tmp/mysql/ts1.ibd' engine innodb;
Query OK, 0 rows affected (0.11 sec)
mysql:ytt_new>alter table t1 tablespace ts1;
Query OK, 0 rows affected (0.12 sec)
Records: 0  Duplicates: 0  Warnings: 0

I made a rough calculation, which is not very accurate, so there may be some error in the record, but the meaning is very clear: wait for the report "TABLE IS FULL".

mysql:ytt_new>insert t1 (r1) values (200);
ERROR 1114 (HY000): The table 't1' is full

mysql:ytt_new>select count(*) from t1;
+----------+
| count(*) |
+----------+
|    10384 |
+----------+
1 row in set (0.20 sec)

When the table is full, remove the table space, empty the table, and then insert new records.

mysql:ytt_new>alter table t1 tablespace innodb_file_per_table;
Query OK, 0 rows affected (0.18 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql:ytt_new>drop tablespace ts1;
Query OK, 0 rows affected (0.13 sec)

mysql:ytt_new>truncate table t1;
Query OK, 0 rows affected (0.04 sec)

The other is processing at the application side:

The table data can be cached at the application side in advance. After reaching the limited number of records, it can be written to the database side in batch. Before writing to the database, the table can be cleared first.

For example, table t1 data is cached to file T1 CSV, when T1 When the CSV reaches row 1W, the database clears the table data and imports T1 csv.

epilogue

Before MySQL, in the MyISAM era, the table attribute max_rows is used to estimate the number of records in the table, but it is not a mandatory requirement. Similarly, I wrote above that the use of general table space can limit the number of records in the table; In the InnoDB era, there is no intuitive method. It depends more on the methods listed above to solve this problem. The specific solution depends on the demand.

Programmer Think