The author's official account is welcome to exchange.
In the process of MySQL master-slave replication, errors are inevitable due to various reasons. How to solve the errors in the replication process? This article will demonstrate how to skip the corresponding errors in the slave database when the master-slave inconsistency is caused by the modification of the slave database data based on the normal replication mode and the GTID replication mode.
1 common errors
1) ERROR: 1062 insert data from library, uniqueness conflict occurred
2) ERROR: 1032 the data to be deleted cannot be found from the library
3) ERROR: 1452 cannot insert or update data in the foreign key table that does not have a reference primary key
2 error in normal copy mode slave Library
2.1 simulation error
Here, the [Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY] error will be simulated in the normal master-slave replication environment, that is, an error occurs when the master database synchronizes data to the slave database due to misoperation:
1) The master database initializes data, and the records with ID=1 and 2 are synchronized to the slave database
mysql> create table test.test(id int auto_increment not null primary key,name varchar(20)); Query OK, 0 rows affected (0.16 sec) mysql> insert into test.test values(1,'Alen'),(2,'Tom'); Query OK, 2 rows affected (0.02 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select * from test.test; +----+------+ | id | name | +----+------+ | 1 | Alen | | 2 | Tom | +----+------+ 2 rows in set (0.01 sec)
2) View the data synchronized from the library, the master and slave are consistent, and then insert a record with data ID=3 in the slave library
mysql> select * from test.test; +----+------+ | id | name | +----+------+ | 1 | Alen | | 2 | Tom | +----+------+ 2 rows in set (0.00 sec) mysql> insert into test.test values(3,'Test'); Query OK, 1 row affected (0.00 sec) mysql> select * from test.test; +----+------+ | id | name | +----+------+ | 1 | Alen | | 2 | Tom | | 3 | Test | +----+------+ 3 rows in set (0.00 sec)
3) Insert a record with ID=3 in the main library
mysql> insert into test.test values(3,'Test'); Query OK, 1 row affected (0.55 sec) mysql> select * from test.test; +----+------+ | id | name | +----+------+ | 1 | Alen | | 2 | Tom | | 3 | Test | +----+------+ 3 rows in set (0.00 sec)
4) Viewing the copy status from the library, an error was found
mysql> show replica status\G *************************** 1. row *************************** Replica_IO_State: Waiting for source to send event Source_Host: 192.168.56.201 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: mysql-bin.000009 Read_Source_Log_Pos: 1405 Relay_Log_File: mysql-relay-bin.000013 Relay_Log_Pos: 1295 Relay_Source_Log_File: mysql-bin.000009 Replica_IO_Running: Yes Replica_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1062 Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at master log mysql-bin.000009, end_log_pos 1374. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any. Skip_Counter: 0 Exec_Source_Log_Pos: 1120 Relay_Log_Space: 1957 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Source_SSL_Allowed: No -----------------------------Partial output omission------------------------------------
5) In the slave library, you can view the specific error information according to the replication status or from the error log. It is found that the error is Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY, you can find test The record with test table ID=3 caused
mysql> select * from performance_schema.replication_applier_status_by_worker\G *************************** 1. row *************************** CHANNEL_NAME: WORKER_ID: 1 THREAD_ID: NULL SERVICE_STATE: OFF LAST_ERROR_NUMBER: 1062 LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'ANONYMOUS' at master log mysql-bin.000009, end_log_pos 1374; Could not execute Write_rows event on table test.test; Duplicate entry '3' for key 'test.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000009, end_log_pos 1374 LAST_ERROR_TIMESTAMP: 2021-12-28 21:39:20.595365 LAST_APPLIED_TRANSACTION: ANONYMOUS LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-12-28 21:35:45.222824 LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-12-28 21:35:45.222824 LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 2021-12-28 21:35:45.377695 LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 2021-12-28 21:35:45.399679 APPLYING_TRANSACTION: ANONYMOUS APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-12-28 21:39:20.436380 APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-12-28 21:39:20.436380 APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2021-12-28 21:39:20.595012 -----------------------------Partial output omission------------------------------------
6) You can also query binlog in the master database according to the copy error information from the slave database to find out the specific error content. It is found that it is caused by the record with ID=3
[root@node1 ~]# mysqlbinlog --base64-output='decode-rows' -vv --start-position=1120 --stop-position=1374 /mysql/mysqldata/binlog/mysql-bin.000009 # The proper term is pseudo_replica_mode, but we use this compatibility alias # to make the statement usable on server versions 8.0.24 and older. /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 196 #211228 21:17:44 server id 201 end_log_pos 125 CRC32 0x71343e2d Start: binlog v 4, server v 8.0.27 created 211228 21:17:44 at startup # Warning: this binlog is either in use or was not closed properly. ROLLBACK/*!*/; # at 1120 #211228 21:39:19 server id 201 end_log_pos 1199 CRC32 0x98888e0c Anonymous_GTID last_committed=4 sequence_number=5 rbr_only=yes original_committed_timestamp=1640698760436380 immediate_commit_timestamp=1640698760436380 transaction_length=285 /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/; # original_commit_timestamp=1640698760436380 (2021-12-28 21:39:20.436380 CST) # immediate_commit_timestamp=1640698760436380 (2021-12-28 21:39:20.436380 CST) /*!80001 SET @@session.original_commit_timestamp=1640698760436380*//*!*/; /*!80014 SET @@session.original_server_version=80027*//*!*/; /*!80014 SET @@session.immediate_server_version=80027*//*!*/; SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/; # at 1199 #211228 21:39:19 server id 201 end_log_pos 1274 CRC32 0x2cbcc0e6 Query thread_id=9 exec_time=0 error_code=0 SET TIMESTAMP=1640698759/*!*/; SET @@session.pseudo_thread_id=9/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1168113696/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8mb4 *//*!*/; SET @@session.character_set_client=255,@@session.collation_connection=255,@@session.collation_server=46/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; /*!80011 SET @@session.default_collation_for_utf8mb4=255*//*!*/; BEGIN /*!*/; # at 1274 #211228 21:39:19 server id 201 end_log_pos 1330 CRC32 0x3fb4a8d7 Table_map: `test`.`test` mapped to number 94 # at 1330 #211228 21:39:19 server id 201 end_log_pos 1374 CRC32 0x473cec6d Write_rows: table id 94 flags: STMT_END_F ### INSERT INTO `test`.`test` ### SET ### @1=3 /* INT meta=0 nullable=0 is_null=0 */ ### @2='Tom' /* VARSTRING(80) meta=80 nullable=1 is_null=0 */ ROLLBACK /* added by mysqlbinlog */ /*!*/; SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/; DELIMITER ; # End of log file /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/; [root@node1 ~]#
2.2 skip from library error
By simulating the incorrect insertion of data from the slave library, the master-slave error occurs when the master library inserts data normally. After confirming the problem, you can skip a single error or multiple errors from the slave Library in the following ways:
1) Stop from library
mysql> stop replica; Query OK, 0 rows affected (0.00 sec)
2) Set to skip one error from the library. If you need to skip multiple errors, you can set SQL_ replica_ skip_ Set the counter value higher
mysql> set global sql_replica_skip_counter=1; Query OK, 0 rows affected (0.00 sec)
3) Start the slave library, check the slave library status, and find the skipped error
mysql> start replica; Query OK, 0 rows affected (0.03 sec) mysql> show replica status\G
Note: in MySQL 8.0 Before version 26, you can also set the option slave_skip_errors to skip errors, but this option comes from MySQL 8.0 It was abandoned after 26.
3 GTID copy mode slave library error
3.1 simulation error
Here, the [Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND] error will be simulated in the GTID based master-slave replication environment, that is, an error occurs when the master database synchronizes data to the slave database due to misoperation:
1) Query issue data, 3 pieces of data in total
mysql> select * from test.test; +----+------+ | id | name | +----+------+ | 1 | Alen | | 2 | Tom | | 3 | Test | +----+------+ 3 rows in set (0.00 sec)
2) Simulate misoperation and delete data with ID=3 from the library
mysql> delete from test.test where id=3; Query OK, 1 row affected (0.01 sec) mysql> select * from test.test ; +----+------+ | id | name | +----+------+ | 1 | Alen | | 2 | Tom | +----+------+ 2 rows in set (0.00 sec)
3) Delete the master library normally
mysql> delete from test.test where id=3; Query OK, 1 row affected (0.12 sec) mysql> select * from test.test; +----+------+ | id | name | +----+------+ | 1 | Alen | | 2 | Tom | +----+------+ 2 rows in set (0.00 sec)
4) Viewing the copy status from the library, an error occurred
mysql> show replica status\G *************************** 1. row *************************** Replica_IO_State: Waiting for source to send event Source_Host: 192.168.56.201 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: mysql-bin.000012 Read_Source_Log_Pos: 1626 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 1515 Relay_Source_Log_File: mysql-bin.000012 Replica_IO_Running: Yes Replica_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1032 Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'a036ad34-6325-11ec-9e83-08002790c7d5:7' at master log mysql-bin.000012, end_log_pos 1595. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any. Skip_Counter: 0 Exec_Source_Log_Pos: 1340 Relay_Log_Space: 2010 Until_Condition: None -----------------------------Partial output omission------------------------------------
5) In the slave library, you can view the specific error information according to the replication status or from the error log. It is found that the error is Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND, it is found that the slave database made an error when executing the master database transaction a036ad34-6325-11ec-9e83-08002790c7d5:7, and the corresponding record does not exist in the slave database
mysql> select * from performance_schema.replication_applier_status_by_worker\G *************************** 1. row *************************** CHANNEL_NAME: WORKER_ID: 1 THREAD_ID: NULL SERVICE_STATE: OFF LAST_ERROR_NUMBER: 1032 LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'a036ad34-6325-11ec-9e83-08002790c7d5:7' at master log mysql-bin.000012, end_log_pos 1595; Could not execute Delete_rows event on table test.test; Can't find record in 'test', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000012, end_log_pos 1595 LAST_ERROR_TIMESTAMP: 2021-12-28 22:30:34.569490 LAST_APPLIED_TRANSACTION: a036ad34-6325-11ec-9e83-08002790c7d5:6 LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-12-28 22:26:35.893264 LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-12-28 22:26:35.893264 LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 2021-12-28 22:26:36.050613 LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 2021-12-28 22:26:36.061199 APPLYING_TRANSACTION: a036ad34-6325-11ec-9e83-08002790c7d5:7 APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-12-28 22:30:34.411642 APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-12-28 22:30:34.411642 APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2021-12-28 22:30:34.569146 -----------------------------Partial output omission------------------------------------
6) Check the transaction a036ad34-6325-11ec-9e83-08002790c7d5:7 in the master database. It is found that an error occurred when deleting the record with ID=3, indicating that the record with ID=3 does not exist in the slave database
[root@node1 ~]# mysqlbinlog --base64-output='decode-rows' -vv --include-gtids='a036ad34-6325-11ec-9e83-08002790c7d5:7' /mysql/mysqldata/binlog/mysql-bin.000012 -----------------------------Partial output omission------------------------------------ BEGIN /*!*/; # at 1494 #211228 22:30:34 server id 201 end_log_pos 1550 CRC32 0x4a598165 Table_map: `test`.`test` mapped to number 94 # at 1550 #211228 22:30:34 server id 201 end_log_pos 1595 CRC32 0x74356eaa Delete_rows: table id 94 flags: STMT_END_F ### DELETE FROM `test`.`test` ### WHERE ### @1=3 /* INT meta=0 nullable=0 is_null=0 */ ### @2='Test' /* VARSTRING(80) meta=80 nullable=1 is_null=0 */ # at 1595 #211228 22:30:34 server id 201 end_log_pos 1626 CRC32 0xe0c85d4c Xid = 68 COMMIT/*!*/; SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/; DELIMITER ;
3.2 skip from library error
By simulating the erroneous deletion of data from the library, the master-slave error occurs when the master library normally deletes data. After confirming the problem, you can skip a single error in the slave Library in the following ways:
1) Stop from library
mysql> stop replica; Query OK, 0 rows affected (0.00 sec)
2) Set the next transaction from the library, the transaction to be skipped
mysql> set gtid_next='a036ad34-6325-11ec-9e83-08002790c7d5:7'; Query OK, 0 rows affected (0.00 sec)
3) Execute an empty transaction
mysql> begin;commit; Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec)
4) Set resume next transaction
mysql> set gtid_next=AUTOMATIC; Query OK, 0 rows affected (0.00 sec)
5) Start the slave library and check the replication status. It is found that the replication status is normal
mysql> start replica; Query OK, 0 rows affected (0.03 sec) mysql> show replica status\G
If the batch error is skipped in GTID mode, refer to the following process:
1) Stop from library
mysql> stop replica; Query OK, 0 rows affected (0.01 sec)
2) Perform a reset from the library
mysql> reset master; Query OK, 0 rows affected (0.17 sec)
3) Set the next batch of transaction gtids from the library_ Purged and gtid of the master library_ Same as executed
mysql> set global gtid_purged='a036ad34-6325-11ec-9e83-08002790c7d5:1'; Query OK, 0 rows affected (0.00 sec)
4) Start slave Library
mysql> start replica; Query OK, 0 rows affected (0.03 sec) mysql> show replica status\G
Through the demonstration of master-slave replication errors caused by the modification of slave libraries in the above two master-slave replication modes, it can be seen that most of the errors are caused by misoperation of slave libraries, so how to avoid them? There are two methods:
1) Set user permissions to avoid misoperation from the library
2) Set slave library read-only, that is, set global read in the slave library setting option_ only=1;
In addition, it should be noted that the two methods of skipping slave library errors are mutually exclusive, that is, SQL cannot be used_ replica_ skip_ Counter skips the scenario where the master-slave replication based on GTID causes an error in the slave library.
Note: in case of master-slave replication error, be sure to repeatedly confirm and verify the correctness of the data when skipping the error!