MySQL 8.0 high availability how to solve the master-slave synchronization error caused by the modification of slave database data

Posted by mkohan on Mon, 03 Jan 2022 07:17:07 +0100

The author's official account is welcome to exchange.

In the process of MySQL master-slave replication, errors are inevitable due to various reasons. How to solve the errors in the replication process? This article will demonstrate how to skip the corresponding errors in the slave database when the master-slave inconsistency is caused by the modification of the slave database data based on the normal replication mode and the GTID replication mode.

1 common errors

1) ERROR: 1062 insert data from library, uniqueness conflict occurred

2) ERROR: 1032 the data to be deleted cannot be found from the library

3) ERROR: 1452 cannot insert or update data in the foreign key table that does not have a reference primary key

2 error in normal copy mode slave Library

2.1 simulation error

Here, the [Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY] error will be simulated in the normal master-slave replication environment, that is, an error occurs when the master database synchronizes data to the slave database due to misoperation:

1) The master database initializes data, and the records with ID=1 and 2 are synchronized to the slave database

mysql> create table test.test(id int auto_increment not null primary key,name varchar(20));

Query OK, 0 rows affected (0.16 sec)



mysql> insert into test.test values(1,'Alen'),(2,'Tom');

Query OK, 2 rows affected (0.02 sec)

Records: 2 Duplicates: 0 Warnings: 0



mysql> select * from test.test;

+----+------+

| id | name |

+----+------+

| 1 | Alen |

| 2 | Tom |

+----+------+

2 rows in set (0.01 sec)

2) View the data synchronized from the library, the master and slave are consistent, and then insert a record with data ID=3 in the slave library

mysql> select * from test.test;

+----+------+

| id | name |

+----+------+

| 1 | Alen |

| 2 | Tom |

+----+------+

2 rows in set (0.00 sec)



mysql> insert into test.test values(3,'Test');

Query OK, 1 row affected (0.00 sec)



mysql> select * from test.test;

+----+------+

| id | name |

+----+------+

| 1 | Alen |

| 2 | Tom |

| 3 | Test |

+----+------+

3 rows in set (0.00 sec)

3) Insert a record with ID=3 in the main library

mysql> insert into test.test values(3,'Test');

Query OK, 1 row affected (0.55 sec)



mysql> select * from test.test;

+----+------+

| id | name |

+----+------+

| 1 | Alen |

| 2 | Tom |

| 3 | Test |

+----+------+

3 rows in set (0.00 sec)

4) Viewing the copy status from the library, an error was found

mysql> show replica status\G

*************************** 1. row ***************************

Replica_IO_State: Waiting for source to send event

Source_Host: 192.168.56.201

Source_User: repl

Source_Port: 3306

Connect_Retry: 60

Source_Log_File: mysql-bin.000009

Read_Source_Log_Pos: 1405

Relay_Log_File: mysql-relay-bin.000013

Relay_Log_Pos: 1295

Relay_Source_Log_File: mysql-bin.000009

Replica_IO_Running: Yes

Replica_SQL_Running: No

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 1062

Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at master log mysql-bin.000009, end_log_pos 1374. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.

Skip_Counter: 0

Exec_Source_Log_Pos: 1120

Relay_Log_Space: 1957

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Source_SSL_Allowed: No

-----------------------------Partial output omission------------------------------------

5) In the slave library, you can view the specific error information according to the replication status or from the error log. It is found that the error is Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY, you can find test The record with test table ID=3 caused

mysql> select * from performance_schema.replication_applier_status_by_worker\G

*************************** 1. row ***************************

CHANNEL_NAME:

WORKER_ID: 1

THREAD_ID: NULL

SERVICE_STATE: OFF

LAST_ERROR_NUMBER: 1062

LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'ANONYMOUS' at master log mysql-bin.000009, end_log_pos 1374; Could not execute Write_rows event on table test.test; Duplicate entry '3' for key 'test.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000009, end_log_pos 1374

LAST_ERROR_TIMESTAMP: 2021-12-28 21:39:20.595365

LAST_APPLIED_TRANSACTION: ANONYMOUS

LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-12-28 21:35:45.222824

LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-12-28 21:35:45.222824

LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 2021-12-28 21:35:45.377695

LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 2021-12-28 21:35:45.399679

APPLYING_TRANSACTION: ANONYMOUS

APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-12-28 21:39:20.436380

APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-12-28 21:39:20.436380

APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2021-12-28 21:39:20.595012

-----------------------------Partial output omission------------------------------------

6) You can also query binlog in the master database according to the copy error information from the slave database to find out the specific error content. It is found that it is caused by the record with ID=3

[root@node1 ~]# mysqlbinlog --base64-output='decode-rows' -vv --start-position=1120 --stop-position=1374 /mysql/mysqldata/binlog/mysql-bin.000009

# The proper term is pseudo_replica_mode, but we use this compatibility alias

# to make the statement usable on server versions 8.0.24 and older.

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;

/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;

DELIMITER /*!*/;

# at 196

#211228 21:17:44 server id 201 end_log_pos 125 CRC32 0x71343e2d Start: binlog v 4, server v 8.0.27 created 211228 21:17:44 at startup

# Warning: this binlog is either in use or was not closed properly.

ROLLBACK/*!*/;

# at 1120

#211228 21:39:19 server id 201 end_log_pos 1199 CRC32 0x98888e0c Anonymous_GTID last_committed=4 sequence_number=5 rbr_only=yes original_committed_timestamp=1640698760436380 immediate_commit_timestamp=1640698760436380 transaction_length=285

/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;

# original_commit_timestamp=1640698760436380 (2021-12-28 21:39:20.436380 CST)

# immediate_commit_timestamp=1640698760436380 (2021-12-28 21:39:20.436380 CST)

/*!80001 SET @@session.original_commit_timestamp=1640698760436380*//*!*/;

/*!80014 SET @@session.original_server_version=80027*//*!*/;

/*!80014 SET @@session.immediate_server_version=80027*//*!*/;

SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;

# at 1199

#211228 21:39:19 server id 201 end_log_pos 1274 CRC32 0x2cbcc0e6 Query thread_id=9 exec_time=0 error_code=0

SET TIMESTAMP=1640698759/*!*/;

SET @@session.pseudo_thread_id=9/*!*/;

SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;

SET @@session.sql_mode=1168113696/*!*/;

SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;

/*!\C utf8mb4 *//*!*/;

SET @@session.character_set_client=255,@@session.collation_connection=255,@@session.collation_server=46/*!*/;

SET @@session.lc_time_names=0/*!*/;

SET @@session.collation_database=DEFAULT/*!*/;

/*!80011 SET @@session.default_collation_for_utf8mb4=255*//*!*/;

BEGIN

/*!*/;

# at 1274

#211228 21:39:19 server id 201 end_log_pos 1330 CRC32 0x3fb4a8d7 Table_map: `test`.`test` mapped to number 94

# at 1330

#211228 21:39:19 server id 201 end_log_pos 1374 CRC32 0x473cec6d Write_rows: table id 94 flags: STMT_END_F

### INSERT INTO `test`.`test`

### SET

### @1=3 /* INT meta=0 nullable=0 is_null=0 */

### @2='Tom' /* VARSTRING(80) meta=80 nullable=1 is_null=0 */

ROLLBACK /* added by mysqlbinlog */ /*!*/;

SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;

DELIMITER ;

# End of log file

/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

[root@node1 ~]#

2.2 skip from library error

By simulating the incorrect insertion of data from the slave library, the master-slave error occurs when the master library inserts data normally. After confirming the problem, you can skip a single error or multiple errors from the slave Library in the following ways:

1) Stop from library

mysql> stop replica;

Query OK, 0 rows affected (0.00 sec)

2) Set to skip one error from the library. If you need to skip multiple errors, you can set SQL_ replica_ skip_ Set the counter value higher

mysql> set global sql_replica_skip_counter=1;

Query OK, 0 rows affected (0.00 sec)

3) Start the slave library, check the slave library status, and find the skipped error

mysql> start replica;

Query OK, 0 rows affected (0.03 sec)



mysql> show replica status\G

Note: in MySQL 8.0 Before version 26, you can also set the option slave_skip_errors to skip errors, but this option comes from MySQL 8.0 It was abandoned after 26.

3 GTID copy mode slave library error

3.1 simulation error

Here, the [Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND] error will be simulated in the GTID based master-slave replication environment, that is, an error occurs when the master database synchronizes data to the slave database due to misoperation:

1) Query issue data, 3 pieces of data in total

mysql> select * from test.test;

+----+------+

| id | name |

+----+------+

| 1 | Alen |

| 2 | Tom |

| 3 | Test |

+----+------+

3 rows in set (0.00 sec)

2) Simulate misoperation and delete data with ID=3 from the library

mysql> delete from test.test where id=3;

Query OK, 1 row affected (0.01 sec)



mysql> select * from test.test ;

+----+------+

| id | name |

+----+------+

| 1 | Alen |

| 2 | Tom |

+----+------+

2 rows in set (0.00 sec)

3) Delete the master library normally

mysql> delete from test.test where id=3;

Query OK, 1 row affected (0.12 sec)



mysql> select * from test.test;

+----+------+

| id | name |

+----+------+

| 1 | Alen |

| 2 | Tom |

+----+------+

2 rows in set (0.00 sec)

4) Viewing the copy status from the library, an error occurred

mysql> show replica status\G

*************************** 1. row ***************************

Replica_IO_State: Waiting for source to send event

Source_Host: 192.168.56.201

Source_User: repl

Source_Port: 3306

Connect_Retry: 60

Source_Log_File: mysql-bin.000012

Read_Source_Log_Pos: 1626

Relay_Log_File: mysql-relay-bin.000002

Relay_Log_Pos: 1515

Relay_Source_Log_File: mysql-bin.000012

Replica_IO_Running: Yes

Replica_SQL_Running: No

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 1032

Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'a036ad34-6325-11ec-9e83-08002790c7d5:7' at master log mysql-bin.000012, end_log_pos 1595. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.

Skip_Counter: 0

Exec_Source_Log_Pos: 1340

Relay_Log_Space: 2010

Until_Condition: None

-----------------------------Partial output omission------------------------------------

5) In the slave library, you can view the specific error information according to the replication status or from the error log. It is found that the error is Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND, it is found that the slave database made an error when executing the master database transaction a036ad34-6325-11ec-9e83-08002790c7d5:7, and the corresponding record does not exist in the slave database

mysql> select * from performance_schema.replication_applier_status_by_worker\G

*************************** 1. row ***************************

CHANNEL_NAME:

WORKER_ID: 1

THREAD_ID: NULL

SERVICE_STATE: OFF

LAST_ERROR_NUMBER: 1032

LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'a036ad34-6325-11ec-9e83-08002790c7d5:7' at master log mysql-bin.000012, end_log_pos 1595; Could not execute Delete_rows event on table test.test; Can't find record in 'test', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000012, end_log_pos 1595

LAST_ERROR_TIMESTAMP: 2021-12-28 22:30:34.569490

LAST_APPLIED_TRANSACTION: a036ad34-6325-11ec-9e83-08002790c7d5:6

LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-12-28 22:26:35.893264

LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-12-28 22:26:35.893264

LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 2021-12-28 22:26:36.050613

LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 2021-12-28 22:26:36.061199

APPLYING_TRANSACTION: a036ad34-6325-11ec-9e83-08002790c7d5:7

APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-12-28 22:30:34.411642

APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-12-28 22:30:34.411642

APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2021-12-28 22:30:34.569146

-----------------------------Partial output omission------------------------------------

6) Check the transaction a036ad34-6325-11ec-9e83-08002790c7d5:7 in the master database. It is found that an error occurred when deleting the record with ID=3, indicating that the record with ID=3 does not exist in the slave database

[root@node1 ~]# mysqlbinlog --base64-output='decode-rows' -vv --include-gtids='a036ad34-6325-11ec-9e83-08002790c7d5:7' /mysql/mysqldata/binlog/mysql-bin.000012

-----------------------------Partial output omission------------------------------------

BEGIN

/*!*/;

# at 1494

#211228 22:30:34 server id 201 end_log_pos 1550 CRC32 0x4a598165 Table_map: `test`.`test` mapped to number 94

# at 1550

#211228 22:30:34 server id 201 end_log_pos 1595 CRC32 0x74356eaa Delete_rows: table id 94 flags: STMT_END_F

### DELETE FROM `test`.`test`

### WHERE

### @1=3 /* INT meta=0 nullable=0 is_null=0 */

### @2='Test' /* VARSTRING(80) meta=80 nullable=1 is_null=0 */

# at 1595

#211228 22:30:34 server id 201 end_log_pos 1626 CRC32 0xe0c85d4c Xid = 68

COMMIT/*!*/;

SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;

DELIMITER ;

3.2 skip from library error

By simulating the erroneous deletion of data from the library, the master-slave error occurs when the master library normally deletes data. After confirming the problem, you can skip a single error in the slave Library in the following ways:

1) Stop from library

mysql> stop replica;

Query OK, 0 rows affected (0.00 sec)

2) Set the next transaction from the library, the transaction to be skipped

mysql> set gtid_next='a036ad34-6325-11ec-9e83-08002790c7d5:7';

Query OK, 0 rows affected (0.00 sec)

3) Execute an empty transaction

mysql> begin;commit;

Query OK, 0 rows affected (0.00 sec)



Query OK, 0 rows affected (0.00 sec)

4) Set resume next transaction

mysql> set gtid_next=AUTOMATIC;

Query OK, 0 rows affected (0.00 sec)

5) Start the slave library and check the replication status. It is found that the replication status is normal

mysql> start replica;

Query OK, 0 rows affected (0.03 sec)



mysql> show replica status\G

If the batch error is skipped in GTID mode, refer to the following process:

1) Stop from library

mysql> stop replica;

Query OK, 0 rows affected (0.01 sec)

2) Perform a reset from the library

mysql> reset master;

Query OK, 0 rows affected (0.17 sec)

3) Set the next batch of transaction gtids from the library_ Purged and gtid of the master library_ Same as executed

mysql> set global gtid_purged='a036ad34-6325-11ec-9e83-08002790c7d5:1';

Query OK, 0 rows affected (0.00 sec)

4) Start slave Library

mysql> start replica;

Query OK, 0 rows affected (0.03 sec)



mysql> show replica status\G

Through the demonstration of master-slave replication errors caused by the modification of slave libraries in the above two master-slave replication modes, it can be seen that most of the errors are caused by misoperation of slave libraries, so how to avoid them? There are two methods:

1) Set user permissions to avoid misoperation from the library

2) Set slave library read-only, that is, set global read in the slave library setting option_ only=1;

In addition, it should be noted that the two methods of skipping slave library errors are mutually exclusive, that is, SQL cannot be used_ replica_ skip_ Counter skips the scenario where the master-slave replication based on GTID causes an error in the slave library.

Note: in case of master-slave replication error, be sure to repeatedly confirm and verify the correctness of the data when skipping the error!

Topics: Database MySQL