MySQL parallel replication records updated from the database do not exist, but exist

Posted by johnkelly on Tue, 31 Mar 2020 08:41:14 +0200

Parallel replication records updated from the library do not exist, but exist

background

The semi synchronous slave database SQL thread with parallel replication enabled reports 1032 errors, while the asynchronous slave database does not report any errors. This happens occasionally

Analysis

Version mysql 5.7.16
mysql> show variables like '%slave_para%';
+------------------------+---------------+
| Variable_name | Value |
+------------------------+---------------+
| slave_parallel_type | LOGICAL_CLOCK |
| slave_parallel_workers | 16 |
+------------------------+---------------+
1. It is suspected that the updated record does not exist due to the update operation on the slave database
2. View error log and find

2018-04-03T10:11:47.720156+08:00 16 [ERROR] Slave SQL for channel '': Worker 13 failed executing transaction 'a272bbcf-874f-11e7-a288-00505695b721:687871861' at master log mysql-bin.004119, end_log_pos 376471678; Could not execute Update_rows event on table anytxn.seq_xxxx; Can't find record in 'seq_xxxx', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.004119, end_log_pos 376471678, Error_code: 1032

2018-04-03T10:11:47.720230+08:00 2 [Warning] Slave SQL for channel '': ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2018-04-03T10:11:47.720959+08:00 2 [Note] Error reading relay log event for channel '': slave SQL thread was killed

3. binlog discovery from position analysis of SQL thread stop

SET @@SESSION.GTID_NEXT= 'a272bbcf-874f-11e7-a288-00505695b721:687871861'/*!*/;
# at 376471694
#180403 10:11:47 server id 104073  end_log_pos 376471555 CRC32 0x1be91176   Query   thread_id=2086049   exec_time=0 error_code=0
SET TIMESTAMP=1522721507/*!*/;
BEGIN
/*!*/;
# at 376471768
#180403 10:11:47 server id 104073  end_log_pos 376471616 CRC32 0x10644d77   Table_map: `anytxn`.`seq_xxxx` mapped to number 301
# at 376471829
#180403 10:11:47 server id 104073  end_log_pos 376471678 CRC32 0x871a9787   Update_rows: table id 301 flags: STMT_END_F

### UPDATE `anytxn`.`seq_xxxx`
### WHERE
###   @1=7116088 /* LONGINT meta=0 nullable=0 is_null=0 */
###   @2=1 /* INT meta=0 nullable=0 is_null=0 */
### SET
###   @1=7116089 /* LONGINT meta=0 nullable=0 is_null=0 */
###   @2=1 /* INT meta=0 nullable=0 is_null=0 */
# at 376471891
#180403 10:11:47 server id 104073  end_log_pos 376471709 CRC32 0x9eb59238   Xid = 22247621418
COMMIT/*!*/;
# at 376471922
#180403 10:11:47 server id 104073  end_log_pos 376471774 CRC32 0xf7b6ad5d   GTID    last_committed=641254   sequence_number=641259
SET @@SESSION.GTID_NEXT= 'a272bbcf-874f-11e7-a288-00505695b721:687871862'/*!*/;
# at 376471987
#180403 10:11:47 server id 104073  end_log_pos 376471856 CRC32 0x6256de00   Query   thread_id=2085350   exec_time=0 error_code=0
SET TIMESTAMP=1522721507/*!*/;
BEGIN
/*!*/;
# at 376472069
#180403 10:11:47 server id 104073  end_log_pos 376471979 CRC32 0x6c329578   Table_map: `anytxn`.`bm_cc_customer_address_info` mapped to number 1569
# at 376472192
#180403 10:11:47 server id 104073  end_log_pos 376472162 CRC32 0x834cc8b9   Write_rows: table id 1569 flags: STMT_END_F


### INSERT INTO `anytxn`.`bm_xxxxxxxxxxxxxx`
### SET
###   @1=14480779 /* LONGINT meta=0 nullable=0 is_null=0 */
###   @2='0000001002380654' /* STRING(96) meta=65120 nullable=0 is_null=0 */
###   @3='B001' /* VARSTRING(30) meta=30 nullable=1 is_null=0 */
###   @4=NULL /* STRING(12) meta=65036 nullable=1 is_null=1 */
###   @5='10000010001202000000001' /* VARSTRING(96) meta=96 nullable=1 is_null=0 */
###   @6='B00' /* STRING(9) meta=65033 nullable=1 is_null=0 */
###   @7='xxxxxxxxxxx' /* VARSTRING(765) meta=765 nullable=1 is_null=0 */
###   @8=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */
###   @9=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */
###   @10=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */
 mysql@ACSXFDB6.xwbank.com:/home/mysql>  mysqlbinlog -v -v --start-datetime='2018-04-03 10:11:45' --stop-datetime='2018-04-03 10:11:48'  /data/mysql/ACSXFDB6-relay-bin.005477 | grep last_comm | grep 10:11:47 | grep 641254
#180403 10:11:47 server id 104073  end_log_pos 376469618 CRC32 0xb6dc6cef   GTID    last_committed=641227   sequence_number=641254
#180403 10:11:47 server id 104073  end_log_pos 376471774 CRC32 0xf7b6ad5d   GTID    last_committed=641254   sequence_number=641259
#180403 10:11:47 server id 104073  end_log_pos 376472258 CRC32 0x27cf3013   GTID    last_committed=641254   sequence_number=641260

From the above information, it can be seen that the update record does not exist. It is the record whose ID is 7116088 that is updating the anytxn.seq XXXX table
There are two concurrent committed transactions last_committed=641254, 687871861 transactions that are not found in the updated record, and committed transaction sequence_number=641260 (that is, the operation of insert ing another table)

4. View tables and related records where the updated records do not exist

show create table seq_xxxx;
| seq_xxxx | CREATE TABLE seq_xxxx (
currentValue bigint(20) NOT NULL,
increment int(11) NOT NULL DEFAULT '1'

mysql> select * from seq_xxxx;
+--------------+-----------+
| currentValue | increment |
+--------------+-----------+
| 7116088 | 1 |
+--------------+-----------+
It can be found that the record exists in the actual database

test

mysql> select @@version;
+------------+
| @@version |
+------------+
| 5.7.19-log |
+------------+
1 row in set (0.00 sec)
mysql> show variables like '%para%';
+------------------------+---------------+
| Variable_name | Value |
+------------------------+---------------+
| slave_parallel_type | LOGICAL_CLOCK |
| slave_parallel_workers | 4 |
+------------------------+---------------+

sysbench /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-host=10.186.30.73 --mysql-socket=/opt/mysql/data/3307/mysqld.sock --mysql-port=3307  --db-driver=mysql  --mysql-db=test --mysql-user=admin --mysql-password=admin --table_size=100000 --tables=5 --threads=100 --time=120 --report-interval=5  run

There are concurrent committed transactions, but no simulated re issue update records exist, but they actually exist

Related bug links
Repeated multi-threaded slave replication failures

Topics: MySQL SQL Database Session