Parallel replication records updated from the library do not exist, but exist
background
The semi synchronous slave database SQL thread with parallel replication enabled reports 1032 errors, while the asynchronous slave database does not report any errors. This happens occasionally
Analysis
Version mysql 5.7.16
mysql> show variables like '%slave_para%';
+------------------------+---------------+
| Variable_name | Value |
+------------------------+---------------+
| slave_parallel_type | LOGICAL_CLOCK |
| slave_parallel_workers | 16 |
+------------------------+---------------+
1. It is suspected that the updated record does not exist due to the update operation on the slave database
2. View error log and find
2018-04-03T10:11:47.720156+08:00 16 [ERROR] Slave SQL for channel '': Worker 13 failed executing transaction 'a272bbcf-874f-11e7-a288-00505695b721:687871861' at master log mysql-bin.004119, end_log_pos 376471678; Could not execute Update_rows event on table anytxn.seq_xxxx; Can't find record in 'seq_xxxx', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.004119, end_log_pos 376471678, Error_code: 1032
2018-04-03T10:11:47.720230+08:00 2 [Warning] Slave SQL for channel '': ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2018-04-03T10:11:47.720959+08:00 2 [Note] Error reading relay log event for channel '': slave SQL thread was killed
3. binlog discovery from position analysis of SQL thread stop
SET @@SESSION.GTID_NEXT= 'a272bbcf-874f-11e7-a288-00505695b721:687871861'/*!*/; # at 376471694 #180403 10:11:47 server id 104073 end_log_pos 376471555 CRC32 0x1be91176 Query thread_id=2086049 exec_time=0 error_code=0 SET TIMESTAMP=1522721507/*!*/; BEGIN /*!*/; # at 376471768 #180403 10:11:47 server id 104073 end_log_pos 376471616 CRC32 0x10644d77 Table_map: `anytxn`.`seq_xxxx` mapped to number 301 # at 376471829 #180403 10:11:47 server id 104073 end_log_pos 376471678 CRC32 0x871a9787 Update_rows: table id 301 flags: STMT_END_F ### UPDATE `anytxn`.`seq_xxxx` ### WHERE ### @1=7116088 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=1 /* INT meta=0 nullable=0 is_null=0 */ ### SET ### @1=7116089 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=1 /* INT meta=0 nullable=0 is_null=0 */ # at 376471891 #180403 10:11:47 server id 104073 end_log_pos 376471709 CRC32 0x9eb59238 Xid = 22247621418 COMMIT/*!*/; # at 376471922 #180403 10:11:47 server id 104073 end_log_pos 376471774 CRC32 0xf7b6ad5d GTID last_committed=641254 sequence_number=641259 SET @@SESSION.GTID_NEXT= 'a272bbcf-874f-11e7-a288-00505695b721:687871862'/*!*/; # at 376471987 #180403 10:11:47 server id 104073 end_log_pos 376471856 CRC32 0x6256de00 Query thread_id=2085350 exec_time=0 error_code=0 SET TIMESTAMP=1522721507/*!*/; BEGIN /*!*/; # at 376472069 #180403 10:11:47 server id 104073 end_log_pos 376471979 CRC32 0x6c329578 Table_map: `anytxn`.`bm_cc_customer_address_info` mapped to number 1569 # at 376472192 #180403 10:11:47 server id 104073 end_log_pos 376472162 CRC32 0x834cc8b9 Write_rows: table id 1569 flags: STMT_END_F ### INSERT INTO `anytxn`.`bm_xxxxxxxxxxxxxx` ### SET ### @1=14480779 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2='0000001002380654' /* STRING(96) meta=65120 nullable=0 is_null=0 */ ### @3='B001' /* VARSTRING(30) meta=30 nullable=1 is_null=0 */ ### @4=NULL /* STRING(12) meta=65036 nullable=1 is_null=1 */ ### @5='10000010001202000000001' /* VARSTRING(96) meta=96 nullable=1 is_null=0 */ ### @6='B00' /* STRING(9) meta=65033 nullable=1 is_null=0 */ ### @7='xxxxxxxxxxx' /* VARSTRING(765) meta=765 nullable=1 is_null=0 */ ### @8=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */ ### @9=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */ ### @10=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */
mysql@ACSXFDB6.xwbank.com:/home/mysql> mysqlbinlog -v -v --start-datetime='2018-04-03 10:11:45' --stop-datetime='2018-04-03 10:11:48' /data/mysql/ACSXFDB6-relay-bin.005477 | grep last_comm | grep 10:11:47 | grep 641254 #180403 10:11:47 server id 104073 end_log_pos 376469618 CRC32 0xb6dc6cef GTID last_committed=641227 sequence_number=641254 #180403 10:11:47 server id 104073 end_log_pos 376471774 CRC32 0xf7b6ad5d GTID last_committed=641254 sequence_number=641259 #180403 10:11:47 server id 104073 end_log_pos 376472258 CRC32 0x27cf3013 GTID last_committed=641254 sequence_number=641260
From the above information, it can be seen that the update record does not exist. It is the record whose ID is 7116088 that is updating the anytxn.seq XXXX table
There are two concurrent committed transactions last_committed=641254, 687871861 transactions that are not found in the updated record, and committed transaction sequence_number=641260 (that is, the operation of insert ing another table)
4. View tables and related records where the updated records do not exist
show create table seq_xxxx;
| seq_xxxx | CREATE TABLE seq_xxxx (
currentValue bigint(20) NOT NULL,
increment int(11) NOT NULL DEFAULT '1'
mysql> select * from seq_xxxx;
+--------------+-----------+
| currentValue | increment |
+--------------+-----------+
| 7116088 | 1 |
+--------------+-----------+
It can be found that the record exists in the actual database
test
mysql> select @@version;
+------------+
| @@version |
+------------+
| 5.7.19-log |
+------------+
1 row in set (0.00 sec)
mysql> show variables like '%para%';
+------------------------+---------------+
| Variable_name | Value |
+------------------------+---------------+
| slave_parallel_type | LOGICAL_CLOCK |
| slave_parallel_workers | 4 |
+------------------------+---------------+
sysbench /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-host=10.186.30.73 --mysql-socket=/opt/mysql/data/3307/mysqld.sock --mysql-port=3307 --db-driver=mysql --mysql-db=test --mysql-user=admin --mysql-password=admin --table_size=100000 --tables=5 --threads=100 --time=120 --report-interval=5 run
There are concurrent committed transactions, but no simulated re issue update records exist, but they actually exist
Related bug links
Repeated multi-threaded slave replication failures