What is the architecture of master-slave replication?
One master and one slave, write from the master library and read from the library
The slave library can only be set as read-only, otherwise it will be messy to synchronize the writing of both the master library and the client.
What is the role of master-slave replication?
- The function of hot backup. If the master database hangs, you can use the slave database to continue working
- Read write separation improves performance. In particular, some queries have a great impact on performance.
- To facilitate the expansion of the architecture, even one master is never enough in the later stage. For example, if there is A large demand for writing, it can be written in the main database, and the data placement points can be distributed according to the business sub database or hash. If the demand for query is large, you can add slave libraries, one master and two slaves, or even one master and multiple slaves. In order to reduce the pressure on the master database, cascade replication can be adopted. (A is the main library)
Principle of master-slave replication?
Probably the process is
- Execute write sql from master database
- The slave database is connected to the master database (so its replication is asynchronous, because the master database will transmit binlog to the slave database only after the slave database initiates a request)
- The master library creates a thread binlog dump thread to write binlog logs to the slave library
- Create an i/O thread from the library, read the contents of binlog log and write them to relay log
- Create an sql thread from the library, read the content from the real log and write to the slave library
To sum up, there are two logs and three threads.
Delay of master-slave replication
It can be seen that one process of master-slave replication involves two disk writes, and although it is synchronous in macro view, it is actually an asynchronous replication, so there will be some delay, but the delay is relatively small.
The delay time is related to the complexity of the architecture. One master-slave and one master-slave must have different delays
It is also related to physical machines. One physical machine must have less delay than two physical machines, and one computer room must have less delay than two computer rooms
Generally speaking, the delay I have used is within 1ms, so the difference is almost that there is no delay.
Under what circumstances will master-slave inconsistency occur
Generally speaking, there are the following scenarios of master-slave inconsistency
Network Delay
This is mainly due to the hardware problem. For example, the network delay caused by cross machine room is too long, resulting in the inconsistency between master and slave that affects the use. If you really encounter the inconsistency between master and slave caused by too long network delay, you can only selectively read the master.
For example, the delay time is 1 s
The host writes a piece of data and caches the written records in the cache
If another write request hits the cache within one second, it is considered that the slave database has not completed synchronization at this time, and ask it to query the master database
The loads of the master and slave machines are inconsistent
In the process of master-slave replication, the master library needs to create one thread and the slave library needs to create two threads. Insufficient resources of any machine will lead to thread creation failure or waiting, resulting in inconsistency between master and slave.
Parameter configuration problem - Max_ allowed_ Inconsistent packet settings
If there is a problem with writing long sql, take a look at this parameter. It is likely that the value of slave database is relatively large, which can allow long sql execution, while the value of slave database is relatively small, so sql of the same length cannot be executed.
Self increasing inconsistent key
This is generally rare, unless there are a large number of data inconsistencies and obvious characteristics
Setting of synchronization parameters
- innodb_flush_log_at_trx_commit: controls redo Delay of log flushing to disk
- sync_binlog: controls the delay of binlog log flushing to disk
See details Do you know the two important configuration parameters of MySQL master-slave replication?.
These two parameters control the write delay. If it is not set properly and the log has not been written to the disk, mysql will be down, data will be lost and transaction loss will occur. Naturally, the problem of inconsistency between master and slave will also occur.
Asynchronous replication itself
Asynchronous replication is a process in which the master database sends a binlog and directly returns it to the client, regardless of whether the slave database is synchronized or not. mysql5. Semi synchronous replication already exists on 7. The parameters are as follows
- after_sync : The master database does not commit the transaction first, and then submits the transaction after the result is returned from the database
Duplicate server ID
The server IDs of multiple slave s are duplicated due to human configuration
Version inconsistency
This is relatively rare. Generally, the version of the master library and slave library will be set to be consistent
How to solve the inconsistency between master and slave
You can refer to this article
mysql master-slave synchronization inconsistency solution.
Found master-slave inconsistency
View the status of the next main library thread
show processlist;
Check the status of the main library
show master status;
Check the status of the slave library. If the following occurs, it indicates that the sql thread is hung
mysql> show slave status\G Slave_IO_Running: Yes Slave_SQL_Running: No
Check the log again. What is the error
Delete failed
Last_SQL_Error: Could not execute Delete_rows event on table hcy.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000006, end_log_pos 254
Since the main database has been deleted and the data cannot be found from the database, it is good to skip directly from the database
stop slave; // Skip step error set global sql_slave_skip_counter =1; start slave;
Duplicate primary key
The record already exists in the slave, and the same record is inserted in the master
Last_SQL_Error: Could not execute Write_rows event on table hcy.t1; Duplicate entry '2' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000006, end_log_pos 924
View this data on the master and slave respectively and delete the duplicate primary keys
Update lost
A piece of data is updated on the master and cannot be found on the slave
Last_SQL_Error: Could not execute Update_rows event on table hcy.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000010, end_log_pos 794
Add data to the slave
The above is aimed at less inconsistent data, which can be solved manually. If there is a large amount of data and there are too many inconsistent data to process, you can only resynchronize
Resynchronize
Using the hot backup tool
IBackup or xtrabackup
This backup tool does not lock the table during backup. Like mvcc, it takes a snapshot for external reading.
Data changes during backup will be recorded and synchronized after the backup is completed.
How to get the monitoring master-slave inconsistency
Pass parameter
mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.205 Master_User: repl Master_Port: 3306 Connect_Retry: 30 Master_Log_File: edu-mysql-bin.000008 Read_Master_Log_Pos: 120 Relay_Log_File: edu-mysql-relay-bin.000002 Relay_Log_Pos: 287 Relay_Master_Log_File: edu-mysql-bin.000008 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 120 Relay_Log_Space: 464 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 205 Master_UUID: 7402509d-fd14-11e5-bfd0-000c2963dd15 Master_Info_File: /home/mysql/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 1 row in set (0.00 sec)
- Slave_IO_Running
This parameter can be used as io_thread monitoring item. Yes means io_ The connection between thread and the master database is normal and replication can be implemented. No indicates that the communication with the master database is abnormal. In most cases, it is caused by the master-slave network; - Slave_SQL_Running
This parameter represents SQL_ Whether the thread is normal or not, specifically whether the statement is executed. It is often encountered that the primary key is repeated or a table does not exist. - Seconds_Behind_Master
By comparing SQL_ Timestamp and IO of event executed by thread_ Thread compares the timestamp (TS for short) of the copied event to obtain such a difference;
NULL - indicates io_thread or SQL_ Any one of the threads fails, that is, the Running status of the thread is No, not Yes.
0 - the value is zero, which is what we are eager to see. It means that the master-slave replication is good, and it can be considered that lag does not exist.
Positive value - indicates that the master and slave have delayed. The larger the number, the more the slave lags behind the master.
Negative value - it's almost rare. I just heard some experienced DBA s say that they have seen it. In fact, this is a BUG value. This parameter does not support negative values, that is, it should not appear.
Tool maatkit MK heartbeat
It can be understood as the encapsulation and enhancement of parameters
Through a synchronization clock, he builds the same table on the master database and the slave database. In the table, the server ID of the slave database and the timestamp of data insertion are stored. He regularly checks the timestamp difference of the same data on the master and slave database. If the difference is higher than half a second, it is considered that the master and slave are inconsistent.