Original GTID practice and analysis

Posted by mtucker6784 on Wed, 26 Jan 2022 09:56:15 +0100

Tip: official account display code will be automatically folded, and recommended horizontal screen reading.

"Part I Introduction to GTID"

On mysql5 6. GTID (Global Transaction Identifier) feature is introduced, which can uniquely identify a transaction in the cluster. During MySQL master-slave replication, the slave node can use GTID to determine the replication location, which is used to replace the traditional way of using binlog file offset. When the master-slave switch occurs, the slave node can automatically find the correct replication location on the new master, It greatly simplifies the maintenance of clusters under complex replication topology and reduces the risk of misoperation in artificially setting replication sites. In addition, GTID based replication can skip the executed transactions and reduce the risk of data inconsistency.

"Part II" GTID practice and analysis "

[composition of GTID] GTID is implemented by server_uuid + gno, for example: 3E11FA47-71CA-11E1-9E33-C80AA9429562:23. server_uuid is the 128 bit uuid generated when the server is started, and gno is the sequence number, which is incremented from 1 on each mysql server. It is the unique identification of the transaction on the instance. GTIDs on the same instance are generally continuous, such as 3E11FA47-71CA-11E1-9E33-C80AA9429562:1-100. If this group of GTIDs comes from different instances, each group of instances is separated by commas; If gno has multiple ranges, the ranges of each group are separated by colons, for example: 3E11FA47-71CA-11E1-9E33-C80AA9429562:23, 3E11FA47-71CA-11E1-9E33-C80AA9429562:1-5 For example, when show slave status, you can see the executed on the slave_ gtid_ Set contains two sets of gtids: the self executing gtid on the slave and the transaction gtid from the master node:

[life cycle of GTID]

  1. When executing and submitting a transaction, the master node will assign a GTID to the transaction, which is responsible for judging the smallest unused transaction sequence number on the instance as the gno of the GTID. If a transaction does not need to be written to the binlog (for example, the transaction is filtered out or read-only), it will not be assigned a GTID. GTID is the server of the instance_ UUID and the smallest transaction sequence number not yet used on this instance.
  2. If a transaction is assigned a GTID, the GTID will be used as GTID when the transaction is committed_ log_ Event is written to binlog. The GTID of the written binlog will be persisted to mysql before binlog rotate or mysql instance exits GTID_ In the executed table.
  3. After the transaction is committed, GTID will be added to the system variable @ @ global GTID_ In executed, this variable represents the GTID set of all committed transactions. In the case of opening binlog, mysql GTID_ The GTID set in the executed table cannot represent the GTID set of the completed transaction executed by the instance at the current time, because the GTID will not be recorded synchronously to the table when the transaction is committed. At this time, some transaction gtids have not been persisted from binlog to the gtid_executed table.
  4. In master-slave replication, binlog is transferred to the slave node and stored in the relay log. The slave node reads the GTID of the transaction from the relay log and stores the gtid_next is set to the GTID.
  5. Before executing the transaction, the slave node will check that no other thread holds the ownership of the GTID, and check that the current transaction GTID does not exist in the gtid_executed to avoid repeated execution of the same transaction. Through the system variable @ @ global GTID_ Owned to maintain the GTID and the thread ID holding the GTID ownership. If the ownership of a GTID has been held by other threads, you need to wait for other threads to release before applying for execution, so as to ensure that only one thread is processing the transaction at the same time. If the GTID has been executed on the instance, the transaction will be skipped automatically to avoid repeated execution.
  6. Use GTID from node_ Next assigns a GTID to the transaction instead of regenerating a new GTID, which ensures the uniqueness of the transaction GTID in the cluster.
  7. If the slave node also opens binlog, the gtid will also be written to binlog when the transaction is committed. The written gtid of binlog will be persisted to mysql before binlog rotate or mysql instance exits gtid_ In the executed table.
  8. If binlog is disabled from the slave node, MySQL will add a statement in the transaction to write GTID directly to MySQL GTID_ The executed table is persisted. From mysql8 Starting from 0, this operation is atomic for both DDL statements and DML statements. In this case, mysql GTID_ The executed table can also represent the complete transaction records applied on the node.

[generation of GTID] The GTID is generated in the transaction submission phase and MySQL in the Flush Stage of Group Commit_ BIN_ LOG::process_ flush_ stage_ Assign is called in the queue function_ automatic_ gtids_ to_ flush_ Group generates gtids for each transaction in the Commit Group. In the same Commit Group, the Leader thread is responsible for generating gtids for all transactions in the group. The corresponding call stack is:

MYSQL_BIN_LOG::ordered_commit()             // stage #1 flushing transactions to binary log. |- process_ flush_ stage_ queue                                                    |-assign_ automatic_ gtids_ to_ flush_ Group / / leader traverses the queue | - gtid_ state::generate_ automatic_ gtid                                   |-get_ automatic_ Gno / / generate gno | - acquire_ Ownership / / apply for ownership

GTID_ state::generate_ automatic_ The GTID function is responsible for generating the GTID and applying for the ownership of the GTID for the thread. One of the key functions to generate the GTID is Gtid_state::get_automatic_gno, which is responsible for judging the smallest unused transaction sequence number on the instance as the gno of GTID:

rpl_gno Gtid_state::get_automatic_gno(rpl_sidno sidno) const {  DBUG_TRACE;  Gtid_set::Const_interval_iterator ivit(&executed_gtids, sidno);  Gtid next_candidate = {sidno,                         sidno == get_server_sidno() ? next_free_gno : 1};  while (true) {    const Gtid_set::Interval *iv = ivit.get();    rpl_gno next_interval_start = iv != nullptr ? iv->start : MAX_GNO;    while (next_candidate.gno < next_interval_start &&           DBUG_EVALUATE_IF("simulate_gno_exhausted", false, true)) {      DBUG_PRINT("debug",                 ("Checking availability of gno= %llu", next_candidate.gno));      if (owned_gtids.is_owned_by(next_candidate, 0)) return next_candidate.gno;      next_candidate.gno++;    }    if (iv == nullptr ||        DBUG_EVALUATE_IF("simulate_gno_exhausted", true, false)) {      my_error(ER_GNO_EXHAUSTED, MYF(0));      return -1;    }    if (next_candidate.gno <= iv->end) next_candidate.gno = iv->end;    ivit.next();  }}

After generating GTID, GTID will be called_ state::acquire_ The ownership function applies for GTID ownership:

enum_return_status Gtid_state::generate_automatic_gtid(    THD *thd, rpl_sidno specified_sidno, rpl_gno specified_gno,    rpl_sidno *locked_sidno) {// ...
    if (automatic_gtid.gno == 0) {      automatic_gtid.gno = get_automatic_gno(automatic_gtid.sidno);      if (automatic_gtid.sidno == get_server_sidno() &&          automatic_gtid.gno != -1)        next_free_gno = automatic_gtid.gno + 1;    }
    if (automatic_gtid.gno != -1)      acquire_ownership(thd, automatic_gtid);    else      ret = RETURN_STATUS_REPORTED_ERROR;// ...}

When allocating GTID, it will be allocated monotonically and incrementally from the minimum GTID available on the current instance. Generally, the allocation of GTID on an instance will not generate holes. If holes are generated in GTID due to special circumstances (such as manually setting GTID_next), it will also be allocated from the minimum unused GTID when allocating GTID in AUTOMATIC mode, so as to eliminate holes. Here is a simple demonstration:

# The GTID sequence number of the executed transaction on this instance is 1-119686mysql > show variables like '% GTID%'+----------------------------------+-----------------------------------------------+| Variable_ name                    | Value                                         |+----------------------------------+-----------------------------------------------+| binlog_ GTID_ simple_ recovery      | ON                                            || enforce_ GTID_ consistency         | ON                                            || GTID_ executed                    | d4255688-0718-11ec-9687-506b4b430198:1-119686 || GTID_ executed_ compression_ period | 1000                                          || GTID_ mode                        | ON                                            || GTID_ next                        | AUTOMATIC                                     || GTID_ owned                       |                                               || GTID_ purged                      |                                               || session_ track_ gtids              | OFF                                           |+----------------------------------+-----------------------------------------------+9 rows in set (0.00 sec)
# Set gtid manually_ Next = 119690 (greater than the maximum allocated sequence number 119686, causing holes) MySQL > set gtid_ next='d4255688-0718-11ec-9687-506b4b430198:119690'; mysql> update sbtest9 set k=k+1 where id=1;
# At this time, the GTID of the executed transaction is empty: d425688-0718-11ec-9687-506b4b430198:1-119686:119690mysql > show variables like '% GTID%'+----------------------------------+------------------------------------------------------+| Variable_ name                    | Value                                                |+----------------------------------+------------------------------------------------------+| binlog_ GTID_ simple_ recovery      | ON                                                   || enforce_ GTID_ consistency         | ON                                                   || GTID_ executed                    | d4255688-0718-11ec-9687-506b4b430198:1-119686:119690 || GTID_ executed_ compression_ period | 1000                                                 || GTID_ mode                        | ON                                                   || GTID_ next                        | d4255688-0718-11ec-9687-506b4b430198:119690          || GTID_ owned                       |                                                      || GTID_ purged                      |                                                      || session_ track_ gtids              | OFF                                                  |+----------------------------------+------------------------------------------------------+
# Restore gtid_ next='automatic'mysql> set gtid_ next='automatic';
# Execute an update statement MySQL > update sbtest9 set K = K + 1 where id = 1;
# You can see that the gtid is allocated from the smallest unused gtid, 119687 instead of 119691mysql > show variables like '% gtid%'+----------------------------------+------------------------------------------------------+| Variable_ name                    | Value                                                |+----------------------------------+------------------------------------------------------+| binlog_ gtid_ simple_ recovery      | ON                                                   || enforce_ gtid_ consistency         | ON                                                   || gtid_ executed                    | d4255688-0718-11ec-9687-506b4b430198:1-119687:119690 || gtid_ executed_ compression_ period | 1000                                                 || gtid_ mode                        | ON                                                   || gtid_ next                        | AUTOMATIC                                            || gtid_ owned                       |                                                      || gtid_ purged                      |                                                      || session_ track_ gtids              | OFF                                                  |+----------------------------------+------------------------------------------------------+

It can be found that in automatic mode, GTID allocation starts from the minimum available sequence number, so the size of GTID cannot represent the actual execution order of transactions.

[maintenance of GTID] MySQL through Gtid_state object to maintain the normal operation of GTID system as a whole. There are some common data structures and variables in the implementation:

  • sid: server uuid, 128 bit
  • gno: the serial number incremented from 1 in each server
  • gtid: (sidno, gno), which is globally unique in the cluster
  • sidno: and server_uuid one-to-one correspondence, int32, used in sid_map as an array index, starting from 1
  • sid_map: bidirectional map < sidno, Sid >, maintaining all servers_ Mapping from UUID to sidno.
  • gtid_set: array (sidno = > link_list (interval)), maintaining the data structure of all gtids
  • Interval: (start_gno, end_gno), a node of the single linked list, which represents an interval of gno. Adding the corresponding sidno can form a gtid interval.

In GTID_ There are also several key objects in state to maintain the life cycle of GTID:

  • owned_gtids is used to maintain the correspondence between the GTID on the current instance and the thread holding the GTID ownership. It is used to ensure that only one thread is processing a transaction at the same time. After allocating GTID for transaction generation in the flush stage of group commit, the ownership of GTID will be applied. At this time, the information of GTID and application thread will be added to owned_ In gtids; After the commit stage transaction is successfully committed, the ownership will be released, and the relevant information will be transferred from owned_ Remove from gtids.
  • executed_gtids is used to maintain the GTID collection of submitted transactions on the instance. Users can use @ @ global GTID_ Owned system variable to view. The transaction is in the commit stage of group commit. After the transaction is successfully committed, it will be synchronously added to the executed_gtids, so the set of gtids it stores is updated in real time.
  • mysql. gtid_ The executed table is used to persist gtids in memory_ Executed value. Although gtid in binlog_ Event can also be used as gtid_ Executed persistence tool, but we can't keep all the binlogs of MySQL all the time, and binlog may not be opened on the slave database. Therefore, we need to persist the information to MySQL in time gtid_ Executed table. It should be noted here that when the master node opens binlog, the gtid will not be written to GITD in real time when the transaction is committed_ Executed table, so the gtid in memory_ executed_ The values of set are and gtid_ The values in the executed table are not necessarily consistent. In the slave database, if binlog is not enabled, the gtid of the committed transaction cannot be persisted to the binlog file (maintenance in the gtid_executed_set in memory is not persistent). Therefore, each time the transaction is committed, the gtid of the executed transaction must be written to the gtid_ In the executed table.
  • lost_gtids maintains the collection of gtids deleted from binlog. Its corresponding MySQL system variable is gtid_purged . Because binlog files need to be cleaned regularly to avoid occupying a lot of disk space, when binlog is cleaned, the GTID in the deleted binlog file will be recorded in GTID_ In purged, it is GTID_ A subset of executed.

group_ Related call stacks in commit:

MYSQL_BIN_LOG::ordered_commit()            // stage #1 flushing transactions to binary log. |- process_ flush_ stage_ queue                           |-fetch_ and_ process_ flush_ stage_ queue       |-ha_ flush_ logs                                                             |-assign_ automatic_ gtids_ to_ flush_ Group / / leader traverses the queue | - gtid_ state::generate_ automatic_ gtid                         |-get_ automatic_ Gno / / generate gno | - acquire_ Ownership / / apply for ownership and join owned_gtids    |-flush_thread_caches                               |-flush_cache_to_file                                                            // stage #2 Syncing binary log file to disk. // ......    // stage #3 Commit all transactions in order. |-change_stage / / this stage is affected by binlog_ order_ Commit parameter limit | - process_ commit_ stage_ queue                          |-ha_ commit_ low                                        |-gtid_ state::update_ commit_ Group / / leader traverses the queue | - gtid_ state::update_ gtids_ impl_ own_ Gtid / / from owned_ Delete in gtids and add executed_gtids |-process_ after_ commit_ stage_ queue                 |- stage_ manager. signal_ done                       | |-finish_ commit                                            // binlog_ order_ When commit = 0, each thread submits | - Gtid_state::update_on_commit/update_on_rollback       |-Gtid_state::update_gtids_impl           |-Gtid_state::update_gtids_impl_own_gtid / / from owned_ Delete in gtids and add executed_gtids

"Summary of Part III"

This paper briefly introduces the implementation of GTID in MySQL from four aspects: the composition, life cycle, generation and maintenance of GTID. In master-slave replication, the emergence of GTID greatly reduces the complexity of maintenance_ Maintaining different GTID collection objects in state depends on the global lock_ sid_ Lock and Sid for each sidno_ Locks, in the high concurrency scenario, there may be a lock conflict bottleneck, and there is some room for optimization. This txsql kernel is being optimized. Please look forward to it.

"Part IV reference documents"

https://dev.mysql.com/doc/refman/5.6/en/replication-gtids-concepts.html https://dev.mysql.com/doc/refman/8.0/en/replication-gtids.html

Tencent database technology team internally supports QQ space, wechat red envelope, Tencent advertising, Tencent music, Tencent News and other companies' self-developed businesses. Externally, relying on the base of CBS+CFS on Tencent cloud, TencentDB related products are supported, such as tdsql-c (original CynosDB), TencentDB for MySQL(CDB), CTSDB, MongoDB, CES, etc. Tencent database technology team focuses on continuously optimizing the database kernel and architecture capabilities, improving database performance and stability, and providing "worry free and reassuring" database services for Tencent self research business and Tencent cloud customers. This official account is intended to promote and share expertise in the database with the vast number of database technology enthusiasts.