[Huawei Cloud Technology Sharing] Database Development: MySQL Seconds_Behind_Brief Master Analysis

Posted by heldenbrau on Fri, 12 Jun 2020 06:20:43 +0200

[Summary] For mysql primary and standby instances, seconds_behind_master is an important parameter to measure the delay between master and slave.Seconds_can be obtained by executing "show slave status;" on slaveBehind_The value of master.

Seconds_Behind_Master

For mysql primary and standby instances, seconds_behind_master is an important parameter to measure the delay between master and slave.Seconds_can be obtained by executing "show slave status;" on slaveBehind_The value of master.

Original implementation

Definition: The number of seconds that the slave SQL thread is behind processing the master binary log.

Type: time_t(long)

The calculation is as follows:

rpl_slave.cc::show_slave_status_send_data()
if ((mi->get_master_log_pos() == mi->rli->get_group_master_log_pos()) &&
       (!strcmp(mi->get_master_log_name(),
                mi->rli->get_group_master_log_name()))) {
     if (mi->slave_running == MYSQL_SLAVE_RUN_CONNECT)
       protocol->store(0LL);
     else
       protocol->store_null();
   } else {
     long time_diff = ((long)(time(0) - mi->rli->last_master_timestamp) -
                       mi->clock_diff_with_master);
     protocol->store(
         (longlong)(mi->rli->last_master_timestamp ? max(0L, time_diff) : 0));
   }

There are two main cases:

* The SQL thread waits for the IO thread to get the host binlog, and seconds_behind_master is 0, indicating no delay between standby and host;

* The SQL thread processes the relay log, where seconds_behind_master passes (long)(time(0) - mi->rli->last_Master_Timestamp) - mi->clock_Diff_With_Master calculated;

last_master_timestamp

Definition:

* Time of events in the main library binlog.

• type: time_t (long)

Calculation method:

last_master_timestamp has different ways of calculating depending on whether the standby machine copies in parallel or not.

Non-parallel replication:

rpl_slave.cc:exec_relay_log_event()
if ((!rli->is_parallel_exec() || rli->last_master_timestamp == 0) &&
    !(ev->is_artificial_event() || ev->is_relay_log_event() ||
     (ev->common_header->when.tv_sec == 0) ||
     ev->get_type_code() == binary_log::FORMAT_DESCRIPTION_EVENT ||
     ev->server_id == 0))
{
 rli->last_master_timestamp= ev->common_header->when.tv_sec +
                             (time_t) ev->exec_time;
 DBUG_ASSERT(rli->last_master_timestamp >= 0);
}

In this mode, last_master_timestamp is expressed as the end time of each event, whereWhen.tv_SEC represents the start time of the event, exec_time represents the execution time of the transaction.The value is calculated at apply_Before the event, so when the event has not been executed, last_master_timestamp has been updated.Because exec_time only in Query_Log_Evet exists, so last_master_timestamp changes at different event phases when a transaction is applied.For example, a transaction with two insert statements prints out the type, timestamp, and execution time of the event when the code snippet is called

create table t1(a int PRIMARY KEY AUTO_INCREMENT ,b longblob) engine=innodb;
begin;
insert into t1(b) select repeat('a',104857600);
insert into t1(b) select repeat('a',104857600);
commit;
10T06:41:32.628554Z 11 [Note] [MY-000000] [Repl] event_type: 33 GTID_LOG_EVENT
2020-02-10T06:41:32.628601Z 11 [Note] [MY-000000] [Repl] event_time: 1581316890
2020-02-10T06:41:32.628614Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
2020-02-10T06:41:32.628692Z 11 [Note] [MY-000000] [Repl] event_type: 2   QUERY_EVENT
2020-02-10T06:41:32.628704Z 11 [Note] [MY-000000] [Repl] event_time: 1581316823
2020-02-10T06:41:32.628713Z 11 [Note] [MY-000000] [Repl] event_exec_time: 35
2020-02-10T06:41:32.629037Z 11 [Note] [MY-000000] [Repl] event_type: 19   TABLE_MAP_EVENT
2020-02-10T06:41:32.629057Z 11 [Note] [MY-000000] [Repl] event_time: 1581316823
2020-02-10T06:41:32.629063Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
2020-02-10T06:41:33.644111Z 11 [Note] [MY-000000] [Repl] event_type: 30    WRITE_ROWS_EVENT
2020-02-10T06:41:33.644149Z 11 [Note] [MY-000000] [Repl] event_time: 1581316823
2020-02-10T06:41:33.644156Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
2020-02-10T06:41:43.520272Z 0 [Note] [MY-011953] [InnoDB] Page cleaner took 9185ms to flush 3 and evict 0 pages
2020-02-10T06:42:05.982458Z 11 [Note] [MY-000000] [Repl] event_type: 19   TABLE_MAP_EVENT
2020-02-10T06:42:05.982488Z 11 [Note] [MY-000000] [Repl] event_time: 1581316858
2020-02-10T06:42:05.982495Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
2020-02-10T06:42:06.569345Z 11 [Note] [MY-000000] [Repl] event_type: 30    WRITE_ROWS_EVENT
2020-02-10T06:42:06.569376Z 11 [Note] [MY-000000] [Repl] event_time: 1581316858
2020-02-10T06:42:06.569384Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
2020-02-10T06:42:16.506176Z 0 [Note] [MY-011953] [InnoDB] Page cleaner took 9352ms to flush 8 and evict 0 pages
2020-02-10T06:42:37.202507Z 11 [Note] [MY-000000] [Repl] event_type: 16    XID_EVENT
2020-02-10T06:42:37.202539Z 11 [Note] [MY-000000] [Repl] event_time: 1581316890
2020-02-10T06:42:37.202546Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0

Parallel replication:

rpl_slave.cc   mts_checkpoint_routine
ts = rli->gaq->empty()
          ? 0
          : reinterpret_cast<Slave_job_group *>(rli->gaq->head_queue())->ts;
 rli->reset_notified_checkpoint(cnt, ts, true);
 /* end-of "Coordinator::"commit_positions" */

There is a distribution queue gaq on the standby machine in this mode, if gaq is empty, set last_commit_timestamp is 0; if gaq is not empty, a checkpoint point lwm is maintained at this time, all transactions before lwm are executed on the standby machine, and last_commit_timestamp is updated to the time after the transaction in which lwm resides has completed execution.The time type is time_t type.

ptr_group->ts = common_header->when.tv_sec +
                   (time_t)exec_time;  // Seconds_behind_master related
rli->rli_checkpoint_seqno++;
if (update_timestamp) {
 mysql_mutex_lock(&data_lock);
 last_master_timestamp = new_ts;
 mysql_mutex_unlock(&data_lock);
}

Under parallel replication, the last_will not be updated until event execution is completeMaster_Timestamp, so seconds_under non-parallel and parallel replicationBehind_Masters differ.

clock_diff_with_master

Definition:

• The difference in seconds between the clock of the master and the clock of the slave (second - first). It must be signed as it may be <0 or >0. clock_diff_with_master is computed when the I/O thread starts; for this the I/O thread does a SELECT UNIX_TIMESTAMP() on the master.

• type: long

rpl_slave.cc::get_master_version_and_clock()
if (!mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()")) &&
     (master_res= mysql_store_result(mysql)) &&
     (master_row= mysql_fetch_row(master_res)))
 {
   mysql_mutex_lock(&mi->data_lock);
   mi->clock_diff_with_master=
     (long) (time((time_t*) 0) - strtoul(master_row[0], 0, 10));
   DBUG_EXECUTE_IF("dbug.mts.force_clock_diff_eq_0",
     mi->clock_diff_with_master= 0;);
   mysql_mutex_unlock(&mi->data_lock);
 }

The difference is calculated only once and handled when master contacts slave.

Other

exec_time

Definition:

• the difference from the statement's original start timestamp and the time at which it completed executing.

• type: unsigned long

struct timeval end_time;
ulonglong micro_end_time = my_micro_time();
my_micro_time_to_timeval(micro_end_time, &end_time);
exec_time = end_time.tv_sec - thd_arg->query_start_in_secs();

Time Function

(1) time_t time(time_t timer) time_t is of long type and returns a value that is only accurate to seconds;

(2) int gettimeofday (struct timeval *tv, struct timezone *tz) can obtain the current time in microseconds;

(3) timeval structure

#include <time.h>
stuct timeval {
   time_t tv_sec; /*seconds*/
   suseconds_t tv_usec; /*microseconds*/
}

summary

Use seconds_behind_master measures master backup latency only to the second level, and in some scenarios, seconds_behind_master does not accurately reflect the delay between masters and backups.When a primary and standby exception occurs, you can combine seconds_behind_master source code for specific analysis.

Click Attention to learn about Huawei Cloud Fresh Technology for the first time~

Topics: MySQL SQL