MHA of MySQL high availability architecture (Theory + deployment + fault simulation)

Posted by ssidellq on Fri, 18 Feb 2022 14:10:19 +0100

1, MHA overview

  • MHA (Master HA) is an open source MySQL high availability program. It provides the function of automatic master failover for MySQL master-slave replication architecture.
  • When MHA monitors the failure of the master node, it will promote the slave node with the latest data to become a new master node. During this period, MHA will avoid consistency problems by obtaining additional information from other slave nodes.
  • MHA also provides the online switching function of the master node, that is, switching the master/slave node on demand.
  • MHA can achieve failover within 30 seconds and ensure data consistency as much as possible during failover.

2, MHA service composition

2.1 MHA Manager (management node)

  • It is usually deployed separately on an independent machine to manage multiple master/slave clusters (groups). Each master/slave cluster is called an application to manage and coordinate the whole cluster.

2.2 MHA node

  • Run on each MySQL server (master/slave/manager), and speed up failover by monitoring scripts with the functions of parsing and cleaning logs.
  • It is mainly the agent that receives the instructions sent by the management node. The agent needs to run on each mysql node.
  • In short, node is used to collect the bin logs generated from the node server.
  • Compare whether the slave node intended to be promoted to the new master node owns and completes the operation. If it is not sent to the new master node, it will be promoted to the master node after local application.

3, Working principle and characteristics of MHA

  • ① Save binary log events from the master of downtime and crash
  • ② Identify the slave with the latest update
  • ③ Apply different relay logs to other slave
  • ④ Apply binary log events saved from the master
  • ⑤ Upgrade a slave to a new master
  • ⑥ Use another slave to connect to the new master for replication
  • MHA features
    • In the process of automatic failover, MHA tries to save binary logs from the down primary server, so as to ensure no data loss to the greatest extent
    • Using semi synchronous replication can greatly reduce the risk of data loss
    • At present, MHA supports a master-slave architecture with at least three servers, i.e. one master and two slaves

4, MHA construction and fault simulation

4.1 construction environment

host nameIP addressroleAll installation packages (MHA)
manager192.168.131.14manager management nodemanager component, MHA node component
master192.168.131.9mysql master databaseMHA node assembly
slave1192.168.131.10mysql from databaseMHA node assembly
slave2192.168.131.11mysql from databaseMHA node assembly

4.2 deployment ideas

  • Install MySQL database
  • Configure MySQL with one master and two slaves
  • Install MHA software
  • Configure MySQL MHA high availability
  • Simulate master failover

4.3 install MySQL database and synchronize with master-slave

  • Close the firewall and modify the host name (master, salve1, salve2, mha)
[root@localhost ~]# systemctl stop firewalld.service 
[root@localhost ~]# systemctl disable firewalld.service 
[root@localhost ~]# setenforce 0
[root@localhost ~]# hostnamectl set-hostname master
[root@localhost ~]# su
[root@master ~]# 
  • Compilation dependent environment and amake compilation tool installation (three MySQL servers)
[Drag the installation package into the three machines to be installed MySQL Server (one master and two slaves)]
[root@master opt]# ls
cmake-2.8.6.tar.gz  mysql-5.6.36.tar.gz  rh
[root@slave1 opt]# ls
cmake-2.8.6.tar.gz  mysql-5.6.36.tar.gz  rh
[root@slave2 opt]# ls
cmake-2.8.6.tar.gz  mysql-5.6.36.tar.gz  rh

[root@localhost ~]# yum -y install ncurses-devel.x86_64 gcc-c++.x86_64 perl-Module-Install
[root@localhost ~]# cd /opt/
[root@localhost opt]# ls
cmake-2.8.6.tar.gz  mysql-5.6.36.tar.gz  rh
[root@localhost opt]# tar zxvf cmake-2.8.6.tar.gz 
[root@localhost opt]# cd cmake-2.8.6/
[root@localhost cmake-2.8.6]# ./configure 
[root@localhost cmake-2.8.6]# gmake && gmake install
  • Install MySQL 5 six
[root@localhost cmake-2.8.6]# cd /opt/
[root@localhost opt]# tar zxvf mysql-5.6.36.tar.gz 
[root@localhost opt]# cd mysql-5.6.36/
[root@localhost mysql-5.6.36]# cmake \
> -DCMAKE_INSTALL_PREFIX=/usr/local/mysql \
> -DDEFAULT_COLLATION=utf8_general_ci \
[root@localhost mysql-5.6.36]# make -j6 && make install
  • Copy startup scripts and configuration files
[/etc/init.d/: Binary file storage directory of various servers and programs]
[/etc/rc.d/: Executor connection directory at each startup level. Many files in it point to init.d/Some soft links]
[root@localhost mysql-5.6.36]# cp support-files/my-default.cnf /etc/my.cnf
cp: Overwrite"/etc/my.cnf"? y
[root@localhost mysql-5.6.36]# cp support-files/mysql.server /etc/rc.d/init.d/mysqld
[root@localhost mysql-5.6.36]# chmod +755 /etc/rc.d/init.d/mysqld 
[root@localhost mysql-5.6.36]# chkconfig --add /etc/init.d/mysqld 
[root@localhost mysql-5.6.36]# chkconfig mysqld --level 35 on
[root@localhost mysql-5.6.36]# echo 'PATH=$PATH:/usr/local/mysql/bin'>> /etc/profile
[root@localhost mysql-5.6.36]# source /etc/profile
[root@localhost mysql-5.6.36]# useradd -s /sbin/nologin mysql
[root@localhost mysql-5.6.36]# chown -R mysql.mysql /usr/local/mysql/
  • MySQL_install_db initialization (both directory file basedir and data file datadir must be present)
[root@localhost mysql-5.6.36]# /usr/local/mysql/scripts/mysql_install_db \
> --basedir=/usr/local/mysql \
> --datadir=/usr/local/mysql/data/ \
> --user=mysql

[Install if an error is reported autoconf [library]
[root@localhost mysql-5.6.36]# /usr/local/mysql/scripts/mysql_install_db --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data --user=mysql
FATAL ERROR: please install the following Perl modules before executing /usr/local/mysql/scripts/mysql_install_db:
[root@localhost mysql-5.6.36]# yum -y install autoconf
  • Modify master configuration file
[root@localhost mysql-5.6.36]# vim /etc/my.cnf
 30 server-id=10
 31 log_bin=master-bin
 32 log-slave-updates=true
  • Configure slave server
[root@localhost mysql-5.6.36]# vim /etc/my.cnf
 30 server-id=20
 31 log_bin=master-bin
 32 relay-log=relay-log-bin
 33 relay-log-index=slave-realy-bin.index

[From server 2]
[root@localhost mysql-5.6.36]# vim /etc/my.cnf
 30 server-id=30
 31 log_bin=master-bin
 32 relay-log=relay-log-bin
 33 relay-log-index=slave-realy-bin.index
  • Make two soft links on the master and two slave
[root@localhost mysql-5.6.36]# ln -s /usr/local/mysql/bin/mysql /usr/sbin/
[root@localhost mysql-5.6.36]# ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/
[The first is mysql [command file]
[The second is mysqlbinlog [available for breakpoint recovery]
  • Turn on the service and view the port
[root@localhost mysql-5.6.36]# systemctl start mysqld
[root@localhost mysql-5.6.36]# netstat -natp | grep 3306
tcp6       0      0 :::3306                 :::*                    LISTEN      40198/mysqld  
  • Configure MySQL with one master and two slaves
[root@localhost mysql-5.6.36]# mysqladmin -uroot -p password '5514' [set password]
[root@localhost mysql-5.6.36]# mysql -uroot -p5514

[Two users are authorized on all database nodes, one is used synchronously from the database, and the other is manager Use]
mysql> grant replication slave on *.* to 'myslave'@'192.168.131.%' identified by '111222';  [Synchronize from database]
mysql> grant all privileges on *.* to 'mha'@'192.168.131.%' identified by 'manager';        [manager Use]

[The following three authorizations are for experiments MHA inspect MySQL Add only when the master-slave reports an error to prevent the slave library from being connected to the master library through the host name]
mysql> grant all privileges on *.* to 'mha'@'master' identified by 'manager';
mysql> grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
mysql> grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';
mysql> flush privileges;

[stay Master [View binaries and synchronization points on]
mysql> show master status;
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
| master-bin.000002 |     1720 |              |                  |                   |

[In two slave [synchronize with node server]
mysql> change master to master_host='',master_user='myslave',master_password='111222',master_log_file='master-bin.000002',master_log_pos=1720;
mysql> start slave;
mysql> show slave status\G;
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

[Set two slave libraries to read-only mode (required)]
mysql> set global read_only=1;
mysql> flush privileges;

4.4 configure MHA

  • Install MHA dependent environment on all servers (epel source needs to be installed first)
[install epel Source, and no gph Check]
[root@localhost ~]# yum -y install epel-release --nogpgcheck

[root@localhost /]# Yum install - y perl DBD mysql \ [perl for mysql database]
> perl-Config-Tiny \                                  [Fetch the value from the configuration file]
> perl-Log-Dispatch \                                 [log-[log]
> perl-Parallel-ForkManager \                         [Multithreading management]
> perl-ExtUtils-CBuilder \                            [Extension tools CBuilder MakeMaker]
> perl-ExtUtils-MakeMaker \
> perl-CPAN                                           [cpan perl Database in]
  • Install all node components
  • You must first install the node component on all servers, and finally install the manager component on the MHA manager node (because manager depends on the node component)
[root@localhost opt]# ls
cmake-2.8.6  cmake-2.8.6.tar.gz  mha4mysql-node-0.57.tar.gz  mysql-5.6.36  mysql-5.6.36.tar.gz  rh
[root@localhost opt]# tar zxvf mha4mysql-node-0.57.tar.gz
[root@localhost opt]# cd mha4mysql-node-0.57
[root@localhost mha4mysql-node-0.57]# perl Makefile.PL
[root@localhost mha4mysql-node-0.57]# make && make install
  • Install the manager component on the MHA node
  • node must be installed before installing the manager component
[root@localhost opt]# ls
mha4mysql-manager-0.57.tar.gz  mha4mysql-node-0.57  mha4mysql-node-0.57.tar.gz  rh
[root@localhost opt]# tar zxvf mha4mysql-manager-0.57.tar.gz
[root@localhost opt]# cd mha4mysql-manager-0.57
[root@localhost mha4mysql-manager-0.57]# perl Makefile.PL
[root@localhost mha4mysql-manager-0.57]# make && make install

4.5 common scripts and tools generated after the installation of manager and node

[root@localhost mha4mysql-manager-0.57]# ls /usr/local/bin/
apply_diff_relay_logs  json_pp              masterha_check_status  masterha_master_monitor   masterha_stop            purge_relay_logs
config_data            masterha_check_repl  masterha_conf_host     masterha_master_switch    package-stash-conflicts  save_binary_logs
filter_mysqlbinlog     masterha_check_ssh   masterha_manager       masterha_secondary_check  prove
  • After manager is installed, several tools will be generated in the / usr/local/bin directory, mainly including
masterha_check_sshCheck the SSH configuration of MHA
masterha_check_replCheck MySQL replication status
masterha_managerStart the manager script
masterha_check_statusCheck the current MHA operation status
masterha_master_monitorCheck whether the master is down
masterha_master_switchControl failover (manual or automatic)
masterha_conf_hostAdd or delete configured server information
masterha_stopClose manager
  • After node installation, several scripts will be generated in / usr/local/bin directory, which need to be checked, mainly including
save_binary_logsSave and copy binary logs of master
apply_diff_relay_logsIdentify the different relay log events and apply the different events to other slave s
filter_mysqlbinlogRemove unnecessary rollback events (MHA is no longer applicable to this tool)
purge_relay_logsClear relay log (does not block SQL thread)

4.6 configure password free authentication

  • Configure password free authentication to all database nodes on manager
[Press enter to confirm the next step]
[root@localhost mha4mysql-manager-0.57]# ssh-keygen -t rsa
[root@localhost mha4mysql-manager-0.57]# ssh-copy-id
[Press finish yes Input after ssh Login password [5514]
[root@localhost mha4mysql-manager-0.57]# ssh-copy-id
[root@localhost mha4mysql-manager-0.57]# ssh-copy-id
  • Configure password free authentication to slave1 and salve2 on the master
[root@localhost mha4mysql-node-0.57]# ssh-keygen -t rsa
[root@localhost mha4mysql-node-0.57]# ssh-copy-id
[root@localhost mha4mysql-node-0.57]# ssh-copy-id
  • Configure password free authentication to master and salve2 on slave1
[root@localhost mha4mysql-node-0.57]# ssh-keygen -t rsa
[root@localhost mha4mysql-node-0.57]# ssh-copy-id
[root@localhost mha4mysql-node-0.57]# ssh-copy-id
  • Configure password free authentication to master and salve1 on slave2
[root@localhost mha4mysql-node-0.57]# ssh-keygen -t rsa
[root@localhost mha4mysql-node-0.57]# ssh-copy-id
[root@localhost mha4mysql-node-0.57]# ssh-copy-id

4.7 modify MHA related configuration

  • Copy the relevant scripts on the mha node to the / usr/local/bin directory
[root@localhost mha4mysql-manager-0.57]# cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
[After copying, there will be four execution files]
[root@localhost mha4mysql-manager-0.57]# ll /usr/local/bin/scripts/
Total consumption 32
-rwxr-xr-x. 1 1001 1001  3648 5 March 31, 2015 master_ip_failover          [During automatic switching VIP [managed Script]
-rwxr-xr-x. 1 1001 1001  9870 5 March 31, 2015 master_ip_online_change     [When switching Online VIP Management of]
-rwxr-xr-x. 1 1001 1001 11867 5 March 31, 2015 power_manager               [Script to shut down the host after failure]
-rwxr-xr-x. 1 1001 1001  1360 5 March 31, 2015 send_report                 [[script to send alarm after failover]

[Zaijiang master_ip_failover Copy to/usr/local/bin/[table of contents]
[root@localhost mha4mysql-manager-0.57]# cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin
  • Modify master_ip_failover content (delete the original content and add it directly)
[root@localhost mha4mysql-manager-0.57]# vim /usr/local/bin/master_ip_failover 
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
my $vip = '';                              [float IP]
my $brdc = '';                             [[broadcast address]
my $ifdev = 'ens33';                                      [The network card used is ens33]
my $key = '1';                                            [[international serial number]
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";     [use ifoconfig Command to start it and set the floating address]
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";      [have access to ifconfig Command it down [off]        
my $exit_code = 0;                                        [Normal exit (return status code)]
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
$exit_code = 0;
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
exit $exit_code;
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
$exit_code = 0;
if ($@) {
warn $@;
exit $exit_code;
exit $exit_code;
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
exit 1;
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
## A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

sub usage {
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  • Create the MHA software directory and copy the configuration file (here, use the app1.cnf configuration file to manage the mysql node server)
[root@localhost mha4mysql-manager-0.57]# mkdir /etc/masterha
[root@localhost mha4mysql-manager-0.57]# cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha/
  • Configure node related information (add node related information after deleting the original configuration directly)
[root@localhost mha4mysql-manager-0.57]# vim /etc/masterha/app1.cnf 

[server default]
manager_log=/var/log/masterha/app1/manager.log               [manager [working directory]
manager_workdir=/var/log/masterha/app1                       [manager [work log]
master_binlog_dir=/usr/local/mysql/data                      [master preservation binlog [location]
master_ip_failover_script=/usr/local/bin/master_ip_failover  [Set automatic failover When to switch scripts, that is, the last configured Script]  
master_ip_online_change_script=/usr/local/bin/master_ip_online_change  [Set the switching script for manual switching]
password=manager                                             [The password here is the password of the user who created monitoring before]
user=mha                                                     [[set monitoring user]
ping_interval=1             [Set monitoring master database and send Ping The event interval of the package is 3 seconds by default. It will be carried out automatically when there is no response after 3 attempts failover]
remote_workdir=/tmp                                          [Set remote MySQL When sending handover again binlog [temporary storage location]
repl_password=111222                                         [Set replication( slaves)[user's password]
repl_user=myslave                                            [Set the account number of the copying user]
report_script=/usr/local/send_report                         [Set the alarm script sent after sending switching (this configuration is not carried out here for the purpose of experiment)]
secondary_check_script=/usr/local/bin/masterha_secondary_check -s -s[Set check script from server]
shutdown_script=""                                           [Set the script to close the fault host after the fault occurs]
ssh_user=root                                                [set up ssh [login user name]



[Set candidate master,If this parameter is set, this library will be promoted to master library after master-slave switching]
[By default, if one slave backward master More than 100 M of relay logs If so, MHA The will not be selected slave As a new master]
[Because for this slave The recovery of takes a long time; By setting check_repl_delay=0,MHA Trigger switching when selecting a new master [replication delay will be ignored]
[This parameter is set for candidate_master=1 The host of is very useful because the candidate host must be new in the process of switching master]
  • For the first configuration, you need to manually turn on the virtual IP on the Master node
[root@master mha4mysql-node-0.57]# /sbin/ifconfig ens33:1

4.8 check the master-slave connection of mysql on the node without password authentication and manager

  • Test ssh password free authentication on the manager node. If it is normal, it will output successfully
  • Test the mysql master-slave connection on the manager node

4.9 start MHA on the manager node

  • --remove_dead_master_conf: this parameter indicates that the ip of the old master database will be removed from the configuration file after the master-slave switch.
  • --manger_log: log storage location.
  • --ignore_last_failover: by default, if MHA detects continuous downtime and the interval between two outages is less than 8 hours, it will not fail. The reason for this restriction is to avoid ping pong effect.
    This parameter also means to ignore the files generated by the last MHA triggered switching. By default, after MHA switching, it will be recorded in the log directory, that is, the log app1 set above failover. Complete file. If the file is found in this directory during the next switch, the switch is not allowed to be triggered unless the file is deleted after the first switch. For convenience, it is set to – ignore here_ last_ failover.
[root@localhost mha4mysql-manager-0.57]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
[1] 9558

4.10 and check MHA status and log

  • Viewing the MHA status, you can see that the current master is the master node.
[root@localhost mha4mysql-manager-0.57]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:9558) is running(0:PING_OK), master:
  • Check the MHA log to see that the current master is
[root@localhost mha4mysql-manager-0.57]# cat /var/log/masterha/app1/manager.log | grep "current master"
Wed Feb 24 12:01:16 2021 - [info] Checking SSH publickey authentication settings on the current master.. (current master)

4.11 check whether the VIP address of the master exists

  • Check again after closing MHA

4.12 fault simulation

  • Monitor observation log records on the manager node
[root@localhost mha4mysql-manager-0.57]# tail -f /var/log/masterha/app1/manager.log
  • Stop the mysql service on the Master node
[root@localhost mha4mysql-node-0.57]# systemctl stop mysqld.service 

  • Algorithm of alternative master database for failover
    • Generally, the judgment of the slave library is from the (position/GTID). There are differences in the data, which is closest to the slave of the master and becomes the alternative master.
    • When the data is consistent, select the alternative main library according to the order of the configuration file.
    • Set the weight (candidate_master=1), and force to specify the alternative master according to the weight.
      • By default, if a slave falls behind the master's 100m relay logs, even if it has a weight, it will fail.
      • If check_ repl_ If delay = 0, even if it lags behind a lot of logs, it is forced to select it as the alternative master.

4.13 repair simulated faults

  • Repair mysql
[root@localhost mha4mysql-node-0.57]# systemctl start mysqld.service 
  • Repair master-slave
[In the current main library server slave2 [View binaries and synchronization points]
mysql> show master status;
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
| master-bin.000001 |     1515 |              |                  |                   |
1 row in set (0.00 sec)

[In the original master library server master [execute synchronization]
[root@localhost mha4mysql-node-0.57]# mysql -uroot -p5514
mysql> change master to master_host='',master_user='myslave',master_password='111222',master_log_file='master-bin.000001',master_log_pos=1515;
Query OK, 0 rows affected, 2 warnings (0.03 sec)

mysql> start slave;
  • On the manager node, modify the configuration file app1 cnf
  • Then add this record (master), because it will disappear automatically when it detects failure
[root@localhost mha4mysql-manager-0.57]# vim /etc/masterha/app1.cnf
 28 [server1]
 29 hostname=
 30 port=3306
  • Start MHA on the manager node
[root@localhost mha4mysql-manager-0.57]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
  • Solve the problem of incompatibility and error reporting between Chinese and English characters
    • dos2unix /usr/local/bin/master_ip_failover

5, Summary

  • Role of MHA: it provides automatic master failover for MySQL master-slave replication architecture.
  • MHA consists of MHA Manager (management node) and MHA node (data node)
  • MHA features
    • In the process of automatic failover, MHA tries to save binary logs from the down primary server, so as to ensure no data loss to the greatest extent
    • Using semi synchronous replication can greatly reduce the risk of data loss
    • At present, MHA supports a master-slave architecture with at least three servers, i.e. one master and two slaves
  • Semi synchronous replication: as the name suggests, it is a part of replication, which may be different from the data of the main database. No matter how complete the master-slave mechanism is, the slave database sometimes lags behind the master database.

Topics: MySQL cloud computing