MySQL MHA for high availability architecture

Posted by shiranwas on Wed, 08 Dec 2021 04:21:06 +0100

catalogue

1, Introduction

2, MHA services

working principle

Actual operation

Initialize environment

Modify the hostname of Master, Slave1 and Slave2 nodes

Master slave synchronization

Install MHA

1, Introduction

MHA (Master HA) is an open source MySQL high availability program. It provides the automatic master failover function for MySQL master-slave replication architecture. When MHA monitors a master node failure, it will promote the slave node with the latest data to become a new master node. During this period, MHA will avoid consistency problems by obtaining additional information from other slave nodes. MHA also provides the online switching function of the master node, that is, switching the master/slave node on demand.
MHA is a relatively mature MySQL high availability solution developed by Japanese yoshinorim (formerly worked in DeNA and now on FaceBook). MHA can achieve failover within 30 seconds and ensure data consistency as much as possible during failover. At present, Taobao is also developing a similar product TMHA, which supports one master and one slave.

2, MHA services

      The MHA service has two roles, MHA Manager (management node) and MHA node (data node):
MHA Manager:
Generally, multiple master/slave clusters (groups) are deployed separately on an independent machine to manage. Each master/slave cluster is called one   application is used to manage and coordinate the whole cluster.
MHA node:
Running on each MySQL server (master/slave/manager), it speeds up failover by monitoring scripts with the functions of parsing and cleaning logs.
It is mainly the agent that receives the instructions sent by the management node. The agent needs to run on each mysql node. In short, node is used to collect bin logs generated from the node server. Compare whether the slave node intended to be promoted to the new master node owns and completes the operation. If it is not sent to the new master node, it will be promoted to the master node after local application.

working principle

The working principle of MHA is summarized as follows:
(1) Save binary log events from the master of downtime and crash;
(2) Identify the slave with the latest update;
(3) Apply different relay logs to other slave;
(4) Apply binary log events saved from the master;
(5) Upgrade a slave to a new master;
(6) Use another slave to connect to the new master for replication.

Actual operation

Experimental preparation

MHA manager Node server:

CentOS7.4(64 position) manager/192.168.241.136 ,install MHA node and manager assembly

Master Node server:

CentOS7.4(64 position) mysql1/192.168.241.129,install mysql5.7,MHA node assembly

Slave1 Node server:

CentOS7.4(64 position) mysql2/192.168.241.133 ,install mysql5.7,MHA node assembly

Slave2 Node server:

CentOS7.4(64 position) mysql3/192.168.241.134 ,install mysql5.7,MHA node assembly

Initialize environment

systemctl stop firewalld
systemctl disable firewalld
setenforce 0

Modify the hostname of Master, Slave1 and Slave2 nodes

hostnamectl set-hostname mysql1
hostnamectl set-hostname mysql2
hostnamectl set-hostname mysql3
​
vim /etc/hosts
192.168.241.129 Mysql1
192.168.241.133 Mysql2
192.168.241.134 Mysql3
​

 

Master slave synchronization

modify Master,Slave1,Slave2 Nodal Mysql Master profile/etc/my.cnf 
##Master node##
vim /etc/my.cnf
[mysqld]
server-id = 1
log_bin = master-bin
log-slave-updates = true
​
systemctl restart mysqld
​
​
##Slave1 node##
vim /etc/my.cnf
server-id = 2                       #The server IDs of the three servers cannot be the same
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index
​
systemctl restart mysqld
​
​
###Slave2 node##
vim /etc/my.cnf
server-id = 3 
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index
​
systemctl restart mysqld

ocal/mysql/bin/
[root@mysql1 ~]#ln -s /usr/local/mysql/bin/{mysql,mysqlbinlog} /usr/sbin/
[root@mysql1 ~]#ls /usr/s

#####Login database##########
grant replication slave on *.* to 'myslave'@'192.168.241.%' identified by 'Wx12345@';
#Authorized master-slave user
grant all privileges on *.* to 'mha'@'192.168.241.%' identified by 'manager';
grant all privileges on *.* to 'mha'@'Mysql1' identified by 'manager';
grant all privileges on *.* to 'mha'@'Mysql2' identified by 'manager';
grant all privileges on *.* to 'mha'@'Mysql3' identified by 'manager';


flush privileges;

show master status;
#View the status of the master

help change master to
change master to master_host='192.168.241.129',master_user='myslave',master_password='Wx12345@',master_log_file='master-bin.000001',master_log_pos=2045;
#Slave node configuration

start slave;
show slave status\G

set global read_only=1;
#Set two slave nodes to read-only mode

verification
create database test1;
use test1;
create table ky15 (id int,name char(20),age int,gender char(20));
insert into ky15 values(1,'wx',2,'m');

##stay slave1,slave2 Upper verification##
select * from kgc.ky15;

 

Install MHA

#MHA dependent environments are installed on all servers. First, install the epel source
yum install epel-release --nogpgcheck -y
​
yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN
​
#To install MHA package, you must first install node components on all servers
#The version of each operating system is different. Here, CentOS 7.6 selects version 0.57.
#You must first install the node component on all servers, and finally install the manager component on the MHA manager node, because manager depends on the node component.
cd /opt
tar zxvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile.PL
make && make install
​
​
tar zxvf mha4mysql-manager-0.57.tar.gz 
cd mha4mysql-manager-0.57/
perl Makefile.PL
make && make install
​
cd /usr/local/bin/
#After the manager component is installed, several tools will be generated under / usr/local/bin
ls

#Configure password less authentication to all database nodes on the manager node
ssh-keygen -t rsa 				#Press enter all the way
ssh-copy-id 192.168.241.129
ssh-copy-id 192.168.241.133
ssh-copy-id 192.168.241.134

#After the node component is installed, several scripts will also be generated under / usr/local/bin (these tools are usually triggered by MHAManager scripts without manual operation). The main contents are as follows:
save_binary_logs Save and copy master Binary log for
apply_diff_relay_logs Identify differential relay log events and apply their differential events to other events slave
filter_mysqlbinlog Remove unnecessary ROLLBACK Events( MHA (this tool is no longer used)
purge_relay_logs Clear relay log (no blocking) SQL Thread)


#########Configure password less authentication on all servers##############
stay manager Configure password less authentication to all database nodes on the node
ssh-keygen -t rsa 				#Press enter all the way
ssh-copy-id 192.168.241.129
ssh-copy-id 192.168.241.133
ssh-copy-id 192.168.241.134


The database servers can import each other
ssh-keygen -t rsa 
ssh-copy-id 192.168.241.133
ssh-copy-id 192.168.91.134

ssh-keygen -t rsa 
ssh-copy-id 192.168.241.129
ssh-copy-id 192.168.241.134


ssh-keygen -t rsa 
ssh-copy-id 192.168.241.129
ssh-copy-id 192.168.241.133

 

######stay manager Operation on node###########
(1)stay manager Copy related scripts on node to/usr/local/bin catalogue
cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
//After copying, there will be four execution files
ll /usr/local/bin/scripts/

master_ip_failover  		#Script for VIP management during automatic switching
master_ip_online_change 	#Management of vip during online switching
power_manager 				#Script to shut down the host after a failure
send_report 				#Script for sending alarms after failover

 

(2)##Copy the above script of VIP management during automatic switching to / usr/local/bin directory. Here, use master_ip_failover scripts to manage VIPs and failover
cp /usr/local/bin/scripts/* /usr/local/bin
 
(3)##Modify master_ ip_ Delete all failover, add the following contents and modify relevant parameters
 
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
 
use Getopt::Long;
 
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
#############################Add content section#########################################
my $vip = '192.168.241.188';									#Specify the address of the vip
my $brdc = '192.168.241.255';								#Specify the broadcast address of the vip
my $ifdev = 'ens33';										#Specify the network card bound by vip
my $key = '1';												#Specifies the serial number of the virtual network card bound by the vip
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";		#Represents that the value of this variable is ifconfig ens33:1 192.168.241.188
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";		#Represents that the value of this variable is ifconfig ens33:1 192.168.241.188 down
my $exit_code = 0;											#Specifies that the exit status code is 0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
 
exit &main();
 
sub main {
 
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
 
if ( $command eq "stop" || $command eq "stopssh" ) {
 
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
 
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
## A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
 
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
 
#Delete Note 
:2,87 s/^#//
 
(4)##Create the MHA software directory and copy the configuration file. Here, use the app1.cnf configuration file to manage the mysql node server
 
#Create MHA software directory and copy configuration files
cd /opt/mha4mysql-manager-0.57/samples/conf/
ls
mkdir /etc/masterha
cp app1.cnf /etc/masterha/
 
 
#Modify the app1.cnf configuration file, delete all contents of the original text, and add the following
vim /etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=manager
ping_interval=1
remote_workdir=/tmp
repl_password=123123
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.241.133 -s 192.168.241.134
shutdown_script=""
ssh_user=root
user=mha
 
[server1]
hostname=192.168.241.129
port=3306
 
[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.241.133
port=3306
 
[server3]
hostname=192.168.241.134
port=3306

 

###Enable virtual IP on the primary node
ifconfig  ens33:1 192.168.241.188/24

(6)##Test ssh password free authentication on the manager node. If it is normal, it will output successfully. As shown below
masterha_check_ssh -conf=/etc/masterha/app1.cnf
 
(7)##Test the mysql master-slave connection on the manager node, and the word MySQL Replication Health is OK appears at the end, indicating that it is normal. As shown below.
masterha_check_repl -conf=/etc/masterha/app1.cnf
 
(8)##Start MHA on the manager node
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
 
(9)##Viewing the MHA status, you can see that the current master is the Mysql1 node.
masterha_check_status --conf=/etc/masterha/app1.cnf
 
(10)see MHA Log to see the current master It's 192.168.59.118,As shown below.
cat /var/log/masterha/app1/manager.log | grep "current master"
 
(11)##Check whether the VIP address 192.168.59.118 of Mysql1 exists. This VIP address will not disappear because the manager node stops MHA service.
ifconfig
 
//To shut down the manager service, you can use the following command.
masterha_stop --conf=/etc/masterha/app1.cnf
 Or it can be used directly kill process ID Close by

 

---------------------- fault simulation  --------------------------------
#Monitor observation logging on the manager node
tail -f /var/log/masterha/app1/manager.log

#Stop the MySQL service on Master node Mysql1
systemctl stop mysqld
 or
pkill -9 mysql

#After a normal automatic switching, the MHA process will exit. HMA will automatically modify the contents of the app1.cnf file and delete the down mysql1 node. Check whether mysql2 takes over VIP
ifconfig

Algorithm of failover alternative master library:
1.It is generally judged that the slave library is from( position/GTID)Judging the pros and cons, there are differences in data, which is closest to master of slave,Become an alternative master.
2.If the data is consistent, select an alternative master database according to the order of the configuration file.
3.Set weighted( candidate_master=1),Force an alternate master by weight.
(1)By default, if one slave backward master 100M of relay logs If there is a weight, it will fail.
(2)If check_repl_delay=0 If, even if it lags behind a lot of logs, it is forced to select it as the alternative master.

Troubleshooting steps:
1.repair mysql
systemctl restart mysqld

2.Repair master-slave
#View binary files and synchronization points on the current main database server Mysql2
show master status;

#Perform synchronization on the original master database server mysql1
change master to master_host='192.168.91.101',master_user='myslave',master_password='123123',master_log_file='master-bin.000002',master_log_pos=154;

start slave;


3.stay manager Modify profile on node app1.cnf(Add this record again, because it will disappear automatically when it detects failure)
vi /etc/masterha/app1.cnf
......
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.91.100 -s 192.168.91.102
......
[server1]
hostname=192.168.80.101
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.91.100
port=3306

[server3]
hostname=192.168.80.102
port=3306

4.stay manager Start on node MHA
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &



#Solve the problem of incompatible Chinese and English characters and error reporting
dos2unix /usr/local/bin/master_ip_failover 

Topics: Linux Operation & Maintenance Database MySQL server