MHA high availability cluster deployment and failover

Posted by torleone on Sun, 23 Jan 2022 01:33:53 +0100

1, MHA theory

1. MHA definition

  • MHA (MasterHigh Availability) is a set of excellent software for failover and master-slave replication in MySQL high availability environment.
  • The emergence of MHA is to solve the problem of MySQL single point.
  • During MySQL failover, MHA can automatically complete the failover operation within 0-30 seconds.
  • MHA can ensure the consistency of data to the greatest extent in the process of failover, so as to achieve high availability in the real sense.

2. MHA composition

  • MHA Node
    The MHA Node runs on each MySQL server.
  • MHA Manager (management node)
    MHA Manager can be deployed on an independent machine to manage multiple master slave clusters; It can also be deployed on a slave node.
    The MHA Manager will periodically probe the master node in the cluster. When the master fails, it can automatically promote the slave of the latest data to the new master, and then point all other slave to the new master again. The entire failover process is completely transparent to the application.

3. MHA features

● during automatic failover, MHA tries to save binary logs from the down master server to ensure that data is not lost to the greatest extent
● using semi synchronous replication can greatly reduce the risk of data loss. If only one slave has received the latest binary log, MHA can apply the latest binary log to all other slave servers, so it can ensure the data consistency of all nodes
● at present, MHA supports a master-slave architecture with at least three services, i.e. one master and two slaves

2, Construction of MHA

Experimental ideas

1. MHA architecture
① Database installation
② One master and two slaves
③ MHA construction

2. Fault simulation
① Simulated main library failure
② The alternative master library becomes the master library
③ The original fault master database recovers and rejoins the MHA as a slave database

Experimental preparation

hostoperating systemIP addressInstallation package / software / tools
MHA manager node serverCentOS7192.168.217.10MHAnode component, MHAmanager component
MasterCentOS7192.168.217.40mysql-boost-5.7.20.tar.gz, MHAnode components
Slave1CentOS7192.168.217.20mysql-boost-5.7.20.tar.gz, MHAnode components
Slave2CentOS7192.168.217.30mysql-boost-5.7.20.tar.gz, MHAnode components
First, at three stations mysql View on node server mysql Whether the service is started. If not, you need to install it first mysql
netstat -natp | grep 3306


1. Modify the hostname of Master, Slave1 and Slave2 nodes

hostnamectl set-hostname master(192.168.217.40)
hostnamectl set-hostname slave1(192.168.217.20)
hostnamectl set-hostname slave2(192.168.217.30)
su



2. Turn off firewall

systemctl stop firewalld
systemctl disable firewalld
setenforce 0

3. Modify the main configuration file / etc / my. Of the three MySQL servers cnf
master(192.168.217.40)

vim /etc/my.cnf
[mysqld]
server-id = 20
log_bin = master-bin
log-slave-updates = true

systemctl restart mysqld
ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/


slave1(192.168.217.20),slave2(192.168.217.30)

vim /etc/my.cnf
server-id = 30  
#If server id = 40, mysql3 is 40. The server IDs of the three servers cannot be the same
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

systemctl restart mysqld
ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/




4. Configure mysql one master and two slave

① All database nodes are authorized with mysql

mysql -uroot -p
grant replication slave on *.* to 'myslave'@'192.168.217.%' identified by '123456';		
#Synchronize use from database
grant all privileges on *.* to 'mha'@'192.168.217.%' identified by '123456';		
#manager usage

#Prevent the slave library from connecting to the master library through the host name
grant all privileges on *.* to 'mha'@'master' identified by 'manager';
grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';
flush privileges;



② View binaries and synchronization points in the Master node

show master status

Example: everyone's binary file name or offset may be different. Remember your own
+-------------------+----------+--------------+------------------+-------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 |     1747 |              |                  |                   |
+-------------------+----------+--------------+------------------+-------------------+

③ In Slave1, the Slave2 node performs synchronization

stop slave
change master to master_host='192.168.217.40',master_user='myslave',master_password='123456',master_log_file='master-bin.000001',master_log_pos=1747;
# Everyone is different, according to their own match
start slave;
show slave status\G;




Slave1 and Slave2 nodes are set to read-only mode

set global read_only=1;

5. Master slave replication verification
Create Library in Master

slave Query Library verification

6. Install MHA software
① MHA dependent environment is installed on all servers. First install the epel source, which needs to be installed online
Then install the node component on all servers

mv /etc/yum.repos.d/repos.bak/CentOS-* /etc/yum.repos.d/
yum list

yum install epel-release --nogpgcheck -y

yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN



Note:
The version of each operating system is different. Here, centos7 4. Version 0.57 must be selected.
You must first install the node component on all servers, and finally install the manager component on the MHA manager node, because manager depends on the node component.

② Install the manager component on the MHA manager node

Package mha4mysql-manager-0.57.tar.gz Put/opt Directory

cd /opt
tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install




#After the manager component is installed, several tools will be generated under / usr/local/bin, mainly including the following:

tooleffect
masterha_check_sshCheck the SSH configuration of MHA
masterha_check_replCheck MySQL replication status
masterha_mangerScript to start manager
masterha_check_statusDetect the current MHA operation status
masterha_master_monitorCheck whether the master is down
masterha_master_switchControl failover (automatic or manual)
masterha_conf_hostAdd or remove configured server information
masterha_stopClose manager

#After the node component is installed, several scripts will also be generated under / usr/local/bin (these tools are usually triggered by MHAManager scripts without manual operation). The main contents are as follows:

scripteffect
save_binary_logsSave and copy binary logs of master
apply_diff_relay_logsIdentify the different relay log events and apply the different events to other slave s
filter_mysqlbinlogRemove unnecessary ROLLBACK events

7. Configure password less authentication on all servers
① Configure password free authentication to all [database nodes] on the manager (192.168.217.10) node

ssh-keygen -t rsa 				#Press enter all the way
ssh-copy-id 192.168.217.20
ssh-copy-id 192.168.217.30
ssh-copy-id 192.168.217.40

② Configure password free authentication to database nodes slave1 (192.168.217.20) and slave2 (192.168.217.30) on the master (192.168.217.40)

ssh-keygen -t rsa
ssh-copy-id 192.168.217.20
ssh-copy-id 192.168.217.30


③ Configure password free authentication to the database nodes master (192.168.217.40) and slave2 (192.168.217.30) on slave1 (192.168.217.20)

ssh-keygen -t rsa
ssh-copy-id 192.168.217.30
ssh-copy-id 192.168.217.40

④ Configure password free authentication to the database nodes master (192.168.217.40) and slave 2 (192.168.217.20) on slave 2 (192.168.217.30)

ssh-keygen -t rsa
ssh-copy-id 192.168.217.40
ssh-copy-id 192.168.217.20

8. Configure MHA on the manager node

① On the manager node, copy the relevant scripts to the / usr/local/bin directory

cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin

#After copying, there will be four executables
ll /usr/local/bin/scripts/

② Copy the above script of VIP management during automatic switching to / usr/local/bin directory. Here, use master_ip_failover scripts to manage VIPs and failover

cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

③ The modification contents are as follows: (delete the original contents, directly copy and modify vip related parameters, and vip customization)

echo '' > /usr/local/bin/master_ip_failover

vim /usr/local/bin/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
#############################Add content section#########################################
my $vip = '192.168.217.200';								#Specify the address of the vip
my $brdc = '192.168.217.255';								#Specify the broadcast address of the vip
my $ifdev = 'ens33';										#Specify the network card bound by vip
my $key = '1';												#Specifies the serial number of the virtual network card bound by the vip
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";		#Represents that the value of this variable is ifconfig ens33:1 192.168.217.200
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";		#Represents that the value of this variable is ifconfig ens33:1 192.168.217.200 down
my $exit_code = 0;											#Specifies that the exit status code is 0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
## A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}


④ Create MHA software directory and copy the configuration file, using app1 CNF configuration file is used to manage the mysql node server. The configuration file is generally placed in the / etc / directory

mkdir /etc/masterha
cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha

vim /etc/masterha/app1.cnf						#Delete the original content, directly copy and modify the IP address of the node server
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=123456
ping_interval=1
remote_workdir=/tmp
repl_password=123456
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.217.20 -s 192.168.217.30
shutdown_script=""
ssh_user=root
user=mha

[server1]
hostname=192.168.217.40
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.217.20
port=3306

[server3]
hostname=192.168.217.30
port=3306

[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=123456
ping_interval=1
remote_workdir=/tmp
repl_password=123456
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.217.20 -s 192.168.217.30
shutdown_script=""
ssh_user=root
user=mha

[server1]
hostname=192.168.217.40
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.217.20
port=3306

[server3]
hostname=192.168.217.30
port=3306

Manually start vip on the Master server

ifconfig ens33:1 192.168.217.200/24

⑤ Test ssh password free authentication on the manager node. If it is normal, it will output successfully.

masterha_check_ssh -conf=/etc/masterha/app1.cnf

⑥ Test the mysql master-slave connection on the manager node, and the word MySQL Replication Health is OK appears at the end, indicating that it is normal.

masterha_check_repl -conf=/etc/masterha/app1.cnf

⑦ Start MHA on the manager node

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

parametermeaning
- -remove_dead_master_confThis parameter indicates that the ip address of the old master database will be removed from the configuration file after the master-slave switch
- -manger_logLog storage location

- -ignore_last_failover: by default, if MHA detects continuous downtime and the interval between two outages is less than 8 hours, it will not fail. The reason for this restriction is to avoid ping pong effect. This parameter means to ignore the files generated by the last MHA triggered switch. By default, after MHA switches, it will be recorded in the log directory, that is, the log app1 set above failover. Complete file. If the file is found in this directory during the next switch, the switch is not allowed to be triggered unless the file is deleted after the first switch. For convenience, it is set to – ignore here_ last_ failover.

  • Viewing the MHA status, you can see the current master node.
    masterha_check_status --conf=/etc/masterha/app1.cnf

  • Check the MHA log to see that the current master is 192.168.217.40, as shown below.
    cat /var/log/masterha/app1/manager.log | grep "current master"

  • Check whether the VIP address 192.168.217.200 of Mysql1 exists. This VIP address will not disappear because the manager node stops MHA service.
    ifconfig

  • To shut down the manager service, you can use the following command
    masterha_stop --conf=/etc/masterha/app1.cnf can be closed directly by using the kill process ID

9. Fault simulation
Monitor observation logging on the manager node

tail -f /var/log/masterha/app1/manager.log
# tail means to view the last 10 lines
# -f is to monitor file changes in the form of file name

Stop the mysql service on the Master

systemctl stop mysqld

After a normal automatic switching, the MHA process will exit. HMA will automatically modify app1 CNF file content, delete the down mysql1 node.

Check whether slave1 takes over the VIP

ifconfig

View log records on the manager node

Algorithm of failover alternative master library:
1. Generally, the slave database is judged from (position/GTID). The data is different. It is closest to the slave of the master and becomes the alternative master.
2. If the data is consistent, select the alternative main database according to the sequence of configuration files.
3. Set the weight (candidate_master=1), and force to specify the alternative master according to the weight.
① By default, if a slave lags behind the master's 100m relay logs, even if it has a weight, it will become invalid.
② If check_ repl_ If delay = 0, even if it lags behind many logs, it is forced to select it as the alternative primary.

10. Fault repair
① Repair mysql (192.168.217.40)

systemctl restart mysqld
systemctl status mysqld

② Repair master-slave (192.168.217.20)

mysql -u root -p123456
#View binary files and synchronization points on the current master database server slave1
show master status;

#Perform synchronization on the original master database server
change master to master_host='192.168.217.20',master_user='myslave',master_password='123456',master_log_file='master-bin.000002',master_log_pos=154;

start slave;

③ On the manager node, modify the configuration file app1 CNF (add this record again, because it will disappear automatically when it detects failure)

vi /etc/masterha/app1.cnf
......
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.217.40 -s 192.168.217.30
......
[server1]
hostname=192.168.217.20
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.217.40
port=3306

[server3]
hostname=192.168.217.30
port=3306


④ Start MHA on the manager node

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

⑤ Re check the MHA status, and you can see that the current master is Mysql2 node.

masterha_check_status --conf=/etc/masterha/app1.cnf

Topics: Java Linux Database MySQL CentOS