How to collect MySQL MHA deployment and running status information through Python

Posted by jmcc on Sun, 03 Oct 2021 02:20:59 +0200

1, Background introduction
When the number of MySQL database instances of the group reaches 2000 + and the scale of MHA cluster reaches hundreds, the timely and efficient management of MHA is a challenge that DBA s must face. MHA cluster node information and operation status are the basis of management. This section mainly introduces how to collect MHA cluster node information and running status through Python. This information will be an important part of CMDB information.

There are hundreds of MHA clusters, more than a dozen MHAManager nodes, and one MHAManager node manages 50-60 clusters. The program we hope to develop can collect all the required MHA Server node information, VIP information, running status information and other information only by deploying and running in these dozen MHAManager nodes, and save the collected data to MySQL database.

2, Implementation logic

2.1 MHA tool program or file called by the program
Tool program or file function
mha_appxxx.cnf configuration file
1. Extract Server information (Server IP) from this file;

2. Extract the files of FailOver Script and Online Change Script.

appxxx_ master_ ip_ The failover script file extracts the defined VIPs and compares them horizontally with the VIPs collected in other places to prevent configuration errors.
appxxx_ master_ ip_ online_ The change script file extracts the defined VIP and compares horizontally to prevent configuration errors.
masterha_check_repl tool program
1. Check MySQL replication status;

2. Resolve the current master node IP;

3. Parse the slave node IP;

4. Analyze the VIP.

masterha_check_status
Check the current MHA running status (running OK or stop).

For ease of understanding, we post MHA_ Content of appxxx.cnf.

[server default]
manager_workdir=/var/log/masterha/app1.log / / set the working directory of manager
manager_log=/var/log/masterha/app1/manager.log / / set the manager log
master_binlog_dir=/data/mysql / / set the location where the master saves binlog so that MHA can find the master's log. Here is the mysql data directory
master_ip_failover_script= /usr/local/bin/appxxx_master_ip_failover / / sets the switching script for automatic failover
master_ip_online_change_script= /usr/local/bin/appxxx_master_ip_online_change / / sets the switching script for manual switching
Password = user password / / set the password of the root user in mysql, which is the password of the monitoring user created earlier
user=root set monitoring user root
ping_interval=1 / / set the time interval between monitoring the main database and sending Ping packets. The default is 3 seconds. When there is no response after three attempts, the system will automatically perform a railover
remote_workdir=/tmp / / set the location where binlog is saved when remote mysql switches
repl_password = user password / / set the password of the copying user
repl_user=repl / / sets the replication user name in the replication environment
report_script=/usr/local/send_report / / sets the script of the alarm sent after switching
shutdown_script = "" / / set the script to close the fault host after the fault occurs (the main function of this script is to close the host and put it in the event of brain fissure, which is not used here)
ssh_user=root / / set the login user name of SSH

[server1]
hostname=110.110.110.50
port=3306

[server2]
hostname=110.110.110.60
port=3306
candidate_master=1 / / set as the candidate master. If this parameter is set, the slave database will be promoted to the master database after the master-slave switch occurs, even if the master database is not the latest slave in the cluster
check_repl_delay=0 / / by default, if a slave lags behind the relay logs of the master by 100m, MHA will not select the slave as a new master because the recovery of the slave takes a long time. Set check_repl_delay=0,MHA triggers the switch. When selecting a new master, the replication delay will be ignored. This parameter is set to candidate_ The host with master = 1 is very useful because the candidate master must be a new master in the process of switching

[server3]
hostname=110.110.110.70
port=3306

2.2. Flow chart of simple procedure

Because it is a simple flow chart, the judgment and abnormality are not indicated in the chart.

3, Main code implementation
Python Inn delivers red envelopes and paper books

3.1. Create a table to save the collected information
The table is named mysqldb_mha_info, the create script is as follows:

create table mysqldb_mha_info (
id int(11) NOT NULL AUTO_INCREMENT,
mha_manager_ip varchar(50) NOT NULL DEFAULT '' COMMENT 'the IP of the cluster where the MHA management node is located',
mha_name varchar(50) NOT NULL DEFAULT 'COMMENT' name of MHA, similar to replica set ',
mha_file_name varchar(250) NOT NULL DEFAULT 'COMMENT' MHA. CNF profile name ',
mha_name_path varchar(250) NOT NULL DEFAULT 'COMMENT' MHA. CNF profile path and name ',
cnf_server1_ip varchar(50) NOT NULL DEFAULT 'COMMENT' node 1 'in MHA cnf configuration file,
cnf_server2_ip varchar(50) NOT NULL DEFAULT 'COMMENT' node 2 in MHA cnf configuration file,
cnf_server3_ip varchar(50) NOT NULL DEFAULT '' COMMENT 'node 3 in MHA cnf configuration file,
failover_script varchar(250) NOT NULL DEFAULT 'COMMENT' file of MHA failover scripts',
failover_script_vip varchar(50) NOT NULL DEFAULT 'COMMENT' VIP defined in MHA failover scripts',
online_script varchar(250) NOT NULL DEFAULT 'COMMENT' file of MHA online change scripts',
online_script_vip varchar(50) NOT NULL DEFAULT 'COMMENT' VIP defined in MHA online change scripts',
script_remark varchar(1500) NOT NULL DEFAULT 'COMMENT' MHA scripts VIP check result ',
masterha_status varchar(10) NOT NULL DEFAULT 'COMMENT' MHA check is enabled, from masterha_check_status check result ',
master_serverip varchar(50) NOT NULL DEFAULT 'COMMENT' MHA check is enabled, from masterha_check_status check result ',
current_master_ip varchar(50) NOT NULL DEFAULT '' COMMENT 'MHA current master node, from check_repl’,
mha_current_vip varchar(50) NOT NULL DEFAULT 'COMMENT' MHA current VIP, from check_repl’,
slave1_ip varchar(50) NOT NULL DEFAULT 'COMMENT' MHA is currently from node 1, from check_repl’,
slave2_ip varchar(50) NOT NULL DEFAULT 'COMMENT' MHA is currently from node 2, from check_repl’,
mha_cnf_remark varchar(1500) NOT NULL DEFAULT 'COMMENT' MHA check conf/cnf check result ',
check_repl_remark varchar(1500) NOT NULL DEFAULT 'COMMENT' MHA check repl check result ',
remark varchar(1500) NOT NULL DEFAULT '' COMMENT 'MHA check result',
creator varchar(50) NOT NULL DEFAULT '',
create_time datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
operator varchar(50) NOT NULL DEFAULT '',
modify_time datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4

3.2. Module connecting DB
The module is named db_conn.py. In this module, MySQL connector is used instead of MySQL dB. Easier installation.

#!/usr/bin/python3

-- coding: UTF-8 --

##Import MySQL DB installation module trouble
import mysql.connector
db = mysql.connector.connect(user='nideuid', password='nidepwd',host='nideseverip',database='DBname',port=XXXX)

3.3. Function realization module
The file is collect_mysqldbmha_info.py, the code is as follows:

#!/usr/bin/python

-- coding: UTF-8 --

import os
import io
import re
import ConfigParser
import socket

import db_conn
mysqldb = db_conn.db
cursor = mysqldb.cursor()

Part 1 get native IP

try:
s=socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
s.connect(('8.8.8.8',80))
mha_manager_ip=s.getsockname()[0]
print('the IP address of the host where the mha manager is located is as follows: ')
print(mha_manager_ip)
finally:
s.close()

##Part 2: loop through the folder where mha cnf is located, and take out the cnf for judgment and inspection
Path='/date/funcation/python/mha_conf'
#fout=open('output file name ',' w ')
for Name in os.listdir(Path) :
Pathname= os.path.join(Path,Name)

print(Pathname)

print(Name)

mha_name = Name.replace(".cnf", ""). replace(".conf", "") ### starts a name for the MHA cluster
##print(mha_name)
##Note that this is r, not w, otherwise an error is reported: IOError: File not open for reading
with open(Pathname,'r') as f:
filecontent=f.read()
#print(filecontent)
remark = ''
####Adjust to configparser. Note that the module names of Python 2 and python are different. configparser and configparser
config =ConfigParser.ConfigParser()
try:
config.read(Pathname)
server_item = config.sections()
##print(server_item)
###start gets the name of the scripts file during MHA switching
mha_failover_script = ''
mha_online_change_script =''
mha_cnf_remark =''
if 'server default' in server_item:
mha_failover_script = config.get('server default','master_ip_failover_script')
###
mha_failover_script=mha_failover_script.replace(" --ssh_user=root", "")
##print(mha_failover_script)
else:
mha_cnf_remark = mha_cnf_remark + ‘mha_failover_script not configured; '
if 'server default' in server_item:
mha_online_change_script = config.get('server default','master_ip_online_change_script')
##print(mha_online_change_script)
else:
mha_cnf_remark = mha_cnf_remark + ‘mha_online_change_script not configured; '
###1.1 end acquisition end
##1.2 start to obtain the node information in the MHA configuration file
server1_host = '' ##MHA cnf configuration file node 1
server2_host = '' ##MHA cnf configuration file node 2
server3_host = '' ##MHA cnf configuration file node 3
if 'server1' in server_item:
server1_host = config.get('server1','hostname')
print(server1_host)
else:
server1_host = ''
mha_cnf_remark = mha_cnf_remark + 'Server1 is not configured;'
print(server1_host)
if 'server2' in server_item:
server2_host = config.get('server2','hostname')
print(server2_host)
else:
server2_host = ''
mha_cnf_remark = mha_cnf_remark + 'Server2 is not configured;'
print(server2_host)
if 'server3' in server_item:
server3_host = config.get('server3','hostname')
print(server3_host)
##else:
##server3_host = ''
##mha_cnf_remark = mha_cnf_remark + 'Server3 is not configured;'
##print(server3_host)
##1.2 end of obtaining server node information
print(mha_cnf_remark)
except Exception as e:
print(e)

#####Part 3 start extracts the configured VIP from mha scripts
mha_remark = ''
mha_failover_my_vip = ''
mha_failover_flush_vip = ''
mha_onlinechange_my_vip = ''
mha_onlinechange_flush_vip =''
if len(mha_failover_script) <> 0 and len(mha_online_change_script) <> 0 :
  ##3.1 handle failover first_ Script to parse the VIP
  with open(mha_failover_script,'r') as f:
    failscript_lines=f.readlines()
    for failscript_line in failscript_lines:
      failscript_ip=re.findall(r"\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b", failscript_line)
      if failscript_ip:
        if 'my $vip =' in failscript_line:
          mha_failover_my_vip = failscript_ip[0]
          print('Parse out mha_failover_my_vip:')
          print(mha_failover_my_vip)
        if  'my $ssh_flush_vip' in failscript_line:
          mha_failover_flush_vip = failscript_ip[0]
          print('Parse out mha_failover_flush_vip:')
          print(mha_failover_flush_vip)

    ##After reading the file, judge the reading results and judge the two situations
    ## Whether a is undefined
    if mha_failover_my_vip =='':
      mha_remark = mha_remark + 'MHA failover  Not extracted VIP Please check the settings of;'
    ## The other,, is defined, but the values are not equal
    if mha_failover_my_vip <> mha_failover_flush_vip:
      mha_remark = mha_remark + 'MHA failover scripts Two places set in the file VIP Inconsistent, please check;'

  ## 3.2 process online change scripts, parse and extract VIP information
  with open(mha_online_change_script,'r') as f:
    onlinescript_lines=f.readlines()
    for onlinescript_line in onlinescript_lines:
      onlinescript_ip=re.findall(r"\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b", onlinescript_line)
      if onlinescript_ip:
        if 'my $vip =' in onlinescript_line:
          mha_onlinechange_my_vip = onlinescript_ip[0]
          print('Parse out mha_onlinechange_my_vip:')
          print(mha_onlinechange_my_vip)
        if  'my $ssh_flush_vip' in onlinescript_line:
          mha_onlinechange_flush_vip = onlinescript_ip[0]
          print('Parse out mha_onlinechange_flush_vip:')
          print(mha_onlinechange_flush_vip)
    #### After reading the online change file, judge whether the defined VIP meets the requirements
    if mha_onlinechange_my_vip =='':
      mha_remark = mha_remark + 'MHA online change scripts Not extracted VIP Please check the settings of;'
    if mha_onlinechange_my_vip <> mha_onlinechange_flush_vip:
      mha_remark = mha_remark + 'MHA online change scripts Two places set in the file VIP Inconsistent, please check;'

  #### After reading both files, judge whether the VIP s defined in the two files are consistent
  if mha_onlinechange_my_vip <> mha_failover_my_vip:
    mha_remark = mha_remark + 'MHA online change script  and failover script Medium VIP atypism,Please check.'

else:
  mha_remark = mha_remark + 'MHA init The profile for the switch is not defined correctly scripts,Please check.'
  #print('MHA init's configuration file does not correctly define switching scripts, please check. ')
  print(mha_remark)
#####Part 2 extract the configured VIP from mha scripts

#### Part 4, from masterha_check_status determines the status of the mha process in the execution result
## Execute masterha from_ check_ The masterha parsed in the status result_ Status and master_serverip data
masterha_status =''
master_serverip =''
## Execute masterha from_ check_ Current resolved in repl result_ master ,current_slave1,current_slave2 and mha_current_vip data
current_master = ''
current_slave1 = ''
current_slave2 = ''
mha_current_vip =''
##Judge whether the file is the configuration file of MHA. The judgment method is that there must be sections of server default\server1 in the file
if 'server default' in server_item and 'server1' in server_item :
  ##cmd_mha_status ='/usr/local/bin/masterha_check_status --conf=/etc/mha/opszabbix.cnf'
  cmd_mha_status ='/usr/local/bin/masterha_check_status --conf='+Pathname
  try:
    mha_status=os.popen(cmd_mha_status)
    mha_status_result = mha_status.read()
    print(mha_status_result)  ##The return style is xxxx (PID: -------) is running (0: ping_ok), master: xxx.xxx.xxx.xxx
    ### Judge whether the status is running
    if 'running(0:PING_OK)' in mha_status_result:
      masterha_status='Running'
      ##Grab the Master node of MHA
      ##master_serverip = mha_status_result[mha_status_result.rfind('master:'):]
      master_serverip = mha_status_result.split('master:')[1]
      print(master_serverip)
      print('MHA Normal startup and operation')
    elif 'stopped(2:NOT_RUNNING)' in mha_status_result:
      masterha_status='stopped'
      print('MHA Not started!!!')
  finally:
     if mha_status:
        mha_status.close()
  #### Part 5, from masterha_ check_ In the execution result of repl, judge and resolve the master node, slave node and VIP node
  ##  Judge the condition of the replica set
  cmd_repl_status ='/usr/local/bin/masterha_check_repl --conf='+Pathname
  try:
    ##### Add the 2 > error parameter, and you do not need to print out the debugging information.
    cmd_repl_status_result = cmd_repl_status + '     2> checkrepl.log'
    repl_status=os.popen(cmd_repl_status_result)
    repl_status_result = repl_status.read()
    ##print(repl_status_result)
    if 'MySQL Replication Health is OK' in repl_status_result:
      print('MHA The master-slave of the cluster is normal')
      ###Get ServerIP
      #######The debugging information is output to stream 2, so you must add 2 > & 1, otherwise you can't get the node information and can only catch one VIP.
      cmd_repl_status_info = cmd_repl_status + '     2>&1'
      with os.popen(cmd_repl_status_info,'r') as repl_status_check2:
        #repl_status_lines=repl_status_check2.readlines()
        repl_status_lines=repl_status_check2.readlines()
        ##print(len(repl_status_lines))  ####Print out the number of elements in the list
        for repl_status_line in repl_status_lines:
          ##print('##################  start   ###########################')
          ##print(str(repl_status_line).replace("\n", "").replace("\t", ""))
          ##repl_status_line ='Current Alive Master: 10.200.58.63(10.200.58.63:55988)'
          serverip_result=re.findall(r"\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b", repl_status_line)
          if serverip_result:
            if 'Current Alive Master:' in repl_status_line:
              current_master = serverip_result[0]
              print('Resolved to master node IP')
              print(current_master)
            elif 'Checking replication health on' in repl_status_line and current_slave1 == '':   ###There may be 2 slave nodes, here is the first slave node
              current_slave1 = serverip_result[0]
              print('Resolved to slave node 1')
              print(current_slave1)
            elif 'Checking replication health on' in repl_status_line and current_slave1 <> '':  ###There may be two slave nodes. Here is the second slave node
              current_slave2 = serverip_result[0]
              print('Resolved to slave node 2')
              print(current_slave2)
            elif 'Checking replication health on' in repl_status_line and current_slave1 <> '':  ###There may be two slave nodes. Here is the second slave node
              print('The cluster has 3 or more slave nodes. Please confirm.')
            if 'down==/sbin/ifconfig ' in repl_status_line:
              mha_current_vip = serverip_result[0]
              print('Resolved to MHA Clustered VIP')
              print(mha_current_vip)

            ##print('including serverip ')
            ##print(serverip_result)
          #else:
            #print('excluding ServerIP ')
        ##else:
          ##print(repl_status_line)
          ##print('##################  end   ###########################')
      ####End of getting IP part
    else:
      print('MHA Cluster master-slave exception,Please check in time')

  finally:
     if repl_status:
        repl_status.close()

else:
   remark = Pathname + '...... no MHA Configuration file for,Please check!'
   print(remark)


##### Part 6, saving data to a table
sql_insert = "insert into mysqldb_mha_info(mha_manager_ip,mha_name,mha_file_name,mha_name_path,cnf_server1_ip,cnf_server2_ip,cnf_server3_ip,failover_script,failover_script_vip,online_script,online_script_vip,masterha_status,master_serverip,current_master_ip,mha_current_vip,slave1_ip,slave2_ip,mha_cnf_remark,script_remark,remark) " \
                  "values('%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s')" % \
                  (mha_manager_ip,mha_name,Name,Pathname,server1_host,server2_host,server3_host,mha_failover_script,mha_failover_my_vip,mha_online_change_script,mha_onlinechange_my_vip,masterha_status,current_master,master_serverip,mha_current_vip,current_slave1,current_slave2,mha_cnf_remark,mha_remark,remark)
##print(sql_insert)
cursor.execute(sql_insert)
mysqldb.commit()

#####

Close cursor

cursor.close()

Close database connection

mysqldb.close()

3.4. Code operation
Python running environment: Python 2.7.5

Execute command:

python /data/XXXX path / collect_mysqldbmha_info.py
Collect regularly, please set cron as required

This is the end of this article about the function implementation of collecting MySQL MHA deployment and running status information through Python

Topics: Python Database MySQL