MHA高可用搭建

    技术2022-07-21  84

    1.准备环境

    主机名IP地址角色db01172.16.1.51masterdb02172.16.1.52slave1、candidate_masterdb03172.16.1.53slave2、manager

    2.安装数据库

    [root@db01 ~]# yum install mariadb-server -y [root@db02 ~]# yum install mariadb-server -y [root@db03 ~]# yum install mariadb-server -y

    3.修改数据库配置文件

    #master节点配置 [root@db01 ~]# vim /etc/my.cnf skip_name_resolve = ON innodb_file_per_table = ON server-id = 1 #复制集群中的各节点的id均必须唯一 log-bin = mysql-bin #开启二进制日志 relay-log = relay-log #开启中继日志 #slave节点配置 [root@db02 ~]# vim /etc/my.cnf skip_name_resolve = ON innodb_file_per_table = ON server-id = 2 log-bin = mysql-bin relay-log = relay-log #read_only = ON #不在配置文件中限定只读,但是要记得在slave上限制只读 relay_log_purge = 0 #禁用自动清空不再需要中继日志 log_slave_updates = 1 #使得更新的数据写进二进制日志中 [root@db03 ~]# vim /etc/my.cnf skip_name_resolve = ON innodb_file_per_table = ON server-id = 3 log-bin = mysql-bin relay-log = relay-log #read_only = ON relay_log_purge = 0 log_slave_updates = 1 #配置完成后,启动数据库(3个节点) [root@db01 ~]# systemctl restart mariadb

    4.配置一主多从复制架构

    #master节点 MariaDB [(none)]> grant replication slave on *.* to 'slave'@'172.16.1.%' identified by '123456'; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> grant all on *.* to 'mhaadmin'@'172.16.1.%' identified by 'mhapass'; #很重要 Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> show master status; +------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +------------------+----------+--------------+------------------+ | mysql-bin.000003 | 397 | | | +------------------+----------+--------------+------------------+ 1 row in set (0.00 sec) #slave节点(2个) MariaDB [(none)]> set global read_only=1; #设置只读(防止意外被写数据,很重要) Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> change master to -> master_host='172.16.1.51', -> master_user='slave', -> master_password='123456', -> master_log_file='mysql-bin.000003', -> master_log_pos=397; Query OK, 0 rows affected (0.10 sec) change master to master_host='172.16.1.51',master_user='slave',master_password='123456',master_log_file='mysql-bin.000003',master_log_pos=397; MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.1.51 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000003 Read_Master_Log_Pos: 397 Relay_Log_File: relay-log.000002 Relay_Log_Pos: 529 Relay_Master_Log_File: mysql-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 397 Relay_Log_Space: 817 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec)

    5.MHA基本环境准备

    1)配置三台机器的ssh互信(三台都要操作)

    ssh-keygen -t rsa ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.1.51 ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.1.52 ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.1.53 #或者使用免密、免验证登录 sshpass -p1 ssh-copy-id -i /root/.ssh/id_rsa.pub -o StrictHostKeyChecking=no root@172.16.1.51

    2)安装依赖

    yum install -y perl perl-DBI perl-DBD-MySQL perl-IO-Socket-SSL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes

    3)安装MHA

    #三个节点都需安装 mha4mysql-node yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm #manager(db03)节点需要在安装 mha4mysql-manager yum install -y mha4mysql-manager-0.56-0.el6.noarch.rpm

    6.配置MHA(在manager节点上操作,即db03)

    1)创建目录

    mkdir -p /etc/mha/scripts mkdir -p /var/log/mha/app1

    2)配置mha配置文件

    [root@db03 ~]# vim /etc/mha/app1.cnf [server default] manager_log=/var/log/mha/app1/manager.log #设置manager的日志 manager_workdir=/var/log/mha/app1 #设置manager的工作目录 master_binlog_dir=/var/lib/mysql #设置master 保存binlog的位置 master_ip_failover_script= /etc/mha/scripts/master_ip_failover #设置自动failover时候的切换脚本 master_ip_online_change_script=/etc/mha/scripts/master_ip_online_change #设置手动切换时候的切换脚本 user=mhaadmin password=mhapass ssh_user=root repl_password=123456 repl_user=slave #设置复制环境中的复制用户名 ping_interval=1 #发送ping包的时间间隔 [server1] hostname=172.16.1.51 port=3306 [server2] hostname=172.16.1.52 port=3306 candidate_master=1 #设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为 主库 check_repl_delay=0 #默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master [server3] hostname=172.16.1.53 port=3306

    7.配置VIP(manager节点)

    1)自动failover脚本 “/etc/mha/scripts/master_ip_failover”

    #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); my $vip = '172.16.1.59/24'; my $key = '1'; my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down"; GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, ); exit &main(); sub main { print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host \n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \n"; exit 0; } else { &usage(); exit 1; } } sub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; } sub stop_vip() { return 0 unless ($ssh_user); `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; }

    2)手动 failover “/etc/mha/scripts/master_ip_online_change”

    #!/bin/bash source /root/.bash_profile vip=`echo '172.16.1.59/24'` #设置VIP key=`echo '1'` command=`echo "$1" | awk -F = '{print $2}'` orig_master_host=`echo "$2" | awk -F = '{print $2}'` new_master_host=`echo "$7" | awk -F = '{print $2}'` orig_master_ssh_user=`echo "${12}" | awk -F = '{print $2}'` new_master_ssh_user=`echo "${13}" | awk -F = '{print $2}'` #要求服务的网卡识别名一样,都为eth1(这里是) stop_vip=`echo "ssh root@$orig_master_host /usr/sbin/ifconfig eth1:$key down"` start_vip=`echo "ssh root@$new_master_host /usr/sbin/ifconfig eth1:$key $vip"` if [ $command = 'stop' ] then echo -e "\n\n\n****************************\n" echo -e "Disabled thi VIP - $vip on old master: $orig_master_host \n" $stop_vip if [ $? -eq 0 ] then echo "Disabled the VIP successfully" else echo "Disabled the VIP failed" fi echo -e "***************************\n\n\n" fi if [ $command = 'start' -o $command = 'status' ] then echo -e "\n\n\n*************************\n" echo -e "Enabling the VIP - $vip on new master: $new_master_host \n" $start_vip if [ $? -eq 0 ] then echo "Enabled the VIP successfully" else echo "Enabled the VIP failed" fi echo -e "***************************\n\n\n" fi

    3)将脚本赋予可执行权限

    chmod +x /etc/mha/scripts/master_ip_failover chmod +x /etc/mha/scripts/master_ip_online_change

    8.通过 masterha_check_ssh 验证 ssh 信任登录是否成功

    [root@db03 ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf Wed Apr 16 23:17:58 2020 - All SSH connection tests passed successfully. #表示所有都成功

    9.通过 masterha_check_repl 验证 mysql 主从复制是否成功(下面输出表示测试通过)

    [root@db03 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf IN SCRIPT TEST====/sbin/ifconfig eth1:1 down==/sbin/ifconfig eth1:1 172.16.1.59/24=== Checking the Status of the script.. OK Wed Apr 16 27:15:58 2020 - OK. Wed Apr 16 27:15:58 2020 - shutdown_script is not defined. Wed Apr 16 27:15:58 2020 - Got exit code 0 (Not master dead). MySQL Replication Health is OK. 注意: #报错一: Wed Apr 1 17:47:34 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] Got MySQL error when connecting 172.16.1.53(172.16.1.53:3306) :1130:Host '172.16.1.53' is not allowed to connect to this MariaDB server, but this is not a MySQL crash. Check MySQL server settings. 解决方案: 去master节点,给mhaadmin重新授权 grant all on *.* to 'mhaadmin'@'172.16.1.%' identified by 'mhapass'; #报错二: Wed Apr 1 17:48:05 2020 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln393] 172.16.1.52(172.16.1.52:3306): User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host. 解决方案: 去master节点,给slave重新授权 grant replication slave on *.* to 'slave'@'172.16.1.%' identified by '123456'; #报错三: Wed Apr 1 22:32:16 2020 - [info] /etc/mha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.16.1.51 --orig_master_ip=172.16.1.51 --orig_master_port=3306 Wed Apr 1 22:32:16 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. Can't exec "/etc/mha/scripts/master_ip_failover": Permission denied at /usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm line 68. 解决方案: 可以看到,上面提示脚本文件/etc/mha/scripts/master_ip_failover权限不足, chmod +x /etc/mha/scripts/*

    10.启动MHA**(注意:MHA监控脚本切换一次就会退出,需要再次启动)**

    1)先在master上绑定vip,(只需要在master绑定这一次,以后会自动切换)

    [root@db01 ~]# /usr/sbin/ifconfig eth1:1 172.16.1.59/24

    2)然后通过 masterha_manager 启动 MHA 监控

    #启动mha [root@db03 ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 & #关闭mha [root@db03 ~]# masterha_stop --conf=/etc/masterha/app1.cnf #检查MHA的启动状态 [root@db03 ~]# tailf /var/log/mha/app1/manager.log #如果最后一行是如下,表明启动成功 Ping(SELECT) succeeded, waiting until MySQL doesn’t respond..

    3)检查集群状态

    [root@db03 ~]# masterha_check_status --conf=/etc/mha/app1.cnf mha (pid:7598) is running(0:PING_OK), master:172.16.1.51 #注:上面的信息中“mha (pid:7598) is running(0:PING_OK)”表示MHA服务运行OK,否则, 则会显示为类似“mha is stopped(1:NOT_RUNNING).”
    Processed: 0.014, SQL: 9