MHA高可用+Atlas读写分离

高可用架构方案

1
2
3
4
5
6
负载均衡:有一定的高可用性 
LVS Nginx
主备系统:有高可用性,需要切换,单活架构
keepalived, MHA, MMM
真正高可用(多活-并发):
NDB Cluster Oracle RAC Sysbase cluster , InnoDB Cluster(MGR),PXC , MGC

主从架构演变介绍

基础架构

1
2
3
4
5
6
一主一从
一主多从
双主
多级主从
循环复制
MGR

高级应用架构

高性能架构
1
2
3
4
5
1. 读写分离架构(读性能较高):
Atlas;Mysql Router;Cobar;proxySQL;Maxcale;Mycat

2. 分布式架构(读写性能都提高):
Atlas;Mycat(开源);TDDL;Innodb Cluster
高可用架构
1
2
3
4
1. 单活(可切换):
MMM架构——Keepalived+2主+1从;MHA架构(1主2从);TMHA(1主1从)
2. 多活:
Innodb Cluster;MGC(MariaDB Galera Cluster)架构;PXC)(Percona XtraDB Cluster);MySQL Cluster(Oracle rac)架构

高可用MHA

工作原理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
主库宕机处理过程
1. 监控节点 (通过配置文件获取所有节点信息)
系统,网络,SSH连接性
主从状态,重点是主库

2. 选主
(1) 如果判断从库(position或者GTID),数据有差异,最接近于Master的slave,成为备选主
(2) 如果判断从库(position或者GTID),数据一致,按照配置文件顺序,选主.
(3) 如果设定有权重(candidate_master=1),按照权重强制指定备选主.
1. 默认情况下如果一个slave落后master 100M的relay logs的话,即使有权重,也会失效.
2. 如果check_repl_delay=0的化,即使落后很多日志,也强制选择其为备选主
3. 数据补偿
(1) 当SSH能连接,从库对比主库GTID 或者position号,立即将二进制日志保存至各个从节点并且应用(save_binary_logs )
(2) 当SSH不能连接, 对比从库之间的relaylog的差异(apply_diff_relay_logs)
4. Failover
将备选主进行身份切换,对外提供服务
其余从库和新主库确认新的主从关系
5. 应用透明(VIP)
MHA自带
6. 故障切换通知(send_reprt)
7. 二次数据补偿(binlog_server)

软件包构成

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Manager工具包主要包括以下几个工具:
masterha_manger 启动MHA的脚本
masterha_stop 关闭Manager
masterha_check_ssh 检查MHA的SSH配置状况
masterha_check_repl 检查MySQL复制状况
masterha_master_monitor 检测master是否宕机
masterha_check_status 检测当前MHA运行状态
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的server信息

Node工具包主要包括以下几个工具:
这些工具通常由MHA Manager的脚本触发,无需人为操作
save_binary_logs 保存和复制master的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的
purge_relay_logs 清除中继日志(不会阻塞SQL线程)

搭建

环境

主库 从库1 从库2 MySQL版本
192.168.177.200 192.168.177.201 192.168.177.202 5.7.41
node node node、manager、binlogserver

配置关键程序软链接

1
2
ln -s /web/soft/mysql/bin/mysqlbinlog    /usr/bin/mysqlbinlog
ln -s /web/soft/mysql/bin/mysql /usr/bin/mysql

一主两从开启GTID

1
2
3
4
5
主库:
192.168.177.200 node
从库:
192.168.177.201 node
192.168.177.202 node manager

配置各节点互信

1
2
3
4
5
6
rm -rf /root/.ssh 
ssh-keygen
cd /root/.ssh
mv id_rsa.pub authorized_keys
scp -r /root/.ssh 192.168.177.201:/root
scp -r /root/.ssh 192.168.177.202:/root

验证

1
2
3
ssh 192.168.177.200 date
ssh 192.168.177.201 date
ssh 192.168.177.202 date

从1

1
2
3
ssh 192.168.177.200 date
ssh 192.168.177.201 date
ssh 192.168.177.202 date

从2

1
2
3
ssh 192.168.177.200 date
ssh 192.168.177.201 date
ssh 192.168.177.202 date

安装软件

所有节点

下载MHA软件

1
https://github.com/yoshinorim/mha4mysql-manager/wiki/Downloads

1
2
3
4
node:
http://www.mysql.gr.jp/frame/modules/bwiki/index.php?plugin=attach&pcmd=open&file=mha4mysql-node-0.56-0.el6.noarch.rpm&refer=matsunobu
manager:
http://www.mysql.gr.jp/frame/modules/bwiki/index.php?plugin=attach&pcmd=open&file=mha4mysql-manager-0.56-0.el6.noarch.rpm&refer=matsunobu

Node软件安装

1
2
yum install perl-DBD-MySQL -y
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

主库

创建MHA专用用户

1
2
3
4
5
GRANT ALL PRIVILEGES ON *.* TO 'mha'@'192.168.177.%' IDENTIFIED BY '123456';
flush privileges;

创建好用户后三个节点查询用户是否创建成功:
select user,host from mysql.user;

从库2

Manager软件安装

1
2
yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

Manager配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
创建配置文件目录
mkdir -p /etc/mha/
创建日志目录
mkdir -p /web/logs/mysql/mha/
编辑mha配置文件
vim /etc/mha/mha.cnf

[server default]
#Manager工作日志
manager_log=/web/logs/mysql/mha/manager
#存放日志位置
manager_workdir=/web/logs/mysql/mha
#主库二进制日志位置
master_binlog_dir=/web/logs/mysql/binlog/
user=mha
password=123456
#探测节点状态时间间隔(s)探测3次
ping_interval=2
repl_user=repl
repl_password=123456
ssh_user=root
[server1]
hostname=192.168.177.200
port=3306
[server2]
hostname=192.168.177.201
port=3306
[server3]
hostname=192.168.177.202
port=3306

额外参数

1
2
3
4
5
6
7
8
9
10
11
12
13
说明:
主库宕机谁来接管?
1. 所有从节点日志都是一致的,默认会以配置文件的顺序去选择一个新主。
2. 从节点日志不一致,自动选择最接近于主库的从库
3. 如果对于某节点设定了权重(candidate_master=1),权重节点会优先选择。
但是此节点日志量落后主库100M日志的话,也不会被选择。可以配合check_repl_delay=0,关闭日志量的检查,强制选择候选节点。

#设置监控逐鹿,发送ping包的时间间隔,尝试三次没有回应的时候自动进行failover
ping_interval=2
#设置候选master,设置该参数后,发生主从切换以后将会将此库提升为主库即使这个主库不是集群中事件最新的
canditate_master=1
#默认情况下如果一个slave落后master 100M的relay-logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master.
check_repl_delay=0

互信检查

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
masterha_check_ssh  --conf=/etc/mha/mha.cnf

[root@localhost mha]# masterha_check_ssh --conf=/etc/mha/mha.cnf
Wed Oct 16 15:36:56 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Oct 16 15:36:56 2024 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Wed Oct 16 15:36:56 2024 - [info] Reading server configuration from /etc/mha/mha.cnf..
Wed Oct 16 15:36:56 2024 - [info] Starting SSH connection tests..
Wed Oct 16 15:36:58 2024 - [debug]
Wed Oct 16 15:36:56 2024 - [debug] Connecting via SSH from root@192.168.177.200(192.168.177.200:22) to root@192.168.177.201(192.168.177.201:22)..
Wed Oct 16 15:36:57 2024 - [debug] ok.
Wed Oct 16 15:36:57 2024 - [debug] Connecting via SSH from root@192.168.177.200(192.168.177.200:22) to root@192.168.177.202(192.168.177.202:22)..
Wed Oct 16 15:36:58 2024 - [debug] ok.
Wed Oct 16 15:36:59 2024 - [debug]
Wed Oct 16 15:36:57 2024 - [debug] Connecting via SSH from root@192.168.177.202(192.168.177.202:22) to root@192.168.177.200(192.168.177.200:22)..
Wed Oct 16 15:36:58 2024 - [debug] ok.
Wed Oct 16 15:36:58 2024 - [debug] Connecting via SSH from root@192.168.177.202(192.168.177.202:22) to root@192.168.177.201(192.168.177.201:22)..
Wed Oct 16 15:36:59 2024 - [debug] ok.
Wed Oct 16 15:36:59 2024 - [debug]
Wed Oct 16 15:36:56 2024 - [debug] Connecting via SSH from root@192.168.177.201(192.168.177.201:22) to root@192.168.177.200(192.168.177.200:22)..
Wed Oct 16 15:36:57 2024 - [debug] ok.
Wed Oct 16 15:36:57 2024 - [debug] Connecting via SSH from root@192.168.177.201(192.168.177.201:22) to root@192.168.177.202(192.168.177.202:22)..
Wed Oct 16 15:36:58 2024 - [debug] ok.
Wed Oct 16 15:36:59 2024 - [info] All SSH connection tests passed successfully.[root@localhost mha]# masterha_check_ssh --conf=/etc/mha/mha.cnf
Wed Oct 16 15:47:19 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Oct 16 15:47:19 2024 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Wed Oct 16 15:47:19 2024 - [info] Reading server configuration from /etc/mha/mha.cnf..
Wed Oct 16 15:47:19 2024 - [info] Starting SSH connection tests..
Wed Oct 16 15:47:22 2024 - [debug]
Wed Oct 16 15:47:19 2024 - [debug] Connecting via SSH from root@192.168.177.200(192.168.177.200:22) to root@192.168.177.201(192.168.177.201:22)..
Wed Oct 16 15:47:20 2024 - [debug] ok.
Wed Oct 16 15:47:20 2024 - [debug] Connecting via SSH from root@192.168.177.200(192.168.177.200:22) to root@192.168.177.202(192.168.177.202:22)..
Wed Oct 16 15:47:22 2024 - [debug] ok.
Wed Oct 16 15:47:23 2024 - [debug]
Wed Oct 16 15:47:20 2024 - [debug] Connecting via SSH from root@192.168.177.201(192.168.177.201:22) to root@192.168.177.200(192.168.177.200:22)..
Wed Oct 16 15:47:21 2024 - [debug] ok.
Wed Oct 16 15:47:21 2024 - [debug] Connecting via SSH from root@192.168.177.201(192.168.177.201:22) to root@192.168.177.202(192.168.177.202:22)..
Wed Oct 16 15:47:22 2024 - [debug] ok.
Wed Oct 16 15:47:23 2024 - [debug]
Wed Oct 16 15:47:20 2024 - [debug] Connecting via SSH from root@192.168.177.202(192.168.177.202:22) to root@192.168.177.200(192.168.177.200:22)..
Wed Oct 16 15:47:21 2024 - [debug] ok.
Wed Oct 16 15:47:21 2024 - [debug] Connecting via SSH from root@192.168.177.202(192.168.177.202:22) to root@192.168.177.201(192.168.177.201:22)..
Wed Oct 16 15:47:22 2024 - [debug] ok.
Wed Oct 16 15:47:23 2024 - [info] All SSH connection tests passed successfully.

主从状态检查

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
 masterha_check_repl  --conf=/etc/mha/mha.cnf

[root@localhost mha]# masterha_check_repl --conf=/etc/mha/mha.cnf
Wed Oct 16 15:46:48 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Oct 16 15:46:48 2024 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Wed Oct 16 15:46:48 2024 - [info] Reading server configuration from /etc/mha/mha.cnf..
Wed Oct 16 15:46:48 2024 - [info] MHA::MasterMonitor version 0.56.
Wed Oct 16 15:46:49 2024 - [info] GTID failover mode = 1
Wed Oct 16 15:46:49 2024 - [info] Dead Servers:
Wed Oct 16 15:46:49 2024 - [info] Alive Servers:
Wed Oct 16 15:46:49 2024 - [info] 192.168.177.200(192.168.177.200:3306)
Wed Oct 16 15:46:49 2024 - [info] 192.168.177.201(192.168.177.201:3306)
Wed Oct 16 15:46:49 2024 - [info] 192.168.177.202(192.168.177.202:3306)
Wed Oct 16 15:46:49 2024 - [info] Alive Slaves:
Wed Oct 16 15:46:49 2024 - [info] 192.168.177.201(192.168.177.201:3306) Version=5.7.41-log (oldest major version between slaves) log-bin:enabled
Wed Oct 16 15:46:49 2024 - [info] GTID ON
Wed Oct 16 15:46:49 2024 - [info] Replicating from 192.168.177.200(192.168.177.200:3306)
Wed Oct 16 15:46:49 2024 - [info] 192.168.177.202(192.168.177.202:3306) Version=5.7.41-log (oldest major version between slaves) log-bin:enabled
Wed Oct 16 15:46:49 2024 - [info] GTID ON
Wed Oct 16 15:46:49 2024 - [info] Replicating from 192.168.177.200(192.168.177.200:3306)
Wed Oct 16 15:46:49 2024 - [info] Current Alive Master: 192.168.177.200(192.168.177.200:3306)
Wed Oct 16 15:46:49 2024 - [info] Checking slave configurations..
Wed Oct 16 15:46:49 2024 - [info] read_only=1 is not set on slave 192.168.177.201(192.168.177.201:3306).
Wed Oct 16 15:46:49 2024 - [info] read_only=1 is not set on slave 192.168.177.202(192.168.177.202:3306).
Wed Oct 16 15:46:49 2024 - [info] Checking replication filtering settings..
Wed Oct 16 15:46:49 2024 - [info] binlog_do_db= , binlog_ignore_db=
Wed Oct 16 15:46:49 2024 - [info] Replication filtering check ok.
Wed Oct 16 15:46:49 2024 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Wed Oct 16 15:46:49 2024 - [info] Checking SSH publickey authentication settings on the current master..
Wed Oct 16 15:46:50 2024 - [info] HealthCheck: SSH to 192.168.177.200 is reachable.
Wed Oct 16 15:46:50 2024 - [info]
192.168.177.200(192.168.177.200:3306) (current master)
+--192.168.177.201(192.168.177.201:3306)
+--192.168.177.202(192.168.177.202:3306)

Wed Oct 16 15:46:50 2024 - [info] Checking replication health on 192.168.177.201..
Wed Oct 16 15:46:50 2024 - [info] ok.
Wed Oct 16 15:46:50 2024 - [info] Checking replication health on 192.168.177.202..
Wed Oct 16 15:46:50 2024 - [info] ok.
Wed Oct 16 15:46:50 2024 - [warning] master_ip_failover_script is not defined.
Wed Oct 16 15:46:50 2024 - [warning] shutdown_script is not defined.
Wed Oct 16 15:46:50 2024 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

开启MHA

1
nohup masterha_manager --conf=/etc/mha/mha.cnf --remove_dead_master_conf --ignore_last_failover  < /dev/null> /web/logs/mysql/mha/manager.log 2>&1 &

查看MHA工作状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
masterha_check_status --conf=/etc/mha/mha.cnf

[root@localhost mha]# masterha_check_status --conf=/etc/mha/mha.cnf
mha (pid:7022) is running(0:PING_OK), master:192.168.177.200

[root@localhost mha]# mysql -umha -p123456 -h 192.168.177.200 -e "show variables like 'server_id'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id | 1 |
+---------------+-------+
[root@localhost mha]# mysql -umha -p123456 -h 192.168.177.201 -e "show variables like 'server_id'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id | 2 |
+---------------+-------+
[root@localhost mha]# mysql -umha -p123456 -h 192.168.177.202 -e "show variables like 'server_id'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id | 3 |
+---------------+-------+

故障模拟及处理

停主库

1
2
3
4
/etc/init.d/mysql stop

观察manager的日志 tail -100f /web/logs/mysql/mha/manager.log
末尾必须显示successfully,才算正常切换成功。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
root@localhost mha]# cat /etc/mha/mha.cnf 
[server default]
manager_log=/web/logs/mysql/mha/manager
manager_workdir=/web/logs/mysql/mha
master_binlog_dir=/web/logs/mysql/binlog/
password=123456
ping_interval=2
repl_password=123456
repl_user=repl
ssh_user=root
user=mha

[server2]
hostname=192.168.177.201
port=3306

[server3]
hostname=192.168.177.202
port=3306

mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.177.201
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000005
Read_Master_Log_Pos: 1713
Relay_Log_File: localhost-relay-bin.000002
Relay_Log_Pos: 414
Relay_Master_Log_File: mysql-bin.000005
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

[root@localhost mha]# tail -f /web/logs/mysql/mha/manager
Master 192.168.177.200(192.168.177.200:3306) is down!

Check MHA Manager logs at localhost.localdomain:/web/logs/mysql/mha/manager for details.

Started automated(non-interactive) failover.
Selected 192.168.177.201(192.168.177.201:3306) as a new master.
192.168.177.201(192.168.177.201:3306): OK: Applying all logs succeeded.
192.168.177.202(192.168.177.202:3306): OK: Slave started, replicating from 192.168.177.201(192.168.177.201:3306)
192.168.177.201(192.168.177.201:3306): Resetting slave info succeeded.
Master failover to 192.168.177.201(192.168.177.201:3306) completed successfully

恢复主库

1
/etc/init.d/mysql start

恢复主从结构(主库执行)

1
2
tail -200f /web/logs/mysql/mha/manager  可以查看到构建主从的语句
CHANGE MASTER TO MASTER_HOST='192.168.177.201', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='123456';

恢复Manager配置文件

1
2
3
4
vim /etc/mha/mha.cnf 
[server1]
hostname=192.168.177.200
port=3306

启动MHA

1
nohup masterha_manager --conf=/etc/mha/mha.cnf --remove_dead_master_conf --ignore_last_failover  < /dev/null> /web/logs/mysql/mha/manager.log 2>&1 &

主从库已变更

从库1 主库 从库2 MySQL版本
192.168.177.200 192.168.177.201 192.168.177.202 5.7.41

MHA vip配置

参数

1
2
master_ip_failover_script=/etc/mha/master_ip_failover
注意:/etc/mha/master_ip_failover,必须事先准备好

添加脚本

1
2
3
4
vim  /etc/mha/master_ip_failover

chmod +x /etc/mha/master_ip_failover
dos2unix /etc/mha/master_ip_failover 转换中文字符

vip漂移脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
#!/usr/bin/env perl

use strict;
use warnings FATAL => 'all';
use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);

my $vip = '192.168.177.218/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down"; # 确保接口名称一致

GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
print "\n\nIN Master_ip_failover_command:====$command===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
usage();
exit 1;
}
}

sub start_vip {
my $output = qx{ssh $ssh_user\@$new_master_host "$ssh_start_vip"};
if ($?) {
die "Failed to start VIP: $output";
}
}

sub stop_vip {
return 0 unless ($ssh_user);
my $output = qx{ssh $ssh_user\@$orig_master_host "$ssh_stop_vip"};
if ($?) {
die "Failed to stop VIP: $output";
}
}

sub usage {
print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

更改Manager配置文件 添加脚本

1
2
3
vim /etc/mha/mha.cnf
[server default]
master_ip_failover_script=/etc/mha/master_ip_failover

第一次配置vip时候,需要在主库手工生成vip

1
2
3
4
5
手工在主库上绑定vip,注意一定要和配置文件中的ethN一致,eth33:1(1是key指定的值)
yum install net-tools -y #没有ifconfig命令执行
ifconfig ens33:1 192.168.177.218/24
ip address add 192.168.228.218/24 dev ens33:1
ifconfig

重启Manager

1
2
3
4
5
6
masterha_stop --conf=/etc/mha/mha.cnf
nohup masterha_manager --conf=/etc/mha/mha.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /web/logs/mysql/mha/manager.log 2>&1 &

检查状态
masterha_check_status --conf=/etc/mha/mha.cnf
mha (pid:8652) is running(0:PING_OK), master:192.168.177.201

验证

查看主库vip

停止主库

1
/etc/init.d/mysql stop

查看主库vip及从库vip

恢复主库

1
2
3
4
5
6
启动
/etc/init.d/mysql stop
构建主从关系
CHANGE MASTER TO MASTER_HOST='192.168.177.200', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='123456';
启动slave
start slave;

启动Manager并查看状态

1
2
3
4
[root@localhost mha]# nohup masterha_manager --conf=/etc/mha/mha.cnf --remove_dead_master_conf --ignore_last_failover  < /dev/null> /web/logs/mysql/mha/manager.log 2>&1 &
[1] 8652
[root@localhost mha]# masterha_check_status --conf=/etc/mha/mha.cnf
mha (pid:8652) is running(0:PING_OK), master:192.168.177.200

邮件提醒

参数

1
report_script=/etc/mha/mail

Manager增加配置

1
2
3
4
mkdir -p /etc/mha/mail/
vim /etc/mha/mha.cnf
[server default]
report_script=/etc/mha/mail/send

重启MHA

1
2
masterha_stop --conf=/etc/mha/mha.cnf
nohup masterha_manager --conf=/etc/mha/mha.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /web/logs/mysql/mha/manager.log 2>&1 &

Binlog Server

1
避免主MHA连接不上主库,导致从库数据差异,数据丢失  需要配置额外服务器
1
2
3
4
5
6
7
8
mkdir /web/data/mysql/binserver

vim /etc/mha/app1.cnf
[binlog1]
no_master=1
hostname=192.168.177.202
#不能跟原binlog目录一样
master_binlog_dir=/web/data/mysql/binserver/

拉去主库binlog日志

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
将当前主库binlog拉过来(从00000X开始拉,之后的binlog会自动按顺序过来)
进入创建的目录
cd /web/data/mysql/binserver/
mysqlbinlog -R --host=192.168.177.201 --user=mha --password=123456 --raw --stop-never mysql-bin.000009 &

注意:
拉取日志的起点,需要按照目前从库的已经获取到的二进制日志点为起点
mysql> show slave status \G;
*************************** 1. row ***************************
Master_Host: 192.168.177.201
Master_Log_File: mysql-bin.000009

-R: 表示以只读模式执行,防止对日志文件进行写入操作。
--host=192.168.177.200: 指定要连接的主库的 IP 地址。
--user=mha: 连接主库服务器的用户名。
--password=123456: 连接主库服务器的密码。
--raw: 以原始格式输出二进制日志内容。
--stop-never: 持续运行,不断读取新产生的日志内容,不会自动停止

启动MHA

1
nohup masterha_manager --conf=/etc/mha/mha.cnf --remove_dead_master_conf --ignore_last_failover  < /dev/null> /web/logs/mysql/mha/manager.log 2>&1 &

故障处理

1
2
3
4
5
主库宕机,binlogserver 自动停掉,manager 也会自动停止。
处理思路:
1. 删除之前的binlog,重新获取新主库的binlog到binlogserver中
2. 重新配置文件binlog server信息
3. 最后再启动MHA

设置从库多线程并发复制

1
2
3
4
5
6
7
指定从库用于并行执行 SQL 线程的数量
SET GLOBAL slave_parallel_workers = 4;
slave_parallel_type:设置并行复制的类型
SET GLOBAL slave_parallel_type='LOGICAL_CLOCK';
重启复制线程
STOP SLAVE;
START SLAVE;

Atlas读写分离

MHA+Atlas

介绍

1
2
3
4
5
6
7
8
Atlas是由 Qihoo 360, Web平台部基础架构团队开发维护的一个基于MySQL协议的数据中间层项目。它是在mysql-proxy 0.8.2版本的基础上,对其进行了优化,增加了一些新的功能特性。
360内部使用Atlas运行的mysql业务,每天承载的读写请求数达几十亿条。
下载地址
https://github.com/Qihoo360/Atlas/releases
注意:
1. Atlas只能安装运行在64位的系统上
2. Centos 5.X安装 Atlas-XX.el5.x86_64.rpm,Centos 6.X安装Atlas-XX.el6.x86_64.rpm。
3. 后端mysql版本应大于5.1,建议使用Mysql 5.6以上

下载

1
https://objects.githubusercontent.com/github-production-release-asset-2e65be/15279980/39ac0594-8600-11e4-9ad6-7fc203cad2de?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20241018%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241018T030903Z&X-Amz-Expires=300&X-Amz-Signature=5bfd9678cb16fee876e6f6a3af89b858a41f5840485524604c559ab8da9bc290&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3DAtlas-2.2.1.el6.x86_64.rpm&response-content-type=application%2Foctet-stream

安装配置

安装

1
2
3
4
5
6
7
rpm -ivh Atlas-2.2.1.el6.x86_64.rpm

创建日志文件夹
mkdir -p /web/logs/mysql/mysql-proxy/

cd /usr/local/mysql-proxy/
mv test.cnf test.cnf.bak
主库 从库1 从库2
192.168.177.200 192.168.177.201 192.168.177.202

密码加密

1
2
3
4
进入Atlas下bin目录
加密密码
[root@localhost bin]# ./encrypt 123456
/iZxz+0GRoA=

配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
vim test.cnf

[mysql-proxy]
#管理Atlas用户名密码
admin-username = atlas
admin-password = 123456
#负责写入的服务器iP
proxy-backend-addresses = 192.168.177.218:3306
#负责读的节点(从库)
proxy-read-only-backend-addresses = 192.168.177.201:3306,192.168.177.202:3306
pwds = repl:/iZxz+0GRoA=,mha:/iZxz+0GRoA=
daemon = true
keepalive = true
event-threads = 8
log-level = message
log-path = /web/logs/mysql/mysql-proxy/
#是否记录操作日志
sql-log=ON
proxy-address = 0.0.0.0:33060
admin-address = 0.0.0.0:2345
charset=utf8

启动

1
/usr/local/mysql-proxy/bin/mysql-proxyd test start

验证

1
2
3
4
5
6
ps -ef |grep proxy
netstat -lntp |grep 33060
tcp 0 0 0.0.0.0:33060 0.0.0.0:* LISTEN 55015/mysql-proxy

netstat -lntp |grep 2345
tcp 0 0 0.0.0.0:2345 0.0.0.0:* LISTEN 55015/mysql-proxy

测试

1
2
3
mysql -u mha -p123456 -h 192.168.177.202 -P 33060
查询
select @@server_id;

1
2
写入
mysql> begin;select @@server_id;commit;

生产用户要求

1
2
3
4
5
6
7
8
9
10
开发人员申请一个应用用户 deploy(  select  update  insert)  密码123456,要通过192网段登录
1. 在主库中,创建用户
grant select ,update,insert on *.* to deploy@'192.168.177.%' identified by '123456';
2. 在atlas中添加生产用户
/usr/local/mysql-proxy/bin/encrypt 123456 ---->制作加密密码
vim test.cnf
pwds = repl:/iZxz+0GRoA=,mha:/iZxz+0GRoA=,deploy:/iZxz+0GRoA=

/usr/local/mysql-proxy/bin/mysql-proxyd test restart
mysql -u deploy -p123456 -h 192.168.177.202 -P 33060

Atlas基本管理

连接管理接口

1
mysql -u Atlas -p 123456 -h127.0.0.1 -P2345

打印帮助

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
mysql> select * from help;
+----------------------------+---------------------------------------------------------+
| command | description |
+----------------------------+---------------------------------------------------------+
| SELECT * FROM help | shows this help |
| SELECT * FROM backends | lists the backends and their state |
| SET OFFLINE $backend_id | offline backend server, $backend_id is backend_ndx's id |
| SET ONLINE $backend_id | online backend server, ... |
| ADD MASTER $backend | example: "add master 127.0.0.1:3306", ... |
| ADD SLAVE $backend | example: "add slave 127.0.0.1:3306", ... |
| REMOVE BACKEND $backend_id | example: "remove backend 1", ... |
| SELECT * FROM clients | lists the clients |
| ADD CLIENT $client | example: "add client 192.168.1.2", ... |
| REMOVE CLIENT $client | example: "remove client 192.168.1.2", ... |
| SELECT * FROM pwds | lists the pwds |
| ADD PWD $pwd | example: "add pwd user:raw_password", ... |
| ADD ENPWD $pwd | example: "add enpwd user:encrypted_password", ... |
| REMOVE PWD $pwd | example: "remove pwd user", ... |
| SAVE CONFIG | save the backends to config file |
| SELECT VERSION | display the version of Atlas |
+----------------------------+---------------------------------------------------------+

查询后端所有节点信息

1
2
3
4
5
6
7
8
mysql>  SELECT * FROM backends    ;
+-------------+----------------------+-------+------+
| backend_ndx | address | state | type |
+-------------+----------------------+-------+------+
| 1 | 192.168.177.218:3306 | up | rw |
| 2 | 192.168.177.201:3306 | up | ro |
| 3 | 192.168.177.202:3306 | up | ro |
+-------------+----------------------+-------+------+

动态删除节点

1
REMOVE BACKEND 3;

动态添加节点

1
ADD SLAVE 192.168.177.202:3306;

保存

1
SAVE CONFIG;

自动分表

1
2
3
4
使用Atlas的分表功能时,首先需要在配置文件test.cnf设置tables参数。
tables参数设置格式:数据库名.表名.分表字段.子表数量,
比如:
你的数据库名叫school,表名叫stu,分表字段叫id,总共分为2张表,那么就写为school.stu.id.2,如果还有其他的分表,以逗号分隔即可。

读写分离

1
2
3
MySQL-Router    ---> MySQL官方
ProxySQL --->Percona
Maxscale ---> MariaDB

MHA高可用+Atlas读写分离
https://yftxhy.site/2024/10/18/MHA高可用+Atlas读写分离/
作者
Taozi
发布于
2024年10月18日
许可协议