高可用集群KEEPALIVED
一.高可用集群
1.1 集群类型
- LB:Load Balance 负载均衡
LVS/HAProxy/nginx(http/upstream, stream/upstream)
- HA:High Availability 高可用集群
数据库、Redis
- SPoF: Single Point of Failure,解决单点故障
HPC:High Performance Computing 高性能集群
1.2 系统可用性
SLA : Service-Level Agreement 服务等级协议(提供服务的企业与客户之间就服务的品质、水准、性能等方面所达成的双方共同认可的协议或契约)A = MTBF / (MTBF+MTTR )99.95%:(60*24*30)*(1-0.9995)=21.6 分钟 # 一般按一个月停机时间统计指标 : 99.9%, 99.99%, 99.999%,99.9999%
1.3 系统故障
- 硬件故障:设计缺陷、wear out(损耗)、非人为不可抗拒因素
- 软件故障:设计缺陷 bug
1.4 实现高可用
- active/passive 主/备
- active/active 双主
- active --> HEARTBEAT --> passive
- active <--> HEARTBEAT <--> active
1.5.VRRP:Virtual Router Redundancy Protocol
- 物理层:路由器、三层交换机
- 软件层:keepalived
1.5.1 VRRP 相关术语
- 虚拟路由器:Virtual Router
- 虚拟路由器标识:VRID(0-255),唯一标识虚拟路由器
- VIP:Virtual IP
- VMAC:Virutal MAC (00-00-5e-00-01-VRID)
- 物理路由器:
- master:主设备
- backup:备用设备
- priority:优先级
1.5.2 VRRP 相关技术
- 无认证
- 简单字符认证:预共享密钥
- MD5
- 主/备:单虚拟路由器
- 主/主:主/备(虚拟路由器1),备/主(虚拟路由器2)
1.6Keepalived 相关文件
- 软件包名:keepalived
- 主程序文件:/usr/sbin/keepalived
- 主配置文件:/etc/keepalived/keepalived.conf
- 配置文件示例:/usr/share/doc/keepalived/
- Unit File:/lib/systemd/system/keepalived.service
- Unit File的环境配置文件:/etc/sysconfig/keepalived
WarningRHEL7 中可能会遇到一下 bugsystemctl restart keepalived # 新配置可能无法生效systemctl stop keepalived;systemctl start keepalived # 无法停止进程,需要 kill 停
二、 全局配置
! Configuration File for keepalivedglobal_defs {notification_email {594233887@qq.com#keepalived 发生故障切换时邮件发送的目标邮箱,可以按行区分写多个timiniglee-zln@163.com}notification_email_from keepalived@KA1.timinglee.org # 发邮件的地址smtp_server 127.0.0.1 # 邮件服务器地址smtp_connect_timeout 30 # 邮件服务器连接 timeoutrouter_id KA1.timinglee.org #每个keepalived主机唯一标识# 建议使用当前主机名,但多节点重名不影响vrrp_skip_check_adv_addr # 对所有通告报文都检查,会比较消耗性能#启用此配置后,如果收到的通告报文和上一个报文是同一 #个路由器,则跳过检查,默认值为全检查vrrp_strict # 严格遵循 vrrp 协议# 启用此项后以下状况将无法启动服务 :#1. 无 VIP 地址#2. 配置了单播邻居#3. 在 VRRP 版本 2 中有 IPv6 地址# 建议不加此项配置vrrp_garp_interval 0 # 报文发送延迟, 0 表示不延迟vrrp_gna_interval 0 # 消息发送延迟vrrp_mcast_group4 224.0.0.18 # 指定组播 IP 地址范围:}
三、配置虚拟路由器
vrrp_instance VI_1 {
state MASTER
interface eth0 #绑定为当前虚拟路由器使用的物理接口,如:eth0,可以和VIP不在一个网卡
virtual_router_id 51 #每个虚拟路由器惟一标识,范围:0-255,每个虚拟路由器此值必须唯一
#否则服务无法启动
#同属一个虚拟路由器的多个keepalived节点必须相同
#务必要确认在同一网络中此值必须唯一
priority 100 #当前物理节点在此虚拟路由器的优先级,范围:1-254
#值越大优先级越高,每个keepalived主机节点此值不同
advert_int 1 #vrrp通告的时间间隔,默认1s
authentication { #认证机制
auth_type AH|PASS #AH为IPSEC认证(不推荐),PASS为简单密码(建议使用)
uth_pass 1111 #预共享密钥,仅前8位有效
#同一个虚拟路由器的多个keepalived节点必须一样
}
virtual_ipaddress { #虚拟IP,生产环境可能指定上百个IP地址
<IPADDR>/<MASK> brd <IPADDR> dev <STRING> scope <SCOPE> label <LABEL>
172.25.254.100 #指定VIP,不指定网卡,默认为eth0,注意:不指定/prefix,默认32
172.25.254.101/24 dev eth1
172.25.254.102/24 dev eth2 label eth2:1
}
}
四、抢占模式和非抢占模式
1、非抢占模式 nopreempt
默认为抢占模式 preempt ,即当高优先级的主机恢复在线后,会抢占低先级的主机的 master 角色, 这样会使vip在 KA 主机中来回漂移,造成网络抖动,建议设置为非抢占模式 nopreempt ,即高优先级主机恢复后,并不会抢占低优先级主机的 master角色非抢占模块下 , 如果原主机 down 机 , VIP 迁移至的新主机 , 后续也发生 down 时 , 仍会将 VIP 迁移回原主机
2、抢占延迟模式 preempt_delay
preempt_delay # #指定抢占延迟时间为#s,默认延迟300s
Note注意:需要各keepalived服务器state为BACKUP,并且不要启用 vrrp_strict
五、VIP单播配置
[!NOTE]注意:启用 vrrp_strict 时,不能启用单播
unicast_src_ip <IPADDR> # 指定发送单播的源 IPunicast_peer {<IPADDR> # 指定接收单播的对方目标主机 IP......}
六、Keepalived 通知脚本配置
1、通知脚本类型
当前节点成为主节点时触发的脚本
notify_master <STRING>|<QUOTED-STRING>
当前节点转为备节点时触发的脚本
notify_backup <STRING>|<QUOTED-STRING>
当前节点转为“失败”状态时触发的脚本
notify_fault <STRING>|<QUOTED-STRING>
通用格式的通知触发机制,一个脚本可完成以上三种状态的转换时的通知
notify <STRING>|<QUOTED-STRING>
当停止VRRP时触发的脚本
notify_stop <STRING>|<QUOTED-STRING>
2、脚本的调用方法
在 vrrp_instance VI_1 语句块的末尾加下面行notify_master "/etc/keepalived/notify.sh master"notify_backup "/etc/keepalived/notify.sh backup"notify_fault "/etc/keepalived/notify.sh fault"
七:双主架构
实现 master/master 的 Keepalived 双主架构master/slave 的单主架构,同一时间只有一个 Keepalived 对外提供服务,此主机繁忙,而另一台主机却很空闲,利用率低下,可以使用master/master 的双主架构,解决此问题。master/master 的双主架构:即将两个或以上 VIP 分别运行在不同的 keepalived 服务器,以实现服务器并行提供 web 访问的目的,提高服务器资源利用率
八、实现IPVS的高可用性
1、IPVS相关配置
1.1虚拟服务器配置结构
virtual_server IP port {...real_server {...}real_server {...}…}
1.2 virtual server (虚拟服务器)的定义格式
virtual_server IP port # 定义虚拟主机 IP 地址及其端口virtual_server fwmark int #ipvs 的防火墙打标,实现基于防火墙的负载均衡集群virtual_server group string # 使用虚拟服务器组
1.3 虚拟服务器配置
virtual_server IP port { #VIP 和 PORTdelay_loop <INT> # 检查后端服务器的时间间隔lb_algo rr|wrr|lc|wlc|lblc|sh|dh # 定义调度方法lb_kind NAT|DR|TUN # 集群的类型 , 注意要大写persistence_timeout <INT> # 持久连接时长protocol TCP|UDP|SCTP # 指定服务协议 , 一般为 TCPsorry_server <IPADDR> <PORT> # 所有 RS 故障时,备用服务器地址real_server <IPADDR> <PORT> { #RS 的 IP 和 PORTweight <INT> #RS 权重notify_up <STRING>|<QUOTED-STRING> #RS 上线通知脚本notify_down <STRING>|<QUOTED-STRING> #RS 下线通知脚本HTTP_GET|SSL_GET|TCP_CHECK|SMTP_CHECK|MISC_CHECK { ... } # 定义当前主机健康状态检测方法}}# 注意 : 括号必须分行写 , 两个括号写在同一行 , 如 : }} 会出错
1.4 应用层监测
HTTP_GET|SSL_GET {url {path <URL_PATH> # 定义要监控的 URLstatus_code <INT> # 判断上述检测机制为健康状态的响应码,一般为 200}connect_timeout <INTEGER> # 客户端请求的超时时长 , 相当于 haproxy 的 timeout servernb_get_retry <INT> # 重试次数delay_before_retry <INT> # 重试之前的延迟时长connect_ip <IP ADDRESS> # 向当前 RS 哪个 IP 地址发起健康状态检测请求connect_port <PORT> # 向当前 RS 的哪个 PORT 发起健康状态检测请求bindto <IP ADDRESS> # 向当前 RS 发出健康状态检测请求时使用的源地址bind_port <PORT> # 向当前 RS 发出健康状态检测请求时使用的源端口}
1.5 TCP监测
TCP_CHECK {connect_ip <IP ADDRESS> # 向当前 RS 的哪个 IP 地址发起健康状态检测请求connect_port <PORT> # 向当前 RS 的哪个 PORT 发起健康状态检测请求bindto <IP ADDRESS> # 发出健康状态检测请求时使用的源地址bind_port <PORT> # 发出健康状态检测请求时使用的源端口connect_timeout <INTEGER> # 客户端请求的超时时长# 等于 haproxy 的 timeout server}
九、实现其它应用的高可用性 VRRP Script
1、VRRP Script 配置
vrrp_script <SCRIPT_NAME> {script <STRING>|<QUOTED-STRING> # 此脚本返回值为非 0 时,会触发下面 OPTIONS 执行OPTIONS}
track_script {SCRIPT_NAME_1SCRIPT_NAME_2}
2、定义 VRRP script
vrrp_script <SCRIPT_NAME> { # 定义一个检测脚本,在 global_defs 之外配置script <STRING>|<QUOTED-STRING> #shell 命令或脚本路径interval <INTEGER> # 间隔时间,单位为秒,默认 1 秒timeout <INTEGER> # 超时时间weight <INTEGER:-254..254> # 默认为 0, 如果设置此值为负数,# 当上面脚本返回值为非 0 时# 会将此值与本节点权重相加可以降低本节点权重,# 即表示 fall.# 如果是正数,当脚本返回值为 0 ,# 会将此值与本节点权重相加可以提高本节点权重# 即表示 rise. 通常使用负值fall <INTEGER> # 执行脚本连续几次都失败 , 则转换为失败,建议设为 2 以上rise <INTEGER> # 执行脚本连续几次都成功,把服务器从失败标记为成功user USERNAME [GROUPNAME] # 执行监测脚本的用户或组init_fail # 设置默认标记为失败状态,监测成功之后再转换为成功状态}
3、调用 VRRP script
vrrp_instance test {... ...track_script {check_down}}
keepalived实验
一、虚拟路由管理
1、ip配置
[root@ka1 ~]# vmset.sh eth0 172.25.254.10 ka1.hyl.oeg
[root@ka1 ~]# systemctlstop firewalld.service[root@ka2 ~]# # vmset.sh eth0 172.25.254.20 ka2.hyl.oeg
[root@ka2 ~]# systemctlstop firewalld.service[root@realserver1 ~]# vmset.sh eth0 172.25.254.110 realserver1.hyl.oeg
[root@realserver1 ~]#systemctlstop firewalld.service[root@realserver2 ~]#vmset.sh eth0 172.25.254.120 realserver2.hyl.oeg
[root@realserver2~]#systemctlstop firewalld.service
2、配置httpd
[root@realserver1~]# yum install httpd -y
[root@realserver1 ~]# echo 172.25.254.110 > /var/www/html/index.html
[root@realserver1 ~]# systemctl enable --now httpd
Created symlink from /etc/systemd/system/multi-user.target.wants/httpd.service to /usr/lib/systemd/system/httpd.service.
[root@realserver2 ~]# yum install httpd -y
[root@realserver2 ~]# echo 172.25.254.120 > /var/www/html/index.html
[root@realserver2 ~]# systemctl enable --now httpd
Created symlink from /etc/systemd/system/multi-user.target.wants/httpd.service to /usr/lib/systemd/system/httpd.service.
测试
[root@ka1 ~]# curl 172.25.254.110 172.25.254.110
[root@ka1 ~]# curl 172.25.254.120 172.25.254.120
3、安装 keepalived
[root@ka1 ~]# yum install keepalived -y
4、全局配置
配置ka1端
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
重启服务
[root@ka1 ~]# systemctl enable --now keepalived
Created symlink from /etc/systemd/system/multi-user.target.wants/keepalived.service to /usr/lib/systemd/system/keepalived.service.
[root@ka1 ~]# systemctl restart keepalived
查看vip
安装tcpump
[root@ka1~]# yum install tcpdump -y
日志测试
[root@ka1 ~]# tcpdump -i eth0 -nn host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:05:37.766822 IP 172.25.254.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 100, prio 100, authtype simple,
intvl 1s, length 20
11:05:38.768163 IP 172.25.254.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 100, prio 100, authtype simple,
intvl 1s, length 20
11:05:39.769328 IP 172.25.254.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 100, prio 100, authtype simple,
intvl 1s, length 20
ka2配置
[root@ka1 ~]# scp /etc/keepalived/keepalived.conf
root@172.25.254.20:/etc/keepalived/keepalived.conf
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
重启服务
[root@ka2 ~]# systemctl enable --now keepalived.service
Created symlink from /etc/systemd/system/multi-user.target.wants/keepalived.service to /usr/lib/systemd/system/keepalived.service.
[root@ka2 ~]# systemctl restart keepalived.service
[root@ka2 ~]# yum install tcpdump -y
查看ip
二、开启通信
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
或者
[root@ka1 ~]# systemctl restart keepalived
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
[root@ka2 ~]# systemctl restart keepalived.service
测试
[root@ka2 ~]# ping 172.25.254.100
PING 172.25.254.100 (172.25.254.100) 56(84) bytes of data.
64 bytes from 172.25.254.100: icmp_seq=1 ttl=64 time=0.206 ms
64 bytes from 172.25.254.100: icmp_seq=2 ttl=64 time=0.441 ms
--- 172.25.254.100 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.206/0.323/0.441/0.118 ms
三、开启独立日志功能
修改sysconfig文件
[root@ka1 ~]# vim /etc/sysconfig/keepalived
修改rsyslog文件
[root@ka1 ~]# vim /etc/rsyslog.conf
重启服务
[root@ka1 ~]# systemctl restart keepalived.service
[root@ka1 ~]# systemctl restart rsyslog.service
查看日志文件
[root@ka1 ~]# ll /var/log/keepalived.log
-rw------- 1 root root 2253 8月 12 11:56 /var/log/keepalived.log
四、实现独立子配置文件
注释文件内容
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
建立文件
[root@ka1 ~]# mkdir -p /etc/keepalived/conf.d
开启子文件
[root@ka1 ~]# vim /etc/keepalived/conf.d/172.25.254.100.conf
[root@ka1 ~]# systemctl restart keepalived.service
五、非抢占模式 nopreempt
非抢占模式
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived.service
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
[root@ka2~]# systemctl restart keepalived.service
测试
六、抢占延迟模式 preempt_delay
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
[root@ka2 ~]# systemctl restart keepalived.service
七、组播变单播
主机配置
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
[root@ka2 ~]# systemctl restart keepalived.service
测试
[root@ka1 ~]# tcpdump -i eth0 -nn src host 172.25.254.10 and dst 172.25.254.20
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:57:34.119048 IP 172.25.254.10 > 172.25.254.20: VRRPv2, Advertisement, vrid 100, prio 100, authtype simple, intvl 1s, length 20
14:57:35.120354 IP 172.25.254.10 > 172.25.254.20: VRRPv2, Advertisement, vrid 100, prio 100, authtype simple, intvl 1s, length 20
14:57:35.127617 ARP, Reply 172.25.254.10 is-at 00:0c:29:5c:43:47, length 28
14:57:36.122126 IP 172.25.254.10 > 172.25.254.20: VRRPv2, Advertisement, vrid 100, prio 100, authtype simple, intvl 1s, length 20
^C
4 packets captured
4 packets received by filter
0 packets dropped by kernel
[root@ka1 ~]# systemctl stop keepalived
[root@ka2 ~]# tcpdump -i eth0 -nn src host 172.25.254.20 and dst 172.25.254.10
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 15:00:16.652507 IP 172.25.254.20 > 172.25.254.10: VRRPv2, Advertisement, vrid 100, prio 80, authtype simple, intvl 1s, length 20
15:00:17.653821 IP 172.25.254.20 > 172.25.254.10: VRRPv2, Advertisement, vrid 100, prio 80, authtype simple, intvl 1s, length 20
15:00:18.655321 IP 172.25.254.20 > 172.25.254.10: VRRPv2, Advertisement, vrid 100, prio 80, authtype simple, intvl 1s, length 20
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
八、邮件通知
安装邮件发送工具
[root@ka1 ~]# yum install mailx -y
[root@ka2 ~]# yum install mailx -y
QQ邮箱配置
[root@ka1 ~]# vim /etc/mail.rc
发送文件
[root@ka1 ~]# echo test message | mail -s test 3238886180@qq.com
实现 Keepalived 状态切换的通知脚本
编写脚本+权限
[root@ka1 ~]# vim /etc/keepalived/mail.sh
[root@ka1 ~]# chmod +x /etc/keepalived/mail.sh
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
测试
[root@ka1 ~]# /etc/keepalived/mail.sh master
[root@ka1 ~]# /etc/keepalived/mail.sh backup
[root@ka2 ~]# /etc/keepalived/mail.sh backup
[root@ka2 ~]# /etc/keepalived/mail.sh fault
[root@ka1 ~]# systemctl stop keepalived
[root@ka1 ~]# systemctl restart keepalived
九:双主架构
配置ka1、ka2文件
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
[root@ka2 ~]# systemctl restart keepalived
测试
十:keepalived+lvs
设定vip
[root@realserver1 ~]# ip a a 172.25.254.100/32 dev lo
[root@realserver1 ~]# vi /etc/sysctl.d/arp.conf
[root@realserver1 ~]# scp /etc/sysctl.d/arp.conf root@172.25.254.120:/etc/sysctl.d/arp.conf
ka1、ka2文件配置
[root@ka1 ~]# yum install ipvsadm -y
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka2 ~]# yum install ipvsadm -y
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
[root@ka2 ~]# systemctl restart keepalived
测试
十一:利用脚本实现主从角色切换
[root@ka1 ~]# vim /etc/keepalived/test.sh
[root@ka1 ~]# chmod +x /etc/keepalived/test.sh
[root@ka1 ~]# cat /etc/keepalived/test.sh
#!/bin/bash
[ ! -f /mnt/lee ]
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka1 ~]# sh /etc/keepalived/test.sh
[root@ka1 ~]# echo $?
0
[root@ka1 ~]# rm -rf /mnt/lee
[root@ka1 ~]# sh /etc/keepalived/test.sh
[root@ka1 ~]# echo $?
1
十二:实现HAProxy高可用
在两个ka1和ka2两个节点启用内核参数
[root@ka1 ~]# vim /etc/sysctl.conf
net.ipv4.ip_nonlocal_bind = 1
[root@ka1 ~]# sysctl -p
net.ipv4.ip_nonlocal_bind = 1
[root@ka2 ~]# vim /etc/sysctl.conf
[root@ka2 ~]# sysctl -p
net.ipv4.ip_nonlocal_bind = 1
实现haproxy的配置 ka1、ka2
[root@ka1 ~]# yum install haproxy -y
[root@ka2 ~]# yum install haproxy -y
[root@ka1 ~]# vim /etc/haproxy/haproxy.cfg
[root@ka1 ~]# systemctl restart haproxy
[root@ka1 ~]# systemctl enable --now haproxy
Created symlink from /etc/systemd/system/multi-user.target.wants/haproxy.service to /usr/lib/sy stemd/system/haproxy.service.
server配置
[root@realserver1 ~]# ip a d 172.25.254.100/32 dev lo
[root@realserver2 ~]# ip a d 172.25.254.100/32 dev lo
[root@realserver1 ~]# vim /etc/sysctl.d/arp.conf
[root@realserver2 ~]# vim /etc/sysctl.d/arp.conf
测试
[root@realserver1 ~]# curl 172.25.254.110
172.25.254.110
[root@realserver1 ~]# curl 172.25.254.120
172.25.254.120
keepalived配置
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
注释掉lvs
[root@ka2 ~]# systemctl restart keepalived
查看服务
访问测试
为解决ka1的vip掉了 访问失败
[root@ka1 ~]# systemctl restart haproxy
[root@ka1 ~]# killall -0 haproxy
[root@ka1 ~]# systemctl stop haproxy
[root@ka1 ~]# killall -0 haproxy
haproxy: no process found
[root@ka1 ~]# [root@ka1 ~]# echo $?
1
[root@ka2 ~]# killall -0 haproxy
-bash: killall: 未找到命令
[root@ka2 ~]# yum install psmisc -y
[root@ka2 ~]# killall -0 haproxy
[root@ka2 ~]# echo $?
0
在ka1中编写检测脚本
[root@ka1 ~]# mkdir /etc/keepalived/scripts
[root@ka1 ~]# vim /etc/keepalived/scripts/haproxy.sh
[root@ka1 ~]# chmod +x /etc/keepalived/scripts/haproxy.sh
在ka1、2中配置keepalived
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
[root@ka2 ~]# systemctl restart keepalived
测试
进入循环测试
当关闭服务,则访问失败
[root@ka1 ~]# systemctl stop haproxy
当服务开启,继续循环访问
[root@ka1 ~]# systemctl start haproxy