简介
什么是哨兵
Redis的主从模式下,主节点一旦发生故障便不能提供服务,需要人工干预。手动将从节点晋升为主节点,同时还需要修改客户端配置。
Sentinel(哨兵)架构解决了Redis主从人工干预的问题。
Redis Sentinel是Redis的高可用实现方案。实际生成环境中,可以提高整个Redis系统的可用性。
Sentinel 本质上是一个运行在特殊状模式下的 Redis 服务器。
Sentinel 模式下 Redis 服务器只支持 PING、SENTINEL、INFO、SUBSCRIBE、UNSUBSCRIBE、PSUBSCRIBE、PUNSUBSCRIBE 七个命令。
哨兵的主要功能
Redis Sentinel是一个分布式系统,可以提高Redis的可用性。在没有人为干预的情况下可以实现故障节点替换。
Sentinel系统用于管理多个Redis服务器,Sentinel主要任务有:
- 监控(Monitoring)
Sentinel会不断地定期检查主服务器和从服务器是否运行正常。
- 提醒(Notification)
当被监控的某个Redis服务器出现问题时,Sentinel会通过API向管理员或者其他程序发送通知。
- 自动故障迁移(Automatic failover)
当一个主机不能正常工作时,Sentinel会自动故障转移。它会在集群中选一个节点作为新的主节点,并让其他从节点复制此新主节点。
实现
一主两从三哨兵
规划Redis
-
主机规划
| 主机名称 | 主机地址 | 主机规划 |
| — | — | — |
| redis-master | 192.168.108.129 | master主机+Sentinel |
| redis-slave-1 | 192.168.108.130 | slave主机+Sentinel |
| redis-slave-2 | 192.168.108.131 | slave主机+Sentinel | -
redis目录规划
| 用途 | 目录 | 备注 |
| — | — | — |
| 家目录 | /usr/local/redis/ | |
| 解压目录 | /usr/local/ | |
| 日志目录 | /usr/local/redis/logs/ | |
| 数据目录 | /usr/local/redis/data/ | |
| 配置目录 | /usr/local/redis/conf/ | |
| PID目录 | /usr/local/redis/pid/ | | -
sentinel目录规划
| 用途 | 目录 | 备注 |
| — | — | — |
| 日志目录 | /usr/local/redis_16379/logs/ | |
| 数据目录 | /usr/local/redis_16379/data/ | |
| 配置目录 | /usr/local/redis_16379/conf/ | |
| PID目录 | /usr/local/redis_16379/pid/ | | -
检查主机
#检查防火墙
systemctl status firewalld
systemctl stop firewalld
systemctl disable firewalld
- 拓扑图
安装Redis
涉及主机:redis-master和redis-slave
#解决依赖关系
yum -y install wget gcc gcc-c++ make automake autoconf libtool libc
#下载Redis
wget http://download.redis.io/releases/redis-3.2.9.tar.gz
#解压Redis
tar -zxvf redis-3.2.9.tar.gz -C /usr/local/
#重命名Redis文件
mv /usr/local/redis-3.2.9 /usr/local/redis
#安装Redis
cd /usr/local/redis
make
make PREFIX=/usr/local/redis MALLOC=libc install
#创建目录和用户
useradd redis
mkdir -p /usr/local/redis/{data,conf,logs,pid}
chown -R redis:redis /usr/local/redis/
#复制配置文件
cp /usr/local/redis/redis.conf /usr/local/redis/conf/
#命令加载至profile
cat >> /etc/profile << 'EOF'
export PATH=$PATH:/usr/local/redis/bin/
EOF
source /etc/profile
配置Redis
- master和slave
cat > /usr/local/redis/conf/redis.conf << EOF
bind 0.0.0.0
port 6379
requirepass 1qaz!QAZ
protected-mode yes
timeout 0
databases 16
daemonize yes
pidfile /usr/local/redis/pid/redis_6379.pid
loglevel notice
logfile "/usr/local/redis/logs/redis.log"
#数据保存路径
dir /usr/local/redis/data/
#备份策略
save 900 1
save 300 10
save 60 10000
#备份文件名称
dbfilename dump.rdb
EOF
实现Sentinel
配置主从
- slave主机
cat >> /usr/local/redis/conf/redis.conf << EOF
##添加主库信息
#主库地址和端口
slaveof 192.168.108.129 6379
#主库密码
masterauth 1qaz!QAZ
EOF
配置哨兵
- master和slave主机
mkdir -p /usr/local/redis_16379/{conf,logs,pid,data}
cat > /usr/local/redis_16379/conf/redis.conf << EOF
bind 0.0.0.0
port 16379
daemonize yes
logfile /usr/local/redis_16379/logs/redis_16379.log
pidfile /usr/local/redis_16379/pid/redis_16379.pid
dir /usr/local/redis_16379/data
#指定主节点信息,判断主节点故障后几台sentinel选举master
sentinel monitor redis-master 192.168.108.129 6379 2
#指定Sentinel认为服务器断了多久后迁移节点,单位毫秒
sentinel down-after-milliseconds redis-master 3000
#指定一次又多少个从节点去新master上复制,一般为一台一台的同步复制
sentinel parallel-syncs redis-master 1
#故障转移超时时间,单位毫秒
sentinel failover-timeout redis-master 180000
#添加连接Redis的密码
sentinel auth-pass redis-master 1qaz!QAZ
EOF
- 参数
sentinel monitor redis-master 192.168.108.129 6379 2
sentinel monitor
: 是一个自定义名称,用于哨兵之间识别和标识主服务器的逻辑名字。
: 主服务器的IP地址。
: 主服务器监听的端口。
: 最小投票数,它是哨兵在认为主服务器已经不可达并启动自动故障迁移之前需要达到的最小哨兵同意数。例如,如果设置为2,则至少需要两个哨兵同意主服务器无法访问。
启动redis主从
redis-server /usr/local/redis/conf/redis.conf
启动Redis哨兵
redis-sentinel /usr/local/redis_16379/conf/redis.conf
验证哨兵
验证服务
- Redis进程
[root@wang redis]# ps -ef | grep redis
root 33560 1 0 15:14 ? 00:00:00 redis-server 0.0.0.0:6379
root 33607 1 0 15:22 ? 00:00:00 redis-sentinel 0.0.0.0:16379 [sentinel]
root 33611 22305 0 15:22 pts/0 00:00:00 grep --color=auto redis
- 配置
#主节点
[root@wang redis]# redis-cli -h 192.168.108.129 -p 16379
192.168.108.129:16379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=redis-master,status=ok,address=192.168.108.129:6379,slaves=2,sentinels=3
#从节点
[root@localhost redis]# redis-cli -h 192.168.108.130 -p 16379
192.168.108.130:16379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=redis-master,status=ok,address=192.168.108.129:6379,slaves=2,sentinels=3
[root@localhost redis]# redis-cli -h 192.168.108.131 -p 16379
192.168.108.131:16379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=redis-master,status=ok,address=192.168.108.129:6379,slaves=2,sentinels=3
验证读写
#从机写入
192.168.108.131:6379> AUTH 1qaz!QAZ
OK
192.168.108.131:6379> set name wangmingqu
(error) READONLY You can't write against a read only slave.
192.168.108.130:6379> AUTH 1qaz!QAZ
OK
192.168.108.130:6379> set name wangmingqu
(error) READONLY You can't write against a read only slave.
#主机读写
192.168.108.129:6379> AUTH 1qaz!QAZ
OK
192.168.108.129:6379> set name wangmingqu
OK
192.168.108.130:6379> get name
"wangmingqu"
192.168.108.131:6379> get name
"wangmingqu"
验证故障转移
- 主备切换
#关闭主节点
pkill -9 redis
#观察从节点
[root@localhost redis]# redis-cli -h 192.168.108.130 -p 16379
192.168.108.130:16379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=redis-master,status=ok,address=192.168.108.131:6379,slaves=2,sentinels=3
[root@localhost redis]# redis-cli -h 192.168.108.131 -p 16379
192.168.108.131:16379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=redis-master,status=ok,address=192.168.108.131:6379,slaves=2,sentinels=3
- 数据读写
#从机写入
192.168.108.130:6379> AUTH 1qaz!QAZ
OK
192.168.108.130:6379> set age 18
(error) READONLY You can't write against a read only slave.
#主机写入
192.168.108.131:6379> AUTH 1qaz!QAZ
OK
192.168.108.131:6379> set age 18
OK
#从机读取
192.168.108.130:6379> get age
"18"
哨兵命令
登录哨兵命令行
redis-cli -h 192.168.108.129 -p 16379
查看哨兵信息
192.168.108.129:16379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=redis-master,status=ok,address=192.168.108.131:6379,slaves=2,sentinels=3
查看哨兵主节点信息
192.168.108.129:16379> sentinel masters
1) 1) "name"
2) "redis-master"
3) "ip"
4) "192.168.108.131"
5) "port"
6) "6379"
7) "runid"
8) "06313df278d3d20060b7e12a94efaaa217547ce3"
9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "690"
19) "last-ping-reply"
20) "690"
21) "down-after-milliseconds"
22) "3000"
23) "info-refresh"
24) "901"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "503084"
29) "config-epoch"
30) "1"
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "180000"
39) "parallel-syncs"
40) "1"
查看某个哨兵组的主节点信息
192.168.108.129:16379> sentinel master redis-master
1) "name"
2) "redis-master"
3) "ip"
4) "192.168.108.131"
5) "port"
6) "6379"
7) "runid"
8) "06313df278d3d20060b7e12a94efaaa217547ce3"
9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "740"
19) "last-ping-reply"
20) "740"
21) "down-after-milliseconds"
22) "3000"
23) "info-refresh"
24) "6188"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "679090"
29) "config-epoch"
30) "1"
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "180000"
39) "parallel-syncs"
40) "1"
查看某个哨兵组的从节点信息
192.168.108.129:16379> sentinel slaves redis-master
1) 1) "name"
2) "192.168.108.129:6379"
3) "ip"
4) "192.168.108.129"
5) "port"
6) "6379"
7) "runid"
8) "cc9e042f8312ef446118d0171869a67ae4af579d"
9) "flags"
10) "slave"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "762"
19) "last-ping-reply"
20) "762"
21) "down-after-milliseconds"
22) "3000"
23) "info-refresh"
24) "227"
25) "role-reported"
26) "slave"
27) "role-reported-time"
28) "766728"
29) "master-link-down-time"
30) "1709799229000"
31) "master-link-status"
32) "err"
33) "master-host"
34) "192.168.108.131"
35) "master-port"
36) "6379"
37) "slave-priority"
38) "100"
39) "slave-repl-offset"
40) "1"
2) 1) "name"
2) "192.168.108.130:6379"
3) "ip"
4) "192.168.108.130"
5) "port"
6) "6379"
7) "runid"
8) "cdc7797103531d2aca7ff5c7f5bbbd2d16ece9f8"
9) "flags"
10) "slave"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "762"
19) "last-ping-reply"
20) "762"
21) "down-after-milliseconds"
22) "3000"
23) "info-refresh"
24) "3356"
25) "role-reported"
26) "slave"
27) "role-reported-time"
28) "776764"
29) "master-link-down-time"
30) "0"
31) "master-link-status"
32) "ok"
33) "master-host"
34) "192.168.108.131"
35) "master-port"
36) "6379"
37) "slave-priority"
38) "100"
39) "slave-repl-offset"
40) "298886"
查看某个哨兵组的哨兵节点信息
192.168.108.129:16379> sentinel sentinels redis-master
1) 1) "name"
2) "9613171fbc3a343ce1fdaa7617471df992deb7c7"
3) "ip"
4) "192.168.108.131"
5) "port"
6) "16379"
7) "runid"
8) "9613171fbc3a343ce1fdaa7617471df992deb7c7"
9) "flags"
10) "sentinel"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "553"
19) "last-ping-reply"
20) "553"
21) "down-after-milliseconds"
22) "3000"
23) "last-hello-message"
24) "395"
25) "voted-leader"
26) "?"
27) "voted-leader-epoch"
28) "0"
2) 1) "name"
2) "66ef8fc36c93f53c757a2b595b8b47f62bb8adb1"
3) "ip"
4) "192.168.108.130"
5) "port"
6) "16379"
7) "runid"
8) "66ef8fc36c93f53c757a2b595b8b47f62bb8adb1"
9) "flags"
10) "sentinel"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "553"
19) "last-ping-reply"
20) "553"
21) "down-after-milliseconds"
22) "3000"
23) "last-hello-message"
24) "600"
25) "voted-leader"
26) "?"
27) "voted-leader-epoch"
28) "0"
查看某个哨兵组的主节点IP和PORT
192.168.108.129:16379> sentinel get-master-addr-by-name redis-master
1) "192.168.108.131"
2) "6379"
强制执行故障转移
192.168.108.129:16379> sentinel failover redis-master
OK
将哨兵的配置信息强制写入配置文件
192.168.108.129:16379> sentinel flushconfig
OK
检查某个哨兵组中可用主机数量
192.168.108.129:16379> sentinel ckquorum redis-master
OK 3 usable Sentinels. Quorum and failover authorization can be reached