Patroni方案简介
Patroni是一个基于zk、etcd、consul等的pg ha模板,可以使用python来创建和定制高可用性解决方案。Patroni使用分布式key-value数据库作为数据存储,主节点故障时进行主节点重新选举。通过PG内置的流复制,支持同步和异步复制。
由于数据库信息记录在ETCD中,通过增加部署实例数可以避免脑裂,且该方案自动化程度较高,可以自动初始化PG实例;当STANDBY实例关闭后,支持自动尝试拉起;支持当主库宕机后自动选取新主库。
ETCD是一个基于RAFT协议的分布式Key-Value数据库,基于Go语言编写,Patroni监控本地的PG数据库状态,并将相关信息写入ETCD。每个Patroni都能读写ETCD的KEY,从而获取其他主机的PG数据库实例信息。
环境说明
-
操作系统:CentOS Linux 7.9
-
数据库版本:PostgreSQL 13.1
-
Python版本:Python3.8
-
ETCD版本:3.3
-
Patroni版本:2.0.1
部署规划
主机名 | IP | 组件 | 备注 |
PG1 | 192.168.21.135 | PostgreSQL、Patroni、ETCD | MASTER |
PG2 | 192.168.21.136 | PostgreSQL、Patroni、ETCD | STANDBY |
PG3 | 192.168.21.137 | PostgreSQL、Patroni、ETCD | STANDBY |
搭建部署
1. 安装PostgreSQL以及搭建流复制
略
2. 安装etcd集群
# PG1节点
sudo yum install -y etcd
vim etc/etcd/etcd.conf
ETCD_NAME="node1"
ETCD_DATA_DIR="/var/lib/etcd/node1.etcd"
ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://192.168.21.135:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://127.0.0.1:2379,http://192.168.21.135:2379"
ETCD_LISTEN_PEER_URLS="http://192.168.21.135:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.21.135:2380"
ETCD_INITIAL_CLUSTER="node1=http://192.168.21.135:2380,node2=http://192.168.21.136:2380,node3=http://192.168.21.137:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
# PG2节点
sudo yum install -y etcd
vim /etc/etcd/etcd.conf
ETCD_NAME="node2"
ETCD_DATA_DIR="/var/lib/etcd/node2.etcd"
ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://192.168.21.136:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://127.0.0.1:2379,http://192.168.21.136:2379"
ETCD_LISTEN_PEER_URLS="http://192.168.21.136:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.21.136:2380"
ETCD_INITIAL_CLUSTER="node1=http://192.168.21.135:2380,node2=http://192.168.21.136:2380,node3=http://192.168.21.137:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
# PG3节点
sudo yum install -y etcd
vim /etc/etcd/etcd.conf
ETCD_NAME="node3"
ETCD_DATA_DIR="/var/lib/etcd/node3.etcd"
ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://192.168.21.137:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://127.0.0.1:2379,http://192.168.21.137:2379"
ETCD_LISTEN_PEER_URLS="http://192.168.21.137:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.21.137:2380"
ETCD_INITIAL_CLUSTER="node1=http://192.168.21.135:2380,node2=http://192.168.21.136:2380,node3=http://192.168.21.137:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
3. 安装patroni
# 这里采用编译安装的方式
# 安装依赖包
yum install -y libffi-devel libuuid libuuid-devel
# 安装python
tar -xzvf python-3.8.7.tgz
cd python-3.8.7
./configure
make -j 7
make -j 7 install
rm -rf /usr/bin/python
ln -s /usr/local/bin/python3 /usr/bin/pyhon
# 处理yum的问题
vim /usr/bin/yum
修改#!/usr/bin/python 为#!/usr/bin/python2
# 安装patroni
pip3 install psycopg2-binary
pip3 install patroni[etcd]
4. 创建patroni配置文件
# PG1节点
mkdir /etc/patroni
vim /etc/patroni/patroni.conf
scope: etcd_patroni_pg
namespace: /data/
name: pg1
restapi:
listen: 192.168.21.135:8008
connect_address: 192.168.21.135:8008
etcd:
host: 192.168.21.135:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
listen_address: "*"
port: 5432
max_connections: 2000
hot_standby: "on"
max_locks_per_transactions: 128
max_prepared_transactions: 2000
max_replication_slots: 10
max_wal_sender: 10
max_worker_precesses: 256
track_commit_timestamp: "on"
wal_keep_size: 0
wal_level: replica
wal_log_hints: "on"
log_directory: "log"
log_destination: "stderr"
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.21.135:5432
data_dir: /data/pgdata
bin_dir: /opt/pg13/bin
pgpass: /home/postgres/.pgpass
authentication:
replication:
username: replica
password: replica
superuser:
username: postgres
password: root
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
# PG2节点
mkdir /etc/patroni
vim /etc/patroni/partoni.conf
scope: etcd_patroni_pg
namespace: /data/
name: pg2
restapi:
listen: 192.168.21.136:8008
connect_address: 192.168.21.136:8008
etcd:
host: 192.168.21.136:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
listen_address: "*"
port: 5432
max_connections: 2000
hot_standby: "on"
max_locks_per_transactions: 128
max_prepared_transactions: 2000
max_replication_slots: 10
max_wal_sender: 10
max_worker_precesses: 256
track_commit_timestamp: "on"
wal_keep_size: 0
wal_level: replica
wal_log_hints: "on"
log_directory: "log"
log_destination: "stderr"
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.21.136:5432
data_dir: /data/pgdata
bin_dir: /opt/pg13/bin
pgpass: /home/postgres/.pgpass
authentication:
replication:
username: replica
password: replica
superuser:
username: postgres
password: root
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
# PG3节点
mkdir /etc/patroni
vim /etc/patroni/patroni.conf
scope: etcd_patroni_pg
namespace: /data/
name: pg3
restapi:
listen: 192.168.21.137:8008
connect_address: 192.168.21.137:8008
etcd:
host: 192.168.21.137:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
listen_address: "*"
port: 5432
max_connections: 2000
hot_standby: "on"
max_locks_per_transactions: 128
max_prepared_transactions: 2000
max_replication_slots: 10
max_wal_sender: 10
max_worker_precesses: 256
track_commit_timestamp: "on"
wal_keep_size: 0
wal_level: replica
wal_log_hints: "on"
log_directory: "log"
log_destination: "stderr"
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.21.137:5432
data_dir: /data/pgdata
bin_dir: /opt/pg13/bin
pgpass: /home/postgres/.pgpass
authentication:
replication:
username: replica
password: replica
superuser:
username: postgres
password: root
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
5. 将patroni加入服务开机启动
vim /etc/systemd/system/patroni.service
[Unit]
Description=patroni - a high-availability PostgreSQL
Documentation=https://patroni.readthedocs.io/en/latest/index.html
After=syslog.target network.target etcd.target
Wants=network-online.target
[Service]
Type=simple
User=postgres
Group=postgres
PermissionsStartOnly=true
ExecStart=/usr/local/bin/patroni /etc/patroni/patroni.yml
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536
KillMode=process
KillSignal=SIGINT
Restart=on-abnormal
RestartSec=30s
TimeoutSec=0
[Install]
WantedBy=multi-user.target
6. 启动集群
启动数据库:
Pg_ctl start $PGDATA
启动etcd:
Sudo systemctl start etcd
启动patroni:
Sudo systemctl start patroni
检查验证
1. 查看集群节点状态
patronictl -c /etc/patroni/patroni.yml list
-
Role:显示当前节点所属状态;
-
TL:显示当前集群的时间线;
-
Lag in MB:显示当前集群主备延迟大小
2. 手工触发主备切换
选择某一可用的从节点,使其成为主节点角色
patronictl -c /etc/patroni/patroni.yml switchover
查看集群状态
patronictl -c /etc/patroni/patroni.yml list
原始旧主库,自动切换为新主的备库,TL时间线变化。切换后,旧主库会自动跟随新主库。
3. 手工停止主节点Patroni
再次发生主备切换,重新启动patroni组件后,集群状态
pg2自动跟随新主库pg3。
当某一节点与主库不同步,或者节点异常运行时,
可以使用此方法初始化节点信息以重新加入集群。
patronictl -c /etc/patroni/patroni.yml reinit pgsql
1.选择以下需要添加的节点名称:pg2
2.你确定要重新初始化成员 pg2?[y/N]:y
3.成功:为成员pgsql_node3执行初始化
4. 手动停止PG主节点实例
从上面的过程可以看出,停止主库的后,patroni组件很快就检测到主库宕机,然后尝试将主库重新启动。如果启动不成功,则触发主备切换。
绑定VIP
Patroni集群,在应用连接时,可以连接多个IP,在使用时,先判断当前实例是否为MASTER,如果是,则业务继续进行,否则需要尝试其他IP。PATRONI可以将VIP绑定在MASTER节点,当主备切换时,VIP漂移到新主节点。
1. 修改patroni配置文件,每个节点都需要
scope: etcd_patroni_pg1
namespace: /data/
name: pg1
restapi:
listen: 192.168.21.135:8008
connect_address: 192.168.21.135:8008
etcd:
host: 192.168.21.135:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
listen_address: "*"
port: 5432
max_connections: 2000
hot_standby: "on"
max_locks_per_transaction: 128
max_prepared_transactions: 2000
max_replication_slots: 10
max_wal_sender: 10
max_worker_precesses: 256
track_commit_timestamp: "on"
wal_keep_size: 0
wal_level: replica
wal_log_hints: "on"
log_directory: "log"
log_destination: "stderr"
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.21.135:5432
data_dir: /data/pgdata
bin_dir: /opt/pg13/bin
authentication:
replication:
username: replica
password: replica
superuser:
username: postgres
password: root
pgpass: /home/postgresql/.pgpass
callbacks:
on_start: /etc/patroni/patroni_callback.sh
on_stop: /etc/patroni/patroni_callback.sh
on_role_change: /etc/patroni/patroni_callback.sh
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
注意:
callbacks:
on_start: /etc/patroni/patroni_callback.sh
on_stop: /etc/patroni/patroni_callback.sh
on_role_change: /etc/patroni/patroni_callback.sh
此段为绑定VIP需要的shell脚本。
2. shell脚本
#!/bin/bash
readonly cb_name=$1
readonly role=$2
readonly scope=$3
function usage() {
echo "Usage: $0 <on_start|on_stop|on_role_change> <role> <scope>";
exit 1;
}
echo "this is patroni callback $cb_name $role $scope"
case $cb_name in
on_stop)
sudo ip addr del 192.168.21.250/24 dev ens33 label ens33:1
sudo arping -q -A -c 1 -I ens33 192.168.21.250
#sudo iptables -F
;;
on_start)
;;
on_role_change)
if [[ $role == 'master' ]]; then
sudo ip addr add 192.168.21.250/24 brd 192.168.21.255 dev ens33 label ens33:1
sudo arping -q -A -c 1 -I ens33 192.168.21.250
#sudo iptables -F
elif [[ $role == 'slave' ]]||[[ $role == 'replica' ]]||[[ $role == 'logical' ]]; then
sudo ip addr del 192.168.21.250/24 dev ens33 label ens33:1
sudo arping -q -A -c 1 -I enp0s8 192.168.21.250
#sudo iptables -F
fi
;;
*)
usage
;;
esac
总结
Patroni用于构建高可用 PostgreSQL 集群的工具,它通过自动化的方式来管理主备切换、数据复制和集群健康检查。
参考资料:
乖乖,PG高可用咱不用主从流复制了,咱试试patroni
基于Patroni+etcd+流复制搭建PostgreSQL高可用 - 墨天轮
使用Patroni和HAProxy创建高度可用的PostgreSQL集群-腾讯云开发者社区-腾讯云
https://www.linode.com/docs/guides/create-a-highly-available-postgresql-cluster-using-patroni-and-haproxy/