repmgr无法自动故障转移

news2025/4/18 1:54:56

停掉主节点，让备节点自动接管

[postgres@db223 ~]$ repmgr -f ~/repmgr/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------
1 | db223 | primary | * running | | default | 100 | 15 | host=db223 dbname=repmgr user=repmgr password=repmgr connect_timeout=2
2 | db206 | primary | - failed | ? | default | 100 | | host=db206 dbname=repmgr user=repmgr password=repmgr connect_timeout=2

WARNING: following issues were detected
- unable to connect to node "db206" (ID: 2)

HINT: execute with --verbose option to see connection error messages

旧主重新加入集群

[postgres@db206 data]$ repmgr -f ~/repmgr/repmgr.conf node rejoin -d 'host=db223 port=5432 user=repmgr dbname=repmgr password=repmgr' --force-rewind
NOTICE: rejoin target is node "db223" (ID: 1)
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 1
DETAIL: rejoin target server's timeline 15 forked off current database system timeline 14 before current recovery point 120/9B171E00
NOTICE: executing pg_rewind
DETAIL: pg_rewind command is "/home/postgres/pg14/bin/pg_rewind -D '/home/postgres/pg14/data' --source-server='host=db223 dbname=repmgr user=repmgr password=repmgr connect_timeout=2'"
NOTICE: 0 files copied to /home/postgres/pg14/data
NOTICE: setting node 2's upstream to node 1
WARNING: unable to ping "host=db206 dbname=repmgr user=repmgr password=repmgr connect_timeout=2"
DETAIL: PQping() returned "PQPING_NO_RESPONSE"
NOTICE: starting server using "/home/postgres/pg14/bin/pg_ctl -w -D '/home/postgres/pg14/data' start"
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 1

怎么着都无法自动故障转移，没有别的办法，做了个重做备机好了，好了（？？？？）

[postgres@db206 data]$ repmgr -h db223 -U repmgr -d repmgr -f /home/postgres/repmgr/repmgr.conf standby clone
NOTICE: destination directory "/home/postgres/pg14/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=db223 user=repmgr dbname=repmgr
DETAIL: current installation size is 1752 MB
INFO: replication slot usage not requested; no replication slot will be set up for this standby
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: checking and correcting permissions on existing directory "/home/postgres/pg14/data"
NOTICE: starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
/home/postgres/pg14/bin/pg_basebackup -l "repmgr base backup" -D /home/postgres/pg14/data -h db223 -p 5432 -U repmgr -X stream
NOTICE: standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: pg_ctl -D /home/postgres/pg14/data start
HINT: after starting the server, you need to re-register this standby with "repmgr standby register --force" to update the existing node record

[postgres@db206 data]$ pg_ctl start
waiting for server to start....2023-08-18 21:25:12.514 CST [11178] LOG: redirecting log output to logging collector process
2023-08-18 21:25:12.514 CST [11178] HINT: Future log output will appear in directory "log".
done
server started

[postgres@db206 data]$ repmgr -f ~/repmgr/repmgr.conf standby register -F
INFO: connecting to local node "db206" (ID: 2)
INFO: connecting to primary database
INFO: standby registration complete
NOTICE: standby node "db206" (ID: 2) successfully registered
[postgres@db206 data]$ repmgr -f ~/repmgr/repmgr.conf cluster show