导言
本文主要从测试的角度来阐明group_replication_unreachable_majority_timeout和group_replication_member_expel_timeout参数对集群网络分区的影响,首先这里放一张大图,为多年前学习MGR的时候留下的,也就是这两个参数对集群的影响,也是本文测试重点。
在使用mgr的过程中,我们会经常看到以下报错,大概意思是:“由于网络故障,成员被逐出mgr集群,并将该成员状态更改为ERROR”。这因为mgr集群节点间心跳检测超时,少数派节点被驱逐导致的。
2022-07-31T13:07:30.464239-00:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2022-07-31T13:07:37.458761-00:00 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
下面结合2个具体案例,分析以下3个参数配置对mgr网络故障的影响和mgr的故障检测流程。集群拓扑:单主模式
节点 IP地址 主机名 状态
node1 172.16.86.61 mysqlwjhtest01 PRIMARY
node2 172.16.86.62 mysqlwjhtest02 SECONDARY
node3 172.16.86.63 mysqlwjhtest03 SECONDARY
本次测试主要包括两步:
- 模拟网络分区,看它对集群各节点的影响。
- 恢复网络连接,看看各节点又是如何反应的。
一、 group_replication_unreachable_majority_timeout参数设置为0
3个相关测试参数
#适用于8.0.21以上版本
group_replication_member_expel_timeout=30 #可疑节点超过30秒被驱逐集群
group_replication_unreachable_majority_timeout=0 #网络分区后少数派节点网络不可达超时时间
group_replication_autorejoin_tries=3 #error状态节点重新加入集群的次数
案例1:模拟secondary节点网络中断
1.1 模拟网络分区故障,在 node3 上执行(当前node3为secondary节点)。通过iptables 命令断开 node3 与 node1、node2 之间的网络连接。
iptables -A INPUT -p tcp -s 172.16.86.61 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.61 -j DROP
iptables -A INPUT -p tcp -s 172.16.86.62 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.62 -j DROP
命令执行完 5s(mgr心跳检测时间,固定值,无法修改)后,各个节点开始响应。
- 首先查看多数派节点(node1 、node2)日志及集群状态信息。显示 node3 处于 UNREACHABLE 状态。
[Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
root@localhost: 19:38: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | UNREACHABLE | SECONDARY | 8.0.28 |
| mysqlwjhtest02 | ONLINE | SECONDARY | 8.0.28 |
| mysqlwjhtest01 | ONLINE | PRIMARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
日志显示,检测到node3网络不可达以后,等过了30秒(参数group_replication_member_expel_timeout=30控制)以后,node3被驱逐出集群。集群状态视图只有2个节点了。
2023-05-24T19:38:39.413051+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
2023-05-24T19:39:09.413491+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2023-05-24T19:39:09.468173+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-24T19:39:09.468299+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.61:13307'
2023-05-24T19:39:09.468320+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.62:13307'
2023-05-24T19:39:09.468336+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 29726 0}, boot key synode is {88e4dbe0 29715 0}, configured event horizon=10, my node identifier is 0'
2023-05-24T19:39:10.399352+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 0 and my address is 172.16.86.61:13307'
2023-05-24T19:39:10.404044+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Group is able to support up to communication protocol version 8.0.27'
2023-05-24T19:39:10.404139+08:00 0 [Warning] [MY-011499] [Repl] Plugin group_replication reported: 'Members removed from the group: mysqlwjhtest03:3307'
2023-05-24T19:39:10.404219+08:00 0 [System] [MY-011503] [Repl] Plugin group_replication reported: 'Group membership changed to mysqlwjhtest02:3307, mysqlwjhtest01:3307 on view 16849117016333239:4.'
2023-05-24T19:39:20.951668+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Failure reading from fd=-1 n=18446744073709551615 from 172.16.86.63:13307'
此时查看多数派节点(node1 、node2)状态信息,已经看不到node3节点了。
root@localhost: 20:02: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest02 | ONLINE | SECONDARY | 8.0.28 |
| mysqlwjhtest01 | ONLINE | PRIMARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
2 rows in set (0.00 sec)
- 然后查看少数派节点(node3)日志及集群状态信息,显示node1,node2 处于 UNREACHABLE 状态。
[Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 has become unreachable.'
[Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
[ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
并且在整个过程中,集群状态信息显示node1和node2一直处于UNREACHABLE状态。(这是由于group_replication_unreachable_majority_timeout=0,后面展开聊这个参数。)
root@localhost: 19:54: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE | SECONDARY | 8.0.28 |
| mysqlwjhtest02 | UNREACHABLE | SECONDARY | 8.0.28 |
| mysqlwjhtest01 | UNREACHABLE | PRIMARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
1.2 恢复网络连接
接下来我们尝试恢复 node3 与 node1、node2 之间的网络。
iptables -F
- 查看原少数派节点(node3)日志,显示心跳检测到node1、node2网络恢复正常,然后auto-rejoin再次加入集群,GCS通信恢复正常,应用xcom cache,更新change master等。
2023-05-24T20:20:52.475071+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.61:13307'
2023-05-24T20:20:53.655956+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.62:13307'
2023-05-24T20:21:03.214971+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 is reachable again.'
2023-05-24T20:21:03.215075+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 is reachable again.'
2023-05-24T20:21:03.215100+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2023-05-24T20:21:19.209028+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 2 and my address is 172.16.86.63:13307'
2023-05-24T20:21:20.208934+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 has become unreachable.'
2023-05-24T20:21:20.209098+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
2023-05-24T20:21:20.209131+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
2023-05-24T20:21:23.097685+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 is reachable again.'
2023-05-24T20:21:23.097871+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 is reachable again.'
2023-05-24T20:21:23.097916+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2023-05-24T20:21:23.107732+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-24T20:21:23.107836+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.61:13307'
2023-05-24T20:21:23.107854+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.62:13307'
2023-05-24T20:21:23.107872+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 29726 0}, boot key synode is {88e4dbe0 29715 0}, configured event horizon=10, my node identifier is 4294967295'
2023-05-24T20:21:26.125137+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sending a request to a remote XCom failed. Please check the remote node log for more details.'
2023-05-24T20:21:26.125323+08:00 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2023-05-24T20:21:26.125534+08:00 0 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2023-05-24T20:21:26.126663+08:00 5344 [System] [MY-013373] [Repl] Plugin group_replication reported: 'Started auto-rejoin procedure attempt 1 of 3'
2023-05-24T20:21:26.127111+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2023-05-24T20:21:26.127169+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Failed to push into XCom.'
2023-05-24T20:21:26.291624+08:00 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'
2023-05-24T20:21:26.293673+08:00 13 [Note] [MY-010596] [Repl] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2023-05-24T20:21:26.294631+08:00 13 [Note] [MY-010587] [Repl] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2023-05-24T20:21:26.294877+08:00 11 [Note] [MY-011444] [Repl] Plugin group_replication reported: 'The group replication applier thread was killed.'
2023-05-24T20:21:26.296001+08:00 5344 [Note] [MY-011673] [Repl] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2023-05-24T20:21:26.299638+08:00 5344 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Debug messages will be sent to: asynchronous::/data/mysql/data/GCS_DEBUG_TRACE'
2023-05-24T20:21:26.300015+08:00 5344 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-24T20:21:26.300062+08:00 5344 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-24T20:21:26.300132+08:00 5344 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] SSL was not enabled'
2023-05-24T20:21:26.301794+08:00 5346 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 3942, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2023-05-24T20:21:26.331516+08:00 5344 [Note] [MY-011670] [Repl] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2023-05-24T20:21:26.331583+08:00 5348 [Note] [MY-010581] [Repl] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log '/data/mysql/relaylog/mysql-relay-bin-group_replication_applier.000002' position: 4213
2023-05-24T20:21:26.473164+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using XCom as Communication Stack for XCom'
2023-05-24T20:21:26.473594+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] XCom initialized and ready to accept incoming connections on port 13307'
2023-05-24T20:21:26.473792+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Successfully connected to the local XCom via anonymous pipe'
2023-05-24T20:21:26.474619+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] TCP_NODELAY already set'
2023-05-24T20:21:26.474690+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully connected to peer 172.16.86.61:13307. Sending a request to be added to the group'
此时查看原少数派节点(node3)集群状态信息,也恢复正常。
root@localhost: 20:20: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE | SECONDARY | 8.0.28 |
| mysqlwjhtest02 | ONLINE | SECONDARY | 8.0.28 |
| mysqlwjhtest01 | ONLINE | PRIMARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
- 查看原多数派节点(node1 、node2)日志,大致就是与node3建立连接,然后集群恢复
2023-05-24T20:21:26.124082+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Node has already been removed: 172.16.86.63:13307 16849122691765501.'
2023-05-24T20:21:26.522275+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Adding new node to the configuration: 172.16.86.63:13307'
2023-05-24T20:21:26.522513+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-24T20:21:26.522574+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.61:13307'
2023-05-24T20:21:26.522609+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.62:13307'
2023-05-24T20:21:26.522624+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 2 host 172.16.86.63:13307'
2023-05-24T20:21:26.522641+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 32296 0}, boot key synode is {88e4dbe0 32285 0}, configured event horizon=10, my node identifier is 0'
2023-05-24T20:21:26.522899+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sender task disconnected from 172.16.86.63:13307'
2023-05-24T20:21:26.522931+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connecting to 172.16.86.63:13307'
2023-05-24T20:21:26.523349+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.63:13307'
2023-05-24T20:21:27.452375+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 0 and my address is 172.16.86.61:13307'
2023-05-24T20:21:27.453687+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Received an XCom snapshot request from 172.16.86.63:13307'
上面案例是mysql8.0 mgr secondary节点网络故障过程中的mgr集群变化。最后看到,网络恢复后集群自行恢复,整个过程无需DBA干预。
案例2:模拟primary节点网络中断
2.1 继续模拟网络分区故障,在 node1 上执行(当前node1为primary节点)。通过iptables 命令断开 node1 与 node2、node3 之间的网络连接。
iptables -A INPUT -p tcp -s 172.16.86.62 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.62 -j DROP
iptables -A INPUT -p tcp -s 172.16.86.63 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.63 -j DROP
date
2023年 05月 27日 星期六 09:29:46 CST
命令执行完 5后,各个节点开始响应。
- 首先查看多数派节点(node2 、node3)日志集群状态信息
primary节点故障后会触发主库切换,我们详细看一下时间线:
- 09:29:46 网络中断
- 09:29:51 5秒后,心跳检测超时
- 09:30:21 35秒后,提升node3为主节点
023-05-27T09:29:51.229977+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 has become unreachable.'
2023-05-27T09:30:20.947030+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T09:30:20.947163+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.62:13307'
2023-05-27T09:30:20.947191+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.63:13307'
2023-05-27T09:30:20.947217+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this co
nfiguration is {88e4dbe0 309739 1}, boot key synode is {88e4dbe0 309728 1}, configured event horizon=10, my node identifier is 1'
2023-05-27T09:30:21.689907+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Group is able to support up to communication protocol version 8.0.2
7'
2023-05-27T09:30:21.689994+08:00 0 [Warning] [MY-011499] [Repl] Plugin group_replication reported: 'Members removed from the group: mysqlwjhtest01:3307'
2023-05-27T09:30:21.690006+08:00 0 [System] [MY-011500] [Repl] Plugin group_replication reported: 'Primary server with address mysqlwjhtest01:3307 left the group. Electin
g new Primary.'
2023-05-27T09:30:21.690097+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member gtid_executed: debbaad4-1fd7-4652-80a4-5c87cfde0db
6:1-16'
2023-05-27T09:30:21.690108+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member applier channel received_transaction_set: debbaad4
-1fd7-4652-80a4-5c87cfde0db6:1-16'
2023-05-27T09:30:21.690123+08:00 0 [System] [MY-011507] [Repl] Plugin group_replication reported: 'A new primary with address mysqlwjhtest03:3307 was elected. The new pri
mary will execute all previous group transactions before allowing writes.'
2023-05-27T09:30:21.690509+08:00 0 [System] [MY-011503] [Repl] Plugin group_replication reported: 'Group membership changed to mysqlwjhtest03:3307, mysqlwjhtest02:3307 on
view 16849117016333239:10.'
2023-05-27T09:30:21.691815+08:00 6189 [System] [MY-013731] [Repl] Plugin group_replication reported: 'The member action "mysql_disable_super_read_only_if_primary" for eve
nt "AFTER_PRIMARY_ELECTION" with priority "1" will be run.'
2023-05-27T09:30:21.692135+08:00 6189 [System] [MY-011566] [Repl] Plugin group_replication reported: 'Setting super_read_only=OFF.'
2023-05-27T09:30:21.692186+08:00 6189 [System] [MY-013731] [Repl] Plugin group_replication reported: 'The member action "mysql_start_failover_channels_if_primary" for eve
nt "AFTER_PRIMARY_ELECTION" with priority "10" will be run.'
2023-05-27T09:30:21.693182+08:00 6181 [Note] [MY-011485] [Repl] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection.'
2023-05-27T09:30:21.693200+08:00 68187 [System] [MY-011510] [Repl] Plugin group_replication reported: 'This server is working as primary member.'
2023-05-27T09:30:31.213720+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Failure reading from fd=-1 n=18446744073709551615 from 172.16.86.61
:13307'
多数派节点(node2 、node3)集群状态信息显示,node3节点已经被选举为新主。
#node2、node3显示状态
root@localhost: 09:30: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE | PRIMARY | 8.0.28 |
| mysqlwjhtest02 | ONLINE | SECONDARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
2 rows in set (0.01 sec)
- 查看少数派节点(node1)日志。
node1提供的日志信息较少,只是在心跳检测超时后阻塞了数据写入,这里并没有提到阻塞多长时间。
2023-05-27T09:29:51.277851+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
2023-05-27T09:29:51.277990+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
2023-05-27T09:29:51.278007+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
查看node1集群状态信息。我们发现网络分区后,少数派节点(node1)依然处于"online primary"状态。
#node1显示状态
root@localhost: 09:30: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | UNREACHABLE | SECONDARY | 8.0.28 |
| mysqlwjhtest02 | UNREACHABLE | SECONDARY | 8.0.28 |
| mysqlwjhtest01 | ONLINE | PRIMARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
所以,结合对比多数派节点集群信息,网络分区后,node1、node3均为"online primary"状态,但是少数派节点(node1)会话阻塞了所有更新,即node1处于只读状态。如果mgrvip切换检测逻辑只依赖于"online primary"状态时,这里会有问题。
此时我们在node1尝试写入数据,发现被阻塞。这是因为node1分区后只有自己一个节点,属于少数派节点,不满足组复制的多数派原则,符合预期。
root@localhost: 10:10: [(none)]> create database db1;
...... 一直卡在这里
2.2 恢复网络连接
接下来我们尝试恢复 node1 与 node2、node3 之间的网络。
iptables -F
- 对比查看少数派节点(node1)和多数派节点(node2、node3)上的集群状态变化。
在这里我发现了比较有意思的事情,首先各个节点的UNREACHABLE状态变为ONLINE。但此时少数派节点(node1)集群信息显示,node1依然是"online primary"状态,然后过了两三秒,原少数派节点(node1)中的primary节点才由node1变为node3。从日志来看,网络恢复后节点从UNREACHABLE状态变为ONLINE,然后由于XCOM层配置的改变,最终报错后,重新加入集群。以下是我在原少数派节点(node1)监控到的集群状态变化:
#首先,网络刚刚恢复时,primary为node1
root@localhost: 10:17: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE | SECONDARY | 8.0.28 |
| mysqlwjhtest02 | ONLINE | SECONDARY | 8.0.28 |
| mysqlwjhtest01 | ONLINE | PRIMARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
#大概两三秒后primary变为node3。
root@localhost: 10:24: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE | PRIMARY | 8.0.28 |
| mysqlwjhtest02 | ONLINE | SECONDARY | 8.0.28 |
| mysqlwjhtest01 | ONLINE | SECONDARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
原少数派节点(node1)网络恢复过程中的日志
2023-05-27T10:17:07.490608+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.63:13307'
2023-05-27T10:17:08.491033+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.62:13307'
2023-05-27T10:17:08.586063+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 0 and my address is 172.16.86.61:13307'
2023-05-27T10:17:08.586486+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 is reachable again.'
2023-05-27T10:17:08.586546+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2023-05-27T10:17:09.202537+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 is reachable again.'
2023-05-27T10:17:20.197898+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
2023-05-27T10:17:20.198008+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
2023-05-27T10:17:20.198026+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
2023-05-27T10:17:20.815474+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 0 and my address is 172.16.86.61:13307'
2023-05-27T10:17:20.815730+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 is reachable again.'
2023-05-27T10:17:38.361435+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2023-05-27T10:17:38.361659+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 is reachable again.'
2023-05-27T10:17:40.489269+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:40.489357+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.62:13307'
2023-05-27T10:17:40.489370+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.63:13307'
2023-05-27T10:17:40.489382+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 309739 1}, boot key synode is {88e4dbe0 309728 1}, configured event horizon=10, my node identifier is 4294967295'
2023-05-27T10:17:43.526534+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sending a request to a remote XCom failed. Please check the remote node log for more details.'
2023-05-27T10:17:43.526695+08:00 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2023-05-27T10:17:43.526856+08:00 0 [Warning] [MY-011630] [Repl] Plugin group_replication reported: 'Due to a plugin error, some transactions were unable to be certified and will now rollback.'
2023-05-27T10:17:43.526929+08:00 0 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2023-05-27T10:17:43.527329+08:00 53521 [ERROR] [MY-011615] [Repl] Plugin group_replication reported: 'Error while waiting for conflict detection procedure to finish on session 53521'
2023-05-27T10:17:43.527405+08:00 53521 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed
2023-05-27T10:17:43.533122+08:00 53521 [Note] [MY-010914] [Server] Aborted connection 53521 to db: 'unconnected' user: 'root' host: 'localhost' (Error on observer while running replication hook 'before_commit').
2023-05-27T10:17:43.533134+08:00 53627 [System] [MY-011565] [Repl] Plugin group_replication reported: 'Setting super_read_only=ON.'
2023-05-27T10:17:43.533523+08:00 53628 [System] [MY-013373] [Repl] Plugin group_replication reported: 'Started auto-rejoin procedure attempt 1 of 3'
2023-05-27T10:17:43.533650+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2023-05-27T10:17:43.533682+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Failed to push into XCom.'
2023-05-27T10:17:44.837043+08:00 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'
2023-05-27T10:17:44.838114+08:00 13 [Note] [MY-010596] [Repl] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2023-05-27T10:17:44.838433+08:00 13 [Note] [MY-010587] [Repl] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2023-05-27T10:17:44.838544+08:00 11 [Note] [MY-011444] [Repl] Plugin group_replication reported: 'The group replication applier thread was killed.'
2023-05-27T10:17:44.839024+08:00 53628 [Note] [MY-011673] [Repl] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2023-05-27T10:17:44.840989+08:00 53628 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Debug messages will be sent to: asynchronous::/data/mysql/data/GCS_DEBUG_TRACE'
2023-05-27T10:17:44.841282+08:00 53628 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-27T10:17:44.841313+08:00 53628 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-27T10:17:44.841357+08:00 53628 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] SSL was not enabled'
2023-05-27T10:17:44.842115+08:00 53630 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 2046, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2023-05-27T10:17:44.860923+08:00 53628 [Note] [MY-011670] [Repl] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2023-05-27T10:17:44.861050+08:00 53632 [Note] [MY-010581] [Repl] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log '/data/mysql/relaylog/mysql-relay-bin-group_replication_applier.000002' position: 2317
2023-05-27T10:17:45.006131+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using XCom as Communication Stack for XCom'
2023-05-27T10:17:45.006450+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] XCom initialized and ready to accept incoming connections on port 13307'
2023-05-27T10:17:45.006546+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Successfully connected to the local XCom via anonymous pipe'
2023-05-27T10:17:45.007262+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] TCP_NODELAY already set'
2023-05-27T10:17:45.007298+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully connected to peer 172.16.86.62:13307. Sending a request to be added to the group'
2023-05-27T10:17:45.007314+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sending add_node request to a peer XCom node'
2023-05-27T10:17:47.166146+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Node has not booted. Requesting an XCom snapshot from node number 1 in the current configuration'
2023-05-27T10:17:47.167161+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Installing requested snapshot. Importing all incoming configurations.'
2023-05-27T10:17:47.167358+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:47.167393+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Creating new server node 0 host 172.16.86.61:13307'
2023-05-27T10:17:47.167502+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Creating new server node 1 host 172.16.86.62:13307'
2023-05-27T10:17:47.167613+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 36985 0}, boot key synode is {88e4dbe0 36974 0}, configured event horizon=10, my node identifier is 0'
2023-05-27T10:17:47.167699+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:47.167723+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.61:13307'
2023-05-27T10:17:47.167747+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.62:13307'
2023-05-27T10:17:47.167761+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Creating new server node 2 host 172.16.86.63:13307'
2023-05-27T10:17:47.167866+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 37375 0}, boot key synode is {88e4dbe0 37364 0}, configured event horizon=10, my node identifier is 0'
2023-05-27T10:17:47.167964+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:47.167980+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.62:13307'
2023-05-27T10:17:47.167993+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.63:13307'
2023-05-27T10:17:47.168009+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 309739 1}, boot key synode is {88e4dbe0 309728 1}, configured event horizon=10, my node identifier is 4294967295'
2023-05-27T10:17:47.168089+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:47.168104+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.62:13307'
2023-05-27T10:17:47.168117+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.63:13307'
2023-05-27T10:17:47.168131+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 2 host 172.16.86.61:13307'
2023-05-27T10:17:47.168146+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 312618 0}, boot key synode is {88e4dbe0 312607 0}, configured event horizon=10, my node identifier is 2'
2023-05-27T10:17:47.168161+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Finished snapshot installation. My node number is 2'
2023-05-27T10:17:47.168269+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member has joined the group. Local port: 13307'
2023-05-27T10:17:47.168610+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connecting to 172.16.86.62:13307'
2023-05-27T10:17:47.168910+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.62:13307'
2023-05-27T10:17:47.168950+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connecting to 172.16.86.63:13307'
2023-05-27T10:17:47.169306+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.63:13307'
2023-05-27T10:17:47.658374+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] This server adjusted its communication protocol to 8.0.27 in order to join the group.'
2023-05-27T10:17:47.658436+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Group is able to support up to communication protocol version 8.0.27'
2023-05-27T10:17:47.660220+08:00 53628 [System] [MY-013375] [Repl] Plugin group_replication reported: 'Auto-rejoin procedure attempt 1 of 3 finished. Member was able to join the group.'
2023-05-27T10:17:48.661179+08:00 0 [System] [MY-013471] [Repl] Plugin group_replication reported: 'Distributed recovery will transfer data using: Incremental recovery from a group donor'
2023-05-27T10:17:48.661739+08:00 53647 [Note] [MY-011576] [Repl] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'
2023-05-27T10:17:48.661780+08:00 0 [System] [MY-011503] [Repl] Plugin group_replication reported: 'Group membership changed to mysqlwjhtest03:3307, mysqlwjhtest02:3307, mysqlwjhtest01:3307 on view 16849117016333239:11.'
2023-05-27T10:17:48.693433+08:00 53647 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='mysqlwjhtest02', master_port= 3307, master_log_file='', master_log_pos= 4, master_bind=''.
2023-05-27T10:17:48.727090+08:00 53647 [Note] [MY-011580] [Repl] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor 62a5b927-fa01-11ed-82ce-0050569c05d6 at mysqlwjhtest02 port: 3307.'
2023-05-27T10:17:48.727967+08:00 53648 [Warning] [MY-010897] [Repl] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2023-05-27T10:17:48.730254+08:00 53648 [System] [MY-010562] [Repl] Slave I/O thread for channel 'group_replication_recovery': connected to master 'repl@mysqlwjhtest02:3307',replication started in log 'FIRST' at position 4
2023-05-27T10:17:48.733307+08:00 53649 [Note] [MY-010581] [Repl] Slave SQL thread for channel 'group_replication_recovery' initialized, starting replication in log 'FIRST' at position 0, relay log '/data/mysql/relaylog/mysql-relay-bin-group_replication_recovery.000001' position: 4
2023-05-27T10:17:48.757900+08:00 53647 [Note] [MY-011585] [Repl] Plugin group_replication reported: 'Terminating existing group replication donor connection and purging the corresponding logs.'
2023-05-27T10:17:48.758896+08:00 53649 [Note] [MY-010587] [Repl] Slave SQL thread for channel 'group_replication_recovery' exiting, replication stopped in log 'mysql-bin.000001' at position 6217
2023-05-27T10:17:48.759258+08:00 53648 [Note] [MY-011026] [Repl] Slave I/O thread killed while reading event for channel 'group_replication_recovery'.
2023-05-27T10:17:48.759324+08:00 53648 [Note] [MY-010570] [Repl] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'mysql-bin.000001', position 6217
2023-05-27T10:17:48.796509+08:00 53647 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='mysqlwjhtest02', master_port= 3307, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2023-05-27T10:17:48.831253+08:00 0 [System] [MY-011490] [Repl] Plugin group_replication reported: 'This server was declared online within the replication group.'
2023-05-27T10:29:27.705642+08:00 53768 [Note] [MY-013730] [Server] 'wait_timeout' period of 300 seconds was exceeded for `root`@`localhost`. The idle time since last command was too long.
2023-05-27T10:29:27.705830+08:00 53768 [Note] [MY-010914] [Server] Aborted connection 53768 to db: 'unconnected' user: 'root' host: 'localhost' (The client was disconnected by the server because of inactivity.).
以上两个案例中,我将多数派网络不可达超时参数(group_replication_unreachable_majority_timeout)设置为0,所以心跳检测超时后,可疑节点被标记为unreachable。因为设置了可疑节点被驱逐时间为30秒(group_replication_member_expel_timeout=30)。所以在网络中断35秒后,多数派节点集群中的可疑节点被驱逐。而少数派节点集群中的可疑节点一直处于unreachable状态。
二、参数group_replication_unreachable_majority_timeout对集群的影响
为了便于观察,这里设置为80秒,然后重复案例2的步骤。
group_replication_member_expel_timeout=30 #与上面案例保持不变,设为30秒
group_replication_unreachable_majority_timeout=80 #设置为80秒
注意理解,参数group_replication_unreachable_majority_timeout表示多数派节点不可达超时时间。即网络故障分区后,它只对少数派节点有效。
此时primary节点是node3,所以断开node3与另外两个节点的网络。
iptables -A INPUT -p tcp -s 172.16.86.62 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.62 -j DROP
iptables -A INPUT -p tcp -s 172.16.86.61 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.61 -j DROP
date
#恢复网络:iptables -F
观察主节点网络故障过程中各节点集群状态
- 13:15:56 模拟网络中断
- 13:16:01 5秒时,心跳检测网络中断,少数派节点(node3)日志显示所有更新将被阻塞85秒,并将另外两个节点状态标记为unreachalbe。同时,多数派节点(node1、node2)集群状态视图将node3标记为unreachalbe。
- 13:16:31 35秒时,多数派节点(node1、node2)日志显示,node3对逐出集群,同时选举新主。 少数派节点无日志输出,此时少数派节点(node3)视图的primary依然是自己(但是所有更新被阻塞中)。
- 13:17:26 85秒时,少数派节点(node3)更改自己集群视图状态,由online primary改为error。 多数派节点无日志输出。
- 13:17:56 115秒时,少数派节点(node3)更新集群状态视图,踢出node1、node2节点信息。
5秒后,心跳检测网络中断,所有节点检测到网络不可达,并更新集群状态视图
#少数派节点(node3)
2023-05-27T13:16:01.716803+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 has become unreachable.'
2023-05-27T13:16:01.716934+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
2023-05-27T13:16:01.716951+08:00 0 [ERROR] [MY-011496] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked for the next 85 seconds. Unless contact with the majority is restored, after this time the member will error out and leave the group. It is possible to use group_replication_force_members to force a new group membership.'
2023-05-27T13:17:26.724015+08:00 71231 [ERROR] [MY-011711] [Repl] Plugin group_replication reported: 'This member could not reach a majority of the members for more than 85 seconds. The member will now leave the group as instructed by the group_replication_unreachable_majority_timeout option.'
#多数派节点(node1、node2)
2023-05-27T13:16:01.719345+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
35秒后,多数派节点(node1、node2)将node3逐出集群,并选择node2为新主库,更新状态视图。 故障节点无日志输出。
2023-05-27T13:16:32.706137+08:00 0 [Warning] [MY-011499] [Repl] Plugin group_replication reported: 'Members removed from the group: mysqlwjhtest03:3307'
2023-05-27T13:16:32.706150+08:00 0 [System] [MY-011500] [Repl] Plugin group_replication reported: 'Primary server with address mysqlwjhtest03:3307 left the group. Electing new Primary.'
2023-05-27T13:16:32.706259+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member gtid_executed: debbaad4-1fd7-4652-80a4-5c87cfde0db6:1-19'
2023-05-27T13:16:32.706272+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member applier channel received_transaction_set: debbaad4-1fd7-4652-80a4-5c87cfde0db6:1-19'
2023-05-27T13:16:32.706284+08:00 0 [System] [MY-011507] [Repl] Plugin group_replication reported: 'A new primary with address mysqlwjhtest02:3307 was elected. The new primary will execute all previous group transactions before allowing writes.'
85秒后,少数派节点(node3)更改自己状态,由online改为error。多数派节点无日志输出。
2023-05-27T13:17:26.724015+08:00 71231 [ERROR] [MY-011711] [Repl] Plugin group_replication reported: 'This member could not reach a majority of the members for more than 85 seconds. The member will now leave the group as instructed by the group_replication_unreachable_majority_timeout option.'
2023-05-27T13:17:26.724483+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2023-05-27T13:17:26.724560+08:00 71231 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2023-05-27T13:17:26.725179+08:00 71231 [System] [MY-011565] [Repl] Plugin group_replication reported: 'Setting super_read_only=ON.'
2023-05-27T13:17:26.725340+08:00 71231 [Note] [MY-011647] [Repl] Plugin group_replication reported: 'Going to wait for view modification'
115秒以后,少数派节点(node3)从自己的集群状态视图中踢出node1、node2节点信息,然后auto-rejoin尝试加入集群(如果在85s-115s间网络未恢复,则加入失败)。
2023-05-27T13:17:56.296998+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Timeout while waiting for the group communication engine to exit!'
2023-05-27T13:17:56.297075+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member has failed to gracefully leave the group.'
2023-05-27T13:17:56.396287+08:00 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'
2023-05-27T13:17:56.396530+08:00 72720 [System] [MY-013373] [Repl] Plugin group_replication reported: 'Started auto-rejoin procedure attempt 1 of 3'
2023-05-27T13:17:56.397353+08:00 71913 [Note] [MY-010596] [Repl] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2023-05-27T13:17:56.397914+08:00 71913 [Note] [MY-010587] [Repl] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2023-05-27T13:17:56.398188+08:00 71911 [Note] [MY-011444] [Repl] Plugin group_replication reported: 'The group replication applier thread was killed.'
2023-05-27T13:17:56.398830+08:00 72720 [Note] [MY-011673] [Repl] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2023-05-27T13:17:56.400717+08:00 72720 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Debug messages will be sent to: asynchronous::/data/mysql/data/GCS_DEBUG_TRACE'
2023-05-27T13:17:56.400984+08:00 72720 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-27T13:17:56.401014+08:00 72720 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the allowlist. It is mandatory that it is added.'
少数派节点(node3)集群视图变化
#5秒后
root@localhost: 14:05: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;system date;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest01 | UNREACHABLE | SECONDARY | 8.0.28 |
| mysqlwjhtest03 | ONLINE | PRIMARY | 8.0.28 |
| mysqlwjhtest02 | UNREACHABLE | SECONDARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
#85秒后
root@localhost: 14:07: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;system date;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest01 | UNREACHABLE | SECONDARY | 8.0.28 |
| mysqlwjhtest03 | ERROR | | 8.0.28 |
| mysqlwjhtest02 | UNREACHABLE | SECONDARY | 8.0.28 |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
#115秒后
root@localhost: 14:08: [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;system date;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ERROR | | 8.0.28 |
+----------------+--------------+-------------+----------------+
1 row in set (0.01 sec)
测试结果分析
场景1:Secondary节点发生故障
- 网络中断5秒时,心跳检测超时,少数派节点将另外两个失联节点状态变为unreachable。多数派节点也将对放状态变为unreachable。
- 网络中断35秒时,多数派节点将可疑节点逐出集群。
- 网络中断85秒时,少数派节点(故障节点)将自己的状态改为error。(如果在85秒前网络恢复,则集群自动恢复)
- 网络中断115秒时,少数派节点(故障节点)将两个失联节点逐出集群。(如果在115秒前网络恢复,则在115秒时mgr先逐出可疑节点,然后auto-rejoin尝试将error节点重新加入集群,且可以成功加入。)
场景2:Primary节点发生故障
- 网络中断5秒时,心跳检测超时,少数派节点将另外两个失联节点状态变为unreachable。多数派节点也将对放状态变为unreachable。
- 网络中断35秒时,多数派节点将可疑节点逐出集群,同时会选举出新的主库。
- 网络中断85秒时,少数派节点(故障节点)将自己的状态改为error。(如果在85秒前网络恢复,则集群自动恢复)
- 网络中断115秒时,少数派节点(故障节点)将两个失联节点逐出集群。(如果在115秒前网络恢复,则在115秒时mgr先逐出可疑节点,然后auto-rejoin尝试将error节点重新加入集群,且可以成功加入。)
几个时间点计算
- 35秒:心跳检测5s + group_replication_member_expel_timeout=35秒
- 85秒:心跳检测5s + group_replication_unreachable_majority_timeout=85
- 115秒:心跳检测5s + group_replication_unreachable_majority_timeout=80s + group_replication_member_expel_timeout=115秒
综上,3个参数的关系如下
- group_replication_member_expel_timeout 表示多数排驱除unreable节点的时间,也就是容忍多少秒,看看网络是否能够恢复。
- 少数派节点首先读取参数group_replication_unreachable_majority_timeout,到达超时时间后进入error状态。然后从零计时可疑节点驱逐时间(group_replication_member_expel_timeout),到达超时时间后从集群状态视图删除失联的所有节点信息。紧接着触发auto-rejoin重试(group_replication_autorejoin_tries);
- group_replication_unreachable_majority_timeout参数的值必须大于group_replication_member_expel_timeout,否则主库故障后还没有选举新主。
由此可见,当参数group_replication_member_expel_timeout和group_replication_unreachable_majority_timeout的值相同时,即少数派节点进入error状态时,多数派节点恰好选举了新主,可以避免出现双主的现象。
由此可见,当参数group_replication_member_expel_timeout和group_replication_unreachable_majority_timeout的值相同时,即少数派节点进入error状态时,多数派节点恰好选举了新主,可以避免出现双主的现象。
三、故障检测流程
结合上面的案例,我们来看看 Group Repliction 的故障检测流程。
-
1.集群中每个节点都会定期(每秒 1 次)向其它节点发送心跳信息。如果在 5s 内(固定值,无参数调整)没有收到其它节点的心跳信息,则会将该节点标记为可疑节点,同时会将该节点的状态设置为 UNREACHABLE 。如果集群中有等于或超过 1/2 的节点显示为 UNREACHABLE ,则该集群不能对外提供写服务。
-
2.如果在group_replication_member_expel_timeout(从 MySQL 8.0.21 开始,该参数的默认值为 5,单位 s,最大可设置值为3600,即 1 小时)时间内,可疑节点恢复正常,则会直接应用 XCom Cache 中的消息。XCom Cache 的大小由group_replication_message_cache_size 决定,默认是 1G。
-
3.如果在group_replication_member_expel_timeout时间内,可疑节点没有恢复正常,则会被驱逐出集群。少数派节点则,
- 如果参数group_replication_unreachable_majority_timeout 没有设置,网络恢复正常,因为已经被大多数剔除,则设置为error。
- 达到 group_replication_unreachable_majority_timeout 的限制,自动设置为error
如果group_replication_autorejoin_tries不为 0,对于 ERROR 状态的节点,会自动重试,重新加入集群(auto-rejoin)。如果group_replication_autorejoin_tries为 0 或重试失败,则会执行 group_replication_exit_state_action 指定的操作。可选的操作有:
-
READ_ONLY:只读模式。在这种模式下,会将 super_read_only 设置为 ON。默认值。
-
OFFLINE_MODE:离线模式。在这种模式下,会将 offline_mode 和 super_read_only 设置为 ON,此时,只有CONNECTION_ADMIN(SUPER)权限的用户才能登陆,普通用户不能登录。
-
ABORT_SERVER:关闭实例。
四、 XCom Cache
XCom Cache 是 XCom 使用的消息缓存,用来缓存集群节点之间交换的消息。缓存的消息是共识协议的一部分。如果网络不稳定,可能会出现节点失联的情况。
如果节点在一定时间(由 group_replication_member_expel_timeout 决定)内恢复正常,它会首先应用 XCom Cache 中的消息。如果 XCom Cache 没有它需要的所有消息,这个节点会被驱逐出集群。驱逐出集群后,如果 group_replication_autorejoin_tries 不为 0,它会重新加入集群(auto-rejoin)。
重新加入集群会使用 Distributed Recovery 补齐差异数据。相比较直接使用 XCom Cache 中的消息,通过 Distributed Recovery 加入集群需要的时间相对较长,过程也较复杂,并且集群的性能也会受到影响。
所以,我们在设置 XCom Cache 的大小时,需预估 group_replication_member_expel_timeout + 5s 这段时间内的内存使用量。如何预估,后面会介绍相关的系统表。
五、生产配置建议
#适用于8.0.21以上版本
group_replication_member_expel_timeout=30 #可疑节点超过30秒被驱逐集群,默认5秒。注意:如果是少数派节点(故障节点),需要在网络不可达超时时间等待结束后,从零计时30秒。
group_replication_unreachable_majority_timeout=35 #网络不可达超时时间,在少数派节点(故障节点)将自己的集群状态视图改为error的时间。默认值为0,表示一直处于UNREACHABLE状态,如果要设置应该大于心跳超时和group_replication_member_expel_timeout参数的设置总和这里是30+5
group_replication_message_cache_size=5G #XCom cache使用的消息缓存,默认1G,建议设置为5G。
group_replication_autorejoin_tries=3 #error状态节点重新加入集群的次数
注意理解:参数group_replication_unreachable_majority_timeout表示多数派节点不可达,即网络故障分区后,它只对少数派节点有效。
目前我司vip脚本切换逻辑主要判断旧主状态,当旧主从 online primary变为error状态时,触发vip切换。所以我们将前两个参数设置相同值,这样可以保证触发vip切换的那一刻,mgr多数派节点刚好选举了新主。同时也避免了双主现象。
六、注意事项
如果集群中存在 UNREACHABLE 的节点,会有以下限制和不足:
- 不能调整集群的拓扑,包括添加和删除节点。
- 如果 Group Replication 的一致性级别等于 AFTER 或 BEFORE_AND_AFTER,则写操作会一直等待,直到 UNREACHABLE 节点 ONLINE 并应用该操作。
- 集群吞吐量会下降。如果是单主模式,可将 group_replication_paxos_single_leader (MySQL 8.0.27 引入的)设置为 ON 解决这个问题,其改变了孟子协议轮流坐庄的方式。