MGR网络抖动问题分析和group_replication_unreachable_majority_timeout等参数分析

news2024/11/26 6:00:36

导言

本文主要从测试的角度来阐明group_replication_unreachable_majority_timeout和group_replication_member_expel_timeout参数对集群网络分区的影响,首先这里放一张大图,为多年前学习MGR的时候留下的,也就是这两个参数对集群的影响,也是本文测试重点。

在使用mgr的过程中,我们会经常看到以下报错,大概意思是:“由于网络故障,成员被逐出mgr集群,并将该成员状态更改为ERROR”。这因为mgr集群节点间心跳检测超时,少数派节点被驱逐导致的。

2022-07-31T13:07:30.464239-00:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2022-07-31T13:07:37.458761-00:00 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'

下面结合2个具体案例,分析以下3个参数配置对mgr网络故障的影响和mgr的故障检测流程。集群拓扑:单主模式

节点        IP地址        主机名             状态
node1      172.16.86.61 mysqlwjhtest01    PRIMARY
node2      172.16.86.62 mysqlwjhtest02    SECONDARY
node3      172.16.86.63 mysqlwjhtest03    SECONDARY

本次测试主要包括两步:

  • 模拟网络分区,看它对集群各节点的影响。
  • 恢复网络连接,看看各节点又是如何反应的。

一、 group_replication_unreachable_majority_timeout参数设置为0

3个相关测试参数

#适用于8.0.21以上版本
group_replication_member_expel_timeout=30           #可疑节点超过30秒被驱逐集群
group_replication_unreachable_majority_timeout=0        #网络分区后少数派节点网络不可达超时时间 
group_replication_autorejoin_tries=3                #error状态节点重新加入集群的次数

案例1:模拟secondary节点网络中断

1.1 模拟网络分区故障,在 node3 上执行(当前node3为secondary节点)。通过iptables 命令断开 node3 与 node1、node2 之间的网络连接。

iptables -A INPUT  -p tcp -s 172.16.86.61 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.61 -j DROP
iptables -A INPUT  -p tcp -s 172.16.86.62 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.62 -j DROP

命令执行完 5s(mgr心跳检测时间,固定值,无法修改)后,各个节点开始响应。

  1. 首先查看多数派节点(node1 、node2)日志及集群状态信息。显示 node3 处于 UNREACHABLE 状态。
[Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
root@localhost: 19:38:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | UNREACHABLE  | SECONDARY   | 8.0.28         |
| mysqlwjhtest02 | ONLINE       | SECONDARY   | 8.0.28         |
| mysqlwjhtest01 | ONLINE       | PRIMARY     | 8.0.28         |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

日志显示,检测到node3网络不可达以后,等过了30秒(参数group_replication_member_expel_timeout=30控制)以后,node3被驱逐出集群。集群状态视图只有2个节点了。

2023-05-24T19:38:39.413051+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
2023-05-24T19:39:09.413491+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2023-05-24T19:39:09.468173+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-24T19:39:09.468299+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.61:13307'
2023-05-24T19:39:09.468320+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.62:13307'
2023-05-24T19:39:09.468336+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 29726 0}, boot key synode is {88e4dbe0 29715 0}, configured event horizon=10, my node identifier is 0'
2023-05-24T19:39:10.399352+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 0 and my address is 172.16.86.61:13307'
2023-05-24T19:39:10.404044+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Group is able to support up to communication protocol version 8.0.27'
2023-05-24T19:39:10.404139+08:00 0 [Warning] [MY-011499] [Repl] Plugin group_replication reported: 'Members removed from the group: mysqlwjhtest03:3307'
2023-05-24T19:39:10.404219+08:00 0 [System] [MY-011503] [Repl] Plugin group_replication reported: 'Group membership changed to mysqlwjhtest02:3307, mysqlwjhtest01:3307 on view 16849117016333239:4.'
2023-05-24T19:39:20.951668+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Failure reading from fd=-1 n=18446744073709551615 from 172.16.86.63:13307'

此时查看多数派节点(node1 、node2)状态信息,已经看不到node3节点了。

root@localhost: 20:02:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest02 | ONLINE       | SECONDARY   | 8.0.28         |
| mysqlwjhtest01 | ONLINE       | PRIMARY     | 8.0.28         |
+----------------+--------------+-------------+----------------+
2 rows in set (0.00 sec)
  1. 然后查看少数派节点(node3)日志及集群状态信息,显示node1,node2 处于 UNREACHABLE 状态。
[Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 has become unreachable.'
[Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
[ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'

并且在整个过程中,集群状态信息显示node1和node2一直处于UNREACHABLE状态。(这是由于group_replication_unreachable_majority_timeout=0,后面展开聊这个参数。)

root@localhost: 19:54:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE       | SECONDARY   | 8.0.28         |
| mysqlwjhtest02 | UNREACHABLE  | SECONDARY   | 8.0.28         |
| mysqlwjhtest01 | UNREACHABLE  | PRIMARY     | 8.0.28         |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

1.2 恢复网络连接

接下来我们尝试恢复 node3 与 node1、node2 之间的网络。

iptables -F
  1. 查看原少数派节点(node3)日志,显示心跳检测到node1、node2网络恢复正常,然后auto-rejoin再次加入集群,GCS通信恢复正常,应用xcom cache,更新change master等。
2023-05-24T20:20:52.475071+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.61:13307'
2023-05-24T20:20:53.655956+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.62:13307'
2023-05-24T20:21:03.214971+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 is reachable again.'
2023-05-24T20:21:03.215075+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 is reachable again.'
2023-05-24T20:21:03.215100+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2023-05-24T20:21:19.209028+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 2 and my address is 172.16.86.63:13307'
2023-05-24T20:21:20.208934+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 has become unreachable.'
2023-05-24T20:21:20.209098+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
2023-05-24T20:21:20.209131+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
2023-05-24T20:21:23.097685+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 is reachable again.'
2023-05-24T20:21:23.097871+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 is reachable again.'
2023-05-24T20:21:23.097916+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2023-05-24T20:21:23.107732+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-24T20:21:23.107836+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.61:13307'
2023-05-24T20:21:23.107854+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.62:13307'
2023-05-24T20:21:23.107872+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 29726 0}, boot key synode is {88e4dbe0 29715 0}, configured event horizon=10, my node identifier is 4294967295'
2023-05-24T20:21:26.125137+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sending a request to a remote XCom failed. Please check the remote node log for more details.'
2023-05-24T20:21:26.125323+08:00 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2023-05-24T20:21:26.125534+08:00 0 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2023-05-24T20:21:26.126663+08:00 5344 [System] [MY-013373] [Repl] Plugin group_replication reported: 'Started auto-rejoin procedure attempt 1 of 3'
2023-05-24T20:21:26.127111+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2023-05-24T20:21:26.127169+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Failed to push into XCom.'
2023-05-24T20:21:26.291624+08:00 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'
2023-05-24T20:21:26.293673+08:00 13 [Note] [MY-010596] [Repl] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2023-05-24T20:21:26.294631+08:00 13 [Note] [MY-010587] [Repl] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2023-05-24T20:21:26.294877+08:00 11 [Note] [MY-011444] [Repl] Plugin group_replication reported: 'The group replication applier thread was killed.'
2023-05-24T20:21:26.296001+08:00 5344 [Note] [MY-011673] [Repl] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2023-05-24T20:21:26.299638+08:00 5344 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Debug messages will be sent to: asynchronous::/data/mysql/data/GCS_DEBUG_TRACE'
2023-05-24T20:21:26.300015+08:00 5344 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-24T20:21:26.300062+08:00 5344 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-24T20:21:26.300132+08:00 5344 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] SSL was not enabled'
2023-05-24T20:21:26.301794+08:00 5346 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 3942, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2023-05-24T20:21:26.331516+08:00 5344 [Note] [MY-011670] [Repl] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2023-05-24T20:21:26.331583+08:00 5348 [Note] [MY-010581] [Repl] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log '/data/mysql/relaylog/mysql-relay-bin-group_replication_applier.000002' position: 4213
2023-05-24T20:21:26.473164+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using XCom as Communication Stack for XCom'
2023-05-24T20:21:26.473594+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] XCom initialized and ready to accept incoming connections on port 13307'
2023-05-24T20:21:26.473792+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Successfully connected to the local XCom via anonymous pipe'
2023-05-24T20:21:26.474619+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] TCP_NODELAY already set'
2023-05-24T20:21:26.474690+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully connected to peer 172.16.86.61:13307. Sending a request to be added to the group'

此时查看原少数派节点(node3)集群状态信息,也恢复正常。

root@localhost: 20:20:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE       | SECONDARY   | 8.0.28         |
| mysqlwjhtest02 | ONLINE       | SECONDARY   | 8.0.28         |
| mysqlwjhtest01 | ONLINE       | PRIMARY     | 8.0.28         |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
  1. 查看原多数派节点(node1 、node2)日志,大致就是与node3建立连接,然后集群恢复
2023-05-24T20:21:26.124082+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Node has already been removed: 172.16.86.63:13307 16849122691765501.'
2023-05-24T20:21:26.522275+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Adding new node to the configuration: 172.16.86.63:13307'
2023-05-24T20:21:26.522513+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-24T20:21:26.522574+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.61:13307'
2023-05-24T20:21:26.522609+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.62:13307'
2023-05-24T20:21:26.522624+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 2 host 172.16.86.63:13307'
2023-05-24T20:21:26.522641+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 32296 0}, boot key synode is {88e4dbe0 32285 0}, configured event horizon=10, my node identifier is 0'
2023-05-24T20:21:26.522899+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sender task disconnected from 172.16.86.63:13307'
2023-05-24T20:21:26.522931+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connecting to 172.16.86.63:13307'
2023-05-24T20:21:26.523349+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.63:13307'
2023-05-24T20:21:27.452375+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 0 and my address is 172.16.86.61:13307'
2023-05-24T20:21:27.453687+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Received an XCom snapshot request from 172.16.86.63:13307'

上面案例是mysql8.0 mgr secondary节点网络故障过程中的mgr集群变化。最后看到,网络恢复后集群自行恢复,整个过程无需DBA干预。

案例2:模拟primary节点网络中断

2.1 继续模拟网络分区故障,在 node1 上执行(当前node1为primary节点)。通过iptables 命令断开 node1 与 node2、node3 之间的网络连接。

iptables -A INPUT  -p tcp -s 172.16.86.62 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.62 -j DROP
iptables -A INPUT  -p tcp -s 172.16.86.63 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.63 -j DROP
date
2023年 05月 27日 星期六 09:29:46 CST

命令执行完 5后,各个节点开始响应。

  1. 首先查看多数派节点(node2 、node3)日志集群状态信息
    primary节点故障后会触发主库切换,我们详细看一下时间线:
  • 09:29:46 网络中断
  • 09:29:51 5秒后,心跳检测超时
  • 09:30:21 35秒后,提升node3为主节点
023-05-27T09:29:51.229977+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 has become unreachable.'
2023-05-27T09:30:20.947030+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T09:30:20.947163+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.62:13307'
2023-05-27T09:30:20.947191+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.63:13307'
2023-05-27T09:30:20.947217+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this co
nfiguration is {88e4dbe0 309739 1}, boot key synode is {88e4dbe0 309728 1}, configured event horizon=10, my node identifier is 1'
2023-05-27T09:30:21.689907+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Group is able to support up to communication protocol version 8.0.2
7'
2023-05-27T09:30:21.689994+08:00 0 [Warning] [MY-011499] [Repl] Plugin group_replication reported: 'Members removed from the group: mysqlwjhtest01:3307'
2023-05-27T09:30:21.690006+08:00 0 [System] [MY-011500] [Repl] Plugin group_replication reported: 'Primary server with address mysqlwjhtest01:3307 left the group. Electin
g new Primary.'
2023-05-27T09:30:21.690097+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member gtid_executed: debbaad4-1fd7-4652-80a4-5c87cfde0db
6:1-16'
2023-05-27T09:30:21.690108+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member applier channel received_transaction_set: debbaad4
-1fd7-4652-80a4-5c87cfde0db6:1-16'
2023-05-27T09:30:21.690123+08:00 0 [System] [MY-011507] [Repl] Plugin group_replication reported: 'A new primary with address mysqlwjhtest03:3307 was elected. The new pri
mary will execute all previous group transactions before allowing writes.'
2023-05-27T09:30:21.690509+08:00 0 [System] [MY-011503] [Repl] Plugin group_replication reported: 'Group membership changed to mysqlwjhtest03:3307, mysqlwjhtest02:3307 on
 view 16849117016333239:10.'
2023-05-27T09:30:21.691815+08:00 6189 [System] [MY-013731] [Repl] Plugin group_replication reported: 'The member action "mysql_disable_super_read_only_if_primary" for eve
nt "AFTER_PRIMARY_ELECTION" with priority "1" will be run.'
2023-05-27T09:30:21.692135+08:00 6189 [System] [MY-011566] [Repl] Plugin group_replication reported: 'Setting super_read_only=OFF.'
2023-05-27T09:30:21.692186+08:00 6189 [System] [MY-013731] [Repl] Plugin group_replication reported: 'The member action "mysql_start_failover_channels_if_primary" for eve
nt "AFTER_PRIMARY_ELECTION" with priority "10" will be run.'
2023-05-27T09:30:21.693182+08:00 6181 [Note] [MY-011485] [Repl] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection.'
2023-05-27T09:30:21.693200+08:00 68187 [System] [MY-011510] [Repl] Plugin group_replication reported: 'This server is working as primary member.'
2023-05-27T09:30:31.213720+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Failure reading from fd=-1 n=18446744073709551615 from 172.16.86.61
:13307'

多数派节点(node2 、node3)集群状态信息显示,node3节点已经被选举为新主。

#node2、node3显示状态
root@localhost: 09:30:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE       | PRIMARY     | 8.0.28         |
| mysqlwjhtest02 | ONLINE       | SECONDARY   | 8.0.28         |
+----------------+--------------+-------------+----------------+
2 rows in set (0.01 sec)
  1. 查看少数派节点(node1)日志。
    node1提供的日志信息较少,只是在心跳检测超时后阻塞了数据写入,这里并没有提到阻塞多长时间。
2023-05-27T09:29:51.277851+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
2023-05-27T09:29:51.277990+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
2023-05-27T09:29:51.278007+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'

查看node1集群状态信息。我们发现网络分区后,少数派节点(node1)依然处于"online primary"状态。

#node1显示状态
root@localhost: 09:30:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | UNREACHABLE  | SECONDARY   | 8.0.28         |
| mysqlwjhtest02 | UNREACHABLE  | SECONDARY   | 8.0.28         |
| mysqlwjhtest01 | ONLINE       | PRIMARY     | 8.0.28         |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

所以,结合对比多数派节点集群信息,网络分区后,node1、node3均为"online primary"状态,但是少数派节点(node1)会话阻塞了所有更新,即node1处于只读状态。如果mgrvip切换检测逻辑只依赖于"online primary"状态时,这里会有问题。

此时我们在node1尝试写入数据,发现被阻塞。这是因为node1分区后只有自己一个节点,属于少数派节点,不满足组复制的多数派原则,符合预期。

root@localhost: 10:10:  [(none)]> create database db1;
......    一直卡在这里

2.2 恢复网络连接

接下来我们尝试恢复 node1 与 node2、node3 之间的网络。

iptables -F
  1. 对比查看少数派节点(node1)和多数派节点(node2、node3)上的集群状态变化。
    在这里我发现了比较有意思的事情,首先各个节点的UNREACHABLE状态变为ONLINE。但此时少数派节点(node1)集群信息显示,node1依然是"online primary"状态,然后过了两三秒,原少数派节点(node1)中的primary节点才由node1变为node3。从日志来看,网络恢复后节点从UNREACHABLE状态变为ONLINE,然后由于XCOM层配置的改变,最终报错后,重新加入集群。以下是我在原少数派节点(node1)监控到的集群状态变化:
#首先,网络刚刚恢复时,primary为node1
root@localhost: 10:17:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE       | SECONDARY   | 8.0.28         |
| mysqlwjhtest02 | ONLINE       | SECONDARY   | 8.0.28         |
| mysqlwjhtest01 | ONLINE       | PRIMARY     | 8.0.28         |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

#大概两三秒后primary变为node3。
root@localhost: 10:24:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ONLINE       | PRIMARY     | 8.0.28         |
| mysqlwjhtest02 | ONLINE       | SECONDARY   | 8.0.28         |
| mysqlwjhtest01 | ONLINE       | SECONDARY   | 8.0.28         |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

原少数派节点(node1)网络恢复过程中的日志

2023-05-27T10:17:07.490608+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.63:13307'
2023-05-27T10:17:08.491033+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.62:13307'
2023-05-27T10:17:08.586063+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 0 and my address is 172.16.86.61:13307'
2023-05-27T10:17:08.586486+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 is reachable again.'
2023-05-27T10:17:08.586546+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2023-05-27T10:17:09.202537+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 is reachable again.'
2023-05-27T10:17:20.197898+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
2023-05-27T10:17:20.198008+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'
2023-05-27T10:17:20.198026+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
2023-05-27T10:17:20.815474+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] A configuration change was detected. Sending a Global View Message to all nodes. My node identifier is 0 and my address is 172.16.86.61:13307'
2023-05-27T10:17:20.815730+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 is reachable again.'
2023-05-27T10:17:38.361435+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2023-05-27T10:17:38.361659+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 is reachable again.'
2023-05-27T10:17:40.489269+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:40.489357+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.62:13307'
2023-05-27T10:17:40.489370+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.63:13307'
2023-05-27T10:17:40.489382+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 309739 1}, boot key synode is {88e4dbe0 309728 1}, configured event horizon=10, my node identifier is 4294967295'
2023-05-27T10:17:43.526534+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sending a request to a remote XCom failed. Please check the remote node log for more details.'
2023-05-27T10:17:43.526695+08:00 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2023-05-27T10:17:43.526856+08:00 0 [Warning] [MY-011630] [Repl] Plugin group_replication reported: 'Due to a plugin error, some transactions were unable to be certified and will now rollback.'
2023-05-27T10:17:43.526929+08:00 0 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2023-05-27T10:17:43.527329+08:00 53521 [ERROR] [MY-011615] [Repl] Plugin group_replication reported: 'Error while waiting for conflict detection procedure to finish on session 53521'
2023-05-27T10:17:43.527405+08:00 53521 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed
2023-05-27T10:17:43.533122+08:00 53521 [Note] [MY-010914] [Server] Aborted connection 53521 to db: 'unconnected' user: 'root' host: 'localhost' (Error on observer while running replication hook 'before_commit').
2023-05-27T10:17:43.533134+08:00 53627 [System] [MY-011565] [Repl] Plugin group_replication reported: 'Setting super_read_only=ON.'
2023-05-27T10:17:43.533523+08:00 53628 [System] [MY-013373] [Repl] Plugin group_replication reported: 'Started auto-rejoin procedure attempt 1 of 3'
2023-05-27T10:17:43.533650+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2023-05-27T10:17:43.533682+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Failed to push into XCom.'
2023-05-27T10:17:44.837043+08:00 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'
2023-05-27T10:17:44.838114+08:00 13 [Note] [MY-010596] [Repl] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2023-05-27T10:17:44.838433+08:00 13 [Note] [MY-010587] [Repl] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2023-05-27T10:17:44.838544+08:00 11 [Note] [MY-011444] [Repl] Plugin group_replication reported: 'The group replication applier thread was killed.'
2023-05-27T10:17:44.839024+08:00 53628 [Note] [MY-011673] [Repl] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2023-05-27T10:17:44.840989+08:00 53628 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Debug messages will be sent to: asynchronous::/data/mysql/data/GCS_DEBUG_TRACE'
2023-05-27T10:17:44.841282+08:00 53628 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-27T10:17:44.841313+08:00 53628 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-27T10:17:44.841357+08:00 53628 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] SSL was not enabled'
2023-05-27T10:17:44.842115+08:00 53630 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 2046, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2023-05-27T10:17:44.860923+08:00 53628 [Note] [MY-011670] [Repl] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2023-05-27T10:17:44.861050+08:00 53632 [Note] [MY-010581] [Repl] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log '/data/mysql/relaylog/mysql-relay-bin-group_replication_applier.000002' position: 2317
2023-05-27T10:17:45.006131+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using XCom as Communication Stack for XCom'
2023-05-27T10:17:45.006450+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] XCom initialized and ready to accept incoming connections on port 13307'
2023-05-27T10:17:45.006546+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Successfully connected to the local XCom via anonymous pipe'
2023-05-27T10:17:45.007262+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] TCP_NODELAY already set'
2023-05-27T10:17:45.007298+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully connected to peer 172.16.86.62:13307. Sending a request to be added to the group'
2023-05-27T10:17:45.007314+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sending add_node request to a peer XCom node'
2023-05-27T10:17:47.166146+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Node has not booted. Requesting an XCom snapshot from node number 1 in the current configuration'
2023-05-27T10:17:47.167161+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Installing requested snapshot. Importing all incoming configurations.'
2023-05-27T10:17:47.167358+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:47.167393+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Creating new server node 0 host 172.16.86.61:13307'
2023-05-27T10:17:47.167502+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Creating new server node 1 host 172.16.86.62:13307'
2023-05-27T10:17:47.167613+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 36985 0}, boot key synode is {88e4dbe0 36974 0}, configured event horizon=10, my node identifier is 0'
2023-05-27T10:17:47.167699+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:47.167723+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.61:13307'
2023-05-27T10:17:47.167747+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.62:13307'
2023-05-27T10:17:47.167761+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Creating new server node 2 host 172.16.86.63:13307'
2023-05-27T10:17:47.167866+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 37375 0}, boot key synode is {88e4dbe0 37364 0}, configured event horizon=10, my node identifier is 0'
2023-05-27T10:17:47.167964+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:47.167980+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.62:13307'
2023-05-27T10:17:47.167993+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.63:13307'
2023-05-27T10:17:47.168009+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 309739 1}, boot key synode is {88e4dbe0 309728 1}, configured event horizon=10, my node identifier is 4294967295'
2023-05-27T10:17:47.168089+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Updating physical connections to other servers'
2023-05-27T10:17:47.168104+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 0 host 172.16.86.62:13307'
2023-05-27T10:17:47.168117+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 1 host 172.16.86.63:13307'
2023-05-27T10:17:47.168131+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using existing server node 2 host 172.16.86.61:13307'
2023-05-27T10:17:47.168146+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully installed new site definition. Start synode for this configuration is {88e4dbe0 312618 0}, boot key synode is {88e4dbe0 312607 0}, configured event horizon=10, my node identifier is 2'
2023-05-27T10:17:47.168161+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Finished snapshot installation. My node number is 2'
2023-05-27T10:17:47.168269+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member has joined the group. Local port: 13307'
2023-05-27T10:17:47.168610+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connecting to 172.16.86.62:13307'
2023-05-27T10:17:47.168910+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.62:13307'
2023-05-27T10:17:47.168950+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connecting to 172.16.86.63:13307'
2023-05-27T10:17:47.169306+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Connected to 172.16.86.63:13307'
2023-05-27T10:17:47.658374+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] This server adjusted its communication protocol to 8.0.27 in order to join the group.'
2023-05-27T10:17:47.658436+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Group is able to support up to communication protocol version 8.0.27'
2023-05-27T10:17:47.660220+08:00 53628 [System] [MY-013375] [Repl] Plugin group_replication reported: 'Auto-rejoin procedure attempt 1 of 3 finished. Member was able to join the group.'
2023-05-27T10:17:48.661179+08:00 0 [System] [MY-013471] [Repl] Plugin group_replication reported: 'Distributed recovery will transfer data using: Incremental recovery from a group donor'
2023-05-27T10:17:48.661739+08:00 53647 [Note] [MY-011576] [Repl] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'
2023-05-27T10:17:48.661780+08:00 0 [System] [MY-011503] [Repl] Plugin group_replication reported: 'Group membership changed to mysqlwjhtest03:3307, mysqlwjhtest02:3307, mysqlwjhtest01:3307 on view 16849117016333239:11.'
2023-05-27T10:17:48.693433+08:00 53647 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='mysqlwjhtest02', master_port= 3307, master_log_file='', master_log_pos= 4, master_bind=''.
2023-05-27T10:17:48.727090+08:00 53647 [Note] [MY-011580] [Repl] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor 62a5b927-fa01-11ed-82ce-0050569c05d6 at mysqlwjhtest02 port: 3307.'
2023-05-27T10:17:48.727967+08:00 53648 [Warning] [MY-010897] [Repl] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2023-05-27T10:17:48.730254+08:00 53648 [System] [MY-010562] [Repl] Slave I/O thread for channel 'group_replication_recovery': connected to master 'repl@mysqlwjhtest02:3307',replication started in log 'FIRST' at position 4
2023-05-27T10:17:48.733307+08:00 53649 [Note] [MY-010581] [Repl] Slave SQL thread for channel 'group_replication_recovery' initialized, starting replication in log 'FIRST' at position 0, relay log '/data/mysql/relaylog/mysql-relay-bin-group_replication_recovery.000001' position: 4
2023-05-27T10:17:48.757900+08:00 53647 [Note] [MY-011585] [Repl] Plugin group_replication reported: 'Terminating existing group replication donor connection and purging the corresponding logs.'
2023-05-27T10:17:48.758896+08:00 53649 [Note] [MY-010587] [Repl] Slave SQL thread for channel 'group_replication_recovery' exiting, replication stopped in log 'mysql-bin.000001' at position 6217
2023-05-27T10:17:48.759258+08:00 53648 [Note] [MY-011026] [Repl] Slave I/O thread killed while reading event for channel 'group_replication_recovery'.
2023-05-27T10:17:48.759324+08:00 53648 [Note] [MY-010570] [Repl] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'mysql-bin.000001', position 6217
2023-05-27T10:17:48.796509+08:00 53647 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='mysqlwjhtest02', master_port= 3307, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2023-05-27T10:17:48.831253+08:00 0 [System] [MY-011490] [Repl] Plugin group_replication reported: 'This server was declared online within the replication group.'
2023-05-27T10:29:27.705642+08:00 53768 [Note] [MY-013730] [Server] 'wait_timeout' period of 300 seconds was exceeded for `root`@`localhost`. The idle time since last command was too long.
2023-05-27T10:29:27.705830+08:00 53768 [Note] [MY-010914] [Server] Aborted connection 53768 to db: 'unconnected' user: 'root' host: 'localhost' (The client was disconnected by the server because of inactivity.).

以上两个案例中,我将多数派网络不可达超时参数(group_replication_unreachable_majority_timeout)设置为0,所以心跳检测超时后,可疑节点被标记为unreachable。因为设置了可疑节点被驱逐时间为30秒(group_replication_member_expel_timeout=30)。所以在网络中断35秒后,多数派节点集群中的可疑节点被驱逐。而少数派节点集群中的可疑节点一直处于unreachable状态。

二、参数group_replication_unreachable_majority_timeout对集群的影响

为了便于观察,这里设置为80秒,然后重复案例2的步骤。

group_replication_member_expel_timeout=30         #与上面案例保持不变,设为30秒
group_replication_unreachable_majority_timeout=80    #设置为80秒
注意理解,参数group_replication_unreachable_majority_timeout表示多数派节点不可达超时时间。即网络故障分区后,它只对少数派节点有效。

此时primary节点是node3,所以断开node3与另外两个节点的网络。

iptables -A INPUT  -p tcp -s 172.16.86.62 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.62 -j DROP
iptables -A INPUT  -p tcp -s 172.16.86.61 -j DROP
iptables -A OUTPUT -p tcp -d 172.16.86.61 -j DROP
date
#恢复网络:iptables -F

观察主节点网络故障过程中各节点集群状态

  • 13:15:56 模拟网络中断
  • 13:16:01 5秒时,心跳检测网络中断,少数派节点(node3)日志显示所有更新将被阻塞85秒,并将另外两个节点状态标记为unreachalbe。同时,多数派节点(node1、node2)集群状态视图将node3标记为unreachalbe。
  • 13:16:31 35秒时,多数派节点(node1、node2)日志显示,node3对逐出集群,同时选举新主。 少数派节点无日志输出,此时少数派节点(node3)视图的primary依然是自己(但是所有更新被阻塞中)。
  • 13:17:26 85秒时,少数派节点(node3)更改自己集群视图状态,由online primary改为error。 多数派节点无日志输出。
  • 13:17:56 115秒时,少数派节点(node3)更新集群状态视图,踢出node1、node2节点信息。

5秒后,心跳检测网络中断,所有节点检测到网络不可达,并更新集群状态视图

#少数派节点(node3)
2023-05-27T13:16:01.716803+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest01:3307 has become unreachable.'
2023-05-27T13:16:01.716934+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest02:3307 has become unreachable.'
2023-05-27T13:16:01.716951+08:00 0 [ERROR] [MY-011496] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked for the next 85 seconds. Unless contact with the majority is restored, after this time the member will error out and leave the group. It is possible to use group_replication_force_members to force a new group membership.'
2023-05-27T13:17:26.724015+08:00 71231 [ERROR] [MY-011711] [Repl] Plugin group_replication reported: 'This member could not reach a majority of the members for more than 85 seconds. The member will now leave the group as instructed by the group_replication_unreachable_majority_timeout option.'

#多数派节点(node1、node2)
2023-05-27T13:16:01.719345+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address mysqlwjhtest03:3307 has become unreachable.'

35秒后,多数派节点(node1、node2)将node3逐出集群,并选择node2为新主库,更新状态视图。 故障节点无日志输出。

2023-05-27T13:16:32.706137+08:00 0 [Warning] [MY-011499] [Repl] Plugin group_replication reported: 'Members removed from the group: mysqlwjhtest03:3307'
2023-05-27T13:16:32.706150+08:00 0 [System] [MY-011500] [Repl] Plugin group_replication reported: 'Primary server with address mysqlwjhtest03:3307 left the group. Electing new Primary.'
2023-05-27T13:16:32.706259+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member gtid_executed: debbaad4-1fd7-4652-80a4-5c87cfde0db6:1-19'
2023-05-27T13:16:32.706272+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member applier channel received_transaction_set: debbaad4-1fd7-4652-80a4-5c87cfde0db6:1-19'
2023-05-27T13:16:32.706284+08:00 0 [System] [MY-011507] [Repl] Plugin group_replication reported: 'A new primary with address mysqlwjhtest02:3307 was elected. The new primary will execute all previous group transactions before allowing writes.'

85秒后,少数派节点(node3)更改自己状态,由online改为error。多数派节点无日志输出。

2023-05-27T13:17:26.724015+08:00 71231 [ERROR] [MY-011711] [Repl] Plugin group_replication reported: 'This member could not reach a majority of the members for more than 85 seconds. The member will now leave the group as instructed by the group_replication_unreachable_majority_timeout option.'
2023-05-27T13:17:26.724483+08:00 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2023-05-27T13:17:26.724560+08:00 71231 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2023-05-27T13:17:26.725179+08:00 71231 [System] [MY-011565] [Repl] Plugin group_replication reported: 'Setting super_read_only=ON.'
2023-05-27T13:17:26.725340+08:00 71231 [Note] [MY-011647] [Repl] Plugin group_replication reported: 'Going to wait for view modification'

115秒以后,少数派节点(node3)从自己的集群状态视图中踢出node1、node2节点信息,然后auto-rejoin尝试加入集群(如果在85s-115s间网络未恢复,则加入失败)。

2023-05-27T13:17:56.296998+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Timeout while waiting for the group communication engine to exit!'
2023-05-27T13:17:56.297075+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member has failed to gracefully leave the group.'
2023-05-27T13:17:56.396287+08:00 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'
2023-05-27T13:17:56.396530+08:00 72720 [System] [MY-013373] [Repl] Plugin group_replication reported: 'Started auto-rejoin procedure attempt 1 of 3'
2023-05-27T13:17:56.397353+08:00 71913 [Note] [MY-010596] [Repl] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2023-05-27T13:17:56.397914+08:00 71913 [Note] [MY-010587] [Repl] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2023-05-27T13:17:56.398188+08:00 71911 [Note] [MY-011444] [Repl] Plugin group_replication reported: 'The group replication applier thread was killed.'
2023-05-27T13:17:56.398830+08:00 72720 [Note] [MY-011673] [Repl] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2023-05-27T13:17:56.400717+08:00 72720 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Debug messages will be sent to: asynchronous::/data/mysql/data/GCS_DEBUG_TRACE'
2023-05-27T13:17:56.400984+08:00 72720 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the allowlist. It is mandatory that it is added.'
2023-05-27T13:17:56.401014+08:00 72720 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the allowlist. It is mandatory that it is added.'

少数派节点(node3)集群视图变化

#5秒后
root@localhost: 14:05:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;system date;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest01 | UNREACHABLE  | SECONDARY   | 8.0.28         |
| mysqlwjhtest03 | ONLINE       | PRIMARY     | 8.0.28         |
| mysqlwjhtest02 | UNREACHABLE  | SECONDARY   | 8.0.28         |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
#85秒后
root@localhost: 14:07:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;system date;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest01 | UNREACHABLE  | SECONDARY   | 8.0.28         |
| mysqlwjhtest03 | ERROR        |             | 8.0.28         |
| mysqlwjhtest02 | UNREACHABLE  | SECONDARY   | 8.0.28         |
+----------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
#115秒后
root@localhost: 14:08:  [(none)]> select MEMBER_HOST,MEMBER_STATE,MEMBER_ROLE,MEMBER_VERSION from performance_schema.replication_group_members;system date;
+----------------+--------------+-------------+----------------+
| MEMBER_HOST    | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+----------------+--------------+-------------+----------------+
| mysqlwjhtest03 | ERROR        |             | 8.0.28         |
+----------------+--------------+-------------+----------------+
1 row in set (0.01 sec)

测试结果分析

场景1:Secondary节点发生故障

  • 网络中断5秒时,心跳检测超时,少数派节点将另外两个失联节点状态变为unreachable。多数派节点也将对放状态变为unreachable。
  • 网络中断35秒时,多数派节点将可疑节点逐出集群。
  • 网络中断85秒时,少数派节点(故障节点)将自己的状态改为error。(如果在85秒前网络恢复,则集群自动恢复)
  • 网络中断115秒时,少数派节点(故障节点)将两个失联节点逐出集群。(如果在115秒前网络恢复,则在115秒时mgr先逐出可疑节点,然后auto-rejoin尝试将error节点重新加入集群,且可以成功加入。)

场景2:Primary节点发生故障

  • 网络中断5秒时,心跳检测超时,少数派节点将另外两个失联节点状态变为unreachable。多数派节点也将对放状态变为unreachable。
  • 网络中断35秒时,多数派节点将可疑节点逐出集群,同时会选举出新的主库。
  • 网络中断85秒时,少数派节点(故障节点)将自己的状态改为error。(如果在85秒前网络恢复,则集群自动恢复)
  • 网络中断115秒时,少数派节点(故障节点)将两个失联节点逐出集群。(如果在115秒前网络恢复,则在115秒时mgr先逐出可疑节点,然后auto-rejoin尝试将error节点重新加入集群,且可以成功加入。)

几个时间点计算

  • 35秒:心跳检测5s + group_replication_member_expel_timeout=35秒
  • 85秒:心跳检测5s + group_replication_unreachable_majority_timeout=85
  • 115秒:心跳检测5s + group_replication_unreachable_majority_timeout=80s + group_replication_member_expel_timeout=115秒

综上,3个参数的关系如下

  1. group_replication_member_expel_timeout 表示多数排驱除unreable节点的时间,也就是容忍多少秒,看看网络是否能够恢复。
  2. 少数派节点首先读取参数group_replication_unreachable_majority_timeout,到达超时时间后进入error状态。然后从零计时可疑节点驱逐时间(group_replication_member_expel_timeout),到达超时时间后从集群状态视图删除失联的所有节点信息。紧接着触发auto-rejoin重试(group_replication_autorejoin_tries);
  3. group_replication_unreachable_majority_timeout参数的值必须大于group_replication_member_expel_timeout,否则主库故障后还没有选举新主。
    由此可见,当参数group_replication_member_expel_timeout和group_replication_unreachable_majority_timeout的值相同时,即少数派节点进入error状态时,多数派节点恰好选举了新主,可以避免出现双主的现象。

由此可见,当参数group_replication_member_expel_timeout和group_replication_unreachable_majority_timeout的值相同时,即少数派节点进入error状态时,多数派节点恰好选举了新主,可以避免出现双主的现象。

三、故障检测流程

结合上面的案例,我们来看看 Group Repliction 的故障检测流程。

  • 1.集群中每个节点都会定期(每秒 1 次)向其它节点发送心跳信息。如果在 5s 内(固定值,无参数调整)没有收到其它节点的心跳信息,则会将该节点标记为可疑节点,同时会将该节点的状态设置为 UNREACHABLE 。如果集群中有等于或超过 1/2 的节点显示为 UNREACHABLE ,则该集群不能对外提供写服务。

     

  • 2.如果在group_replication_member_expel_timeout(从 MySQL 8.0.21 开始,该参数的默认值为 5,单位 s,最大可设置值为3600,即 1 小时)时间内,可疑节点恢复正常,则会直接应用 XCom Cache 中的消息。XCom Cache 的大小由group_replication_message_cache_size 决定,默认是 1G。

  • 3.如果在group_replication_member_expel_timeout时间内,可疑节点没有恢复正常,则会被驱逐出集群。少数派节点则,

  • 如果参数group_replication_unreachable_majority_timeout 没有设置,网络恢复正常,因为已经被大多数剔除,则设置为error。
  • 达到 group_replication_unreachable_majority_timeout 的限制,自动设置为error

如果group_replication_autorejoin_tries不为 0,对于 ERROR 状态的节点,会自动重试,重新加入集群(auto-rejoin)。如果group_replication_autorejoin_tries为 0 或重试失败,则会执行 group_replication_exit_state_action 指定的操作。可选的操作有:

  • READ_ONLY:只读模式。在这种模式下,会将 super_read_only 设置为 ON。默认值。

  • OFFLINE_MODE:离线模式。在这种模式下,会将 offline_mode 和 super_read_only 设置为 ON,此时,只有CONNECTION_ADMIN(SUPER)权限的用户才能登陆,普通用户不能登录。

  • ABORT_SERVER:关闭实例。

四、 XCom Cache

XCom Cache 是 XCom 使用的消息缓存,用来缓存集群节点之间交换的消息。缓存的消息是共识协议的一部分。如果网络不稳定,可能会出现节点失联的情况。

如果节点在一定时间(由 group_replication_member_expel_timeout 决定)内恢复正常,它会首先应用 XCom Cache 中的消息。如果 XCom Cache 没有它需要的所有消息,这个节点会被驱逐出集群。驱逐出集群后,如果 group_replication_autorejoin_tries 不为 0,它会重新加入集群(auto-rejoin)。

重新加入集群会使用 Distributed Recovery 补齐差异数据。相比较直接使用 XCom Cache 中的消息,通过 Distributed Recovery 加入集群需要的时间相对较长,过程也较复杂,并且集群的性能也会受到影响。

所以,我们在设置 XCom Cache 的大小时,需预估 group_replication_member_expel_timeout + 5s 这段时间内的内存使用量。如何预估,后面会介绍相关的系统表。

五、生产配置建议

#适用于8.0.21以上版本
group_replication_member_expel_timeout=30           #可疑节点超过30秒被驱逐集群,默认5秒。注意:如果是少数派节点(故障节点),需要在网络不可达超时时间等待结束后,从零计时30秒。
group_replication_unreachable_majority_timeout=35       #网络不可达超时时间,在少数派节点(故障节点)将自己的集群状态视图改为error的时间。默认值为0,表示一直处于UNREACHABLE状态,如果要设置应该大于心跳超时和group_replication_member_expel_timeout参数的设置总和这里是30+5
group_replication_message_cache_size=5G             #XCom cache使用的消息缓存,默认1G,建议设置为5G。
group_replication_autorejoin_tries=3                #error状态节点重新加入集群的次数
注意理解:参数group_replication_unreachable_majority_timeout表示多数派节点不可达,即网络故障分区后,它只对少数派节点有效。

目前我司vip脚本切换逻辑主要判断旧主状态,当旧主从 online primary变为error状态时,触发vip切换。所以我们将前两个参数设置相同值,这样可以保证触发vip切换的那一刻,mgr多数派节点刚好选举了新主。同时也避免了双主现象。

六、注意事项

如果集群中存在 UNREACHABLE 的节点,会有以下限制和不足:

  • 不能调整集群的拓扑,包括添加和删除节点。
  • 如果 Group Replication 的一致性级别等于 AFTER 或 BEFORE_AND_AFTER,则写操作会一直等待,直到 UNREACHABLE 节点 ONLINE 并应用该操作。
  • 集群吞吐量会下降。如果是单主模式,可将 group_replication_paxos_single_leader (MySQL 8.0.27 引入的)设置为 ON 解决这个问题,其改变了孟子协议轮流坐庄的方式。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/596023.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

穷途末路的阿里中台

观点| Mr.K 主笔| Wendy.L 来源| 技术领导力(ID&#xff1a;jishulingdaoli) 对于关注K哥公众号多年的朋友来说&#xff0c;“中台”早就是老生常谈的东西了&#xff0c;如果你还不知道它是什么&#xff0c;可以看看之前的中台文章先恶补一下。 关于中台&#xff0c;这些年翻…

chatgpt赋能python:Python倒序range的完整指南

Python倒序range的完整指南 Python是一种高级编程语言&#xff0c;很多人认为它非常容易学习和使用。其中一个非常有用的功能是range()函数&#xff0c;可以生成数字序列。然而&#xff0c;有时候我们需要以相反的顺序生成这个数字序列&#xff0c;这时候倒序range()函数就派上…

5G NTN技术概述及演进分析(下)

5G NTN因其广泛的服务覆盖能力、应对物理攻击或自然灾害的健壮性和灵活性&#xff0c;在交通、公共安全、电子健康、农业、金融、汽⻋等领域有广泛的应用前景。5G NTN与地面通信的差异主要体现在高时延、广覆盖和卫星的运动等方面&#xff0c;这使得其在传输时延、多普勒频移、…

Ae:蒙版跟踪(含脸部跟踪)

蒙版跟踪&#xff0c;就是通过记录画面上蒙版区域的变化&#xff0c;生成蒙版路径 Mask Path的关键帧。 常用于完成合成中的遮挡关系&#xff0c;或者局部添加动态效果&#xff0c;以及实现人物的脸部特效。 蒙版跟踪 要使用蒙版跟踪器&#xff0c;需要先创建并选中蒙版&#x…

设计模式之~备忘录模式

备忘录&#xff08;Memento&#xff09;&#xff1a; 备忘录模式&#xff08;Memento Pattern&#xff09;又称之为快照模式(Snapshop Pattern)或者令牌模式(Token Pattern)。 在不破坏封装性的前提下&#xff0c;捕获一个对象的内部状态&#xff0c;并在该对象之外保存这个状态…

【数据库优化-count()】count()统计行数

目录 1 count(*)为什么性能差&#xff1f; 2 优化count(*)性能 2.1 增加redis缓存 2.2 多线程执行 2.3 减少join的表 3 count的各种用法性能对比 数据库&#xff1a;Mysql8、存储引擎是Innodb。 通常情况下&#xff0c;分页接口一般会查询两次数据库&#xff0c;第一次是…

软考A计划-试题模拟含答案解析-卷十七

点击跳转专栏>Unity3D特效百例点击跳转专栏>案例项目实战源码点击跳转专栏>游戏脚本-辅助自动化点击跳转专栏>Android控件全解手册点击跳转专栏>Scratch编程案例 &#x1f449;关于作者 专注于Android/Unity和各种游戏开发技巧&#xff0c;以及各种资源分享&am…

用ChatGPT写一个数据采集程序

入门教程、案例源码、学习资料、读者群 请访问&#xff1a;python666.cn 大家好&#xff0c;欢迎来到 Crossin的编程教室 &#xff01; 上次我们讨论了ChatGPT在辅助编程学习上的一些用法&#xff1a; 如何用ChatGPT学Python 既然ChatGPT可以理解并生成代码&#xff0c;那么自然…

多线程 —— 线程控制

目录 线程控制1 线程创建2 线程等待3 线程终止4 线程分离 pthread_t id && LWP 线程控制 1 线程创建 功能&#xff1a;创建一个新的线程 原型 int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void*), void *arg); 参数…

chatgpt赋能python:Python中任取一个数

Python中任取一个数 如果你是一名Python编程爱好者或者是从事相关行业的工程师&#xff0c;你可能会知道Python强大的数学计算能力。Python中自带了诸多数学函数和库&#xff0c;例如NumPy、SciPy、pandas等等&#xff0c;可以帮助我们轻松地完成各种计算任务&#xff0c;包括…

MySQL—存储引擎(上)

♥️作者&#xff1a;小刘在C站 ♥️个人主页&#xff1a;小刘主页 ♥️每天分享云计算网络运维课堂笔记&#xff0c;努力不一定有回报&#xff0c;但一定会有收获加油&#xff01;一起努力&#xff0c;共赴美好人生&#xff01; ♥️树高千尺&#xff0c;落叶归根人生不易&…

Ae 脚本:Face Tools 4 脸部工具

Motion Bro 的 AE Face Tools 扩展脚本工具&#xff0c;包含几百种预设&#xff0c;如换脸、换妆、美颜、液化、失真扭曲、面部修饰、蒙板工具、文字跟踪、添加元素、人脸锁定防抖稳定以及遮挡等预设效果&#xff0c;大大扩展了 After Effects 中的人脸跟踪功能。 AE Face Tool…

基于激光雷达和单目视觉融合的SLAM和三维语义重建

文章&#xff1a;SLAM and 3D Semantic Reconstruction Based on the Fusion of Lidar and Monocular Vision 作者&#xff1a;Lu Lou, Yitian Li, Qi Zhang and Hanbing Wei 编辑&#xff1a;点云PCL 欢迎各位加入知识星球&#xff0c;获取PDF论文&#xff0c;欢迎转发朋友圈。…

领导说培养我,怎么看出来他是真培养,还是PUA我?

当你的领导说要培养你的时候&#xff0c;如何分辨他的真实意图&#xff1f; 一位网友问&#xff1a; 领导说培养我&#xff0c;怎么看出来他是真培养&#xff0c;还是PUA&#xff1f; 来看看这位网友的支招&#xff1a; 如果领导做到以下几点&#xff0c;那就是真的培养你。 1.…

实战记录01(QT连接MySQL显示未加载问题解决)

进入实战 学了这么几天就要开发新项目了&#xff0c;所以可能不会每日记录吧&#xff0c;遇到问题可能就发一下&#xff0c; 昨天是又重新写了一个登录和管理界面&#xff0c;写完之后就可以开始链接数据库了。 QT连接MYSQL报错 我一开始是在.pro文件中加上了 QTsql QTmysq…

【计算机视觉 | 目标检测】arxiv 计算机视觉关于目标检测的学术速递(5月31日论文合集)

文章目录 一、检测相关(10篇)1.1 Table Detection for Visually Rich Document Images1.2 DENTEX: An Abnormal Tooth Detection with Dental Enumeration and Diagnosis Benchmark for Panoramic X-rays1.3 TrueDeep: A systematic approach of crack detection with less dat…

C++ STL源码下载及目录结构

掌握一定的c基础&#xff0c;想更深入的了解STL源码实现的读者&#xff0c;可以参照本文下载源码阅读 1、源码下载 ​ SGI STL版本&#xff08;侯捷老师推荐&#xff09;&#xff1a;https://github.com/dongyusheng/csdn-code/tree/master/STLgcc STL&#xff1a;GitHub - gc…

使用 Docker 部署 Jenkins 代理(主从)控制服务器

自动化是 DevOps 的核心。各种自动化工具和技术真正实现了持续集成和持续交付的概念。这些工具多年来发展迅速&#xff0c;但似乎永远存在的一个名字是Jenkins。 我们不会在这篇文章中讨论 CI-CD 的介绍性概念&#xff0c;也不会浪费时间展示 Jenkins 安装步骤。如果您是 Jenk…

Go1.21 速览:骚操作 panic(nil) 将成为历史!以后别这么干了。。。

大家好&#xff0c;我是煎鱼。 在 Go 语言中&#xff0c;返回错误、抛出异常一直是大家比较关注的话题。在抛出异常上&#xff0c;我们一般都是这么用的&#xff1a; func mayPanic() {panic("脑子进煎鱼了") }func main() {defer func() {if r : recover(); r ! nil…

Vagrant编排虚拟机安装与配置

1.安装VirtualBox与Vagrant VirtualBox与Vagrant是支持不通过操作系统的&#xff1b;根据具体操作系统下载对应版本即可。(这里使用windows操作系统搭建) VirtualBox https://www.virtualbox.org/wiki/Downloads Vagrant https://developer.hashicorp.com/vagrant/downloads Va…