如何替换OCP节点(二):使用 antman脚本 | OceanBase应用实践

news2024/10/12 10:30:38

前言:

OceanBase Cloud Platform(简称OCP),是 OceanBase数据库的专属企业级数据库管理平台。

在实际生产环境中,OCP的安装通常是第一步,先搭建OCP平台,进而依赖OCP来创建、管理和监控我们的生产集群。但此后,可能由于机房调整或其他需求,可能会出现需要迁移或替换OCP服务器的情况。

上一篇文章,介绍了使用oat平台来替换OCP的方法,本文将介绍使用antman脚本替换OCP服务器的方法。(注:本文的环境的OCP负载均衡使用的f5,所以新的机器需要先配置f5,其他负载均衡场景同理)

环境背景:

大家如果有接触ob生产环境的经验的话,可以能会了解,前期版本,安装ocp的时候,需要安装ocp软件/metadba/obproxy三个docker包,后期ocp版本将db+proxy集成在了一个docker包里,oat的话只能纳管db+proxy

集成的metadb,分开的情况还需要使用antman脚本来替换。

>本篇文章主要介绍使用antman替换,下面说下我的软件信息

1.ocp软件:ocp-all-in-one:3.3.3-20220906114643

2.metadb:OB2276_x86_20210409 

3.proxy:OBP186_20210315

4.antman:t-oceanbase-antman-1.4.3-20220807073355.alios7.x86_64

操作过程:

(一)环境检查/准备

  • 检查替换机器环境,包括分盘,创建admin用户,安装docker软件等,安装好后检查下。
cd /root/t-oceanbase-antman/clonescripts/
sh precheck.sh -m ocp
  • 登录meta库检查有没有tenant的主zone在要被替换的节点,提前切主
MySQL [oceanbase]> select * from __all_Server;
+----------------------------+----------------------------+--------------+----------+----+----------------+------------+-----------------+--------+-----------------------+--------------------------------------------------------------------------------------+-----------+--------------------+--------------+----------------+-------------------+
| gmt_create                 | gmt_modified               | svr_ip       | svr_port | id | zone           | inner_port | with_rootserver | status | block_migrate_in_time | build_version                                                                        | stop_time | start_service_time | first_sessid | with_partition | last_offline_time |
+----------------------------+----------------------------+--------------+----------+----+----------------+------------+-----------------+--------+-----------------------+--------------------------------------------------------------------------------------+-----------+--------------------+--------------+----------------+-------------------+
| 2022-03-17 22:59:19.979627 | 2023-03-20 10:27:01.147283 | 10.10.100.87 |     2882 |  6 | META_OB_ZONE_2 |       2881 |               0 | active |                     0 | 2.2.76_20210406232249-a1e144bdc179fbf473cea37f199e8a76c736b8d4(Apr  6 2021 23:55:12) |         0 |   1679279220991796 |            0 |              1 |  1679278517144838 |
| 2022-03-17 23:54:49.277939 | 2023-03-20 09:29:22.079578 | 10.10.100.9  |     2882 |  7 | META_OB_ZONE_1 |       2881 |               1 | active |                     0 | 2.2.76_20210406232249-a1e144bdc179fbf473cea37f199e8a76c736b8d4(Apr  6 2021 23:55:12) |         0 |   1679275725595691 |            0 |              1 |                 0 |
| 2021-12-21 22:44:16.476503 | 2023-03-20 09:29:22.080425 | 122.44.11.2  |     2882 |  5 | META_OB_ZONE_3 |       2881 |               0 | active |                     0 | 2.2.76_20210406232249-a1e144bdc179fbf473cea37f199e8a76c736b8d4(Apr  6 2021 23:55:12) |         0 |   1640097866698859 |            0 |              1 |                 0 |
+----------------------------+----------------------------+--------------+----------+----+----------------+------------+-----------------+--------+-----------------------+--------------------------------------------------------------------------------------+-----------+--------------------+--------------+----------------+-------------------+
MySQL [oceanbase]> select tenant_name,primary_zone  from __all_tenant;
+----------------+----------------------------------------------+
| tenant_name    | primary_zone                                 |
+----------------+----------------------------------------------+
| sys            | META_OB_ZONE_1;META_OB_ZONE_3;META_OB_ZONE_2 |
| ocp_meta       | META_OB_ZONE_1;META_OB_ZONE_3;META_OB_ZONE_2 |
| ocp_monitor    | META_OB_ZONE_1;META_OB_ZONE_3;META_OB_ZONE_2 |
| oms_tt_tenant  | META_OB_ZONE_1;META_OB_ZONE_3;META_OB_ZONE_2 |
| oms_cc7_tenant | META_OB_ZONE_3;META_OB_ZONE_2;META_OB_ZONE_1 |
| oms_ff9_tenant | META_OB_ZONE_1;META_OB_ZONE_2,META_OB_ZONE_3 |
| oms_cc9_tenant | META_OB_ZONE_3;META_OB_ZONE_1,META_OB_ZONE_2 |
| oms_dd_tenant  | META_OB_ZONE_3;META_OB_ZONE_1,META_OB_ZONE_2 |
| obdw_meta      | META_OB_ZONE_3;META_OB_ZONE_1,META_OB_ZONE_2 |
+----------------+----------------------------------------------+
MySQL [oceanbase]> alter tenant sys primary_zone='META_OB_ZONE_2;META_OB_ZONE_3,META_OB_ZONE_1';
Query OK, 0 rows affected (0.04 sec)

MySQL [oceanbase]> alter tenant ocp_meta primary_zone='META_OB_ZONE_2;META_OB_ZONE_3,META_OB_ZONE_1';
Query OK, 0 rows affected (1.27 sec)

MySQL [oceanbase]> alter tenant ocp_monitor primary_zone='META_OB_ZONE_2;META_OB_ZONE_3,META_OB_ZONE_1';
Query OK, 0 rows affected (0.02 sec)

MySQL [oceanbase]> alter tenant oms_tt_tenant primary_zone='META_OB_ZONE_2;META_OB_ZONE_3,META_OB_ZONE_1';
Query OK, 0 rows affected (0.03 sec)

MySQL [oceanbase]> alter tenant oms_ff9_tenant primary_zone='META_OB_ZONE_2;META_OB_ZONE_3,META_OB_ZONE_1';
Query OK, 0 rows affected (0.03 sec)
  • 因为使用antman脚本迁移,需要在执行机器上修改obcluster.conf文件,或者直接从原ocp上copy后检查下,镜像包也需要传到该机器/root/t-oceanbase-antman目录下
55obffocp:~/t-oceanbase-antman # cat obcluster.conf
ZONE1_RS_IP=10.10.100.9
ZONE2_RS_IP=10.10.100.87
ZONE3_RS_IP=122.44.11.2

######  自动配置,无需修改 / AUTO-CONFIGURATION ######

OBSERVER01_HOSTNAME=OCP_META_SERVER_1
OBSERVER02_HOSTNAME=OCP_META_SERVER_2
OBSERVER03_HOSTNAME=OCP_META_SERVER_3
ZONE1_NAME=META_OB_ZONE_1                       --后续命令参数,主要和参数文件中对上
ZONE2_NAME=META_OB_ZONE_2
ZONE3_NAME=META_OB_ZONE_3
##there must be more than half zone within same region
ZONE1_REGION=OCP_META_REGION
ZONE2_REGION=OCP_META_REGION
ZONE3_REGION=OCP_META_REGION
MYSQL_PORT=2881
RPC_PORT=2882

OCP_VERSION=3.3.3
  • 检查执行antman脚本机器上默认集群密码是否正确,cd ~/t-oceanbase-antman/tools ,执行getpass.sh的脚本,如果不对需要使用setpass.sh修改,因为后续proxy的docker迁移后会有验证,ocp的docker迁移前也会验证。
55obffocp:~/t-oceanbase-antman/tools # bash setpass.sh -s 0Aa255yK^F  
password file sys in /root/.key already exist!
**********************
Password of root@sys is  CqVgg9}Aut
Password of root@ocp_meta is  r6kS^EINTU
Password of root@ocp_monitor is  pkJv1a{7J7
Password of root@odc is  j{fjdd3X9f
Password of root@oms is  {oOIsE9fdQ

55obffocp:~ # mv .key  .key_bak
55obffocp:~ # cd /root/t-oceanbase-antman/tools/
55obffocp:~/t-oceanbase-antman/tools # bash setpass.sh -s 0Aa255yK^F
**********************
Password of root@sys is  0Aa255yK^F
Password of root@ocp_meta is 
Password of root@ocp_monitor is 
Password of root@odc is 
Password of root@oms is 
55obffocp:~/t-oceanbase-antman/tools # bash setpass.sh -c rSf@jO%6EO
**********************
Password of root@sys is  0Aa255yK^F
Password of root@ocp_meta is  rSf@jO%6EO
Password of root@ocp_monitor is 
Password of root@odc is 
Password of root@oms is 

(二)执行antman的manage脚本进行新机器的添加

  • 执行antman的manage脚本进行新机器的添加,ps:(这个版本manage会有报错,文末会有分享)
55obffocp:~/t-oceanbase-antman # ./manage.sh -i ob,ocp,obproxy -l 133.55.22.19 -z 1 -R Jnydzycscc@123 -A OceanBase#123
[2023-06-16 16:31:45.375633] INFO [check conf file /root/t-oceanbase-antman/obcluster.conf format ...]
[2023-06-16 16:31:45.381844] INFO [conf file is upper case format.]
[2023-06-16 16:31:45.391446] INFO [SSH_AUTH=password SSH_USER=root SSH_PORT=22 SSH_PASSWORD= SSH_KEY_FILE=/root/.ssh/id_rsa]
LB_MODE=f5
INSTALL_COMPONENTS componets: ob obproxy ocp
CLEAR_COMPONENTS: 
IP_LIST: 133.55.22.19
ZONE_LIST: 1
ROOT_PASSWORD_LIST: Jnydzycscc@123
ADMIN_PASSWORD_LIST: OceanBase#123
[2023-06-16 16:31:45.746503] INFO [INSTALL_COMPONENT: ob START  ######################################]
[2023-06-16 16:31:45.751057] INFO [deploy_ob: check whether OBSERVER port 2881,2882 are in use or not on 133.55.22.19]
[2023-06-16 16:31:45.806500] INFO [deploy_ob: OBSERVER port 2881,2882 are idle on 133.55.22.19]
[2023-06-16 16:31:45.810773] INFO [deploy_ob: installing ob cluster, logfile: /root/t-oceanbase-antman/logs/deploy_ob.log]
cp: '/root/t-oceanbase-antman/OB2276_x86_20210409.tar.gz' and '/root/t-oceanbase-antman/OB2276_x86_20210409.tar.gz' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/install_OB_docker.sh' and '/root/t-oceanbase-antman/install_OB_docker.sh' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/obcluster.conf' and '/root/t-oceanbase-antman/obcluster.conf' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/common/utils.sh' and '/root/t-oceanbase-antman/common/utils.sh' are the same file
skip copy same file
cp: '/root/.key' and '/root/.key' are the same file
skip copy same file
nohup: ignoring input
[2023-06-16 16:31:45.841348] INFO [installing OB docker and starting OB server on 133.55.22.19, pid: 144513, log: /root/t-oceanbase-antman/logs/install_OB_docker.log and /home/admin/logs/ob-server/ inside docker]
[2023-06-16 16:31:45.925592] INFO [load docker image: docker load -i /root/t-oceanbase-antman/OB2276_x86_20210409.tar.gz]
[2023-06-16 16:31:45.930723] INFO [install_OB_docker.sh is still running on 133.55.22.19]
[2023-06-16 16:31:56.021465] INFO [install_OB_docker.sh is still running on 133.55.22.19]
Loaded image: reg.docker.alibaba-inc.com/antman/ob-docker:OB2276_x86_20210409
[2023-06-16 16:32:06.111458] INFO [install_OB_docker.sh is still running on 133.55.22.19]
[2023-06-16 16:32:06.359285] INFO [start container: docker run -d -it --cap-add SYS_RESOURCE --name META_OB_ZONE_1 --net=host     -e OBCLUSTER_NAME=obcluster      -e DEV_NAME=bond0     -e ROOTSERVICE_LIST="10.10.100.9:2882:2881;10.10.100.87:2882:2881;122.44.11.2:2882:2881"     -e DATAFILE_DISK_PERCENTAGE=90     -e CLUSTER_ID=1632654636     -e ZONE_NAME=META_OB_ZONE_1     -e OBPROXY_PORT=2883     -e MYSQL_PORT=2881     -e RPC_PORT=2882     -e OCP_VIP=134.80.173.57     -e OCP_VPORT=80     -e app.password_root='Jnydzycscc@123'     -e app.password_admin='OceanBase#123'     -e OBPROXY_OPTSTR=""     -e OPTSTR="cpu_count=64,system_memory=50G,memory_limit=254G,__min_full_resource_pool_memory=1073741824,_ob_enable_prepared_statement=false,memory_limit_percentage=90"     --cpu-period 100000     --cpu-quota 6400000     --cpuset-cpus 0-63     --memory 256G     -v /home/admin/oceanbase:/home/admin/oceanbase     -v /data/log1:/data/log1     -v /data/1:/data/1     --restart on-failure:5     reg.docker.alibaba-inc.com/antman/ob-docker:OB2276_x86_20210409]
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
4f1c15e8194cc1ae2fffcc124ea9c982b3fda87ce1a6d0038db88435c737af89
[2023-06-16 16:32:16.209761] INFO [install_OB_docker.sh finished and reg.docker.alibaba-inc.com/antman/ob-docker:OB2276_x86_20210409 started on 133.55.22.19]
[2023-06-16 16:32:16.214771] INFO [waiting on observer ready on 133.55.22.19]
[2023-06-16 16:35:16.244133] INFO [waiting on observer ready on 133.55.22.19 for 3 Minitues]
[2023-06-16 16:36:16.264776] INFO [waiting on observer ready on 133.55.22.19 for 4 Minitues]
[2023-06-16 16:37:16.285808] INFO [waiting on observer ready on 133.55.22.19 for 5 Minitues]
[2023-06-16 16:37:16.579057] INFO [observer on 133.55.22.19 is ready]
[2023-06-16 16:37:16.584583] INFO [deploy_ob: installation of ob cluster done]
[2023-06-16 16:37:16.588617] INFO [INSTALL_COMPONENT: ob DONE  ######################################]
[2023-06-16 16:37:16.593604] INFO [INSTALL_COMPONENT: obproxy START  ######################################]

####日志太多,就不都粘贴出来了,可以上面看到metadb的docker服务添加完后开始了obproxy的docker服务添加###

说明:

1)"133.55.22.19"是要去替换ocp的服务器的实际物理IP。

2)"-z 1"选项,指定的133.55.22.19会被添加到OCP环境中的第1个zone,即和上文查到的10.10.100.9机器在同一个zone里。

这里关于zone的定义,主要是针对OCP服务器上的meta_ob docker而言,obproxy docker和ocp docker并没有zone的概念。

关于每台OCP服务器上的meta_ob docker属于哪一个zone,请参考"obcluster.conf"配置文件中的三个变量:ZONE1_RS_IP,ZONE2_RS_IP,ZONE3_RS_IP。

3)"-R"和"-A",后面需要分别填写成133.55.22.19服务器的root用户密码和admin用户密码。

4)-i是安装,如果替换成-c就是清除 

####这时候正常的话可以通过新添加节点的ip:8080前台登录ocp了,也可以通过这个机器的2883端口去连meta库了

(三)登录ocp的metadb的sys租户新增meta_ob docker的上线

MySQL [oceanbase]> alter system add server '133.55.22.19:2882' zone 'META_OB_ZONE_1';
Query OK, 0 rows affected (0.02 sec)

MySQL [oceanbase]> select svr_ip, zone, with_rootserver, status, start_service_time from __all_server;
+--------------+----------------+-----------------+--------+--------------------+
| svr_ip       | zone           | with_rootserver | status | start_service_time |
+--------------+----------------+-----------------+--------+--------------------+
| 10.10.100.87 | META_OB_ZONE_2 |               1 | active |   1679279220991796 |
| 10.10.100.9  | META_OB_ZONE_1 |               0 | active |   1679275725595691 |
| 122.44.11.2  | META_OB_ZONE_3 |               0 | active |   1640097866698859 |
| 133.55.22.19 | META_OB_ZONE_1 |               0 | active |                  0 |
+--------------+----------------+-----------------+--------+--------------------+
4 rows in set (0.00 sec)

MySQL [oceanbase]> select svr_ip, zone, with_rootserver, status, start_service_time from __all_server;
+--------------+----------------+-----------------+--------+--------------------+
| svr_ip       | zone           | with_rootserver | status | start_service_time |
+--------------+----------------+-----------------+--------+--------------------+
| 10.10.100.87 | META_OB_ZONE_2 |               1 | active |   1679279220991796 |
| 10.10.100.9  | META_OB_ZONE_1 |               0 | active |   1679275725595691 |
| 122.44.11.2  | META_OB_ZONE_3 |               0 | active |   1640097866698859 |
| 133.55.22.19 | META_OB_ZONE_1 |               0 | active |   1686908200404755 |
+--------------+----------------+-----------------+--------+--------------------+
4 rows in set (0.01 sec)

(四)登录ocp的metadb的sys租户将被替换meta_ob docker的下线

MySQL [oceanbase]>  alter system delete server '10.10.100.9:2882' zone 'META_OB_ZONE_1';
Query OK, 0 rows affected (0.19 sec)

MySQL [oceanbase]> select svr_ip, zone, with_rootserver, status, start_service_time from __all_server;
+--------------+----------------+-----------------+----------+--------------------+
| svr_ip       | zone           | with_rootserver | status   | start_service_time |
+--------------+----------------+-----------------+----------+--------------------+
| 10.10.100.87 | META_OB_ZONE_2 |               1 | active   |   1679279220991796 |
| 10.10.100.9  | META_OB_ZONE_1 |               0 | deleting |   1679275725595691 |
| 122.44.11.2  | META_OB_ZONE_3 |               0 | active   |   1640097866698859 |
| 133.55.22.19 | META_OB_ZONE_1 |               0 | active   |   1686908200404755 |
+--------------+----------------+-----------------+----------+--------------------+
4 rows in set (0.01 sec)
MySQL [oceanbase]> select svr_ip, zone, with_rootserver, status, start_service_time from __all_server;
+--------------+----------------+-----------------+--------+--------------------+
| svr_ip       | zone           | with_rootserver | status | start_service_time |
+--------------+----------------+-----------------+--------+--------------------+
| 10.10.100.87 | META_OB_ZONE_2 |               1 | active |   1679279220991796 |
| 122.44.11.2  | META_OB_ZONE_3 |               0 | active |   1640097866698859 |
| 133.55.22.19 | META_OB_ZONE_1 |               0 | active |   1686908200404755 |
+--------------+----------------+-----------------+--------+--------------------+

(五)登录ocp_meta租户手工更新OCP服务器信息

1686989871

####前面步骤处理完,ocp前台还可以看到残留的信息,需要替换下信息#####

55obffocp:~/t-oceanbase-antman # mysql -h10.10.100.87 -P2883 -uroot@ocp_meta#obcluster -p'rSf@jO%6EO' -Docp -c


MySQL [ocp]>  select * from compute_host where inner_ip_address='10.10.100.9'\G
*************************** 1. row ***************************
              id: 1
            name: ocp1a
     description: NULL
operating_system: 4.12.14-120-default
    architecture: x86_64
inner_ip_address: 10.10.100.9
        ssh_port: 2022
            kind: DEDICATED_PHYSICAL_MACHINE
   publish_ports: NULL
          status: ONLINE
          vpc_id: 1
          idc_id: 1
    host_type_id: 1
   serial_number: NULL
           alias: NULL
     create_time: 2021-09-26 21:04:11
     update_time: 2023-03-20 11:01:58
1 row in set (0.00 sec)

MySQL [ocp]> update compute_host set inner_ip_address='133.55.22.19', name='55obffocp' where inner_ip_address='10.10.100.9';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

MySQL [ocp]> select * from compute_host where id =1;
+----+-----------+-------------+---------------------+--------------+------------------+----------+----------------------------+---------------+--------+--------+--------+--------------+---------------+-------+---------------------+---------------------+
| id | name      | description | operating_system    | architecture | inner_ip_address | ssh_port | kind                       | publish_ports | status | vpc_id | idc_id | host_type_id | serial_number | alias | create_time         | update_time         |
+----+-----------+-------------+---------------------+--------------+------------------+----------+----------------------------+---------------+--------+--------+--------+--------------+---------------+-------+---------------------+---------------------+
|  1 | 55obffocp | NULL        | 4.12.14-120-default | x86_64       | 133.55.22.19     |     2022 | DEDICATED_PHYSICAL_MACHINE | NULL          | ONLINE |      1 |      1 |            1 | NULL          | NULL  | 2021-09-26 21:04:11 | 2023-06-16 17:47:19 |
+----+-----------+-------------+---------------------+--------------+------------------+----------+----------------------------+---------------+--------+--------+--------+--------------+---------------+-------+---------------------+---------------------+
1 row in set (0.00 sec)

(六)清理被替换机器上残留的服务

ocp1a:~/t-oceanbase-antman #  ./manage.sh -c ob,ocp,obproxy -l 10.10.100.9 -z 1 -R 'Dt!n(Rg4Av!t' -A OceanBase#123
grep: /etc/system-release: No such file or directory
[2023-06-16 22:45:44.101400] INFO [check conf file /root/t-oceanbase-antman/obcluster.conf format ...]
[2023-06-16 22:45:44.106779] INFO [conf file is upper case format.]
[2023-06-16 22:45:44.114437] INFO [SSH_AUTH=password SSH_USER=root SSH_PORT=22 SSH_PASSWORD= SSH_KEY_FILE=/root/.ssh/id_rsa]
LB_MODE=f5
INSTALL_COMPONENTS componets: 
CLEAR_COMPONENTS: ob obproxy ocp
IP_LIST: 10.10.100.9
ZONE_LIST: 1
ROOT_PASSWORD_LIST: Dt!n(Rg4Av!t
ADMIN_PASSWORD_LIST: OceanBase#123
[2023-06-16 22:45:44.474268] INFO [CLEAR_COMPONENT: ob START  ######################################]
cp: '/root/t-oceanbase-antman/uninstall.sh' and '/root/t-oceanbase-antman/uninstall.sh' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/obcluster.conf' and '/root/t-oceanbase-antman/obcluster.conf' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/common/utils.sh' and '/root/t-oceanbase-antman/common/utils.sh' are the same file
skip copy same file
grep: /etc/system-release: No such file or directory
[2023-06-16 22:45:44.504069] INFO [remove OB server and docker on host: 10.10.100.9]
[2023-06-16 22:45:44.548697] INFO [docker rm -f 62ab623cb4ed]
62ab623cb4ed
[2023-06-16 22:46:01.260706] INFO [remove OB server and docker on host: 10.10.100.9 done!]
[2023-06-16 22:46:01.370808] INFO [uninstall.sh ob  finished and reg.docker.alibaba-inc.com/antman/ob-docker:OB2276_x86_20210409 removed on 10.10.100.9]
[2023-06-16 22:46:01.375914] INFO [OB docker on 10.10.100.9 is removed]
[2023-06-16 22:46:01.380667] INFO [CLEAR_COMPONENT: ob DONE  ######################################]
[2023-06-16 22:46:01.385398] INFO [CLEAR_COMPONENT: obproxy START  ######################################]
cp: '/root/t-oceanbase-antman/uninstall.sh' and '/root/t-oceanbase-antman/uninstall.sh' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/obcluster.conf' and '/root/t-oceanbase-antman/obcluster.conf' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/common/utils.sh' and '/root/t-oceanbase-antman/common/utils.sh' are the same file
skip copy same file
grep: /etc/system-release: No such file or directory
[2023-06-16 22:46:01.416495] INFO [remove obproxy docker on host:10.10.100.9]
[2023-06-16 22:46:01.514215] INFO [docker rm -f 01bdcadf2e11]
01bdcadf2e11
[2023-06-16 22:46:01.765459] INFO [remove obproxy docker on host:10.10.100.9 done!]
[2023-06-16 22:46:01.858848] INFO [uninstall.sh obproxy  finished and reg.docker.alibaba-inc.com/antman/obproxy:OBP186_20210315 removed on 10.10.100.9]
[2023-06-16 22:46:01.863806] INFO [obproxy docker on 10.10.100.9 is removed]
[2023-06-16 22:46:01.868778] INFO [CLEAR_COMPONENT: obproxy DONE  ######################################]
[2023-06-16 22:46:01.873368] INFO [CLEAR_COMPONENT: ocp START  ######################################]
cp: '/root/t-oceanbase-antman/uninstall.sh' and '/root/t-oceanbase-antman/uninstall.sh' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/obcluster.conf' and '/root/t-oceanbase-antman/obcluster.conf' are the same file
skip copy same file
cp: '/root/t-oceanbase-antman/common/utils.sh' and '/root/t-oceanbase-antman/common/utils.sh' are the same file
skip copy same file
grep: /etc/system-release: No such file or directory
[2023-06-16 22:46:01.906934] INFO [remove ocp docker on host:10.10.100.9]
[2023-06-16 22:46:01.944811] INFO [docker rm -f 8b044744a92e]
8b044744a92e
[2023-06-16 22:46:26.162467] INFO [remove ocp docker on host:10.10.100.9 done]
[2023-06-16 22:46:26.253927] INFO [uninstall.sh ocp  finished and reg.docker.alibaba-inc.com/oceanbase/ocp-all-in-one:3.3.3-20220906114643 removed on 10.10.100.9]
[2023-06-16 22:46:26.258281] INFO [ocp docker on 10.10.100.9 is removed]
[2023-06-16 22:46:26.263047] INFO [CLEAR_COMPONENT: ocp DONE  ######################################]
ocp1a:~/t-oceanbase-antman # docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

报错记录及处理:

  • manage脚本执行报错
55obffocp:~/t-oceanbase-antman # ./manage.sh -i ob,ocp,obproxy -l 133.55.22.19 -z 1 -R Jnydzycscc@123 -A OceanBase#123
[2023-06-16 16:31:03.305062] INFO [check conf file /root/t-oceanbase-antman/obcluster.conf format ...]
[2023-06-16 16:31:03.309079] INFO [conf file is upper case format.]
[2023-06-16 16:31:03.315290] INFO [SSH_AUTH=password SSH_USER=root SSH_PORT=22 SSH_PASSWORD= SSH_KEY_FILE=/root/.ssh/id_rsa]
LB_MODE=f5
INSTALL_COMPONENTS componets: ob obproxy ocp
CLEAR_COMPONENTS: 
IP_LIST: 133.55.22.19
ZONE_LIST: 1
ROOT_PASSWORD_LIST: Jnydzycscc@123
ADMIN_PASSWORD_LIST: OceanBase#123
/root/t-oceanbase-antman/common/utils.sh: line 484: -e: command not found
[2023-06-16 16:31:03.636862] ERROR [: ssh authorization to 133.55.22.19 failed, Please check SSH affinity environment varialbes.]

######这个问题也需要修改脚本代码解决#########

1686990138

  • 执行alter system delete server 命令之后很久被替换的server没有delete掉,一直是deleting状态,检查发现ocp的meta库内存参数调整过,新加的server参数小,导致unit迁移卡住。
MySQL [oceanbase]> select svr_ip, zone, with_rootserver, status, start_service_time from __all_server;
+--------------+----------------+-----------------+----------+--------------------+
| svr_ip       | zone           | with_rootserver | status   | start_service_time |
+--------------+----------------+-----------------+----------+--------------------+
| 10.10.100.87 | META_OB_ZONE_2 |               1 | active   |   1679279220991796 |
| 10.10.100.9  | META_OB_ZONE_1 |               0 | deleting |   1679275725595691 |
| 122.44.11.2  | META_OB_ZONE_3 |               0 | active   |   1640097866698859 |
| 133.55.22.19 | META_OB_ZONE_1 |               0 | active   |   1686908200404755 |
+--------------+----------------+-----------------+----------+--------------------+
4 rows in set (0.01 sec)


MySQL [oceanbase]> select count(*),svr_ip from  gv$unit  group by svr_ip;
+----------+--------------+
| count(*) | svr_ip       |
+----------+--------------+
|       27 | 133.55.22.19 |
|       33 | 10.10.100.87 |
|       33 | 122.44.11.2  |
|        6 | 10.10.100.9  |
+----------+--------------+
4 rows in set (0.01 sec)
MySQL [oceanbase]> select *   from  gv$unit   where  svr_ip='10.10.100.9';
+---------+----------------+---------------------------------------------+------------------+----------------------------------------+----------------+-----------+-------------+-------------+----------+---------------------+-----------------------+---------+---------+-------------+-------------+----------+----------+---------------+-----------------+
| unit_id | unit_config_id | unit_config_name                            | resource_pool_id | resource_pool_name                     | zone           | tenant_id | tenant_name | svr_ip      | svr_port | migrate_from_svr_ip | migrate_from_svr_port | max_cpu | min_cpu | max_memory  | min_memory  | max_iops | min_iops | max_disk_size | max_session_num |
+---------+----------------+---------------------------------------------+------------------+----------------------------------------+----------------+-----------+-------------+-------------+----------+---------------------+-----------------------+---------+---------+-------------+-------------+----------+----------+---------------+-----------------+
|    1106 |           1090 | config_oms_tt_tenant_META_OB_ZONE_1_S2_gpa  |             1080 | pool_oms_tt_tenant_META_OB_ZONE_1_gpa  | META_OB_ZONE_1 |      NULL | NULL        | 10.10.100.9 |     2882 |                     |                     0 |       3 |       3 | 12884901888 | 12884901888 |     2500 |     2500 |  536870912000 |             750 |
|    1139 |           1094 | oms_unit                                    |             1129 | oms_ff9_tenant_resource_pool           | META_OB_ZONE_1 |      NULL | NULL        | 10.10.100.9 |     2882 |                     |                     0 |       2 |       2 |  5368709120 |  4294967296 |      128 |      128 |    5368709120 |           10000 |
|    1122 |           1097 | config_oms_c55_tenant_META_OB_ZONE_1_S1_ifu |             1088 | pool_oms_c55_tenant_META_OB_ZONE_1_ifu | META_OB_ZONE_1 |      NULL | NULL        | 10.10.100.9 |     2882 |                     |                     0 |     1.5 |     1.5 |  6442450944 |  6442450944 |     1250 |     1250 |  536870912000 |             375 |
|    1126 |           1100 | config_oms_ff6_tenant_META_OB_ZONE_1_S1_uzz |             1092 | pool_oms_ff6_tenant_META_OB_ZONE_1_uzz | META_OB_ZONE_1 |      NULL | NULL        | 10.10.100.9 |     2882 |                     |                     0 |     1.5 |     1.5 |  6442450944 |  6442450944 |     1250 |     1250 |  536870912000 |             375 |
|    1127 |           1101 | config_oms_ff7_tenant_META_OB_ZONE_1_S1_gkj |             1093 | pool_oms_ff7_tenant_META_OB_ZONE_1_gkj | META_OB_ZONE_1 |      NULL | NULL        | 10.10.100.9 |     2882 |                     |                     0 |     1.5 |     1.5 |  6442450944 |  6442450944 |     1250 |     1250 |  536870912000 |             375 |
|    1135 |           1108 | config_oms_cc8_tenant_META_OB_ZONE_1_S1_wwo |             1101 | pool_oms_cc8_tenant_META_OB_ZONE_1_wwo | META_OB_ZONE_1 |      NULL | NULL        | 10.10.100.9 |     2882 |                     |                     0 |     1.5 |     1.5 |  6442450944 |  6442450944 |     1250 |     1250 |  536870912000 |             375 |
+---------+----------------+---------------------------------------------+------------------+----------------------------------------+----------------+-----------+-------------+-------------+----------+---------------------+-----------------------+---------+---------+-------------+-------------+----------+----------+---------------+-----------------+
6 rows in set (0.02 sec)


MySQL [oceanbase]> alter  system migrate unit=1106 destination='133.55.22.19:2882';
ERROR 4624 (HY000): machine resource is not enough to hold a new unit              ----------------手动去迁移报资源不足

MySQL [oceanbase]> select zone,svr_ip, cpu_total, cpu_assigned,cpu_assigned_percent cpu_ass_pct, round(mem_total/1024/1024/1024) mem_total_gb,
    ->        round(mem_assigned/1024/1024/1024) mem_ass_gb, mem_assigned_percent mem_ass_pct, unit_num, migrating_unit_num, leader_count, round(`load`,2) `load`
    -> from __all_virtual_server_stat
    -> order by zone, svr_ip;                                ---------------------检查资源发现内存不足
+----------------+--------------+-----------+--------------+-------------+--------------+------------+-------------+----------+--------------------+--------------+------+
| zone           | svr_ip       | cpu_total | cpu_assigned | cpu_ass_pct | mem_total_gb | mem_ass_gb | mem_ass_pct | unit_num | migrating_unit_num | leader_count | load |
+----------------+--------------+-----------+--------------+-------------+--------------+------------+-------------+----------+--------------------+--------------+------+
| META_OB_ZONE_1 | 10.10.100.9  |        62 |           11 |          17 |          250 |         40 |          16 |        6 |                  0 |            0 | 0.17 |
| META_OB_ZONE_1 | 133.55.22.19 |        62 |           48 |          77 |          204 |        196 |          96 |       27 |                  0 |            0 | 0.87 |
| META_OB_ZONE_2 | 10.10.100.87 |        62 |           59 |          95 |          250 |        236 |          94 |       33 |                  0 |         2935 | 0.95 |
| META_OB_ZONE_3 | 122.44.11.2  |        62 |           59 |          95 |          250 |        236 |          94 |       33 |                  0 |         1051 | 0.95 |

MySQL [oceanbase]> show parameters  like  '%memory_limit%'
 ;
| META_OB_ZONE_3 | observer | 122.44.11.2  |     2882 | memory_limit                        | NULL      | 300G  | the size of the memory reserved for internal use(for testing purpose). Range: [0M,)                                                                                                                                                    | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| META_OB_ZONE_1 | observer | 133.55.22.19 |     2882 | memory_limit                        | NULL      | 254G  | the size of the memory reserved for internal use(for testing purpose). Range: [0M,)                                                                                                                                                    | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| META_OB_ZONE_2 | observer | 10.10.100.87 |     2882 | memory_limit                        | NULL      | 300G  | the size of the memory reserved for internal use(for testing purpose). Range: [0M,)                                                                                                                                                    | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| META_OB_ZONE_1 | observer | 10.10.100.9  |     2882 | memory_limit                        | NULL      | 300G  | the size of the memory reserved for internal use(for testing purpose). Range: [0M,)                                                                                                                                                    | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |

MySQL [oceanbase]> alter   system  set   memory_limit ='300G'  ;
Query OK, 0 rows affected (0.05 sec)

MySQL [oceanbase]> select count(*),svr_ip from  gv$unit  group by svr_ip;
+----------+--------------+
| count(*) | svr_ip       |
+----------+--------------+
|       33 | 133.55.22.19 |
|       33 | 10.10.100.87 |
|       33 | 122.44.11.2  |
+----------+--------------+
3 rows in set (0.00 sec)


MySQL [oceanbase]> select svr_ip, zone, with_rootserver, status, start_service_time from __all_server;
+--------------+----------------+-----------------+--------+--------------------+
| svr_ip       | zone           | with_rootserver | status | start_service_time |
+--------------+----------------+-----------------+--------+--------------------+
| 10.10.100.87 | META_OB_ZONE_2 |               1 | active |   1679279220991796 |
| 122.44.11.2  | META_OB_ZONE_3 |               0 | active |   1640097866698859 |
| 133.55.22.19 | META_OB_ZONE_1 |               0 | active |   1686908200404755 |
+--------------+----------------+-----------------+--------+--------------------+
3 rows in set (0.00 sec)

总结:

到这里使用antman脚本的方式去替换ocp机器的操作就结束了,包括前面一篇使用oat替换ocp节点的文章可能看起来没什么难度,但是整个过程来回做了好几遍,为了别人以后少踩坑,所以写了这两篇文章分享。如果看了上篇文章的话应该知道oat替换ocp的时候,新加机器是在metadb中新创建了一个zone,然后再把被替换机器下掉,其中还涉及新建资源池修改Locality,增加副本数等操作。其实使用antman脚本的话这个步骤就不太一样,他是将新机器加入到需要替换机器的同一个zone内,然后做同zone内unit的迁移,然后把被替换的机器下线,现阶段的话,相对来说antman替换之后对于ocp的元数据的影响小一些,但是oat黑屏的操作少些,对于obproxy单独docker的前期场景必须使用antman,后期版本就看大家自己酌情选择了。

行之所向,莫问远方。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2207724.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

02_安装jmeter

windows: 安装jdk1.8.0: 1、下载安装包,双击运行安装,点击“下一步”直到完成 2、配置环境变量: JAVA_HOME的值配置为jdk安装目录如D:\java\jdk1.8.0_201 系统变量的Path中添加"%JAVA_HOME%\bin" 3、验证安装…

海外市场充电桩需求激增:充电基础设施展望

报告显示,在大多数欧盟国家的路网中,充电桩数量存在不足、不支持快速充电且分布不均匀的问题。具体而言,有6个欧洲国家的平均每百公里充电桩数量不足1个,17个国家的平均每百公里充电桩数量少于5个,仅有5个国家的平均每…

【Axure原型分享】标签管理列表

今天和大家分享通过标签管理列表的原型模板,包括增删改查搜索筛选排序分页翻页等效果,这个模板是用中继器制作的,所以使用也很方便,初始数据我们只要在中继器表格里填写即可,具体效果可以观看下方视频或者打开预览地址…

单片机(学习)2024.10.11

目录 按键 按键原理 按键消抖 1.延时消抖 2.抬手检测 通信 1.通信是什么 2.电平信号和差分信号 3.通信的分类 (1)时钟信号划分 同步通信 异步通信 (2)通信方式划分 串行通信 并行通信 (3)通信方向划分 单工 半双工 全双工 4.USART和UART(串口通信&a…

selenium工具的几种截屏方法介绍(9)

在使用selenium做自动化的时候,可以对于某些场景截图保存当时的执行情况,方便后续定位问题或者作为一些证据保留现场。 获取元素后将元素截屏 我们获取元素后,使用函数screenshot将元素截屏,参数filename传入完整的png文件名路径…

最近 3 个 yyds 的开源项目!

01 电脑屏幕、麦克风记录工具 ScreenPipe 是一个开源的全天候本地屏幕与麦克风记录工具,为 AI 应用程序提供全方位上下文数据的支持。 该项目旨在成为 Rewind.ai 的替代方案,支持 Windows、Linux 和 macOS 等多平台应用,并且使用 Rust 语言构…

学习Ultralytics(获取yolov8自带的数据集并开始训练)

今天小编带大家学习一下YOLOv8 配置文件,用来定义不同数据集的参数和配置。这些文件包含了关于每个数据集的路径、类别数、类别标签等信息,帮助模型正确地加载和解析数据集,以便进行训练和推理。 具体来说,这些 YAML 文件的作用如…

AIGC时代的程序员生存法则:如何在AI辅助编程工具普及的背景下保持并提升核心竞争力

随着AIGC(AI-Generated Content,如ChatGPT、MidJourney、Claude等)技术的迅猛发展,特别是大型语言模型的不断涌现,程序员的工作方式正发生深刻变革。AI辅助编程工具的普及给编程行业带来了前所未有的挑战和机遇。一方面…

SwiftUI 6.0(iOS 18)将 Sections 也考虑进自定义容器子视图布局(上)

概述 在 WWDC 24 新推出的 SwiftUI 6.0 中,苹果对于容器内部子视图的布局有了更深入的支持。为了能够未雨绸缪满足实际 App 中所有可能的情况,我们还可以再接再厉,将 Sections 的支持也考虑进去。 SwiftUI 6.0 对容器子视图布局的增强支持可以认为是一个小巧的容器自定义布…

Wordpress—一个神奇的个人博客搭建框架

wordpress简介 在当今数字化的时代,拥有一个属于自己的个人博客,不仅可以记录生活点滴、分享专业知识,还能展示个人风采。而在众多的博客搭建框架中,Wordpress 以其强大的功能和灵活性脱颖而出。今天,就让我们一起深入…

spring boot项目日志怎么加?

使用源码LoggerFactory(日志工厂类) 使用方法:getlogger()中间传入1个类 加在过滤里所以需要传入的是过滤这个类(reqfilter.class) 用这个对象调info方法 logger.error是打印错误信息 logger.debug打印debug 结果会增加时间名称等…

LQB焊接超声波部分原理图和焊接说明(勘误)

1、自制的板子的原理图,有一个错误的地方,导致超声波不能正常使用。 下图是实物的原理图存在错误,不小心,自我批评一下。 图中的C6电容330pF的一端接到了VCC,是错误的。 蓝桥杯的原理图是下图,接到GND 因…

【机器学习(十三)】机器学习回归案例之股票价格预测分析—Sentosa_DSML社区版

文章目录 一、背景描述二、Python代码和Sentosa_DSML社区版算法实现对比(一) 数据读入(二) 特征工程(三) 样本分区(四) 模型训练和评估(五) 模型可视化 三、总结 一、背景描述 股票价格是一种不稳定的时间序列,受多种因素的影响。影响股市的外部因素很多,主要有经济因素、政治因…

51单片机数码管循环显示0~f

原理图&#xff1a; #include <reg52.h>sbit dulaP2^6;//段选信号 sbit welaP2^7;//位选信号unsigned char num;//数码管显示的数字0~funsigned char code table[]{ 0x3f,0x06,0x5b,0x4f, 0x66,0x6d,0x7d,0x07, 0x7f,0x6f,0x77,0x7c, 0x39,0x5e,0x79,0x71};//定义数码管显…

CDN服务支持多种应用场景,包括图片、大文件下载、流媒体等

中国联通国际公司产品之 CDN&#xff08;内容分发网络&#xff09; 在当今这个信息爆炸的时代&#xff0c;内容分发网络&#xff08;CDN&#xff09;已成为提升用户体验和保障数据快速传输的重要工具。中国联通国际公司凭借其全球领先的通信技术和广泛的网络覆盖&#xff0c;推…

Qualitor checkAcesso.php 任意文件上传漏洞复现(CVE-2024-44849)

0x01 漏洞概述 Qualitor 8.24及之前版本存在任意文件上传漏洞,未经身份验证远程攻击者可利用该漏洞代码执行,写入WebShell,进一步控制服务器权限。 0x02 复现环境 FOFA:app="Qualitor-Web" 0x03 漏洞复现 PoC POST /html/ad/adfilestorage/request/checkAcess…

第十一章 RabbitMQ之消费者确认机制

目录 一、介绍 二、演示三种ACK方式效果 2.1. none: 不处理 2.1.1. 消费者配置代码 2.1.2. 生产者主要代码 2.1.3. 消费者主要代码 2.1.4. 运行效果 2.2. manual&#xff1a;手动模式 2.3. auto&#xff1a;自动模式 一、介绍 消费者确认机制&#xff08;Consume…

物流大数据底盘建设方案

1、现状及目标 1.1、离线数仓现状及目标 1.2、实时数仓现状及目标 2、建设方向 2.1、建设概览 2.2、数仓架构重建 2.3、数据架构 2.4、作业优化 2.5、具体作业优化-运营 2.6、具体作业优化-财经 2.7、数据血缘依赖重构 2.8、事实表建设思路 2.9、公共维表建设思路 2.10、数据…

springboot-网站开发-使用slf4j实现网站异常错误的及时跟踪定位

springboot-网站开发-使用slf4j实现网站异常错误的及时跟踪定位&#xff01;项目部署&#xff0c;开发好后&#xff0c;部署到远程服务器上面了&#xff0c;运行过程中&#xff0c;难免会遇到一些错误和异常情况&#xff0c;我们需要借助一些插件来帮助我们及时捕捉这类错误和异…

【中文版】深度学习 deep learning 花书 pdf下载 2017.09.04

中文版pdf&#xff1a;https://pan.baidu.com/s/1s93yluQGSly5uBDAIVAlNg?pwdx6xy github&#xff1a;https://github.com/exacity/deeplearningbook-chinese 目录 第一章 前言第二章 线性代数第三章 概率与信息论第四章 数值计算第五章 机器学习基础第六章 深度前馈网络第七…