在 Rocky linux 8.7 使用 Kubespray v2.21.0 离线部署 kubernetes v1.24.0 集群

news2025/1/8 23:58:03

文章目录

    • 前言
    • 创建7台虚拟机
    • 要求
    • 配置代理
    • 下载介质
    • 介质初始化
    • 安装工具包
    • 配置互信
    • 编写 inventory.ini
    • 创建 offline.ymlt他、
    • 部署 offline repo
    • 部署 kubespray
    • 报错2
    • 报错3
    • 报错
      • 报错:container-engine/containerd : containerd | Create registry directories
    • 分步执行
    • 自定义 部署

前言

Kubespray 是 Kubernetes incubator 中的项目,目标是提供 Production Ready Kubernetes 部署方案,该项目基础是通过 Ansible Playbook 来定义系统与 Kubernetes 集群部署的任务,具有以下几个特点:

  • 可以部署在 AWS, GCE, Azure, OpenStack 以及裸机上.
  • 部署 High Available Kubernetes 集群.
  • 可组合性 (Composable),可自行选择 Network Plugin (flannel, calico, canal, weave) 来部署.
  • 支持多种 Linux distributions
    • RHEL 7 / CentOS 7
    • RHEL 8 / AlmaLinux 8
    • Ubuntu 20.04 / 22.04

本篇将说明如何通过 Kubespray 部署 Kubernetes 至裸机节点,安装版本如下所示:

  • rocky linux 8.7
  • Kubernetes v1.24.10
  • kubespray v2.21.0
  • docker-ce 23.0.2

创建7台虚拟机

通过 vSphere client 创建虚拟机,vSphere client 如何创建虚拟机请看这里。

需求:

  • 系统: Rocky Linux 8.7

  • CPU: 4

  • MEM: 8G

  • DISK: 30G

Rocky Linux 9.1 新手入门指南

配置地址与主机名分别为:

10.168.110.21 kube-control-plan01
10.168.110.31 dbscale-control-plan01

10.168.110.41 kube-node01
10.168.110.42 kube-node02
10.168.110.43 kube-node03

10.168.9.199 registry01

k8s 集群机器配置网络全部改为离线环境。

100.168.110.21 kube-control-plan01
100.168.110.31 dbscale-control-plan01

100.168.110.41 kube-node01
100.168.110.42 kube-node02
100.168.110.43 kube-node03

10.168.9.199 registry01 为镜像与yum 源介质分发地址,暂时不改为离线,与

192.168.26.10 为安装机器,并在线拉取需要的镜像。

要求

离线文件:

  • yum 源
  • 镜像
  • pip包

目标节点的支持脚本

  • 从本地安装 containerd .
  • Start up nginx container as web server to supply Yum/Deb repository and PyPI mirror.
  • 启动 docker private registry.
  • Load all container images and push them to the private registry.

配置代理

加速下载介质,例如:github、quay、docker hub

$ vim /root/.bashrc
ENV_PROXY="http://192.168.21.101:7890"

enable_proxy() {
    export HTTP_PROXY="$ENV_PROXY"
    export HTTPS_PROXY="$ENV_PROXY"
    export ALL_PROXY="$ENV_PROXY"
    export NO_PROXY="proxyhost,localhost,*.vsphere.local,*.vm.demo,*.tanzu.demo,192.168.21.101,127.0.0.1/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"

    export http_proxy="$ENV_PROXY"
    export https_proxy="$ENV_PROXY"
    export all_proxy="$ENV_PROXY"
    export no_proxy="proxyhost,localhost,*.vsphere.local,*.vm.demo,*.tanzu.demo,192.168.21.101,127.0.0.1/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"

    #git config --global http.proxy $ENV_PROXY
    #git config --global http.proxy $ENV_PROXY
}



disable_proxy() {
    unset HTTP_PROXY
    unset HTTPS_PROXY
    unset ALL_PROXY
    unset NO_PROXY

    unset http_proxy
    unset https_proxy
    unset all_proxy
    unset no_proxy

    #git config --global --unset http.proxy
    #git config --global --unset https.proxy
}

#source <(kubectl completion bash)

enable_proxy
#disable_proxy

执行:

source /root/.bashrc

下载介质

192.168.26.10 online 操作

 git clone https://github.com/tmurakam/kubespray-offline.git
 cd kubespray-offline

在下载介质之前需要安装:

  • run install-docker.sh to install Docker CE.
  • run install-containerd.sh to install containerd and nerdctl.
  • Set docker environment variable to /usr/local/bin/nerdctl in config.sh.
./nstall-docker.sh
./install-containerd.sh

安装成功后,下载介质执行:

./download-all.sh

所有工件都存储在./outputs 目录。
此脚本调用以下所有脚本:

  • prepare-pkgs.sh
    Setup python, etc.
  • prepare-py.sh
    Setup python venv, install required python packages.
  • get-kubespray.sh
    Download and extract kubespray, if KUBESPRAY_DIR does not exist.
  • pypi-mirror.sh
    Download PyPI mirror files
  • download-kubespray-files.sh
    Download kubespray offline files (containers, files, etc)
  • download-additional-containers.sh
    Download additional containers.
    You can add any container image repoTag to imagelists/*.txt.
  • create-repo.sh
    Download RPM or DEB repositories.
  • copy-target-scripts.sh
    Copy scripts for target node.

用时 30分钟 下载完成介质。

介质初始化

打包:

cd ../
tar czvf kubespray-offline-2.21.0.tar.gz kubespray-offline/

登陆 管理节点
outputs目录中的所有内容复制到目标节点(运行 ansible 的节点)。然后在输出目录中运行以下脚本:

tar zxvf kubespray-offline-2.21.0.tar.gz
cd kubespray-offline/outputs
./set-all.sh

此脚本调用以下所有脚本:

  • setup-container.sh
    Install containerd from local files.
    Load nginx and registry images to containerd.
  • start-nginx.sh
    Start nginx container.
  • setup-offline.sh
    Setup yum/deb repo config and PyPI mirror config to use local nginx server.
  • setup-py.sh
    Install python3 and venv from local repo.
  • start-registry.sh
    Start docker private registry container.
  • load-push-images.sh
    Load all container images to containerd.
    Tag and push them to the private registry.
  • extract-kubespray.sh
    Extract kubespray tarball and apply all patches.

You can configure port number of nginx and private registry in config.sh.

检查

nerdctl ps

发现 nginx 报错 没有 /usr/share/nginx/ 目录

需要在容器内创建 /usr/share/nginx/

nerdctl inspect nginx |grep -i pid

nsenter -t <pid> -n mkdir -p /usr/share/nginx/
nerdctl restart nginx

安装工具包

Create and activate venv:

# Example
$ python3 -m venv ~/.venv/default
$ source ~/.venv/default/bin/activate

Note: For RHEL/CentOS 7, you need to use python 3.8.

# Example
$ /opt/rh/rh-python38/root/usr/bin/python -m venv ~/.venv/default
$ source ~/.venv/default/bin/activate

Extract kubespray and apply patches:

$ ./extract-kubespray.sh
$ cd kubespray-{version}

For Ubuntu 22.04, you need to install build tools to build some python packages.

$ sudo apt install gcc python3-dev libffi-dev libssl-dev

Install ansible:

$ pip install -U pip                # update pip
$ pip install -r requirements.txt   # Install ansible

配置互信

ssh-copy-id root@100.168.110.21
ssh-copy-id root@100.168.110.31
ssh-copy-id root@100.168.110.41
ssh-copy-id root@100.168.110.42
ssh-copy-id root@100.168.110.43

编写 inventory.ini

[all]
kube-control-plan01 ansible_host=100.168.110.21
dbscale-control-plan01 ansible_host=100.168.110.31
kube-node01 ansible_host=100.168.110.41
kube-node02 ansible_host=100.168.110.42
kube-node03 ansible_host=100.168.110.43

[kube_control_plane]
kube-control-plan01

[etcd]
kube-control-plan01

[kube_node]
dbscale-control-plan01
kube-node01
kube-node02
kube-node03

[calico_rr]

[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr

测试ping

ansible -i inventory/local/inventory.ini all -m ping

配置并分发 /etc/hosts\

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
100.168.110.21 kube-control-plan01
100.168.110.31 dbscale-control-plan01
100.168.110.41 kube-node01
100.168.110.42 kube-node02
100.168.110.43 kube-node03

执行

ansible -i inventory/local/inventory.ini all -m copy -a "src=/etc/hosts dest=/etc/hosts"
ansible -i inventory/local/inventory.ini all -m shell -a "cat /etc/hosts"

创建 offline.ymlt他、

vim  outputs/kubespray-2.21.0/inventry/local/group_vars/all/offline.yml
YOUR_HOST=100.168.110.199
http_server: "http://${YOUR_HOST}/"
registry_host: "${YOUR_HOST}:35000"

containerd_insecure_registries: # Kubespray #8340
  "${YOUR_HOST}:35000": "http://${YOUR_HOST}:35000"

files_repo: "{{ http_server }}/files"
yum_repo: "{{ http_server }}/rpms"
ubuntu_repo: "{{ http_server }}/debs"

# Registry overrides
kube_image_repo: "{{ registry_host }}"
gcr_image_repo: "{{ registry_host }}"
docker_image_repo: "{{ registry_host }}"
quay_image_repo: "{{ registry_host }}"

# Download URLs: See roles/download/defaults/main.yml of kubespray.
kubeadm_download_url: "{{ files_repo }}/kubernetes/{{ kube_version }}/kubeadm"
kubectl_download_url: "{{ files_repo }}/kubernetes/{{ kube_version }}/kubectl"
kubelet_download_url: "{{ files_repo }}/kubernetes/{{ kube_version }}/kubelet"
# etcd is optional if you **DON'T** use etcd_deployment=host
etcd_download_url: "{{ files_repo }}/kubernetes/etcd/etcd-{{ etcd_version }}-linux-amd64.tar.gz"
cni_download_url: "{{ files_repo }}/kubernetes/cni/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz"
crictl_download_url: "{{ files_repo }}/kubernetes/cri-tools/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
# If using Calico
calicoctl_download_url: "{{ files_repo }}/kubernetes/calico/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}"
# If using Calico with kdd
calico_crds_download_url: "{{ files_repo }}/kubernetes/calico/{{ calico_version }}.tar.gz"

runc_download_url: "{{ files_repo }}/runc/{{ runc_version }}/runc.{{ image_arch }}"
nerdctl_download_url: "{{ files_repo }}/nerdctl-{{ nerdctl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
containerd_download_url: "{{ files_repo }}/containerd-{{ containerd_version }}-linux-{{ image_arch }}.tar.gz"

#containerd_insecure_registries:
#    "{{ registry_addr }}":"{{ registry_host }}"

# CentOS/Redhat/AlmaLinux/Rocky Linux
## Docker / Containerd
docker_rh_repo_base_url: "{{ yum_repo }}/docker-ce/$releasever/$basearch"
docker_rh_repo_gpgkey: "{{ yum_repo }}/docker-ce/gpg"

# Fedora
## Docker
docker_fedora_repo_base_url: "{{ yum_repo }}/docker-ce/{{ ansible_distribution_major_version }}/{{ ansible_architecture }}"
docker_fedora_repo_gpgkey: "{{ yum_repo }}/docker-ce/gpg"
## Containerd
containerd_fedora_repo_base_url: "{{ yum_repo }}/containerd"
containerd_fedora_repo_gpgkey: "{{ yum_repo }}/docker-ce/gpg"

# Debian
## Docker
docker_debian_repo_base_url: "{{ debian_repo }}/docker-ce"
docker_debian_repo_gpgkey: "{{ debian_repo }}/docker-ce/gpg"
## Containerd
containerd_debian_repo_base_url: "{{ ubuntu_repo }}/containerd"
containerd_debian_repo_gpgkey: "{{ ubuntu_repo }}/containerd/gpg"
containerd_debian_repo_repokey: 'YOURREPOKEY'

# Ubuntu
## Docker
docker_ubuntu_repo_base_url: "{{ ubuntu_repo }}/docker-ce"
docker_ubuntu_repo_gpgkey: "{{ ubuntu_repo }}/docker-ce/gpg"
## Containerd
containerd_ubuntu_repo_base_url: "{{ ubuntu_repo }}/containerd"
containerd_ubuntu_repo_gpgkey: "{{ ubuntu_repo }}/containerd/gpg"
containerd_ubuntu_repo_repokey: 'YOURREPOKEY'

部署 offline repo

Deploy offline repo configurations which use your yum_repo/ubuntu_repo to all target nodes using ansible.

First, copy offline setup playbook to kubespray directory.

$ cp -r ${outputs_dir}/playbook ${kubespray_dir}

Then execute offline-repo.yml playbook.

$ cd ${kubespray_dir}
ansible -i inventory/local/inventory.ini all -m shell -a  "mv /etc/yum.repos.d /tmp"
ansible -i inventory/local/inventory.ini all -m shell -a  "mkdir /etc/yum.repos.d"
ansible -i inventory/local/inventory.ini all -m copy -a "src=/tmp/yum.repos.d/offline.repo dest=/etc/yum.repos.d/"

$ ansible-playbook -i inventory/local/inventory.ini offline-repo.yml

在这里插入图片描述

ansible -i inventory/local/inventory.ini all -m shell -a "yum install conntrack"

部署 kubespray

结果: 该kubespray v2.21.1 版本部署 kubernetes v1.25.6 containerd 失败。原因:https://github.com/kubernetes-sigs/kubespray/issues/9956

# Example  
$ ansible-playbook -i inventory/local/inventory.ini --become --become-user=root cluster.yml

在这里插入图片描述

$ vi roles/kubernetes/preinstall/tasks/0070-system-packages.yml
...
    80  - name: Install packages requirements
    81    package:
    82      name: "{{ required_pkgs | default([]) | union(common_required_pkgs|default([])) }}"
    83      state: present
    84    register: pkgs_task_result
    85    until: pkgs_task_result is succeeded
    86    retries: "{{ pkg_install_retries }}"
    87    delay: "{{ retry_stagger | random + 3 }}"
    88    when: not (ansible_os_family in ["Flatcar", "Flatcar Container Linux by Kinvolk", "ClearLinux"] or is_fedora_coreos)
    89    tags:
    90      - bootstrap-os

报错2

$ vi /root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/kubernetes/preinstall/tasks/0090-etchosts.yml

     1  ---
     2  - name: Hosts | create list from inventory
     3    set_fact:
     4      etc_hosts_inventory_block: |-
     5        {% for item in (groups['k8s_cluster'] + groups['etcd']|default([]) + groups['calico_rr']|default([]))|unique -%}
     6        {% if 'access_ip' in hostvars[item] or 'ip' in hostvars[item] or 'ansible_default_ipv4' in hostvars[item] and 'address' in hostvars[item]['ansible_default_ipv4'] -%}
     7        {{ hostvars[item]['access_ip'] | default(hostvars[item]['ip'] | default(hostvars[item]['ansible_default_ipv4']['address'])) }}
     8        {%- if ('ansible_hostname' in hostvars[item] and item != hostvars[item]['ansible_hostname']) %} {{ hostvars[item]['ansible_hostname'] }}.{{ dns_domain }} {{ hostvars[item]['ansible_hostname'] }} {% else %} {{ item }}.{{ dns_domain }} {{ item }} {% endif %}
     9
    10        {% endif %}
    11        {% endfor %}
    12    delegate_to: localhost
    13    connection: local
    14    delegate_facts: yes
    15    run_once: yes
    16
    17  - name: Hosts | populate inventory into hosts file
    18    blockinfile:
    19      path: /etc/hosts
    20      block: "{{ hostvars.localhost.etc_hosts_inventory_block }}"
    21      state: present
    22      create: yes
    23      backup: yes
    24      unsafe_writes: yes
    25      marker: "# Ansible inventory hosts {mark}"
    26      mode: 0644
    27    when: populate_inventory_to_hosts_file
TASK [kubernetes/preinstall : Hosts | create list from inventory] ******************************************************************************************************************
fatal: [kube-control-plan01 -> localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'address'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/kubernetes/preinstall/tasks/0090-etchosts.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: Hosts | create list from inventory\n  ^ here\n"}

NO MORE HOSTS LEFT *****************************************************************************************************************************************************************

PLAY RECAP *************************************************************************************************************************************************************************
dbscale-control-plan01     : ok=73   changed=0    unreachable=0    failed=0    skipped=97   rescued=0    ignored=0
kube-control-plan01        : ok=89   changed=0    unreachable=0    failed=1    skipped=108  rescued=0    ignored=0
kube-node01                : ok=73   changed=0    unreachable=0    failed=0    skipped=97   rescued=0    ignored=0
kube-node02                : ok=73   changed=0    unreachable=0    failed=0    skipped=97   rescued=0    ignored=0
kube-node03                : ok=73   changed=0    unreachable=0    failed=0    skipped=97   rescued=0    ignored=0
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

修复方法
注释 /root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/kubernetes/preinstall/tasks/0090-etchosts.yml 2-25行

编写/etc/hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost6 localhost6.localdomain6 localhost6.localdomain
# Ansible inventory hosts BEGIN
100.168.110.21 kube-control-plan01.cluster.local kube-control-plan01
100.168.110.31 dbscale-control-plan01.cluster.local dbscale-control-plan01
100.168.110.41 kube-node01.cluster.local kube-node01
100.168.110.42 kube-node02.cluster.local kube-node02
100.168.110.43 kube-node03.cluster.local kube-node03
# Ansible inventory hosts END

执行:

ansible -i inventory/local/inventory.ini all -m copy -a "src=/etc/hosts dest=/etc/hosts"

报错3

$ vi /root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml


   114  - name: containerd | Create registry directories
   115    file:
   116      path: "{{ containerd_cfg_dir }}/certs.d/{{ item.key }}"
   117      state: directory
   118      mode: 0755
   119      recurse: true
   120    with_dict: "{{ containerd_insecure_registries }}"
   121    when: containerd_insecure_registries is defined
   122
   123  - name: containerd | Write hosts.toml file
   124    blockinfile:
   125      path: "{{ containerd_cfg_dir }}/certs.d/{{ item.key }}/hosts.toml"
   126      owner: "root"
   127      mode: 0640
   128      create: true
   129      block: |
   130        server = "{{ item.value }}"
   131        [host."{{ item.value }}"]
   132          capabilities = ["pull", "resolve", "push"]
   133          skip_verify = true
   134    with_dict: "{{ containerd_insecure_registries }}"
   135    when: containerd_insecure_registries is defined
TASK [container-engine/containerd : containerd Create registry directories] ********************************************************************************************************
fatal: [kube-control-plan01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd Create registry directories\n  ^ here\n"}
fatal: [dbscale-control-plan01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd Create registry directories\n  ^ here\n"}
fatal: [kube-node01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd Create registry directories\n  ^ here\n"}
fatal: [kube-node02]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd Create registry directories\n  ^ here\n"}
fatal: [kube-node03]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd Create registry directories\n  ^ here\n"}

NO MORE HOSTS LEFT *****************************************************************************************************************************************************************

PLAY RECAP *************************************************************************************************************************************************************************
dbscale-control-plan01     : ok=147  changed=1    unreachable=0    failed=1    skipped=264  rescued=0    ignored=0
kube-control-plan01        : ok=173  changed=1    unreachable=0    failed=1    skipped=293  rescued=0    ignored=0
kube-node01                : ok=147  changed=1    unreachable=0    failed=1    skipped=264  rescued=0    ignored=0
kube-node02                : ok=147  changed=1    unreachable=0    failed=1    skipped=264  rescued=0    ignored=0
kube-node03                : ok=147  changed=1    unreachable=0    failed=1    skipped=264  rescued=0    ignored=0
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

修复方法:

ansible -i inventory/local/inventory.ini all -m file -a "path=/etc/containerd/certs.d/100.168.110.199:35000 state=directory recurse=true mode=0755"

报错:

TASK [container-engine/containerd : containerd | Write hosts.toml file] ***********************************************************************************************************
fatal: [kube-control-plan01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 123, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Write hosts.toml file\n  ^ here\n"}
fatal: [dbscale-control-plan01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 123, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Write hosts.toml file\n  ^ here\n"}
fatal: [kube-node01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 123, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Write hosts.toml file\n  ^ here\n"}
fatal: [kube-node02]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 123, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Write hosts.toml file\n  ^ here\n"}
fatal: [kube-node03]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 123, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Write hosts.toml file\n  ^ here\n"}

NO MORE HOSTS LEFT *****************************************************************************************************************************************************************

PLAY RECAP *********************************************************************************************************************************

解决方法:

$ vi roles/container-engine/containerd/tasks/main.yml
- name: containerd | Create registry directories
  file:
    path: "{{ containerd_cfg_dir }}/certs.d/{{ item.key }}"
    state: directory
    mode: 0755
    recurse: true
  with_dict: "{{ containerd_insecure_registries }}"
  when: containerd_insecure_registries is defined
$ cat host.toml
server = "100.168.110.199:35000"
[host."100.168.110.199:35000"]
  capabilities = ["pull", "resolve", "push"]
  skip_verify = true

执行:

ansible -i inventory/local/inventory.ini all -m copy -a "src=./host.toml dest=/etc/containerd/certs.d/100.168.110.199:35000/"

报错:

TASK [etcd : Get currently-deployed etcd version] **********************************************************************************************************************************
fatal: [kube-control-plan01]: FAILED! => {"changed": false, "cmd": "/usr/local/bin/etcd --version", "msg": "[Errno 2] No such file or directory: b'/usr/local/bin/etcd': b'/usr/local/bin/etcd'", "rc": 2, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring

报错:

TASK [kubernetes/node : Modprobe nf_conntrack_ipv4] ********************************************************************************************************************************
fatal: [kube-control-plan01]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [dbscale-control-plan01]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [kube-node01]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [kube-node02]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [kube-node03]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
Monday 10 April 2023  09:33:26 -0400 (0:00:00.480)       0:04:30.825 **********

TASK [kubernetes/node : Persist ip_vs modules] ***************************

报错:

TASK [kubernetes/control-plane : Check which kube-control nodes are already members of the cluster] ********************************************************************************fatal: [kube-control-plan01]: FAILED! => {"changed": false, "cmd": ["/usr/local/bin/kubectl", "get", "nodes", "--selector=node-role.kubernetes.io/control-plane", "-o", "json"], "delta": "0:00:00.028582", "end": "2023-04-10 21:33:37.555599", "msg": "non-zero return code", "rc": 1, "start": "2023-04-10 21:33:37.527017", "stderr": "The connection to the server localhost:8080 was refused - did you specify the right host or port?", "stderr_lines": ["The connection to the server localhost:8080 was refused - did you specify the right host or port?"], "stdout": "", "stdout_lines": []}
...ignoring

报错:

TASK [kubernetes/control-plane : kubeadm | Initialize first master] ****************************************************************************************************************
fatal: [kube-control-plan01]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["timeout", "-k", "300s", "300s", "/usr/local/bin/kubeadm", "init", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--skip-phases=addon/coredns", "--upload-certs"], "delta": "0:01:56.126009", "end": "2023-04-10 21:41:40.621046", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2023-04-10 21:39:44.495037", "stderr": "W0410 21:39:44.514437  222884 utils.go:69] The recommended value for \"clusterDNS\" in \"KubeletConfiguration\" is: [10.233.0.10]; the provided value is: [169.254.25.10]\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists\n\t[WARNING CRI]: container runtime is not running: output: E0410 21:39:44.542333  222894 remote_runtime.go:948] \"Status from runtime service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService\"\ntime=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService\"\n, error: exit status 1\n\t[WARNING FileExisting-conntrack]: conntrack not found in system path\n\t[WARNING FileExisting-socat]: socat not found in system path\n\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/kube-apiserver:v1.25.6: output: E0410 21:39:44.665910  222942 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/kube-apiserver:v1.25.6\"\ntime=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"\n, error: exit status 1\n\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/kube-controller-manager:v1.25.6: output: E0410 21:39:44.746281  222977 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/kube-controller-manager:v1.25.6\"\ntime=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"\n, error: exit status 1\n\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/kube-scheduler:v1.25.6: output: E0410 21:39:44.824291  223014 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/kube-scheduler:v1.25.6\"\ntime=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"\n, error: exit status 1\n\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/kube-proxy:v1.25.6: output: E0410 21:39:44.902557  223050 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/kube-proxy:v1.25.6\"\ntime=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"\n, error: exit status 1\n\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/pause:3.8: output: E0410 21:39:44.985206  223086 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/pause:3.8\"\ntime=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"\n, error: exit status 1\n\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/coredns/coredns:v1.9.3: output: E0410 21:39:45.063766  223122 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/coredns/coredns:v1.9.3\"\ntime=\"2023-04-10T21:39:45+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"\n, error: exit status 1\nerror execution phase wait-control-plane: couldn't initialize a Kubernetes cluster\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["W0410 21:39:44.514437  222884 utils.go:69] The recommended value for \"clusterDNS\" in \"KubeletConfiguration\" is: [10.233.0.10]; the provided value is: [169.254.25.10]", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists", "\t[WARNING CRI]: container runtime is not running: output: E0410 21:39:44.542333  222894 remote_runtime.go:948] \"Status from runtime service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService\"", "time=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService\"", ", error: exit status 1", "\t[WARNING FileExisting-conntrack]: conntrack not found in system path", "\t[WARNING FileExisting-socat]: socat not found in system path", "\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/kube-apiserver:v1.25.6: output: E0410 21:39:44.665910  222942 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/kube-apiserver:v1.25.6\"", "time=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"", ", error: exit status 1", "\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/kube-controller-manager:v1.25.6: output: E0410 21:39:44.746281  222977 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/kube-controller-manager:v1.25.6\"", "time=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"", ", error: exit status 1", "\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/kube-scheduler:v1.25.6: output: E0410 21:39:44.824291  223014 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/kube-scheduler:v1.25.6\"", "time=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"", ", error: exit status 1", "\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/kube-proxy:v1.25.6: output: E0410 21:39:44.902557  223050 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/kube-proxy:v1.25.6\"", "time=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"", ", error: exit status 1", "\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/pause:3.8: output: E0410 21:39:44.985206  223086 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/pause:3.8\"", "time=\"2023-04-10T21:39:44+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"", ", error: exit status 1", "\t[WARNING ImagePull]: failed to pull image 100.168.110.199:35000/coredns/coredns:v1.9.3: output: E0410 21:39:45.063766  223122 remote_image.go:222] \"PullImage from image service failed\" err=\"rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" image=\"100.168.110.199:35000/coredns/coredns:v1.9.3\"", "time=\"2023-04-10T21:39:45+08:00\" level=fatal msg=\"pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\"", ", error: exit status 1", "error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "[init] Using Kubernetes version: v1.25.6\n[preflight] Running pre-flight checks\n[preflight] Pulling images required for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"\n[certs] Using existing ca certificate authority\n[certs] Using existing apiserver certificate and key on disk\n[certs] Using existing apiserver-kubelet-client certificate and key on disk\n[certs] Using existing front-proxy-ca certificate authority\n[certs] Using existing front-proxy-client certificate and key on disk\n[certs] External etcd mode: Skipping etcd/ca certificate authority generation\n[certs] External etcd mode: Skipping etcd/server certificate generation\n[certs] External etcd mode: Skipping etcd/peer certificate generation\n[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation\n[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation\n[certs] Using the existing \"sa\" key\n[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Starting the kubelet\n[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"\n[control-plane] Creating static Pod manifest for \"kube-apiserver\"\n[control-plane] Creating static Pod manifest for \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-scheduler\"\n[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s\n[kubelet-check] Initial timeout of 40s passed.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.\n\nUnfortunately, an error has occurred:\n\ttimed out waiting for the condition\n\nThis error is likely caused by:\n\t- The kubelet is not running\n\t- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)\n\nIf you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:\n\t- 'systemctl status kubelet'\n\t- 'journalctl -xeu kubelet'\n\nAdditionally, a control plane component may have crashed or exited when started by the container runtime.\nTo troubleshoot, list all containers using your preferred container runtimes CLI.\nHere is one example how you may list all running Kubernetes containers by using crictl:\n\t- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'\n\tOnce you have found the failing container, you can inspect its logs with:\n\t- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'", "stdout_lines": ["[init] Using Kubernetes version: v1.25.6", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two, depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"", "[certs] Using existing ca certificate authority", "[certs] Using existing apiserver certificate and key on disk", "[certs] Using existing apiserver-kubelet-client certificate and key on disk", "[certs] Using existing front-proxy-ca certificate authority", "[certs] Using existing front-proxy-client certificate and key on disk", "[certs] External etcd mode: Skipping etcd/ca certificate authority generation", "[certs] External etcd mode: Skipping etcd/server certificate generation", "[certs] External etcd mode: Skipping etcd/peer certificate generation", "[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation", "[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation", "[certs] Using the existing \"sa\" key", "[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"", "[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"", "[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"", "[kubelet-start] Starting the kubelet", "[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"", "[control-plane] Creating static Pod manifest for \"kube-apiserver\"", "[control-plane] Creating static Pod manifest for \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-scheduler\"", "[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s", "[kubelet-check] Initial timeout of 40s passed.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp 127.0.0.1:10248: connect: connection refused.", "", "Unfortunately, an error has occurred:", "\ttimed out waiting for the condition", "", "This error is likely caused by:", "\t- The kubelet is not running", "\t- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)", "", "If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:", "\t- 'systemctl status kubelet'", "\t- 'journalctl -xeu kubelet'", "", "Additionally, a control plane component may have crashed or exited when started by the container runtime.", "To troubleshoot, list all containers using your preferred container runtimes CLI.", "Here is one example how you may list all running Kubernetes containers by using crictl:", "\t- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'", "\tOnce you have found the failing container, you can inspect its logs with:", "\t- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'"]}

NO MORE HOSTS LEFT *****************************************************************************************************************************************************************

PLAY RECAP *************************************************************************************************************************************************************************
dbscale-control-plan01     : ok=436  changed=17   unreachable=0    failed=0    skipped=527  rescued=0    ignored=1
kube-control-plan01        : ok=570  changed=67   unreachable=0    failed=1    skipped=726  rescued=0    ignored=3
kube-node01                : ok=436  changed=17   unreachable=0    failed=0    skipped=526  rescued=0    ignored=1
kube-node02                : ok=436  changed=17   unreachable=0    failed=0    skipped=526  rescued=0    ignored=1
kube-node03                : ok=436  changed=17   unreachable=0    failed=0    skipped=526  rescued=0    ignored=1
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

原因:

  • https://github.com/containerd/containerd/issues/4581
kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all --skip-phases=addon/coredns --upload-certs

报错

TASK [kubernetes/node : Modprobe nf_conntrack_ipv4] ********************************************************************************************************************************
fatal: [dbscale-control-plan01]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [kube-control-plan01]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [kube-node01]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [kube-node02]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [kube-node03]: FAILED! => {"changed": false, "msg": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "name": "nf_conntrack_ipv4", "params": "", "rc": 1, "state": "present", "stderr": "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64\n", "stderr_lines": ["modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-425.13.1.el8_7.x86_64"], "stdout": "", "stdout_lines": []}
...ignoring

说明:nf_conntrack_ipv4 havs been rename to nf_conntrack since Linux kernel 4.18+

报错:container-engine/containerd : containerd | Create registry directories

TASK [container-engine/containerd : containerd | Create registry directories] *****************************************
fatal: [kube-control-plan01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0-0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Create registry directories\n  ^ here\n"}
fatal: [dbscale-control-plan01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0-0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Create registry directories\n  ^ here\n"}
fatal: [kube-node01]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0-0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Create registry directories\n  ^ here\n"}
fatal: [kube-node02]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0-0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Create registry directories\n  ^ here\n"}
fatal: [kube-node03]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'key'\n\nThe error appears to be in '/root/kubespray-offline-2.21.0-0/outputs/kubespray-2.21.0/roles/container-engine/containerd/tasks/main.yml': line 114, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: containerd | Create registry directories\n  ^ here\n"}

分步执行

TASK [kubernetes/preinstall : Hosts | create list from inventory] ******************************************************
ok: [kube-control-plan01 -> localhost]
Tuesday 11 April 2023  21:59:33 -0400 (0:00:00.376)       0:04:48.518 *********

TASK [kubernetes/preinstall : Hosts | populate inventory into hosts file] **********************************************
changed: [kube-node02]
changed: [kube-node03]
changed: [kube-node01]
changed: [dbscale-control-plan01]
changed: [kube-control-plan01]
ansible-playbook -i inventory/local/inventory.ini --become --become-user=root cluster.yml --tags  etchosts

自定义 部署

修改 kubernetes 版本。重新部署

 git clone https://github.com/tmurakam/kubespray-offline.git
 cd kubespray-offline

在下载介质之前需要安装:

  • run install-docker.sh to install Docker CE.
  • run install-containerd.sh to install containerd and nerdctl.
  • Set docker environment variable to /usr/local/bin/nerdctl in config.sh.
./nstall-docker.sh
./install-containerd.sh

安装成功后,下载介质执行:
暂时注释掉部分执行脚本

cat download-all.sh
#!/bin/bash

run() {
    echo "=> Running: $*"
    $* || {
        echo "Failed in : $*"
        exit 1
    }
}

source ./config.sh

run ./install-docker.sh
run ./precheck.sh
run ./prepare-pkgs.sh
run ./prepare-py.sh
run ./get-kubespray.sh
if $ansible_in_container; then
    run ./build-ansible-container.sh
else
    run ./pypi-mirror.sh
fi
#run ./download-kubespray-files.sh
#run ./download-additional-containers.sh
#run ./create-repo.sh
#run ./copy-target-scripts.sh

echo "Done."
./download-all.sh

执行结束后修改:

#修改下载k8s介质版本
$ vim cache/kubespray-2.21.0/roles/kubespray-defaults/defaults/main.yaml
....
kube_version: v1.25.6
.....

#修改部署k8s介质版本
$ vim cache/kubespray-2.21.0/inventory/local/group_vars/k8s_cluster/k8s-cluster.yml
...
kube_version: v1.24.10
...

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/425913.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

5.Dockerfile

1.什么是Dockerfile Dockerfile其实就是一个文本文件&#xff0c;由一系列命令和参数构成&#xff0c;Docker可以读取Dockerfile文件并根据Dockerfile文件的描述来构建镜像。 1、对于开发人员&#xff1a;可以为开发团队提供一个完全一致的开发环境&#xff1b; 2、对于测试人…

Servlet API

目录 1.HttpServlet 1.doGet 2.doPost 2.HttpServletRequest 2.1方法 2.2打印请求信息 2.3获取GET请求的参数 2.4post请求body格式 1.x-www-form-urlencoded 2.json 3.HttpServletResponse 3.1方法 3.2设置状态码 3.3自动刷新 3.4构造重定向的响应 1.HttpServlet 方法…

VMware Workstation Pro17安装并导入旧虚拟机系统

VMware Workstation Pro17 VMware是一个虚拟机软件&#xff0c;可以用来虚拟化各种系统&#xff0c;便于进行开发和其他相关工作 VMware Workstation Pro17支持window11版本&#xff0c;如果vm版本太低了升级window的时候会提示卸载旧版本的 下载VMware Workstation Pro17 h…

网格贪心搜索逼近最优组合解

如有错误&#xff0c;感谢不吝赐教、交流 文章目录背景描述实现方法一、寻找两组合的最优二、基于两组合的最优结果寻找四组合最优三、基于四组合的最优结果寻找八组合最优四、基于八组合的最优结果寻找十六组合最优总结适用场景背景描述 假如list [0, 1, 2, 3, 4, 5, 6, 7, …

2023年第三届智能机器人与系统国际会议(ISoIRS 2023) | IEEE-CPS独立出版

2023年第三届智能机器人与系统国际会议(ISoIRS 2023) | IEEE-CPS独立出版 会议简介 Brief Introduction 2023年第三届智能机器人与系统国际会议(ISoIRS 2023) 会议时间&#xff1a;2023年5月26日-28日 召开地点&#xff1a;中国长沙 大会官网&#xff1a;www.isoirs.org ISoIRS…

软件测试别再被“薪资陷阱”困扰了,这份攻略带你轻松查薪资

大家好&#xff0c;我是锦都不二。 测试岗面试 当HR问你期望薪资是多少时&#xff0c; 如果你回答: 10K 恭喜&#xff0c;你已经被HR成功套路&#xff0c; 拿到offer时你会在心里这么嘀咕&#xff1a;我要是当时报价15k该多好。 所以如何知道自己在这个市场上的价值&#xff0c…

95-拥塞控制

拥塞控制1.什么是拥塞控制2.拥塞控制的方法(1)慢启动和拥塞避免(2)快速重传和快速恢复1.什么是拥塞控制 在计算机网络中的链路容量&#xff08;即带宽&#xff09;、交换结点中的缓存和处理机等&#xff0c;都是网络的资源。在某段时间&#xff0c;若对网络中某一资源的需求超…

excel在文本的固定位置插入字符、进行日期和时间的合并

1.excel在文本的固定位置插入字符 如上图&#xff0c;现在想要将其转化为日期格式&#xff08;比如2017/1/1&#xff09;&#xff0c;但是当设置单元格格式为日期时却显示出很多&#xff03;。我们可以通过在20170101中添加两个斜杠“/”来将其转化为2017/1/1。可以用replace函…

基于SSM的二手车交易平台小程序

选题背景 互联网是人类的基本需求&#xff0c;特别是在现代社会&#xff0c;个人压力增大&#xff0c;社会运作节奏高&#xff0c;随着互联网的快速发展&#xff0c;用户的需求也越来越高&#xff0c;用户也将越来越多依靠互联网而不是自己获取信息&#xff0c;使得各种软件程…

【逗老师的无线电】BM的AirSecurity功能使用以防止他人使用你的DMRID

众所周知&#xff0c;在使用DMR热点和中继的时候&#xff0c;如果别人的手台上配置了你的ID进行恶意呼叫&#xff0c;或者伪装你的身份进行通联&#xff0c;之前是没有办法防范的。 目前&#xff0c;BM更新了AirSecurity功能&#xff0c;通过在呼叫前预先单呼一个作为密码的号码…

计算机组成原理——第二章数据的表示和运算(上)

提示&#xff1a;日出有盼&#xff0c;落日有念&#xff0c;心有所期&#xff0c;忙而不茫 文章目录前言2.1.1 进位计数制2.1.2 BCD码2.1.3 无符号整数的表示和运算2.1.4 带符号整数的表示和运算(原反补)2.1.5原反补码的特性对比2.1.6 移码2.1.7 定点小数前言 这里主要是根据王…

vue3 history模式配置及nginx服务器配置

vue的路由方式有hash模式和history模式&#xff0c;history模式路由看起来有好些&#xff0c;路由路径里没有#号&#xff0c;而hash模式默认是有#号的。 vue3开始默认新建的项目都是history模式&#xff0c;不过history模式打包后想要使用正常访问的话&#xff0c;需要后端服务…

BIO/NIO/Netty网络通信编程

文章目录1 BIO (BLOCK IO)2. NIO (new IO)2.1 NIO-Buffer缓冲区2.2 NIO-Buffer分散读-集中写2.3 NIO-Buffer粘包半包2.4 NIO-Channel2.4.1 files相关操作2.4.2 channel网络通信2.4.3 处理消息边界2.4.4 buffer大小分配2.4.5 处理大量写事件2.5 selector-Epoll2.6 IO模型2.7 零拷…

懒人必备!Python代码帮你自动发送会议纪要,让你有更多时间做更重要的事情

目录 痛点&#xff1a; 应用场景&#xff1a; 源代码&#xff1a; 代码说明&#xff1a; 效果如下所示&#xff1a; 痛点&#xff1a; 在传统的工作中&#xff0c;发送会议纪要是一个比较繁琐的任务&#xff0c;需要手动输入邮件内容、收件人、抄送人等信息&#xff0c;每…

代码随想录算法训练营第五十六天 | 583. 两个字符串的删除操作、72. 编辑距离、编辑距离总结

583. 两个字符串的删除操作 动规五部曲 1、确定dp数组&#xff08;dp table&#xff09;以及下标的含义 dp[i][j]&#xff1a;以i-1为结尾的字符串word1&#xff0c;和以j-1位结尾的字符串word2&#xff0c;想要达到相等&#xff0c;所需要删除元素的最少次数。 2、确定递推…

基于matlab使用Swerling目标模型来描述雷达横截面的波动

一、前言该示例说明了如何使用Swerling目标模型来描述雷达横截面的波动。该场景由旋转单基地雷达和具有Swerling 2模型描述的雷达横截面的目标组成。在此示例中&#xff0c;雷达和目标是静止的。二、斯威林 1 与斯威林 2 模型在Swerling 1和Swerling 2目标模型中&#xff0c;总…

Spring项目中如何接入Open AI?

前言 最近随着ChatGPT的爆火&#xff0c;很多人都坐不住了&#xff0c;OpenAI API 允许开发人员访问该模型并在其自己的应用程序中使用。那么它能给我们我们Java开发带来那些好处呢&#xff1f;又该怎么接入Open AI呢&#xff1f; 在开始之前&#xff0c;我们需要在 OpenAI 网…

安全沙箱技术小科普

安全沙箱技术是一种用于保护用户隐私和系统安全的机制&#xff0c;它可以将应用程序限制在一个封闭的运行环境中&#xff0c;防止其对系统和其他应用程序造成潜在的威胁。安全沙箱技术广泛应用于计算机安全领域&#xff0c;如防病毒软件、浏览器、操作系统等&#xff0c;以提高…

UE4读取本地XML文件

关键词&#xff1a;UE4 UE5 Unreal Engine XML 文件 txt 需求&#xff1a; 游戏开发中需要读取了写入配置文件&#xff0c;需要保存场景信息&#xff0c;道具位置旋转信息&#xff0c;那么将其保存为XML是一个不错的办法。 涉及知识点&#xff1a; 怎样读取xml文件 思路 …

2023级浙江大学MBA提前批面试真题及经验分享

前段时间获得了浙大MBA项目拟录取资格&#xff0c;在跟易考周老师报喜的同时也很荣幸收到了分享提前批面试经验的邀请&#xff0c;现在也4月中旬了&#xff0c;马上浙大MBA提面第一批次就要开始了&#xff0c;根据我的经验来说&#xff0c;参加浙大前三批提面拿优秀的概率会更高…