本文作者: slience_me
文章目录
- Kubernetes【谷粒商城版】【建议放弃】
- 1. docker安装
- 2. kubernetes安装前
- 3. kubeadm,kubelet,kubectl
- 3.1 简介
- kubeadm
- kubelet
- kubectl
- 常用指令
- 3.2 安装
- 3.3 kubeadm初始化
- 3.4 加入从节点(工作节点)
- 3.5 安装Pod网络插件(CNI)
- 3.6 KubeSphere安装
- 3.7 卡住
- 配置文件
- openebs-operator-1.5.0.yaml
- kubesphere-minimal.yaml
- 【Q&A】问题汇总
- 01-coredns工作不正常
- 02-kubeSphere安装异常
- 03-kubeSphere相关node不正常
Kubernetes【谷粒商城版】【建议放弃】
官方网站 Kubernetes
下面为kubernetes的使用流程,从安装开始
版本号:
kubectl version
Client Version: version.Info{Major:“1”, Minor:“17”, GitVersion:“v1.17.3”, GitCommit:“06ad960bfd03b39c8310aaf92d1e7c12ce618213”, GitTreeState:“clean”, BuildDate:“2020-02-11T18:14:22Z”, GoVersion:“go1.13.6”, Compiler:“gc”, Platform:“linux/amd64”}
Server Version: version.Info{Major:“1”, Minor:“17”, GitVersion:“v1.17.3”, GitCommit:“06ad960bfd03b39c8310aaf92d1e7c12ce618213”, GitTreeState:“clean”, BuildDate:“2020-02-11T18:07:13Z”, GoVersion:“go1.13.6”, Compiler:“gc”, Platform:“linux/amd64”}helm version
Client: &version.Version{SemVer:“v2.16.3”, GitCommit:“1ee0254c86d4ed6887327dabed7aa7da29d7eb0d”, GitTreeState:“clean”}
Server: &version.Version{SemVer:“v2.16.3”, GitCommit:“1ee0254c86d4ed6887327dabed7aa7da29d7eb0d”, GitTreeState:“dirty”}PS:常言道,一坑更比一坑高,我把全部的坑都踩一边,做过这个后,整个人都练就了强大的内心,有了随时重头再来的勇气!
在这个教程之前,已经完成了虚拟机集群的构建,如果没有完成,见 基于VMware虚拟机集群搭建教程
1. docker安装
我使用的是ubuntu-20.04.6-live-server-amd64版本 docker安装教程
注意:本实验建议docker版本为 19.03, 其余的同以上教程
sudo apt install -y docker-ce=5:19.03.15~3-0~ubuntu-$(lsb_release -cs) \
docker-ce-cli=5:19.03.15~3-0~ubuntu-$(lsb_release -cs) \
containerd.io
2. kubernetes安装前
在安装之前,需要对系统做一些调整,实现更好的性能
关闭防火墙
sudo systemctl stop ufw
sudo systemctl disable ufw
关闭 SELinux
sudo systemctl stop apparmor
sudo systemctl disable apparmor
关闭 Swap
sudo swapoff -a # 临时关闭 swap
sudo sed -i '/swap/d' /etc/fstab # 永久禁用 swap
free -g # Swap 需要为 0 | 验证 Swap 是否关闭
配置主机名和 Hosts 映射
sudo hostnamectl set-hostname <newhostname> # 修改主机名
sudo vim /etc/hosts # 添加 IP 与主机名的映射
192.168.137.130 k8s-node1
192.168.137.131 k8s-node2
192.168.137.132 k8s-node3
让 IPv4 流量正确传递到 iptables
# 启用桥接流量
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system
处理只读文件系统问题
# 如果遇到 "只读文件系统(read-only filesystem)" 问题,可以尝试重新挂载:
sudo mount -o remount,rw /
同步时间
# 如果时间不同步,可能会影响 Kubernetes 组件的运行,可以使用 ntpdate 进行时间同步:
sudo apt update
sudo apt install -y ntpdate
sudo ntpdate time.windows.com
sudo timedatectl set-local-rtc 1 # 修改硬件时钟
sudo timedatectl set-timezone Asia/Shanghai # 修改时钟
3. kubeadm,kubelet,kubectl
3.1 简介
在 Kubernetes(K8s)集群的安装和管理过程中,kubeadm
、kubelet
和 kubectl
是三个最重要的组件,它们分别承担不同的职责:
组件 | 作用 | 适用对象 |
---|---|---|
kubeadm | 初始化和管理 Kubernetes 集群 | 集群管理员 |
kubelet | 运行在每个节点上,管理 Pod 和容器 | Kubernetes 节点 |
kubectl | 用于操作 Kubernetes 资源的命令行工具 | 开发者 & 运维 |
kubeadm
Kubernetes 集群初始化工具
kubeadm
是官方提供的 快速部署和管理 Kubernetes 集群 的工具,主要用于:
# 初始化 Kubernetes 控制平面(主节点)
kubeadm init
# 加入新的工作节点(Worker Node)
kubeadm join <master-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>
# 升级 Kubernetes 版本
kubeadm upgrade apply v1.28.0
特点:
✅ 官方推荐方式,简化集群安装
✅ 自动生成证书、配置文件、Pod 网络
✅ 可升级 Kubernetes(但不管理工作负载)
📌 注意:kubeadm
仅用于 集群初始化和管理,不会持续运行,而是一次性命令。
kubelet
运行在每个节点上,管理 Pod 和容器
kubelet
是 Kubernetes 关键组件,在 每个节点(Master & Worker) 上运行,负责:
- 与 API Server 通信,获取调度任务
- 管理节点上的 Pod(创建、监控、重启)
- 与容器运行时(如 Docker, containerd)交互
- 健康检查,自动恢复失败的 Pod
启动 kubelet
在每个节点上,kubelet
作为系统服务运行:
systemctl enable --now kubelet
📌 注意:
kubelet
不会自己创建 Pod,它只负责运行 API Server 指定的 Pod。- 如果
kubelet
退出或崩溃,节点上的所有 Pod 可能会停止工作。
kubectl
Kubernetes 命令行工具
kubectl
是 Kubernetes 的命令行客户端,用于与 Kubernetes API 交互,管理集群中的资源。
常用指令
kubectl cluster-info # 查看集群信息
kubectl get nodes # 查看集群中的节点信息
kubectl get namespaces # 查看所有命名空间
kubectl get pods # 查看所有 Pod
kubectl get svc # 查看所有服务
kubectl get deployments # 查看所有部署
kubectl get replicasets # 查看所有 ReplicaSets
kubectl get configmaps # 查看所有 ConfigMap
kubectl get secrets # 查看所有 Secrets
kubectl apply -f <file>.yaml # 使用 YAML 文件创建或更新资源
kubectl create deployment <deployment-name> --image=<image-name> # 创建一个新的 Deployment
kubectl set image deployment/<deployment-name> <container-name>=<new-image> # 更新 Deployment 镜像
kubectl apply -f <updated-file>.yaml # 更新资源配置
kubectl delete pod <pod-name> # 删除指定的 Pod
kubectl delete deployment <deployment-name> # 删除指定的 Deployment
kubectl delete svc <service-name> # 删除指定的 Service
kubectl delete pod --all # 删除所有 Pod(例如:已退出的容器)
kubectl logs <pod-name> # 查看指定 Pod 的日志
kubectl logs <pod-name> -c <container-name> # 查看指定容器的日志
kubectl logs <pod-name> --previous # 查看最近已删除 Pod 的日志
kubectl exec -it <pod-name> -- /bin/bash # 在 Pod 中执行命令(进入容器 shell)
kubectl exec -it <pod-name> -c <container-name> -- /bin/bash # 执行命令到指定容器
kubectl describe pod <pod-name> # 查看 Pod 的详细信息
kubectl describe deployment <deployment-name> # 查看 Deployment 的详细信息
kubectl top nodes # 查看节点的资源使用情况
kubectl top pods # 查看 Pod 的资源使用情况
kubectl port-forward pod/<pod-name> <local-port>:<pod-port> # 本地端口转发到 Pod 内部端口
kubectl port-forward svc/<service-name> <local-port>:<service-port> # 本地端口转发到 Service
kubectl get events # 查看集群中的事件
kubectl get events --field-selector involvedObject.kind=Node # 查看节点的事件
kubectl delete pod <pod-name> --force --grace-period=0 # 强制删除未响应的 Pod
kubectl get pods -o yaml # 获取 Pod 的详细 YAML 输出
kubectl get pods -o json # 获取 Pod 的详细 JSON 输出
kubectl get pods -o wide # 获取 Pod 的详细输出
kubectl help # 获取 kubectl 命令帮助信息
kubectl get pod --help # 获取特定命令(如 Pod)帮助信息
kubectl get all -o wide # 获取集群中所有资源的详细信息,包括节点、Pod、Service 等
📌 注意:
kubectl
需要配置kubeconfig
文件才能连接 Kubernetes API Server。- 常见
kubectl
配置文件路径:~/.kube/config
。
总结
组件 | 作用 | 运行环境 |
---|---|---|
kubeadm | 初始化 & 管理 Kubernetes 集群 | 仅在主节点(Master)上运行一次 |
kubelet | 运行和管理 Pod | 所有节点(Master & Worker)长期运行 |
kubectl | 操作 Kubernetes 资源 | 运维 & 开发者本地工具 |
👉 kubeadm
负责安装,kubelet
负责运行,kubectl
负责操作。 🚀
3.2 安装
列出已安装的所有软件包apt list --installed | grep kube
查找可用的 Kubernetes 相关包apt search kube
#使apt支持ssl传输
sudo apt-get install -y apt-transport-https
#下载gpg密钥
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
#添加apt源
sudo apt-add-repository "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main"
#查看可安装版本
sudo apt-cache madison kubectl
#安装指定版本
sudo apt-get install kubelet=1.17.3-00 kubeadm=1.17.3-00 kubectl=1.17.3-00
#阻止自动更新
sudo apt-mark hold kubelet kubeadm kubectl
3.3 kubeadm初始化
处理一下异常 [WARNING IsDockerSystemdCheck]: detected “cgroupfs” as the Docker cgroup driver
sudo tee /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
sudo systemctl restart docker
docker info | grep -i cgroup
这个指令只在master节点(控制平面节点 (Control Plane Node))
执行,就是主节点执行
# apiserver-advertise-address 集群某一个master节点
kubeadm init \
--apiserver-advertise-address=192.168.137.130 \
--image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
--kubernetes-version v1.17.3 \
--service-cidr=10.96.0.0/16 \
--pod-network-cidr=10.244.0.0/16
打印的内容为:
# 正常日志
# W0317 21:29:46.144585 79912 validation.go:28] Cannot validate kube-proxy config - no validator is available
# W0317 21:29:46.144788 79912 validation.go:28] Cannot validate kubelet config - no validator is available
# ...............
# Your Kubernetes control-plane has initialized successfully!
# To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# You should now deploy a pod network to the cluster.
# Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
# https://kubernetes.io/docs/concepts/cluster-administration/addons/
# Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.137.130:6443 --token mkaq9m.k8d544uq97ppg3jd \
--discovery-token-ca-cert-hash sha256:f18f2a44ba2dec0495d0be020e2e2c28ab66d02ae4b39b496720ff43a5657150
执行一下这个
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 检测所有节点
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# k8s-node1 NotReady master 3m43s v1.17.3
3.4 加入从节点(工作节点)
在从节点(工作节点)
执行该指令
kubeadm join 192.168.137.130:6443 --token mkaq9m.k8d544uq97ppg3jd \
--discovery-token-ca-cert-hash sha256:f18f2a44ba2dec0495d0be020e2e2c28ab66d02ae4b39b496720ff43a5657150
# 正常日志
# W0317 21:35:27.501099 64808 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
# [preflight] Running pre-flight checks
# [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
# [preflight] Reading configuration from the cluster...
# [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
# [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
# [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
# [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
# [kubelet-start] Starting the kubelet
# [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
# This node has joined the cluster:
# * Certificate signing request was sent to apiserver and a response was received.
# * The Kubelet was informed of the new secure connection details.
# Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
# 去master节点查一下
kubectl get nodes
# 正常日志
# NAME STATUS ROLES AGE VERSION
# k8s-node1 NotReady master 7m4s v1.17.3
# k8s-node2 NotReady <none> 96s v1.17.3
# k8s-node3 NotReady <none> 96s v1.17.3
# 查一下全部名称空间的节点情况
kubectl get pods --all-namespaces
# 正常日志
# NAMESPACE NAME READY STATUS RESTARTS AGE
# kube-system coredns-7f9c544f75-67njd 0/1 Pending 0 7m46s
# kube-system coredns-7f9c544f75-z82nl 0/1 Pending 0 7m46s
# kube-system etcd-k8s-node1 1/1 Running 0 8m1s
# kube-system kube-apiserver-k8s-node1 1/1 Running 0 8m1s
# kube-system kube-controller-manager-k8s-node1 1/1 Running 0 8m1s
# kube-system kube-proxy-fs2p6 1/1 Running 0 2m37s
# kube-system kube-proxy-x7rkp 1/1 Running 0 2m37s
# kube-system kube-proxy-xpbvt 1/1 Running 0 7m46s
# kube-system kube-scheduler-k8s-node1 1/1 Running 0 8m1s
3.5 安装Pod网络插件(CNI)
kubectl apply -f \
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 正常日志
# namespace/kube-flannel created
# clusterrole.rbac.authorization.k8s.io/flannel created
# clusterrolebinding.rbac.authorization.k8s.io/flannel created
# serviceaccount/flannel created
# configmap/kube-flannel-cfg created
# daemonset.apps/kube-flannel-ds created
# 查一下全部名称空间的节点情况
kubectl get pods --all-namespaces
# 正常日志
# NAMESPACE NAME READY STATUS RESTARTS AGE
# kube-flannel kube-flannel-ds-66bcs 1/1 Running 0 93s
# kube-flannel kube-flannel-ds-ntwwx 1/1 Running 0 93s
# kube-flannel kube-flannel-ds-vb6n9 1/1 Running 0 93s
# kube-system coredns-7f9c544f75-67njd 1/1 Running 0 12m
# kube-system coredns-7f9c544f75-z82nl 1/1 Running 0 12m
# kube-system etcd-k8s-node1 1/1 Running 0 12m
# kube-system kube-apiserver-k8s-node1 1/1 Running 0 12m
# kube-system kube-controller-manager-k8s-node1 1/1 Running 0 12m
# kube-system kube-proxy-fs2p6 1/1 Running 0 7m28s
# kube-system kube-proxy-x7rkp 1/1 Running 0 7m28s
# kube-system kube-proxy-xpbvt 1/1 Running 0 12m
# kube-system kube-scheduler-k8s-node1 1/1 Running 0 12m
# 去master节点查一下
kubectl get nodes
# 正常日志
# NAME STATUS ROLES AGE VERSION
# k8s-node1 Ready master 14m v1.17.3
# k8s-node2 Ready <none> 8m52s v1.17.3
# k8s-node3 Ready <none> 8m52s v1.17.3
# 获取集群中所有资源的详细信息,包括节点、Pod、Service 等
kubectl get all -o wide
# 正常日志
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
# service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 17m <none>
3.6 KubeSphere安装
官方网站 kubesphere.io
先安装helm(master节点执行)
Helm是Kubernetes的包管理器。包管理器类似于我们在Ubuntu中使用的apt、Centos 中使用的yum或者Python中的pip一样,能快速查找、下载和安装软件包。Helm由客 户端组件helm和服务端组件Tiller组成,能够将一组K8S资源打包统一管理,是查找、共 享和使用为Kubernetes构建的软件的最佳方式。
# 删除旧版
sudo rm -rf /usr/local/bin/helm
sudo rm -rf /usr/local/bin/tiller
kubectl get deployments -n kube-system
kubectl delete deployment tiller-deploy -n kube-system
kubectl get svc -n kube-system
kubectl delete service tiller-deploy -n kube-system
kubectl get serviceaccounts -n kube-system
kubectl delete serviceaccount tiller -n kube-system
kubectl get clusterrolebindings
kubectl delete clusterrolebinding tiller
kubectl get configmap -n kube-system
kubectl delete configmap tiller-config -n kube-system
kubectl get pods -n kube-system
kubectl get all -n kube-system
# 安装(master执行)
HELM_VERSION=v2.16.3 curl -L https://git.io/get_helm.sh | bash # 版本不同
# 用这个指令这个版本,否则会出错
curl -LO https://get.helm.sh/helm-v2.16.1-linux-amd64.tar.gz
tar -zxvf helm-v2.16.1-linux-amd64.tar.gz
sudo mv linux-amd64/helm /usr/local/bin/helm
sudo mv linux-amd64/tiller /usr/local/bin/tiller
sudo rm -rf linux-amd64
# % Total % Received % Xferd Average Speed Time Time Time Current
# Dload Upload Total Spent Left Speed
# 100 24.0M 100 24.0M 0 0 1594k 0 0:00:15 0:00:15 --:--:-- 1833k
# root@k8s-node1:/home/slienceme# tar -zxvf helm-v2.16.3-linux-amd64.tar.gz
# linux-amd64/
# linux-amd64/README.md
# linux-amd64/LICENSE
# linux-amd64/tiller
# linux-amd64/helm
# 验证版本(master执行)
helm version
# 正常日志
# Client: &version.Version{SemVer:"v2.16.3", GitCommit:"dd2e5695da88625b190e6b22e9542550ab503a47", GitTreeState:"clean"}
# Error: could not find tiller
# 更新 Helm 权限
sudo chmod +x /usr/local/bin/helm
sudo chmod +x /usr/local/bin/tiller
# 创建权限(master执行) 文件见下面
vim helm-rbac.yaml
helm-rbac.yaml内容
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: tiller
namespace: kube-system
# 继续上面的操作
# 应用配置
kubectl apply -f helm-rbac.yaml
# 正常日志
# serviceaccount/tiller created
# clusterrolebinding.rbac.authorization.k8s.io/tiller created
# 安装Tiller(master执行)
helm init --service-account=tiller --tiller-image=jessestuart/tiller:v2.16.3 --history-max 300
# 默认的 stable chart 仓库不再维护 手动更换
helm init --service-account=tiller --tiller-image=jessestuart/tiller:v2.16.3 --history-max 300 --stable-repo-url https://charts.helm.sh/stable
# 正常日志
# Creating /root/.helm
# Creating /root/.helm/repository
# Creating /root/.helm/repository/cache
# Creating /root/.helm/repository/local
# Creating /root/.helm/plugins
# Creating /root/.helm/starters
# Creating /root/.helm/cache/archive
# Creating /root/.helm/repository/repositories.yaml
# Adding stable repo with URL: https://charts.helm.sh/stable
# Adding local repo with URL: http://127.0.0.1:8879/charts
# $HELM_HOME has been configured at /root/.helm.
#
# Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
#
# Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
# To prevent this, run `helm init` with the --tiller-tls-verify flag.
# For more information on securing your installation see: https://v2.helm.sh/docs/securing_installation/
# 可能报错
# Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com
# Error: error initializing: Looks like "https://kubernetes-charts.storage.googleapis.com" is not a valid chart repository or cannot be reached: Failed to fetch https://kubernetes-charts.storage.googleapis.com/index.yaml : 404 Not Found
# 默认的 stable chart 仓库不再维护 手动更换
# 验证安装成功:回车后有帮助菜单算安装成功
helm
tiller
kubectl get pods --all-namespaces
# 正常日志
# NAMESPACE NAME READY STATUS RESTARTS AGE
# kube-flannel kube-flannel-ds-66bcs 1/1 Running 0 11h
# kube-flannel kube-flannel-ds-ntwwx 1/1 Running 0 11h
# kube-flannel kube-flannel-ds-vb6n9 1/1 Running 0 11h
# kube-system coredns-7f9c544f75-67njd 0/1 CrashLoopBackOff 9 12h
# kube-system coredns-7f9c544f75-z82nl 0/1 CrashLoopBackOff 9 12h
# kube-system etcd-k8s-node1 1/1 Running 0 12h
# kube-system kube-apiserver-k8s-node1 1/1 Running 0 12h
# kube-system kube-controller-manager-k8s-node1 1/1 Running 0 12h
# kube-system kube-proxy-fs2p6 1/1 Running 0 11h
# kube-system kube-proxy-x7rkp 1/1 Running 0 11h
# kube-system kube-proxy-xpbvt 1/1 Running 0 12h
# kube-system kube-scheduler-k8s-node1 1/1 Running 0 12h
# kube-system tiller-deploy-6ffcfbc8df-mzwd7 1/1 Running 0 2m6s
kubectl get nodes -o wide
# 正常日志
# NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
# k8s-node1 Ready master 12h v1.17.3 192.168.137.130 <none> Ubuntu 20.04.6 LTS 5.4.0-208-generic docker://19.3.15
# k8s-node2 Ready <none> 11h v1.17.3 192.168.137.131 <none> Ubuntu 20.04.6 LTS 5.4.0-208-generic docker://19.3.15
# k8s-node3 Ready <none> 11h v1.17.3 192.168.137.132 <none> Ubuntu 20.04.6 LTS 5.4.0-208-generic docker://19.3.15
# 确定master节点有taint
kubectl describe node k8s-node1 | grep Taint
# 正常日志
# Taints: node-role.kubernetes.io/master:NoSchedule
# 有则取消taint
kubectl taint nodes k8s-node1 node-role.kubernetes.io/master:NoSchedule-
# 正常日志
# node/k8s-node1 untainted
OpenEBS 是一个开源的 容器化存储解决方案,它为 Kubernetes 提供了 持久存储(Persistent Storage) 支持。OpenEBS 可以帮助 Kubernetes 提供 块存储 和 文件存储,并且支持动态存储卷的创建和管理。它通过在 Kubernetes 上运行的 CStor、Jiva、Mayastor 等存储引擎,来为容器提供持久存储卷。
OpenEBS 的主要功能:
- 持久化存储管理:在 Kubernetes 集群中为容器提供持久化存储卷,保证容器的状态在重启或迁移后得以保留。
- 动态存储卷管理:OpenEBS 支持根据需求动态创建存储卷,避免手动管理存储的复杂性。
- 高可用性和容错:提供存储的高可用性和容错机制,确保容器的存储不会因为节点故障而丢失数据。
- 易于扩展:可以根据需求横向扩展存储容量,支持容器化的环境。
# 创建命名空间
kubectl create ns openebs
# 正常日志
# namespace/openebs created
# 查询一下当前全部命名空间
kubectl get ns
# 正常日志
# NAME STATUS AGE
# default Active 13h
# kube-flannel Active 13h
# kube-node-lease Active 13h
# kube-public Active 13h
# kube-system Active 13h
# openebs Active 38s
# 直接采用文件形式安装
vim openebs-operator-1.5.0.yaml
kubectl apply -f openebs-operator-1.5.0.yaml
# 正常日志
# Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply namespace/openebs configured
# serviceaccount/openebs-maya-operator created
# clusterrole.rbac.authorization.k8s.io/openebs-maya-operator created
# clusterrolebinding.rbac.authorization.k8s.io/openebs-maya-operator created
# deployment.apps/maya-apiserver created
# service/maya-apiserver-service created
# deployment.apps/openebs-provisioner created
# deployment.apps/openebs-snapshot-operator created
# configmap/openebs-ndm-config created
# daemonset.apps/openebs-ndm created
# deployment.apps/openebs-ndm-operator created
# deployment.apps/openebs-admission-server created
# deployment.apps/openebs-localpv-provisioner created
# 检查一下安装的效果
kubectl get sc -n openebs
# 正常日志
# NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
# openebs-device openebs.io/local Delete WaitForFirstConsumer false 62s
# openebs-hostpath openebs.io/local Delete WaitForFirstConsumer false 62s
# openebs-jiva-default openebs.io/provisioner-iscsi Delete Immediate false 62s
# openebs-snapshot-promoter volumesnapshot.external-storage.k8s.io/snapshot-promoter Delete Immediate false 62s
# 检查一下
kubectl get pods --all-namespaces
# 正常日志
# NAMESPACE NAME READY STATUS RESTARTS AGE
# kube-flannel kube-flannel-ds-66bcs 1/1 Running 2 13h
# kube-flannel kube-flannel-ds-ntwwx 1/1 Running 0 13h
# kube-flannel kube-flannel-ds-vb6n9 1/1 Running 6 13h
# kube-system coredns-57dd576fcb-44722 1/1 Running 1 57m
# kube-system coredns-57dd576fcb-pvhvv 1/1 Running 1 57m
# kube-system etcd-k8s-node1 1/1 Running 5 13h
# kube-system kube-apiserver-k8s-node1 1/1 Running 5 13h
# kube-system kube-controller-manager-k8s-node1 1/1 Running 4 13h
# kube-system kube-proxy-fs2p6 1/1 Running 1 13h
# kube-system kube-proxy-x7rkp 1/1 Running 0 13h
# kube-system kube-proxy-xpbvt 1/1 Running 4 13h
# kube-system kube-scheduler-k8s-node1 1/1 Running 4 13h
# kube-system tiller-deploy-6ffcfbc8df-mzwd7 1/1 Running 1 98m
# openebs maya-apiserver-7f664b95bb-s2zv6 1/1 Running 0 5m8s
# openebs openebs-admission-server-889d78f96-v5pwd 1/1 Running 0 5m8s
# openebs openebs-localpv-provisioner-67bddc8568-ms88s 1/1 Running 0 5m8s
# openebs openebs-ndm-hv64z 1/1 Running 0 5m8s
# openebs openebs-ndm-lpv9t 1/1 Running 0 5m8s
# openebs openebs-ndm-operator-5db67cd5bb-pl6gc 1/1 Running 1 5m8s
# openebs openebs-ndm-sqqv2 1/1 Running 0 5m8s
# openebs openebs-provisioner-c68bfd6d4-sxpb2 1/1 Running 0 5m8s
# openebs openebs-snapshot-operator-7ffd685677-x9wzj 2/2 Running 0 5m8s
# 将openebs-hostpath设置为默认的StorageClass:
kubectl patch storageclass openebs-hostpath -p \
'{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
# 正常日志
# storageclass.storage.k8s.io/openebs-hostpath patched
# kubectl taint nodes 用于在节点上添加污点,从而控制哪些 Pods 可以调度到这些节点。
# 结合容忍(Toleration),你可以实现精细的资源调度控制,避免某些 Pods 被调度到不适合的节点。
# 重新打上污点
kubectl taint nodes k8s-node1 node-role.kubernetes.io=master:NoSchedule
# 正常日志
# node/k8s-node1 tainted
# 最小化安装kubesphere kubesphere-minimal.yaml见下面
vim kubesphere-minimal.yaml
kubectl apply -f kubesphere-minimal.yaml
# 或 网络安装
kubectl apply -f https://raw.githubusercontent.com/kubesphere/ks-installer/v2.1.1/kubesphere-minimal.yaml
# 正常日志
# namespace/kubesphere-system created
# configmap/ks-installer created
# serviceaccount/ks-installer created
# clusterrole.rbac.authorization.k8s.io/ks-installer created
# clusterrolebinding.rbac.authorization.k8s.io/ks-installer created
# deployment.apps/ks-installer created
# 查看安装日志,请耐心等待安装成功
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l app=ks-install -o jsonpath='{.items[0].metadata.name}') -f
# 这个日志太长 单独列出
# 正常日志
# 2025-03-18T04:46:27Z INFO : shell-operator v1.0.0-beta.5
# 2025-03-18T04:46:27Z INFO : Use temporary dir: /tmp/shell-operator
# 2025-03-18T04:46:27Z INFO : Initialize hooks manager ...
# 2025-03-18T04:46:27Z INFO : Search and load hooks ...
# 2025-03-18T04:46:27Z INFO : Load hook config from '/hooks/kubesphere/installRunner.py'
# 2025-03-18T04:46:27Z INFO : HTTP SERVER Listening on 0.0.0.0:9115
# 2025-03-18T04:46:27Z INFO : Initializing schedule manager ...
# 2025-03-18T04:46:27Z INFO : KUBE Init Kubernetes client
# 2025-03-18T04:46:27Z INFO : KUBE-INIT Kubernetes client is configured successfully
# 2025-03-18T04:46:27Z INFO : MAIN: run main loop
# 2025-03-18T04:46:27Z INFO : MAIN: add onStartup tasks
# 2025-03-18T04:46:27Z INFO : Running schedule manager ...
# 2025-03-18T04:46:27Z INFO : QUEUE add all HookRun@OnStartup
# 2025-03-18T04:46:27Z INFO : MSTOR Create new metric shell_operator_live_ticks
# 2025-03-18T04:46:27Z INFO : MSTOR Create new metric shell_operator_tasks_queue_length
# 2025-03-18T04:46:27Z INFO : GVR for kind 'ConfigMap' is /v1, Resource=configmaps
# 2025-03-18T04:46:27Z INFO : EVENT Kube event 'b85d730a-83a0-4b72-b9a3-0653c27dc89e'
# 2025-03-18T04:46:27Z INFO : QUEUE add TASK_HOOK_RUN@KUBE_EVENTS kubesphere/installRunner.py
# 2025-03-18T04:46:30Z INFO : TASK_RUN HookRun@KUBE_EVENTS kubesphere/installRunner.py
# 2025-03-18T04:46:30Z INFO : Running hook 'kubesphere/installRunner.py' binding 'KUBE_EVENTS' ...
# [WARNING]: No inventory was parsed, only implicit localhost is available
# [WARNING]: provided hosts list is empty, only localhost is available. Note that
# the implicit localhost does not match 'all'
# .................................................................
# .................................................................
# .................................................................
# Start installing monitoring
# **************************************************
# task monitoring status is successful
# total: 1 completed:1
# **************************************************
#####################################################
### Welcome to KubeSphere! ###
#####################################################
# Console: http://192.168.137.130:30880
# Account: admin
# Password: P@88w0rd
# NOTES:
# 1. After logging into the console, please check the
# monitoring status of service components in
# the "Cluster Status". If the service is not
# ready, please wait patiently. You can start
# to use when all components are ready.
# 2. Please modify the default password after login.
#
########################################################
3.7 卡住
TASK [common : Kubesphere | Deploy openldap] ***********************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "/usr/local/bin/helm upgrade --install ks-openldap /etc/kubesphere/openldap-ha -f /etc/kubesphere/custom-values-openldap.yaml --set fullnameOverride=openldap --namespace kubesphere-system\n", "delta": "0:00:00.710229", "end": "2025-03-18 09:15:43.271042", "msg": "non-zero return code", "rc": 1, "start": "2025-03-18 09:15:42.560813", "stderr": "Error: UPGRADE FAILED: \"ks-openldap\" has no deployed releases", "stderr_lines": ["Error: UPGRADE FAILED: \"ks-openldap\" has no deployed releases"], "stdout": "", "stdout_lines": []}
配置文件
openebs-operator-1.5.0.yaml
openebs-operator-1.5.0.yaml内容
# This manifest deploys the OpenEBS control plane components, with associated CRs & RBAC rules
# NOTE: On GKE, deploy the openebs-operator.yaml in admin context
# Create the OpenEBS namespace
apiVersion: v1
kind: Namespace
metadata:
name: openebs
---
# Create Maya Service Account
apiVersion: v1
kind: ServiceAccount
metadata:
name: openebs-maya-operator
namespace: openebs
---
# Define Role that allows operations on K8s pods/deployments
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: openebs-maya-operator
rules:
- apiGroups: ["*"]
resources: ["nodes", "nodes/proxy"]
verbs: ["*"]
- apiGroups: ["*"]
resources: ["namespaces", "services", "pods", "pods/exec", "deployments", "deployments/finalizers", "replicationcontrollers", "replicasets", "events", "endpoints", "configmaps", "secrets", "jobs", "cronjobs"]
verbs: ["*"]
- apiGroups: ["*"]
resources: ["statefulsets", "daemonsets"]
verbs: ["*"]
- apiGroups: ["*"]
resources: ["resourcequotas", "limitranges"]
verbs: ["list", "watch"]
- apiGroups: ["*"]
resources: ["ingresses", "horizontalpodautoscalers", "verticalpodautoscalers", "poddisruptionbudgets", "certificatesigningrequests"]
verbs: ["list", "watch"]
- apiGroups: ["*"]
resources: ["storageclasses", "persistentvolumeclaims", "persistentvolumes"]
verbs: ["*"]
- apiGroups: ["volumesnapshot.external-storage.k8s.io"]
resources: ["volumesnapshots", "volumesnapshotdatas"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apiextensions.k8s.io"]
resources: ["customresourcedefinitions"]
verbs: [ "get", "list", "create", "update", "delete", "patch"]
- apiGroups: ["*"]
resources: [ "disks", "blockdevices", "blockdeviceclaims"]
verbs: ["*" ]
- apiGroups: ["*"]
resources: [ "cstorpoolclusters", "storagepoolclaims", "storagepoolclaims/finalizers", "cstorpoolclusters/finalizers", "storagepools"]
verbs: ["*" ]
- apiGroups: ["*"]
resources: [ "castemplates", "runtasks"]
verbs: ["*" ]
- apiGroups: ["*"]
resources: [ "cstorpools", "cstorpools/finalizers", "cstorvolumereplicas", "cstorvolumes", "cstorvolumeclaims"]
verbs: ["*" ]
- apiGroups: ["*"]
resources: [ "cstorpoolinstances", "cstorpoolinstances/finalizers"]
verbs: ["*" ]
- apiGroups: ["*"]
resources: [ "cstorbackups", "cstorrestores", "cstorcompletedbackups"]
verbs: ["*" ]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["get", "watch", "list", "delete", "update", "create"]
- apiGroups: ["admissionregistration.k8s.io"]
resources: ["validatingwebhookconfigurations", "mutatingwebhookconfigurations"]
verbs: ["get", "create", "list", "delete", "update", "patch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
- apiGroups: ["*"]
resources: [ "upgradetasks"]
verbs: ["*" ]
---
# Bind the Service Account with the Role Privileges.
# TODO: Check if default account also needs to be there
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: openebs-maya-operator
subjects:
- kind: ServiceAccount
name: openebs-maya-operator
namespace: openebs
roleRef:
kind: ClusterRole
name: openebs-maya-operator
apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: maya-apiserver
namespace: openebs
labels:
name: maya-apiserver
openebs.io/component-name: maya-apiserver
openebs.io/version: 1.5.0
spec:
selector:
matchLabels:
name: maya-apiserver
openebs.io/component-name: maya-apiserver
replicas: 1
strategy:
type: Recreate
rollingUpdate: null
template:
metadata:
labels:
name: maya-apiserver
openebs.io/component-name: maya-apiserver
openebs.io/version: 1.5.0
spec:
serviceAccountName: openebs-maya-operator
containers:
- name: maya-apiserver
imagePullPolicy: IfNotPresent
image: quay.io/openebs/m-apiserver:1.5.0
ports:
- containerPort: 5656
env:
# OPENEBS_IO_KUBE_CONFIG enables maya api service to connect to K8s
# based on this config. This is ignored if empty.
# This is supported for maya api server version 0.5.2 onwards
#- name: OPENEBS_IO_KUBE_CONFIG
# value: "/home/ubuntu/.kube/config"
# OPENEBS_IO_K8S_MASTER enables maya api service to connect to K8s
# based on this address. This is ignored if empty.
# This is supported for maya api server version 0.5.2 onwards
#- name: OPENEBS_IO_K8S_MASTER
# value: "http://172.28.128.3:8080"
# OPENEBS_NAMESPACE provides the namespace of this deployment as an
# environment variable
- name: OPENEBS_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# OPENEBS_SERVICE_ACCOUNT provides the service account of this pod as
# environment variable
- name: OPENEBS_SERVICE_ACCOUNT
valueFrom:
fieldRef:
fieldPath: spec.serviceAccountName
# OPENEBS_MAYA_POD_NAME provides the name of this pod as
# environment variable
- name: OPENEBS_MAYA_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
# If OPENEBS_IO_CREATE_DEFAULT_STORAGE_CONFIG is false then OpenEBS default
# storageclass and storagepool will not be created.
- name: OPENEBS_IO_CREATE_DEFAULT_STORAGE_CONFIG
value: "true"
# OPENEBS_IO_INSTALL_DEFAULT_CSTOR_SPARSE_POOL decides whether default cstor sparse pool should be
# configured as a part of openebs installation.
# If "true" a default cstor sparse pool will be configured, if "false" it will not be configured.
# This value takes effect only if OPENEBS_IO_CREATE_DEFAULT_STORAGE_CONFIG
# is set to true
- name: OPENEBS_IO_INSTALL_DEFAULT_CSTOR_SPARSE_POOL
value: "false"
# OPENEBS_IO_CSTOR_TARGET_DIR can be used to specify the hostpath
# to be used for saving the shared content between the side cars
# of cstor volume pod.
# The default path used is /var/openebs/sparse
#- name: OPENEBS_IO_CSTOR_TARGET_DIR
# value: "/var/openebs/sparse"
# OPENEBS_IO_CSTOR_POOL_SPARSE_DIR can be used to specify the hostpath
# to be used for saving the shared content between the side cars
# of cstor pool pod. This ENV is also used to indicate the location
# of the sparse devices.
# The default path used is /var/openebs/sparse
#- name: OPENEBS_IO_CSTOR_POOL_SPARSE_DIR
# value: "/var/openebs/sparse"
# OPENEBS_IO_JIVA_POOL_DIR can be used to specify the hostpath
# to be used for default Jiva StoragePool loaded by OpenEBS
# The default path used is /var/openebs
# This value takes effect only if OPENEBS_IO_CREATE_DEFAULT_STORAGE_CONFIG
# is set to true
#- name: OPENEBS_IO_JIVA_POOL_DIR
# value: "/var/openebs"
# OPENEBS_IO_LOCALPV_HOSTPATH_DIR can be used to specify the hostpath
# to be used for default openebs-hostpath storageclass loaded by OpenEBS
# The default path used is /var/openebs/local
# This value takes effect only if OPENEBS_IO_CREATE_DEFAULT_STORAGE_CONFIG
# is set to true
#- name: OPENEBS_IO_LOCALPV_HOSTPATH_DIR
# value: "/var/openebs/local"
- name: OPENEBS_IO_JIVA_CONTROLLER_IMAGE
value: "quay.io/openebs/jiva:1.5.0"
- name: OPENEBS_IO_JIVA_REPLICA_IMAGE
value: "quay.io/openebs/jiva:1.5.0"
- name: OPENEBS_IO_JIVA_REPLICA_COUNT
value: "3"
- name: OPENEBS_IO_CSTOR_TARGET_IMAGE
value: "quay.io/openebs/cstor-istgt:1.5.0"
- name: OPENEBS_IO_CSTOR_POOL_IMAGE
value: "quay.io/openebs/cstor-pool:1.5.0"
- name: OPENEBS_IO_CSTOR_POOL_MGMT_IMAGE
value: "quay.io/openebs/cstor-pool-mgmt:1.5.0"
- name: OPENEBS_IO_CSTOR_VOLUME_MGMT_IMAGE
value: "quay.io/openebs/cstor-volume-mgmt:1.5.0"
- name: OPENEBS_IO_VOLUME_MONITOR_IMAGE
value: "quay.io/openebs/m-exporter:1.5.0"
- name: OPENEBS_IO_CSTOR_POOL_EXPORTER_IMAGE
value: "quay.io/openebs/m-exporter:1.5.0"
- name: OPENEBS_IO_HELPER_IMAGE
value: "quay.io/openebs/linux-utils:1.5.0"
# OPENEBS_IO_ENABLE_ANALYTICS if set to true sends anonymous usage
# events to Google Analytics
- name: OPENEBS_IO_ENABLE_ANALYTICS
value: "true"
- name: OPENEBS_IO_INSTALLER_TYPE
value: "openebs-operator"
# OPENEBS_IO_ANALYTICS_PING_INTERVAL can be used to specify the duration (in hours)
# for periodic ping events sent to Google Analytics.
# Default is 24h.
# Minimum is 1h. You can convert this to weekly by setting 168h
#- name: OPENEBS_IO_ANALYTICS_PING_INTERVAL
# value: "24h"
livenessProbe:
exec:
command:
- /usr/local/bin/mayactl
- version
initialDelaySeconds: 30
periodSeconds: 60
readinessProbe:
exec:
command:
- /usr/local/bin/mayactl
- version
initialDelaySeconds: 30
periodSeconds: 60
---
apiVersion: v1
kind: Service
metadata:
name: maya-apiserver-service
namespace: openebs
labels:
openebs.io/component-name: maya-apiserver-svc
spec:
ports:
- name: api
port: 5656
protocol: TCP
targetPort: 5656
selector:
name: maya-apiserver
sessionAffinity: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: openebs-provisioner
namespace: openebs
labels:
name: openebs-provisioner
openebs.io/component-name: openebs-provisioner
openebs.io/version: 1.5.0
spec:
selector:
matchLabels:
name: openebs-provisioner
openebs.io/component-name: openebs-provisioner
replicas: 1
strategy:
type: Recreate
rollingUpdate: null
template:
metadata:
labels:
name: openebs-provisioner
openebs.io/component-name: openebs-provisioner
openebs.io/version: 1.5.0
spec:
serviceAccountName: openebs-maya-operator
containers:
- name: openebs-provisioner
imagePullPolicy: IfNotPresent
image: quay.io/openebs/openebs-k8s-provisioner:1.5.0
env:
# OPENEBS_IO_K8S_MASTER enables openebs provisioner to connect to K8s
# based on this address. This is ignored if empty.
# This is supported for openebs provisioner version 0.5.2 onwards
#- name: OPENEBS_IO_K8S_MASTER
# value: "http://10.128.0.12:8080"
# OPENEBS_IO_KUBE_CONFIG enables openebs provisioner to connect to K8s
# based on this config. This is ignored if empty.
# This is supported for openebs provisioner version 0.5.2 onwards
#- name: OPENEBS_IO_KUBE_CONFIG
# value: "/home/ubuntu/.kube/config"
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: OPENEBS_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# OPENEBS_MAYA_SERVICE_NAME provides the maya-apiserver K8s service name,
# that provisioner should forward the volume create/delete requests.
# If not present, "maya-apiserver-service" will be used for lookup.
# This is supported for openebs provisioner version 0.5.3-RC1 onwards
#- name: OPENEBS_MAYA_SERVICE_NAME
# value: "maya-apiserver-apiservice"
livenessProbe:
exec:
command:
- pgrep
- ".*openebs"
initialDelaySeconds: 30
periodSeconds: 60
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: openebs-snapshot-operator
namespace: openebs
labels:
name: openebs-snapshot-operator
openebs.io/component-name: openebs-snapshot-operator
openebs.io/version: 1.5.0
spec:
selector:
matchLabels:
name: openebs-snapshot-operator
openebs.io/component-name: openebs-snapshot-operator
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
name: openebs-snapshot-operator
openebs.io/component-name: openebs-snapshot-operator
openebs.io/version: 1.5.0
spec:
serviceAccountName: openebs-maya-operator
containers:
- name: snapshot-controller
image: quay.io/openebs/snapshot-controller:1.5.0
imagePullPolicy: IfNotPresent
env:
- name: OPENEBS_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
livenessProbe:
exec:
command:
- pgrep
- ".*controller"
initialDelaySeconds: 30
periodSeconds: 60
# OPENEBS_MAYA_SERVICE_NAME provides the maya-apiserver K8s service name,
# that snapshot controller should forward the snapshot create/delete requests.
# If not present, "maya-apiserver-service" will be used for lookup.
# This is supported for openebs provisioner version 0.5.3-RC1 onwards
#- name: OPENEBS_MAYA_SERVICE_NAME
# value: "maya-apiserver-apiservice"
- name: snapshot-provisioner
image: quay.io/openebs/snapshot-provisioner:1.5.0
imagePullPolicy: IfNotPresent
env:
- name: OPENEBS_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# OPENEBS_MAYA_SERVICE_NAME provides the maya-apiserver K8s service name,
# that snapshot provisioner should forward the clone create/delete requests.
# If not present, "maya-apiserver-service" will be used for lookup.
# This is supported for openebs provisioner version 0.5.3-RC1 onwards
#- name: OPENEBS_MAYA_SERVICE_NAME
# value: "maya-apiserver-apiservice"
livenessProbe:
exec:
command:
- pgrep
- ".*provisioner"
initialDelaySeconds: 30
periodSeconds: 60
---
# This is the node-disk-manager related config.
# It can be used to customize the disks probes and filters
apiVersion: v1
kind: ConfigMap
metadata:
name: openebs-ndm-config
namespace: openebs
labels:
openebs.io/component-name: ndm-config
data:
# udev-probe is default or primary probe which should be enabled to run ndm
# filterconfigs contails configs of filters - in their form fo include
# and exclude comma separated strings
node-disk-manager.config: |
probeconfigs:
- key: udev-probe
name: udev probe
state: true
- key: seachest-probe
name: seachest probe
state: false
- key: smart-probe
name: smart probe
state: true
filterconfigs:
- key: os-disk-exclude-filter
name: os disk exclude filter
state: true
exclude: "/,/etc/hosts,/boot"
- key: vendor-filter
name: vendor filter
state: true
include: ""
exclude: "CLOUDBYT,OpenEBS"
- key: path-filter
name: path filter
state: true
include: ""
exclude: "loop,/dev/fd0,/dev/sr0,/dev/ram,/dev/dm-,/dev/md"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: openebs-ndm
namespace: openebs
labels:
name: openebs-ndm
openebs.io/component-name: ndm
openebs.io/version: 1.5.0
spec:
selector:
matchLabels:
name: openebs-ndm
openebs.io/component-name: ndm
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
name: openebs-ndm
openebs.io/component-name: ndm
openebs.io/version: 1.5.0
spec:
# By default the node-disk-manager will be run on all kubernetes nodes
# If you would like to limit this to only some nodes, say the nodes
# that have storage attached, you could label those node and use
# nodeSelector.
#
# e.g. label the storage nodes with - "openebs.io/nodegroup"="storage-node"
# kubectl label node <node-name> "openebs.io/nodegroup"="storage-node"
#nodeSelector:
# "openebs.io/nodegroup": "storage-node"
serviceAccountName: openebs-maya-operator
hostNetwork: true
containers:
- name: node-disk-manager
image: quay.io/openebs/node-disk-manager-amd64:v0.4.5
imagePullPolicy: Always
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /host/node-disk-manager.config
subPath: node-disk-manager.config
readOnly: true
- name: udev
mountPath: /run/udev
- name: procmount
mountPath: /host/proc
readOnly: true
- name: sparsepath
mountPath: /var/openebs/sparse
env:
# namespace in which NDM is installed will be passed to NDM Daemonset
# as environment variable
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# pass hostname as env variable using downward API to the NDM container
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# specify the directory where the sparse files need to be created.
# if not specified, then sparse files will not be created.
- name: SPARSE_FILE_DIR
value: "/var/openebs/sparse"
# Size(bytes) of the sparse file to be created.
- name: SPARSE_FILE_SIZE
value: "10737418240"
# Specify the number of sparse files to be created
- name: SPARSE_FILE_COUNT
value: "0"
livenessProbe:
exec:
command:
- pgrep
- ".*ndm"
initialDelaySeconds: 30
periodSeconds: 60
volumes:
- name: config
configMap:
name: openebs-ndm-config
- name: udev
hostPath:
path: /run/udev
type: Directory
# mount /proc (to access mount file of process 1 of host) inside container
# to read mount-point of disks and partitions
- name: procmount
hostPath:
path: /proc
type: Directory
- name: sparsepath
hostPath:
path: /var/openebs/sparse
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: openebs-ndm-operator
namespace: openebs
labels:
name: openebs-ndm-operator
openebs.io/component-name: ndm-operator
openebs.io/version: 1.5.0
spec:
selector:
matchLabels:
name: openebs-ndm-operator
openebs.io/component-name: ndm-operator
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
name: openebs-ndm-operator
openebs.io/component-name: ndm-operator
openebs.io/version: 1.5.0
spec:
serviceAccountName: openebs-maya-operator
containers:
- name: node-disk-operator
image: quay.io/openebs/node-disk-operator-amd64:v0.4.5
imagePullPolicy: Always
readinessProbe:
exec:
command:
- stat
- /tmp/operator-sdk-ready
initialDelaySeconds: 4
periodSeconds: 10
failureThreshold: 1
env:
- name: WATCH_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
# the service account of the ndm-operator pod
- name: SERVICE_ACCOUNT
valueFrom:
fieldRef:
fieldPath: spec.serviceAccountName
- name: OPERATOR_NAME
value: "node-disk-operator"
- name: CLEANUP_JOB_IMAGE
value: "quay.io/openebs/linux-utils:1.5.0"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: openebs-admission-server
namespace: openebs
labels:
app: admission-webhook
openebs.io/component-name: admission-webhook
openebs.io/version: 1.5.0
spec:
replicas: 1
strategy:
type: Recreate
rollingUpdate: null
selector:
matchLabels:
app: admission-webhook
template:
metadata:
labels:
app: admission-webhook
openebs.io/component-name: admission-webhook
openebs.io/version: 1.5.0
spec:
serviceAccountName: openebs-maya-operator
containers:
- name: admission-webhook
image: quay.io/openebs/admission-server:1.5.0
imagePullPolicy: IfNotPresent
args:
- -alsologtostderr
- -v=2
- 2>&1
env:
- name: OPENEBS_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: ADMISSION_WEBHOOK_NAME
value: "openebs-admission-server"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: openebs-localpv-provisioner
namespace: openebs
labels:
name: openebs-localpv-provisioner
openebs.io/component-name: openebs-localpv-provisioner
openebs.io/version: 1.5.0
spec:
selector:
matchLabels:
name: openebs-localpv-provisioner
openebs.io/component-name: openebs-localpv-provisioner
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
name: openebs-localpv-provisioner
openebs.io/component-name: openebs-localpv-provisioner
openebs.io/version: 1.5.0
spec:
serviceAccountName: openebs-maya-operator
containers:
- name: openebs-provisioner-hostpath
imagePullPolicy: Always
image: quay.io/openebs/provisioner-localpv:1.5.0
env:
# OPENEBS_IO_K8S_MASTER enables openebs provisioner to connect to K8s
# based on this address. This is ignored if empty.
# This is supported for openebs provisioner version 0.5.2 onwards
#- name: OPENEBS_IO_K8S_MASTER
# value: "http://10.128.0.12:8080"
# OPENEBS_IO_KUBE_CONFIG enables openebs provisioner to connect to K8s
# based on this config. This is ignored if empty.
# This is supported for openebs provisioner version 0.5.2 onwards
#- name: OPENEBS_IO_KUBE_CONFIG
# value: "/home/ubuntu/.kube/config"
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: OPENEBS_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# OPENEBS_SERVICE_ACCOUNT provides the service account of this pod as
# environment variable
- name: OPENEBS_SERVICE_ACCOUNT
valueFrom:
fieldRef:
fieldPath: spec.serviceAccountName
- name: OPENEBS_IO_ENABLE_ANALYTICS
value: "true"
- name: OPENEBS_IO_INSTALLER_TYPE
value: "openebs-operator"
- name: OPENEBS_IO_HELPER_IMAGE
value: "quay.io/openebs/linux-utils:1.5.0"
livenessProbe:
exec:
command:
- pgrep
- ".*localpv"
initialDelaySeconds: 30
periodSeconds: 60
---
kubesphere-minimal.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: kubesphere-system
---
apiVersion: v1
data:
ks-config.yaml: |
---
persistence:
storageClass: ""
etcd:
monitoring: False
endpointIps: 192.168.0.7,192.168.0.8,192.168.0.9
port: 2379
tlsEnable: True
common:
mysqlVolumeSize: 20Gi
minioVolumeSize: 20Gi
etcdVolumeSize: 20Gi
openldapVolumeSize: 2Gi
redisVolumSize: 2Gi
metrics_server:
enabled: False
console:
enableMultiLogin: False # enable/disable multi login
port: 30880
monitoring:
prometheusReplicas: 1
prometheusMemoryRequest: 400Mi
prometheusVolumeSize: 20Gi
grafana:
enabled: False
logging:
enabled: False
elasticsearchMasterReplicas: 1
elasticsearchDataReplicas: 1
logsidecarReplicas: 2
elasticsearchMasterVolumeSize: 4Gi
elasticsearchDataVolumeSize: 20Gi
logMaxAge: 7
elkPrefix: logstash
containersLogMountedPath: ""
kibana:
enabled: False
openpitrix:
enabled: False
devops:
enabled: False
jenkinsMemoryLim: 2Gi
jenkinsMemoryReq: 1500Mi
jenkinsVolumeSize: 8Gi
jenkinsJavaOpts_Xms: 512m
jenkinsJavaOpts_Xmx: 512m
jenkinsJavaOpts_MaxRAM: 2g
sonarqube:
enabled: False
postgresqlVolumeSize: 8Gi
servicemesh:
enabled: False
notification:
enabled: False
alerting:
enabled: False
kind: ConfigMap
metadata:
name: ks-installer
namespace: kubesphere-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: ks-installer
namespace: kubesphere-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: null
name: ks-installer
rules:
- apiGroups:
- ""
resources:
- '*'
verbs:
- '*'
- apiGroups:
- apps
resources:
- '*'
verbs:
- '*'
- apiGroups:
- extensions
resources:
- '*'
verbs:
- '*'
- apiGroups:
- batch
resources:
- '*'
verbs:
- '*'
- apiGroups:
- rbac.authorization.k8s.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- apiregistration.k8s.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- apiextensions.k8s.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- tenant.kubesphere.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- certificates.k8s.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- devops.kubesphere.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- monitoring.coreos.com
resources:
- '*'
verbs:
- '*'
- apiGroups:
- logging.kubesphere.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- jaegertracing.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- storage.k8s.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- admissionregistration.k8s.io
resources:
- '*'
verbs:
- '*'
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: ks-installer
subjects:
- kind: ServiceAccount
name: ks-installer
namespace: kubesphere-system
roleRef:
kind: ClusterRole
name: ks-installer
apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ks-installer
namespace: kubesphere-system
labels:
app: ks-install
spec:
replicas: 1
selector:
matchLabels:
app: ks-install
template:
metadata:
labels:
app: ks-install
spec:
serviceAccountName: ks-installer
containers:
- name: installer
image: kubesphere/ks-installer:v2.1.1
imagePullPolicy: "Always"
【Q&A】问题汇总
01-coredns工作不正常
# 问题描述
# root@k8s-node1:/home/slienceme# kubectl get pods --all-namespaces
# NAMESPACE NAME READY STATUS RESTARTS AGE
# kube-flannel kube-flannel-ds-66bcs 1/1 Running 0 12h
# kube-flannel kube-flannel-ds-ntwwx 1/1 Running 0 12h
# kube-flannel kube-flannel-ds-vb6n9 1/1 Running 5 12h
# kube-system coredns-7f9c544f75-67njd 0/1 CrashLoopBackOff 17 12h // [!code focus:8]
# kube-system coredns-7f9c544f75-z82nl 0/1 CrashLoopBackOff 17 12h
# kube-system etcd-k8s-node1 1/1 Running 4 12h
# kube-system kube-apiserver-k8s-node1 1/1 Running 4 12h
# kube-system kube-controller-manager-k8s-node1 1/1 Running 3 12h
# kube-system kube-proxy-fs2p6 1/1 Running 0 12h
# kube-system kube-proxy-x7rkp 1/1 Running 0 12h
# kube-system kube-proxy-xpbvt 1/1 Running 3 12h
# kube-system kube-scheduler-k8s-node1 1/1 Running 3 12h
# kube-system tiller-deploy-6ffcfbc8df-mzwd7 1/1 Running 0 39m
通过对问题的追踪,在github issue找到 CoreDNS pod goes to CrashLoopBackOff State
解决流程如下:
你需要修改 Kubernetes 中的 CoreDNS 配置,具体操作步骤如下:
步骤 1: 修改 CoreDNS 配置ConfigMap
kubectl edit configmap coredns -n kube-system
步骤 2: 修改 Corefile
配置
修改 Corefile
中的 forward
配置,当前配置为:
forward . /etc/resolv.conf
你需要将其修改为指向外部 DNS 服务器,比如 Google 的公共 DNS 服务器 8.8.8.8
或 1.1.1.1
(Cloudflare DNS)。例如,你可以修改为:
forward . 8.8.8.8
或者如果你希望使用多个 DNS 服务器,可以配置多个地址:
forward . 8.8.8.8 8.8.4.4
步骤 3: 保存并退出
步骤 4: 验证配置生效
验证 CoreDNS 配置是否生效,可以查看 CoreDNS Pod 是否正常运行,并且配置是否正确生效。
kubectl get pods -n kube-system -l k8s-app=kube-dns
如果需要,也可以重启 CoreDNS Pods,以确保新的配置生效:
kubectl rollout restart deployment coredns -n kube-system
通过这些步骤,你就能避免 CoreDNS 发生循环请求,确保 DNS 请求被转发到外部的 DNS 服务器,而不是 CoreDNS 本身。
02-kubeSphere安装异常
要彻底清理 KubeSphere 之前的安装并重新进行安装,建议按照以下步骤操作:
步骤 1: 删除之前的 KubeSphere 安装资源
使用 kubectl delete
命令删除所有相关资源。这将删除 KubeSphere 安装过程中创建的所有对象(例如 Deployment、Pod、ConfigMap、Service 等)。
kubectl delete -f kubesphere-minimal.yaml
如果你使用了其他 YAML 文件来安装 KubeSphere,还需要分别删除它们。
- 删除 KubeSphere 的命名空间:
如果不再需要 kubesphere-system
命名空间,可以删除该命名空间,它包含所有 KubeSphere 相关资源。
kubectl delete namespace kubesphere-system
注意:删除命名空间后,该命名空间中的所有资源都会被删除。
步骤 2: 重新部署 KubeSphere
确保所有资源已清理干净:
你可以通过以下命令检查 KubeSphere 的资源是否已经完全删除:
kubectl get namespaces
kubectl get all -n kubesphere-system
确保 KubeSphere 的相关资源已经不再列出。
重新应用 kubesphere-minimal.yaml
配置文件:
重新部署 KubeSphere,运行以下命令:
kubectl apply -f kubesphere-minimal.yaml
# 或
kubectl apply -f https://raw.githubusercontent.com/kubesphere/ks-installer/v2.1.1/kubesphere-minimal.yaml
这将重新启动 KubeSphere 安装过程,确保你有一个干净的安装环境。
步骤 3: 监控安装过程
你可以使用以下命令监控安装过程的日志,特别是监控安装器 Pod 的日志:
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l app=ks-install -o jsonpath='{.items[0].metadata.name}') -f
步骤 4: 验证安装
检查 KubeSphere 是否成功部署:
安装完成后,检查 KubeSphere 的组件是否正常运行:
kubectl get pods -n kubesphere-system
确保所有 Pod 的状态为 Running
或 Completed
。
访问 KubeSphere 控制台:
如果你配置了 ingress 或 NodePort 进行外部访问,可以通过浏览器访问 KubeSphere 控制台,确保安装完成并能够登录。
03-kubeSphere相关node不正常
ks-apigateway-78bcdc8ffc-z7cgm
kubectl get pods -n kubesphere-system
# NAMESPACE NAME READY STATUS RESTARTS AGE
# kubesphere-system ks-account-596657f8c6-l8qzq 0/1 Init:0/2 0 11m
# kubesphere-system ks-apigateway-78bcdc8ffc-z7cgm 0/1 Error 7 11m
# kubesphere-system ks-apiserver-5b548d7c5c-gj47l 1/1 Running 0 11m
# kubesphere-system ks-console-78bcf96dbf-64j7h 1/1 Running 0 11m
# kubesphere-system ks-controller-manager-696986f8d9-kd5bl 1/1 Running 0 11m
# kubesphere-system ks-installer-75b8d89dff-mm5z5 1/1 Running 0 12m
# kubesphere-system redis-6fd6c6d6f9-x2l5w 1/1 Running 0 12m
# 先看看日志
kubectl logs -n kubesphere-system ks-apigateway-78bcdc8ffc-z7cgm
# 2025/03/18 04:59:10 [INFO][cache:0xc0000c1130] Started certificate maintenance routine
# [DEV NOTICE] Registered directive 'authenticate' before 'jwt'
# [DEV NOTICE] Registered directive 'authentication' before 'jwt'
# [DEV NOTICE] Registered directive 'swagger' before 'jwt'
# Activating privacy features... done.
# E0318 04:59:15.962304 1 redis.go:51] unable to reach redis hostdial tcp 10.96.171.117:6379: i/o timeout
# 2025/03/18 04:59:15 dial tcp 10.96.171.117:6379: i/o timeout
# unable to reach redis hostdial tcp 10.96.171.117:6379: i/o timeout
# 发现Redis问题
kubectl logs -n kubesphere-system redis-6fd6c6d6f9-x2l5w
# 1:C 18 Mar 2025 04:47:14.085 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
# ..........
# 1:M 18 Mar 2025 04:47:14.086 * Ready to accept connections
# 看着是正常的
# 查看一下服务
kubectl get svc -n kubesphere-system
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# ks-account ClusterIP 10.96.85.94 <none> 80/TCP 18m
# ks-apigateway ClusterIP 10.96.114.5 <none> 80/TCP 18m
# ks-apiserver ClusterIP 10.96.229.139 <none> 80/TCP 18m
# ks-console NodePort 10.96.18.47 <none> 80:30880/TCP 18m
# redis ClusterIP 10.96.171.117 <none> 6379/TCP 18m
# 重启 redis 和 ks-apigateway
kubectl delete pod -n kubesphere-system redis-6fd6c6d6f9-x2l5w
kubectl delete pod -n kubesphere-system ks-apigateway-78bcdc8ffc-z7cgm
# ...这个解决不了,突然就好了
ks-account-596657f8c6-l8qzq
# 先看看日志
kubectl logs -n kubesphere-system ks-account-596657f8c6-l8qzq
# W0318 06:00:23.940798 1 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
# E0318 06:00:25.758353 1 ldap.go:78] unable to read LDAP response packet: unexpected EOF
# E0318 06:00:25.758449 1 im.go:87] create default users unable to read LDAP response packet: unexpected EOF
# Error: unable to read LDAP response packet: unexpected EOF
# ......
# 2025/03/18 06:00:25 unable to read LDAP response packet: unexpected EOF