文章目录
- 声明
- 配置yum源
- 安装docker
- 安装 kubeadm,kubelet 和 kubectl
- 部署主节点
- 其他节点加入集群
- 安装网络插件
声明
由于看了几个k8s的教程,都存在各种问题,自己搭建的时候,踩了不少坑,最后还是靠百度+csdn+chatGPT才搭建了起来,所以决定单独起个帖子,结合这几个教程,出一版搭建参考。
本次示范使用三台虚拟机,分别是119,120和121,仅供学习参考,另外建议每做一个步骤,就对虚拟机或者服务器生成一次备份,避免哪一步操作出错。
配置yum源
如果用的是centos7的话,yum安装一些东西的时候会报Cannot find a valid baseurl for repo: centos-sclo-rh/x86_64,可参考下面解决方案:
cd /etc/yum.repos.d/
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak
# 如果无法wget可以直接浏览器下载下来后放进去
sudo wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
#之后执行
yum clean all
yum update
yum makecache
#关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
#关闭 selinux
selinux:linux 的安全系统
sed -i 's/enforcing/disabled/' /etc/selinux/config
#关闭 swap
sed -ri 's/.*swap.*/#&/' /etc/fstab
#设置网桥参数
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#时间同步(非常重要)
yum install ntpdate -y
ntpdate time.windows.com
备注:如果 yum 下载过程中断开,造成 yum 程序锁定,可运行以下命令关闭 yum 程序
rm -f /var/run/yum.pid
#在 master 节点为各主机的 IP 命名
cat >> /etc/hosts << EOF
192.168.10.119 master
192.168.10.120 node1
192.168.10.121 node2
EOF
配置完重启虚拟机或者服务器。
安装docker
# yum install -y docker-ce 没有可用的软件包的解决方案
yum -y install yum-utils
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
#如果 yum 下载过程中断开,造成 yum 程序锁定,可运行以下命令关闭 yum 程序
rm -f /var/run/yum.pid
# 安装docker-ce
yum -y install docker-ce
接下来访问这个网址:阿里云镜像加速器,跟着页面指引配置镜像加速器
sudo systemctl daemon-reload
sudo systemctl restart docker
# 配置开机自启动
systemctl enable docker
安装 kubeadm,kubelet 和 kubectl
#配置 yum 源镜像
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
#执行以下指令,设置各种密匙
curl -fsSL https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg | sudo gpg --import
curl -fsSL https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg | sudo gpg --import
sudo rpm --import https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
sudo yum clean all
sudo yum makecache
执行yum makecache的时候,会有以下提示,输入y,回车:
#安装软件
yum install kubelet-1.19.4 kubeadm-1.19.4 kubectl-1.19.4 -y
#自动运行软件
systemctl enable kubelet.service
#查看是否安装成功
yum list installed | grep kubelet
yum list installed | grep kubeadm
yum list installed | grep kubectl
部署主节点
在master节点执行:
kubeadm init \
--apiserver-advertise-address=192.168.10.119 \
--image-repository=registry.aliyuncs.com/google_containers \
--kubernetes-version=v1.19.4 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16
其中192.168.10.119为主节点的ip地址,用ifconfig查看即可。
执行后如下图所示:
把里面的指令复制出来执行,主要是这三个:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
之后执行一下:
kubectl get nodes
如下图所示为正常:
其他节点加入集群
将之前master节点看到的这段复制过来,贴到其他节点上执行
执行成功后,在主节点执行以下指令确认:
kubectl get nodes
安装网络插件
上传kube-flannel.yml到主节点,执行kubectl apply -f kube-flannel.yml,以下是文件内容:
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- amd64
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- arm64
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:arm64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:arm64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- arm
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:arm
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:arm
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-ppc64le
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- ppc64le
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:ppc64le
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:ppc64le
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-s390x
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- s390x
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:s390x
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: registry.cn-zhangjiakou.aliyuncs.com/test-lab/coreos-flannel:s390x
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
执行一下指令:
kubectl apply -f kube-flannel.yml
# 如果自己发现这个文件有问题,想替换的话,可以删掉
# kubectl delete -f kube-flannel.yml
执行后先执行这个指令:
kubectl get pods -n kube-system
基本可以看到这两个为0,其他为1,为什么呢?
执行kubectl get nodes,节点也是noready的状态:
执行以下指令去查看其中一个节点:
# nodename换成你要查的,在主节点上执行,例如这里我查的是node1
kubectl describe node <node-name>
这里的信息也提示网络组件其实还没准备好:
在主节点输入以下指令查看一下日志:
journalctl -u kubelet -xe
把这两个关键信息到网上查,才知道 /etc/cni/net.d是跟我们安装的网络flannel有关的,这说明了虽然flannel安装了,但是仍然存在问题。
Flannel 会自动安装 CNI 插件到 /opt/cni/bin,于是我去看了下这个地方:
ls /opt/cni/bin
根据查看的结果,我发现里面少了好几个文件夹,包括flannel,这就说明其实CNI插件没有安装成功。
知道问题了,就很好解决了,接下来就是在每个节点上都安装CNI插件:
wget https://github.com/flannel-io/cni-plugin/releases/download/v1.1.0/flannel-amd64 -O /opt/cni/bin/flannel
chmod +x /opt/cni/bin/flannel
安装后重新执行ls /opt/cni/bin 检查了一下,发现终于完整了:
ls /etc/cni/net.d/ 检查了一下,也有相关配置文件:
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
然后执行以下指令,方便重新生成CNI相关配置到k8s里:
# 主节点执行
kubectl delete pod -n kube-system -l app=flannel
# 主从节点都要执行
sudo systemctl restart kubelet
# 主节点执行
[root@master ~]# kubectl get pods -n kube-system -l app=flannel
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-amd64-4dk66 1/1 Running 0 33s
kube-flannel-ds-amd64-g76mr 1/1 Running 0 23s
kube-flannel-ds-amd64-zdklx 1/1 Running 0 34s
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 62m v1.19.4
node1 Ready <none> 59m v1.19.4
node2 Ready <none> 59m v1.19.4
之后执行:
kubectl get pods -n kube-system
如果还有这种
可以先删掉对应的pod,k8s会自动恢复的,例如上面的三个异常pod:
kubectl delete pod -n kube-system etcd-master
kubectl delete pod -n kube-system kube-controller-manager-master
kubectl delete pod -n kube-system kube-scheduler-master
kubectl get pods -n kube-system
如果还是不行,就用这个指令去查看对应的pod的日志,然后根据日志报错,上网搜搜看怎么解决:
#把name换成你要查的pod,pod是用kubectl get pods -n kube-system来查的
# 例如下图的etcd-master,可以用指令kubectl logs -n kube-system etcd-master
kubectl logs -n kube-system name