1 污点
1.1 污点简介
亲和性调度的方式都是站在Pod的角度上,通过在Pod上增加属性来将Pod调度到到指定的节点上,其实也可以站在Node节点的角度上,通过给Node节点设置属性,来决定是否允许Pod调度过来,这就是污点。
Node被设置上污点之后就和Pod存在了一种相斥的关系,进而拒绝Pod调度进来,甚至可以将已经存在的Pod驱逐出去。
污点的格式为 key=value:effect,key和value是污点的标签,effect描述五点多额作用,支持如下三个选项
- PreferNoSchedule:Kubernetes将尽量避免把Pod调度到具有此污点的Node上,除非没有其他节点可调度了
- NoSchedule:Kubernetes将不会把Pod调度到具有该污点的Node上,但不会影响当前Node上已经存在的Pod
- NoExecute:Kubernetes将不会把Pod调度到具有此污点的Node上,同时也会将Node上已经存在的Pod驱逐
1.2 污点命令
# 设置污点
$ kubectl taint nodes node1 key=value:effect
# 去除污点
$ kubectl taint nodes node1 key:effect-
# 去除所有污点
$ kubectl taint nodes node1 key-
1.3 污点案例
1)给node1设置一个污点,尽量不要调度过来pod
[root@master resource_manage]# kubectl taint nodes node1 name=nginx:PreferNoSchedule
node/node1 tainted
2)创建 nginx pod
[root@master resource_manage]# kubectl run nginx --image=nginx:1.17.1 --port=80
pod/nginx created
3)查询pod调度信息
[root@master resource_manage]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx 1/1 Running 0 7s 10.244.2.48 node2 <none> <none>
可以看到此时直接调度到node2了,不会调度到node1的,当然如果此时node2挂了,只有node1存活时,也会调度过来的。
1.4 查询节点污点
[root@master resource_manage]# kubectl describe node node1
Name: node1
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=node1
kubernetes.io/os=linux
nodeenv=test
Annotations: flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"ba:fe:1f:25:fe:26"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 192.168.16.41
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 14 Mar 2022 14:41:02 +0800
Taints: name=nginx:PreferNoSchedule
Unschedulable: false
Lease:
HolderIdentity: node1
AcquireTime: <unset>
RenewTime: Sat, 26 Mar 2022 00:00:54 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Mon, 14 Mar 2022 14:43:39 +0800 Mon, 14 Mar 2022 14:43:39 +0800 FlannelIsUp Flannel is running on this node
MemoryPressure False Fri, 25 Mar 2022 23:58:57 +0800 Mon, 14 Mar 2022 14:41:02 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 25 Mar 2022 23:58:57 +0800 Mon, 14 Mar 2022 14:41:02 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Fri, 25 Mar 2022 23:58:57 +0800 Mon, 14 Mar 2022 14:41:02 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Fri, 25 Mar 2022 23:58:57 +0800 Mon, 14 Mar 2022 14:43:42 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.16.41
Hostname: node1
Capacity:
cpu: 8
ephemeral-storage: 208357992Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32882960Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 192022725110
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32780560Ki
pods: 110
System Info:
Machine ID: f9c2b25f57184e06b8855490b4be6013
System UUID: d1042642-3933-564f-4f2d-279b5e96cead
Boot ID: 8517c1cc-8935-452e-9efb-a34f396b98a5
Kernel Version: 5.4.179-200.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.9
Kubelet Version: v1.21.2
Kube-Proxy Version: v1.21.2
PodCIDR: 10.244.1.0/24
PodCIDRs: 10.244.1.0/24
Non-terminated Pods: (4 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system kube-flannel-ds-gg4jq 100m (1%) 100m (1%) 50Mi (0%) 50Mi (0%) 11d
kube-system kube-proxy-tqzjl 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d
kubernetes-dashboard dashboard-metrics-scraper-c45b7869d-7ll25 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d
kubernetes-dashboard kubernetes-dashboard-79b5779bf4-t28b4 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 100m (1%) 100m (1%)
memory 50Mi (0%) 50Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
1.5 删除污点
$ kubectl taint nodes node1 name:PreferNoSchedule-
node/node1 untainted
1.6 为什么创建Pod的时候不会调度到master节点?
通过如下命令可以看到master节点是默认设置了node-role.kubernetes.io/master:NoSchedule类型的污点,因此在创建pod的时候是不会往master节点调度的。
[root@master resource_manage]# kubectl describe nodes master
Name: master
Roles: control-plane,master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=master
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node-role.kubernetes.io/master=
node.kubernetes.io/exclude-from-external-load-balancers=
Annotations: flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"02:f6:8e:03:60:51"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 192.168.16.40
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 14 Mar 2022 14:38:03 +0800
Taints: node-role.kubernetes.io/master:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: master
AcquireTime: <unset>
RenewTime: Sat, 26 Mar 2022 00:05:31 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Mon, 14 Mar 2022 14:42:58 +0800 Mon, 14 Mar 2022 14:42:58 +0800 FlannelIsUp Flannel is running on this node
MemoryPressure False Sat, 26 Mar 2022 00:01:28 +0800 Mon, 14 Mar 2022 14:38:02 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sat, 26 Mar 2022 00:01:28 +0800 Mon, 14 Mar 2022 14:38:02 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sat, 26 Mar 2022 00:01:28 +0800 Mon, 14 Mar 2022 14:38:02 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sat, 26 Mar 2022 00:01:28 +0800 Mon, 14 Mar 2022 14:43:03 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.16.40
Hostname: master
Capacity:
cpu: 8
ephemeral-storage: 208357992Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32882960Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 192022725110
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32780560Ki
pods: 110
System Info:
Machine ID: f9c2b25f57184e06b8855490b4be6013
System UUID: c5d32642-f84c-61ef-ac7f-d65ae6880a51
Boot ID: 9cbc9b25-2cf2-42d8-aa89-1fdab687c447
Kernel Version: 5.4.179-200.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.9
Kubelet Version: v1.21.2
Kube-Proxy Version: v1.21.2
PodCIDR: 10.244.0.0/24
PodCIDRs: 10.244.0.0/24
Non-terminated Pods: (6 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system etcd-master 100m (1%) 0 (0%) 100Mi (0%) 0 (0%) 11d
kube-system kube-apiserver-master 250m (3%) 0 (0%) 0 (0%) 0 (0%) 11d
kube-system kube-controller-manager-master 200m (2%) 0 (0%) 0 (0%) 0 (0%) 11d
kube-system kube-flannel-ds-n76xj 100m (1%) 100m (1%) 50Mi (0%) 50Mi (0%) 11d
kube-system kube-proxy-h27ms 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d
kube-system kube-scheduler-master 100m (1%) 0 (0%) 0 (0%) 0 (0%) 11d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 750m (9%) 100m (1%)
memory 150Mi (0%) 50Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
2 容忍
2.1 容忍简介
当对一个node节点定义了污点,但是又希望某一些pod是可以调度到带有污点的节点上,此时就需要容忍了,污点就是拒绝,容忍就是忽略/允许,Node通过污点拒绝Pod调度上去,Pod通过容忍忽略拒绝,如下:
2.2 容忍实战
1)给node1设置NoSchedule污点
此时为演示,可以先保持只有node1一个节点,将其他节点关闭
[root@master resource_manage]# kubectl taint nodes node1 name=nginx:NoSchedule
node/node1 tainted
2)编辑带有容忍的pod_toleration.yaml文件
apiVersion: v1
kind: Namespace
metadata:
name: dev
---
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
tolerations:
- key: "name"
operator: "Equal"
value: "nginx"
effect: "NoSchedule"
3)创建资源
[root@master resource_manage]# kubectl apply -f pod_toleration.yaml
namespace/dev created
pod/nginx-pod created
4)查看验证
然后通过如下命令查看,可以发现此时还是可以调度到node1节点上的
[root@master resource_manage]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-pod 1/1 Running 0 13s 10.244.2.49 node1 <none> <none>
2.3 容忍配置项说明
通过如下命令可以查看配置项的说明:
[root@master resource_manage]# kubectl explain pod.spec.tolerations
KIND: Pod
VERSION: v1
RESOURCE: tolerations <[]Object>
DESCRIPTION:
If specified, the pod's tolerations.
The pod this Toleration is attached to tolerates any taint that matches the
triple <key,value,effect> using the matching operator <operator>.
FIELDS:
effect <string>
Effect indicates the taint effect to match. Empty means match all taint
effects. When specified, allowed values are NoSchedule, PreferNoSchedule
and NoExecute.
key <string>
Key is the taint key that the toleration applies to. Empty means match all
taint keys. If the key is empty, operator must be Exists; this combination
means to match all values and all keys.
operator <string>
Operator represents a key's relationship to the value. Valid operators are
Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for
value, so that a pod can tolerate all taints of a particular category.
tolerationSeconds <integer>
TolerationSeconds represents the period of time the toleration (which must
be of effect NoExecute, otherwise this field is ignored) tolerates the
taint. By default, it is not set, which means tolerate the taint forever
(do not evict). Zero and negative values will be treated as 0 (evict
immediately) by the system.
value <string>
Value is the taint value the toleration matches to. If the operator is
Exists, the value should be empty, otherwise just a regular string.