Pod容器资源限制和探针

一、资源限制

1.Pod和容器的资源请求和限制

2.CPU 资源单位

案例一

案例二

二、健康检查，又称为探针（Probe）

1.探针的三种规则

2.Probe支持三种检查方法

3.探测获得的三种结果

案例一：exec

案例二：httpget

案例三：tcpSocket

案例四：就绪检测readinessProbe1

案例五：就绪检测readinessProbe2

案例六：启动、退出

一、资源限制

当定义Pod 时可以选择性地为每个容器设定所需要的资源数量。最常见的可设定资源是CPU和内存大小，以及其他类型的资源。

当为 Pod 中的容器指定了request资源时，调度器就使用该信息来决定将Pod调度到哪个节点上。当还为容器指定了limit资源时，kubelet就会确保运行的容器不会使用超出所设的limit资源量。kubelet还会为容器预留所设的 request 资源量，供该容器使用。
如果 Pod运行所在的节点具有足够的可用资源，容器且可以使用超出所设置的 request资源量。不过，容器不可以使用超出所设置的limit资源量。
如果给容器设置了内存的 limit值，但未设置内存的 request值，Kubernetes 会自动为其设置与内存 limit相匹配的 request值。类似的，如果给容器设置了CPU的 limit值但未设置cPU的 request值，则Kubernetes自动为其设置CPU 的request 值并使之与CPU 的limit 值匹配。

官网示例:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/

1.Pod和容器的资源请求和限制

spec .containers[].resources.requests.cpu		//定义创建容器时预分配的CPU资源
spec.containers[].resources.requests.memory		//定义创建容器时预分配的内存资源
spec.containers[].resources.imits.cpu			//定义cpu 的资源上限
spec.containers[].resources.1imits.memory		//定义内存的资源上限

2.CPU 资源单位

CPU资源的request和limit以cpu为单位。Kubernetes 中的一个cpu相当于1个VCPU (1个超线程)
Kubernetes也支持带小数CPU 的请求。spec.containers[].resources.requests.cpu为0.5 的容器能够获得一个cpu的一半CPU资源(类似于Cgroup对CPU资源的时间分片)。表达式0.1 等价于表达式100m (毫核)，表示每1000 毫秒内容器可以使用的CPU时间总量为0.1*1000 亳秒。

案例一

OOMKilled：资源不足被杀死

apiVersion: v1
kind: Pod
metadata:
  name: ky-web-db
spec:
  containers:
  - name: web
    image: nginx
    env:
    - name: WEB_ROOT_PASSWORD
      value: "password"
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
  - name: db
    image: mysql
    env:
    - name: MYSQL_ROOT_PASSWORD
      value: "abc123"
    resources:
      requests:
        memory: "64Mi"
        cpu: "0.25"
      limits:
        memory: "128Mi"
        cpu: "500m"

案例二

给足资源

二、健康检查，又称为探针（Probe）

探针是由kubelet对容器执行的定期诊断。

1.探针的三种规则

●livenessProbe ：判断容器是否正在运行。如果探测失败，则kubelet会杀死容器，并且容器将根据 restartPolicy 来设置 Pod 状态。如果容器不提供存活探针，则默认状态为Success。
●readinessProbe ：判断容器是否准备好接受请求。如果探测失败，端点控制器将从与 Pod 匹配的所有 service endpoints 中剔除删除该Pod的IP地址。初始延迟之前的就绪状态默认为Failure。如果容器不提供就绪探针，则默认状态为Success。
●startupProbe（这个1.17版本增加的）：判断容器内的应用程序是否已启动，主要针对于不能确定具体启动时间的应用。如果配置了 startupProbe 探测，在则在 startupProbe 状态为 Success 之前，其他所有探针都处于无效状态，直到它成功后其他探针才起作用。如果 startupProbe 失败，kubelet 将杀死容器，容器将根据 restartPolicy 来重启。如果容器没有配置 startupProbe，则默认状态为 Success。
#注：以上规则可以同时定义。在readinessProbe检测成功之前，Pod的running状态是不会变成ready状态的。

2.Probe支持三种检查方法

●exec ：在容器内执行指定命令。如果命令退出时返回码为0则认为诊断成功。
●tcpSocket ：对指定端口上的容器的IP地址进行TCP检查（三次握手）。如果端口打开，则诊断被认为是成功的。
●==httpGet ==：对指定的端口和路径上的容器的IP地址执行HTTPGet请求。如果响应的状态码大于等于200且小于400，则诊断被认为是成功的

3.探测获得的三种结果

每次探测都将获得以下三种结果之一
●成功：容器通过了诊断。
●失败：容器未通过诊断。
●未知：诊断失败，因此不会采取任何行动

官网示例：
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

案例一：exec

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-exec
spec:
  containers:
  - name: liveness
    image: busybox
    imagePullPolicy: IfNotPresent
    args:  
    - /bin/sh
    - -c
    - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 60
    livenessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      failureThreshold: 1 
      initialDelaySeconds: 5
      periodSeconds: 5

- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 60

创建一个名为 /tmp/healthy 的文件，暂停执行脚本，等待 30 秒钟，删除后，再次暂停执行脚本，等待额外的 60 秒钟

案例二：httpget

apiVersion: v1
kind: Pod
metadata:
  name: liveness-httpget
  namespace: default
spec:
  containers:
  - name: liveness-httpget-container
    image: soscscs/myapp:v1           #soscscs：nginx1.12
    imagePullPolicy: IfNotPresent      #拉取策略
    ports:
    - name: http
      containerPort: 80
    livenessProbe:              #探针
      httpGet:
        port: http
        path: /index.html
      initialDelaySeconds: 1     #延迟1秒开始探测
      periodSeconds: 3          #每3秒探测一次
      timeoutSeconds: 10         #超时时间10秒

删除index.html页面，报错404，探针失败

案例三：tcpSocket

apiVersion: v1
kind: Pod
metadata:
  name: xzq-tcp-live
spec:
  containers:
  - name: nginx
    image: soscscs/myapp:v1
    livenessProbe:
      initialDelaySeconds: 5  #第一次探测延迟5秒，第6秒开始
      timeoutSeconds: 1
      tcpSocket:
        port: 8080
      periodSeconds: 10     #每10秒探测一次
      failureThreshold: 2   #允许2次失败

案例四：就绪检测readinessProbe1

apiVersion: v1
kind: Pod
metadata:
  name: readiness-httpget
  namespace: default
spec:
  containers:
  - name: readiness-httpget-container
    image: soscscs/myapp:v1
    imagePullPolicy: IfNotPresent
    ports:
    - name: http
      containerPort: 80
    readinessProbe:
      httpGet:
        port: 80
        path: /index1.html
      initialDelaySeconds: 1
      periodSeconds: 3
    livenessProbe:
      httpGet:
        port: http
        path: /index.html
      initialDelaySeconds: 1
      periodSeconds: 3
      timeoutSeconds: 10

案例五：就绪检测readinessProbe2

apiVersion: v1
kind: Pod
metadata:
  name: myapp1
  labels:
     app: myapp
spec:
  containers:
  - name: myapp
    image: soscscs/myapp:v1
    ports:
    - name: http
      containerPort: 80
    readinessProbe:
      httpGet:
        port: 80
        path: /index.html
      initialDelaySeconds: 5
      periodSeconds: 5
      timeoutSeconds: 10 
---
apiVersion: v1
kind: Pod
metadata:
  name: myapp2
  labels:
     app: myapp
spec:
  containers:
  - name: myapp
    image: soscscs/myapp:v1
    ports:
    - name: http
      containerPort: 80
    readinessProbe:
      httpGet:
        port: 80
        path: /index.html
      initialDelaySeconds: 5
      periodSeconds: 5
      timeoutSeconds: 10 
---
apiVersion: v1
kind: Pod
metadata:
  name: myapp3
  labels:
     app: myapp
spec:
  containers:
  - name: myapp
    image: soscscs/myapp:v1
    ports:
    - name: http
      containerPort: 80
    readinessProbe:
      httpGet:
        port: 80
        path: /index.html
      initialDelaySeconds: 5
      periodSeconds: 5
      timeoutSeconds: 10 
---
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
  type: ClusterIP
  ports:
  - name: http
    port: 80
    targetPort: 80

删除页面，看效果

//readiness探测失败，Pod 无法进入READY状态，且端点控制器将从 endpoints 中剔除删除该 Pod 的 IP 地址

案例六：启动、退出

apiVersion: v1
kind: Pod
metadata:
  name: lifecycle-demo
spec:
  containers:
  - name: lifecycle-demo-container
    image: soscscs/myapp:v1
    lifecycle:   #此为关键字段
      postStart:
        exec:
          command: ["/bin/sh", "-c", "echo Hello from the postStart handler >> /var/log/nginx/message"]      
      preStop:
        exec:
          command: ["/bin/sh", "-c", "echo Hello from the poststop handler >> /var/log/nginx/message"]
    volumeMounts:
    - name: message-log
      mountPath: /var/log/nginx/
      readOnly: false
  initContainers:
  - name: init-myservice
    image: soscscs/myapp:v1
    command: ["/bin/sh", "-c", "echo 'Hello initContainers'   >> /var/log/nginx/message"]
    volumeMounts:
    - name: message-log
      mountPath: /var/log/nginx/
      readOnly: false
  volumes:
  - name: message-log
    hostPath:
      path: /data/volumes/nginx/log/
      type: DirectoryOrCreate