K8S POD 启动探针 startupProbe 的使用

news2024/10/7 4:34:17

在这里插入图片描述

当我们启动一个POD 时, 当k8s detect 里面的容器启动成功时, 就会认为这个POD 启动完成了, 通常就会在状态里表示 ready 1/1 …

例如

root@k8s-master:~# kubectl get pods
NAME          READY   STATUS    RESTARTS   AGE
bq-api-demo   1/1     Running   0          34m

至于K8S 是怎么判断pod 是否启动完成的:

对于容器内没有设置探测规则的情况,默认的探测规则如下:

启动完成检测:Kubernetes将监视容器的启动状态。如果容器的进程启动并且不处于终止状态(例如,未崩溃),Kubernetes将认为该容器已启动完成。

就绪状态检测:在没有设置就绪探针的情况下,默认情况下,Kubernetes将假定容器处于就绪状态。这意味着在Pod调度到节点后,Kubernetes将立即将流量转发到该容器。

需要注意的是,这些默认规则可能不足以确保应用程序完全启动和可用。因此,强烈建议在Pod的配置文件(YAML)中设置适当的启动探针(startupProbe)和就绪探针(readinessProbe),以便更精确地确定Pod是否已启动完成和就绪,从而确保应用程序的可靠性和稳定性。

所以在生产环境上 我们有必要设置 startupProbe 来让k8s 正确判断pod 已经启动完成, 置于readinessProbe 不在本文讨论范围内。



构建2个api 判断程序是否启动完成

这里作为例子, 我们创建了两个api, 1个模拟成功, 1个模拟失败

模拟成功的api 我们直接用 /actuator/info

@Component
@Slf4j
public class AppVersionInfo implements InfoContributor {

    @Autowired
    private Environment environment;

    @Value("${pom.version}") // https://stackoverflow.com/questions/3697449/retrieve-version-from-maven-pom-xml-in-code
    private String appVersion;

    @Override
    public void contribute(Info.Builder builder) {
        log.info("AppVersionInfo: contribute ...");
        builder.withDetail("app", "Sales API")
                .withDetail("version", appVersion)
                .withDetail("description", "This is a simple Spring Boot application to demonstrate the use of BigQuery in GCP.");
    }
}

模拟失败的api 我们自己写1个 /test/hello/fail

@Slf4j
@RestController
@RequestMapping("/test")
public class TestController {

    @GetMapping("/hello/fail")
    public ResponseEntity<ApiResponse<String>> getSalesDetails() {
        log.error("/test/hello/fail ... this api will already return 500 error");
        ApiResponse<String> response = new ApiResponse<>();
        response.setReturnCode(-1);
        response.setReturnMsg("this api will already return 500 error");
        return ResponseEntity.status(500).body(response);
    }
}



编辑pod yaml file

请留意startupProde 那一段的具体解释

apiVersion: v1 # api version
kind: Pod # type of this resource e.g. Pod/Deployment ..
metadata: 
  name: bq-api-demo
  labels: 
    pod-type: app # custom key value
    pod-version: v1.0.1
  namespace: 'default'
spec: # detail description
  containers: # key point
  - name: bq-api-service # custom name
    image: europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service:1.1.1
    imagePullPolicy: IfNotPresent # try to use local image first, if no, then pull image from remote
    startupProbe:
      httpGet: # Responses within the range of 200 to 399 code will be considered successful
        path: /actuator/info
        port: 8080
      initialDelaySeconds: 20 # prode 20 seconds to the service before check the statup status
      failureThreshold: 3 # Only when there are three consecutive failed attempts, it is considered a startup failure
      periodSeconds: 5 # Retry every 5 seconds (after a failure).
      timeoutSeconds: 5 # If the API does not return within 5 seconds, it is considered a failure
    ports:
    - name: http8080
      containerPort: 8080 # the port used by the container service
      protocol: TCP
    env:
    - name: JVM_OPTS
      value: '-Xms128m -Xmx2048m'
    resources:
      requests: # at least need 
        cpu: 1000m # 1000m = 1 core
        memory: 1000Mi 
      limits: # at max can use
        cpu: 2000m 
        memory: 2000Mi
    
  restartPolicy: OnFailure



重新部署

pod_name=bq-api-demo
yaml_filename=bq-api-service-startup-probe.yaml
namespace=default

# 删除指定 Pod
kubectl delete pod $pod_name -n $namespace

# 等待 Pod 被删除并重新创建
echo "Waiting for the pod to be deleted..."
kubectl wait pod $pod_name --for=delete -n $namespace

# 使用指定的 YAML 文件重新创建 Pod
kubectl create -f $yaml_filename -n $namespace

可以见到K8s 仍然可以detect pod 启动成功

root@k8s-master:~# kubectl get pods
NAME          READY   STATUS    RESTARTS   AGE
bq-api-demo   1/1     Running   0          34m

describe 一下:
的确描述了启动规则

root@k8s-master:~# kubectl describe pod bq-api-demo
...
Containers:
  bq-api-service:
    Container ID:   docker://15c666bd6e22e174d54ccf8757838a26d89a26562a21edca9174f8bcdb03fa90
    Image:          europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service:1.1.1
    Image ID:       docker-pullable://europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service@sha256:30fb2cebd2bf82863608037ce41048114c061acbf1182261a748dadefff2372f
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 17 Mar 2024 19:00:14 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  2000Mi
    Requests:
      cpu:     1
      memory:  1000Mi
    Startup:   http-get http://:8080/actuator/info delay=20s timeout=5s period=5s #success=1 #failure=3
    Environment:
      JVM_OPTS:  -Xms128m -Xmx2048m
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j2bpc (ro)
...

看下log, 的确可以看出appVersionInfo的接口被调用了

root@k8s-master:~# kubectl logs bq-api-demo

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::               (v2.7.18)

2024-03-17 19:00:15.371  INFO 1 --- [           main] com.home.Application                     : Starting Application v1.1.1 using Java 11.0.16 on bq-api-demo with PID 1 (/app/app.jar started by root in /app)
2024-03-17 19:00:15.375  INFO 1 --- [           main] com.home.Application                     : No active profile set, falling back to 1 default profile: "default"
2024-03-17 19:00:16.601  INFO 1 --- [           main] faultConfiguringBeanFactoryPostProcessor : No bean named 'errorChannel' has been explicitly defined. Therefore, a default PublishSubscribeChannel will be created.
2024-03-17 19:00:16.618  INFO 1 --- [           main] faultConfiguringBeanFactoryPostProcessor : No bean named 'integrationHeaderChannelRegistry' has been explicitly defined. Therefore, a default DefaultHeaderChannelRegistry will be created.
2024-03-17 19:00:17.151  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8080 (http)
2024-03-17 19:00:17.160  INFO 1 --- [           main] o.apache.catalina.core.StandardService   : Starting service [Tomcat]
2024-03-17 19:00:17.160  INFO 1 --- [           main] org.apache.catalina.core.StandardEngine  : Starting Servlet engine: [Apache Tomcat/9.0.83]
2024-03-17 19:00:17.238  INFO 1 --- [           main] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring embedded WebApplicationContext
2024-03-17 19:00:17.238  INFO 1 --- [           main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 1759 ms
2024-03-17 19:00:17.587  INFO 1 --- [           main] o.s.c.g.a.c.GcpContextAutoConfiguration  : The default project ID is jason-hsbc
2024-03-17 19:00:17.609  INFO 1 --- [           main] o.s.c.g.core.DefaultCredentialsProvider  : Default credentials provider for Google Compute Engine.
2024-03-17 19:00:17.609  INFO 1 --- [           main] o.s.c.g.core.DefaultCredentialsProvider  : Scopes in use by default credentials: [https://www.googleapis.com/auth/pubsub, https://www.googleapis.com/auth/spanner.admin, https://www.googleapis.com/auth/spanner.data, https://www.googleapis.com/auth/datastore, https://www.googleapis.com/auth/sqlservice.admin, https://www.googleapis.com/auth/devstorage.read_only, https://www.googleapis.com/auth/devstorage.read_write, https://www.googleapis.com/auth/cloudruntimeconfig, https://www.googleapis.com/auth/trace.append, https://www.googleapis.com/auth/cloud-platform, https://www.googleapis.com/auth/cloud-vision, https://www.googleapis.com/auth/bigquery, https://www.googleapis.com/auth/monitoring.write]
2024-03-17 19:00:17.704  INFO 1 --- [           main] com.home.api.config.MyInitializer        : Application started...
2024-03-17 19:00:17.705  INFO 1 --- [           main] com.home.api.config.MyInitializer        : https.proxyHost: null
2024-03-17 19:00:17.705  INFO 1 --- [           main] com.home.api.config.MyInitializer        : https.proxyPort: null
2024-03-17 19:00:18.370  INFO 1 --- [           main] o.s.b.a.e.web.EndpointLinksResolver      : Exposing 4 endpoint(s) beneath base path '/actuator'
2024-03-17 19:00:18.510  INFO 1 --- [           main] o.s.i.endpoint.EventDrivenConsumer       : Adding {logging-channel-adapter:_org.springframework.integration.errorLogger} as a subscriber to the 'errorChannel' channel
2024-03-17 19:00:18.510  INFO 1 --- [           main] o.s.i.channel.PublishSubscribeChannel    : Channel 'application.errorChannel' has 1 subscriber(s).
2024-03-17 19:00:18.511  INFO 1 --- [           main] o.s.i.endpoint.EventDrivenConsumer       : started bean '_org.springframework.integration.errorLogger'
2024-03-17 19:00:18.547  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8080 (http) with context path ''
2024-03-17 19:00:18.562  INFO 1 --- [           main] com.home.Application                     : Started Application in 3.869 seconds (JVM running for 4.353)
2024-03-17 19:00:18.598  INFO 1 --- [           main] com.home.Application                     : customParam: null
2024-03-17 19:00:38.644  INFO 1 --- [nio-8080-exec-1] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring DispatcherServlet 'dispatcherServlet'
2024-03-17 19:00:38.644  INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet        : Initializing Servlet 'dispatcherServlet'
2024-03-17 19:00:38.646  INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet        : Completed initialization in 2 ms
2024-03-17 19:00:38.681  INFO 1 --- [nio-8080-exec-1] c.h.api.monitor.endpoint.AppVersionInfo  : AppVersionInfo: contribute ...



模拟失败的case

首先创建1个新的yaml file, 规则接口选择/test/hello/fail 这个接口的return code 永远是500

    startupProbe:
      httpGet: # Responses within the range of 200 to 399 code will be considered successful
        path: /test/hello/fail # alway return 500..
        port: 8080
      initialDelaySeconds: 20 # prode 20 seconds to the service before check the statup status
      failureThreshold: 3 # Only when there are three consecutive failed attempts, it is considered a startup failure
      periodSeconds: 5 # Retry every 5 seconds (after a failure).
      timeoutSeconds: 5 # If the API does not return within 5 seconds, it is considered a failure

然后重新部署

root@k8s-master:~/k8s-s/pods# bash redeployPod.sh bq-api-demo bq-api-service-startup-probe-fail.yaml 
pod "bq-api-demo" deleted
Waiting for the pod to be deleted...
pod/bq-api-demo created

这次启动失败了 , 重试了3次

root@k8s-master:~# kubectl get pods -o wide
NAME          READY   STATUS    RESTARTS     AGE   IP            NODE        NOMINATED NODE   READINESS GATES
bq-api-demo   0/1     Running   3 (1s ago)   96s   10.244.3.16   k8s-node3   <none>           <none>

从下面的信息也知道是因为startup 接口return 了500

root@k8s-master:~# kubectl describe pod bq-api-demo
Name:         bq-api-demo
Namespace:    default
Priority:     0
Node:         k8s-node3/192.168.0.45
Start Time:   Sun, 17 Mar 2024 20:11:49 +0000
Labels:       pod-type=app
              pod-version=v1.0.1
Annotations:  <none>
Status:       Running
IP:           10.244.3.16
IPs:
  IP:  10.244.3.16
Containers:
  bq-api-service:
    Container ID:   docker://9a95ed5837917f3b527c8f65ec85cec17661ffa5e4ef4e4a6161b2c4cc2dc329
    Image:          europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service:1.1.1
    Image ID:       docker-pullable://europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service@sha256:30fb2cebd2bf82863608037ce41048114c061acbf1182261a748dadefff2372f
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 17 Mar 2024 20:11:50 +0000
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  2000Mi
    Requests:
      cpu:     1
      memory:  1000Mi
    Startup:   http-get http://:8080/test/hello/fail delay=20s timeout=5s period=5s #success=1 #failure=3
    Environment:
      JVM_OPTS:  -Xms128m -Xmx2048m
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xf7gx (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-xf7gx:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  35s               default-scheduler  Successfully assigned default/bq-api-demo to k8s-node3
  Normal   Pulled     34s               kubelet            Container image "europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service:1.1.1" already present on machine
  Normal   Created    34s               kubelet            Created container bq-api-service
  Normal   Started    34s               kubelet            Started container bq-api-service
  Warning  Unhealthy  5s (x2 over 10s)  kubelet            Startup probe failed: HTTP probe failed with statuscode: 500

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1524864.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

基于STM32G4的0.96寸OLED显示屏驱动程序(HAL库),支持硬件/软件I2C

基于STM32G474的0.96寸OLED(SSD1306)显示屏驱动程序&#xff08;4针脚I2C接口&#xff09;&#xff0c;支持硬件IIC/软件IIC&#xff0c;HAL库版。 这款驱动程序比较完善&#xff0c;可以实现 英文、整数、浮点数、汉字、图像、二进制数、十六进制数 等内容显示&#xff0c;可…

Vue 3响应式系统详解:ref、toRefs、reactive及更多

&#x1f31f; 前言 欢迎来到我的技术小宇宙&#xff01;&#x1f30c; 这里不仅是我记录技术点滴的后花园&#xff0c;也是我分享学习心得和项目经验的乐园。&#x1f4da; 无论你是技术小白还是资深大牛&#xff0c;这里总有一些内容能触动你的好奇心。&#x1f50d; &#x…

195基于matlab的凸轮机构GUI界面

基于matlab的凸轮机构GUI界面 &#xff0c; 凸轮设计与仿真 绘制不同的凸轮轮廓曲线 &#xff0c;凸轮机构运动参数包括推程运动角&#xff0c;回程运动角&#xff0c;远休止角&#xff0c;近休止角。运动方式&#xff0c;运动规律。运动仿真过程可视化。内容齐全详尽。用GUI打…

ARM Cortex R52内核 01 概述

ARM Cortex R52内核 01 Introduction 1.1 Cortex-R52介绍 Cortex-R52处理器是一种中等性能、有序、超标量处理器&#xff0c;主要用于汽车和工业应用。它还适用于各种其他嵌入式应用&#xff0c;如通信和存储设备。 Cortex-R52处理器具有一到四个核心&#xff0c;每个核心实…

redis 常见的异常

目录 一、缓存穿透 1、概念 解决方案 &#xff08;1&#xff09;布隆过滤器 (2)、缓存空对象 二、缓存雪崩 1、概念 解决方案 &#xff08;1&#xff09;redis高可用 &#xff08;2&#xff09;限流降级 &#xff08;3&#xff09;数据预热 一、缓存穿透 1、概念 缓…

java----网络编程(一)

一.什么是网络编程 用户在浏览器中&#xff0c;打开在线视频网站&#xff0c;如优酷看视频&#xff0c;实质是通过网络&#xff0c;获取到网络上的一个视频资源。 与本地打开视频文件类似&#xff0c;只是视频文件这个资源的来源是网络。所谓网络资源就是网络中获取数据。而所…

sqlite 常见命令 表结构

在 SQLite 中&#xff0c;将表结构保存为 SQL 具有一定的便捷性和重要性&#xff0c;原因如下 便捷性&#xff1a; 备份和恢复&#xff1a;将表结构保存为 SQL 可以方便地进行备份。如果需要还原或迁移数据库&#xff0c;只需执行保存的 SQL 脚本&#xff0c;就可以重新创建表…

lv17 安防监控项目实战 3

代码目录 框架 our_storage 编译最终生成的目标文件obj 编译生成中间的.o文件 data_global.c 公共资源定义&#xff08;使用在外extern即可&#xff09;定义了锁定义了条件变量消息队列id、共享内存id、信号量id及key值发送短信、接收短信的号码向消息队列发送消息的函数&am…

Docker 哲学 - 容器操作 -cp

1、拷贝 容器绑定的 volume的 数据&#xff0c;到指定目录 2、匿名挂载 volume 只定义一个数据咋在容器内的path&#xff0c;docker自动生成一个 sha256 的key作为 volume 名字。这个 sha256 跟 commitID 一致都是唯一的所以 &#xff0c;docker利用这个机制&#xff0c;可以…

Python二级备考(1)考纲+基础操作

考试大纲如下&#xff1a; 基本要求 考试内容 考试方式 比较希望能直接刷题&#xff0c;因为不懂的比较多可能会看视频。 基础操作刷题&#xff1a; 知乎大头计算机1-13题 import jieba txtinput() lsjieba.lcut(txt) print("{:.1f}".format(len(txt)/len(ls)…

【C语言】指针基础知识(一)

计算机上CPU&#xff08;中央处理器&#xff09;在处理数据的时候&#xff0c;需要的数据是在内存中读取的&#xff0c;处理后的数据也会放回内存中。 一,内存和地址 内存被分为一个个单元&#xff0c;一个内存单元的大小是一个字节。 内存单元的编号&#xff08;可以理解为门…

【回溯专题】【蓝桥杯备考训练】:n-皇后问题、木棒、飞机降落【未完待续】

目录 1、n-皇后问题&#xff08;回溯模板&#xff09; 2、木棒&#xff08;《算法竞赛进阶指南》、UVA307&#xff09; 3、飞机降落&#xff08;第十四届蓝桥杯省赛C B组&#xff09; 1、n-皇后问题&#xff08;回溯模板&#xff09; n皇后问题是指将 n 个皇后放在 nn 的国…

vulhub中GitLab 远程命令执行漏洞复现(CVE-2021-22205)

GitLab是一款Ruby开发的Git项目管理平台。在11.9以后的GitLab中&#xff0c;因为使用了图片处理工具ExifTool而受到漏洞CVE-2021-22204的影响&#xff0c;攻击者可以通过一个未授权的接口上传一张恶意构造的图片&#xff0c;进而在GitLab服务器上执行任意命令。 环境启动后&am…

dp入门:从暴力dfs到dp

本篇为小金鱼大佬视频的学习笔记&#xff0c;原视频链接&#xff1a;https://www.bilibili.com/video/BV1r84y1379W?vd_source726e10ea5b787a300ceada715f64b4bf 基础概念 暴力dfs很多时候仅能过部分测试点&#xff0c;要想将其优化&#xff0c;一般以 dfs -> 记忆化搜索 …

NetSuite多脚本性能研究

在项目中&#xff0c;随着复杂度的提升&#xff0c;客制脚本以及各类SuiteAPP的应用&#xff0c;导致某个对象上挂载的脚本大量增加&#xff0c;最终导致了性能问题。表现在保存单据时时间过长&#xff0c;严重影响人机界面的用户感受。基于此问题&#xff0c;我们开展了NetSui…

谷歌(edge)浏览器过滤,只查看后端发送的请求

打开F12 调试工具 选择Network 这是我们会发现 什么图片 文件 接口的请求很多很多&#xff0c;我们只需要查看我们后端发送的请求是否成功就好了 正常情况我们需要的都是只看接口 先点击这里这个 过滤 我们只需要点击 Fetch/XHR 即可过滤掉其他请求信息的展示 这样烦恼的问题就…

GAN及其衍生网络中生成器和判别器常见的十大激活函数(2024最新整理)

目录 1. Sigmoid 激活函数 2. Tanh 激活函数 3. ReLU 激活函数 4. LeakyReLU 激活函数 5. ELU 激活函数 6. SELU 激活函数 7. GELU 激活函数 8. SoftPlus 激活函数 9. Swish 激活函数 10. Mish 激活函数 激活函数(activation function)的作用是对网络提取到的特征信…

【算法与数据结构】堆排序TOP-K问题

文章目录 &#x1f4dd;堆排序&#x1f320; TOP-K问题&#x1f320;造数据&#x1f309;topk找最大 &#x1f6a9;总结 &#x1f4dd;堆排序 堆排序即利用堆的思想来进行排序&#xff0c;总共分为两个步骤&#xff1a; 建堆 升序&#xff1a;建大堆 降序&#xff1a;建小堆利…

uni-popup(实现自定义弹窗提示、交互)

一般提示框的样式&#xff0c;一般由设计稿而定&#xff0c;如果用uniapp的showmodel&#xff0c;那个并不能满足我们需要的自定义样式&#xff0c;所以最好的方式是我们自己封装一个&#xff01;&#xff08;想什么样就什么样&#xff09;&#xff01; 一、页面效果 二、使用…

unity学习(61)——hierarchy和scene的全新认识+模型+皮肤+动画controller

刚刚开始&#xff0c;但又结束的感觉&#xff1f; 1.对hierarchy和scene中的内容有了全新的认识 一定要清楚自己写过几个scene&#xff1b;每个scene之间如何跳转&#xff1b;build setting是add当前的scene。 2.此时的相机需要与模型同级&#xff0c;不能在把模型放在相机下…