由于k8s 的无状态service 通常部署在多个POD中, 实现多实例面向高并发。
但是k8s 本身并没有提供集中查询多个pod的日志的功能
其中1个常见方案就是ELK.
本文的方案是 利用fluentd sidecar 和 emptydir 把多个pod的日志导向到bigquery的table中。
Emptydir 的简介
Kubernetes中的EmptyDir是一种用于容器之间共享临时存储的空目录卷类型。EmptyDir卷在Pod内的容器之间共享,并且在Pod被删除时,其中的数据也会被删除。
以下是EmptyDir卷的一些关键特点:
临时存储:EmptyDir卷用于临时存储数据。当Pod被删除时,EmptyDir中的数据也会被删除。
容器共享:EmptyDir卷可以在同一个Pod中的多个容器之间共享数据。这对于需要在容器之间共享数据的场景非常有用。
空目录:EmptyDir卷开始为空,可以被一个或多个容器用来写入和读取数据。
生命周期:EmptyDir的生命周期与Pod的生命周期相同。当Pod被删除时,EmptyDir中的数据也会被清除。
所以Emptydir 就是用于1个pod 内多个容器数据交流的。
什么场景下1个pod 内会有两个容器,而且他们之间需要用文件来交流数据。
最著名的usecase 就是日志收集sidecar
下面我就会利用fluentd构建1个日志sidecar 作为例子:
至于fluentd的介绍请参考:
fluentd 简介,日志收集并导入BigQuery
准备fluentd 的镜像
由于fluentd 官方镜像并没有包含bigquery 插件, 所以我们需要自己打包1个包含该插件的惊险
准备dockerfile
Dockerfile
# 使用 Fluentd 官方的镜像作为基础镜像
FROM fluentd:latest
USER root
RUN mkdir -p /app/logs/
RUN mkdir /app/gcp-key/
# 安装 fluent-plugin-bigquery 插件
RUN gem install fluent-plugin-bigquery
# 复制 Fluentd 配置文件到镜像中
COPY fluent.conf /fluentd/etc/fluent.conf
COPY fluentd-ingress-jason-hsbc.json /app/gcp-key/fluentd-ingress-jason-hsbc.json
# 定义 CMD 指令用于启动 Fluentd 并指定配置文件
CMD ["fluentd", "-c", "/fluentd/etc/fluent.conf"]
放置 bigquery 使用的service account key
其中 fluentd-ingress-jason-hsbc.json 是1个service account帐号的 json key, 被用于访问Biqquery table.
问题的是fluentd-ingress-jason-hsbc.json 放在哪里, 这种敏感信息放在github 肯定是不合适的
这里的解决方案是放在google cloud 的secret manager
创建1个cloudbuild job trigger来构建这个docker 镜像
我的fluentd 镜像项目在
https://github.com/nvd11/fluentd-home
有了这个trigger 之后, 一旦代码有任何更新, clouldbuild 会自动帮我打包1个新的镜像
terraform 代码:
resource "google_cloudbuild_trigger" "fluentd-bq-gar-trigger" {
name = "fluentd-bq-gar-trigger" # could not contains underscore
location = var.region_id
# when use github then should use trigger_template
github {
name = "fluentd-home"
owner = "nvd11"
push {
branch = "main"
invert_regex = false # means trigger on branch
}
}
filename = "cloudbuild-gar.yaml"
# projects/jason-hsbc/serviceAccounts/terraform@jason-hsbc.iam.gserviceaccount.com
service_account = data.google_service_account.cloudbuild_sa.id
}
为fluentd 镜像构建编写cloudbuild yaml
cloudbuld-gar.yaml
# just to update the docker image to GAR with the pom.xml version
steps:
- id: prepare service acccount json key file
name: ubuntu # https://hub.docker.com/_/maven
entrypoint: bash
args:
- '-c'
- |
ls -l
echo $$FLUENTD_INGRESS_JASON_HSBC_KEY > fluentd-ingress-jason-hsbc.json
ls -l
cat ./fluentd-ingress-jason-hsbc.json
secretEnv:
- 'FLUENTD_INGRESS_JASON_HSBC_KEY'
- id: build and push docker image
name: 'gcr.io/cloud-builders/docker'
entrypoint: bash
args:
- '-c'
- |
set -x
echo "Building docker image with tag: $_APP_TAG"
docker build -t $_GAR_BASE/$PROJECT_ID/$_DOCKER_REPO_NAME/${_APP_NAME}:${_APP_TAG} .
docker push $_GAR_BASE/$PROJECT_ID/$_DOCKER_REPO_NAME/${_APP_NAME}:${_APP_TAG}
logsBucket: gs://jason-hsbc_cloudbuild/logs/
options: # https://cloud.google.com/cloud-build/docs/build-config#options
logging: GCS_ONLY # or CLOUD_LOGGING_ONLY https://cloud.google.com/cloud-build/docs/build-config#logging
availableSecrets:
secretManager:
- versionName: projects/$PROJECT_ID/secrets/fluentd-ingress-jason-hsbc-key/versions/latest
env: 'FLUENTD_INGRESS_JASON_HSBC_KEY'
substitutions:
_DOCKER_REPO_NAME: my-docker-repo
_APP_NAME: fluentd-bigquery
_APP_TAG: latest
_GAR_BASE: europe-west2-docker.pkg.dev
构建镜像
镜像构建成功
gateman@MoreFine-S500:~/keys$ gcloud artifacts docker images list europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo --include-tags | grep fluentd
europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/fluentd-bigquery sha256:49818bf8f82ef611caadcf5162fbd61a5a3bb34487acd741a9c3c035c54b2fac latest 2024-09-08T23:27:25 2024-09-08T23:27:25 36254422
准备fluentd 配置并放入k8s的config map
apiVersion: v1
kind: ConfigMap
metadata:
name: configmap-fluentd-cloud-order
data:
fluent.conf: |
<source>
@type tail
path /app/logs/app.log
pos_file /app/logs/app.log.pos
tag app.logs
format none
</source>
<match **>
@type bigquery_insert
<buffer>
flush_mode immediate
</buffer>
#auth_method compute_engine
auth_method json_key
json_key /app/gcp-key/fluentd-ingress-jason-hsbc.json
# private_key_passphrase notasecret # default
project jason-hsbc
dataset LOGS
auto_create_table true
table cloud-order-logs
<inject>
time_key timestamp
# time_type unixtime_millis # out of range
time_type string
time_format %Y-%m-%d %H:%M:%S.%L
</inject>
schema [
{
"name": "timestamp",
"type": "TIMESTAMP"
},
{
"name": "project_id",
"type": "STRING"
},
{
"name": "dataset",
"type": "STRING"
},
{
"name": "table",
"type": "STRING"
},
{
"name": "worker",
"type": "STRING"
},
{
"name": "ppid",
"type": "STRING"
},
{
"name": "inserted_rows",
"type": "STRING"
},
{
"name": "message",
"type": "STRING"
}
]
</match>
注意这个配置会把日志写入 LOGS.cloud-order-logs 这个Bigquery table
准备cloud-order service 的deployment.yaml
deployment-cloud-order-fluentd-sidecar.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels: # label of this deployment
app: cloud-order # custom defined
author: nvd11
name: deployment-cloud-order # name of this deployment
namespace: default
spec:
replicas: 3 # desired replica count, Please note that the replica Pods in a Deployment are typically distributed across multiple nodes.
revisionHistoryLimit: 10 # The number of old ReplicaSets to retain to allow rollback
selector: # label of the Pod that the Deployment is managing,, it's mandatory, without it , we will get this error
# error: error validating data: ValidationError(Deployment.spec.selector): missing required field "matchLabels" in io.k8s.apimachinery.pkg.apis.meta.v1.LabelSelector ..
matchLabels:
app: cloud-order
strategy: # Strategy of upodate
type: RollingUpdate # RollingUpdate or Recreate
rollingUpdate:
maxSurge: 25% # The maximum number of Pods that can be created over the desired number of Pods during the update
maxUnavailable: 25% # The maximum number of Pods that can be unavailable during the update
template: # Pod template
metadata:
labels:
app: cloud-order # label of the Pod that the Deployment is managing. must match the selector, otherwise, will get the error Invalid value: map[string]string{"app":"bq-api-xxx"}: `selector` does not match template `labels`
spec: # specification of the Pod
containers:
- image: europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/cloud-order:1.1.0 # image of the container
imagePullPolicy: Always
name: container-cloud-order
command: ["bash"]
args:
- "-c"
- |
java -jar -Dserver.port=8080 app.jar --spring.profiles.active=$APP_ENVIRONMENT --logging.file.name=/app/logs/app.log
env: # set env varaibles
- name: APP_ENVIRONMENT # name of the environment variable
value: prod # value of the environment variable
volumeMounts: # write log here
- name: volume-log
mountPath: /app/logs/
readOnly: false # read only is set to false
ports:
- containerPort: 8080
name: cloud-order
- image: europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/fluentd-bigquery:latest # image of the container
imagePullPolicy: Always
name: container-fluentd
command: ["sh"]
args:
- "-c"
- |
fluentd -c /app/config/fluentd.conf
volumeMounts: # read log from here
- name: volume-log
mountPath: /app/logs/
readOnly: false # read only is set to false
- name: config-volume
mountPath: /app/config
readOnly: false # read only is set to false
volumes:
- name: volume-log
emptyDir: {}
- name: config-volume
configMap:
name: configmap-fluentd-cloud-order # name of the config map
items:
- key: fluent.conf # key of the config map item
path: fluentd.conf # name of the file, it will only contain the content of the key
restartPolicy: Always # Restart policy for all containers within the Pod
terminationGracePeriodSeconds: 10 # The period of time in seconds given to the Pod to terminate gracefully
注意的是
里面定义了两个容器
1个是主容器springboot service cloud-order
另1个是容器 fluentd (镜像来自于本文第1步的构建)
而且我们定义了1个emptydir 的volume, 分别map在两个containers 中
而且sidecar 容器还map了另1个configmap的卷以读取 fluentd.conf 配置
测试
容器启动后
可以用如下命令分别查看pod 两个容器的日志
kubectl logs deployment-cloud-order-54cd5db8f-7rdmq -c 容器名字
如果日志没问题, 就直接查看bigquery 的table
gateman@MoreFine-S500:emptydir$ bq query --nouse_legacy_sql 'SELECT `timestamp`, message FROM `jason-hsbc`.LOGS.`cloud-order-logs` ORDER BY timestamp desc'
WARNING: This command is using service account impersonation. All API calls will be executed as [terraform@jason-hsbc.iam.gserviceaccount.com].
+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| timestamp | message |
+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2024-09-08 19:33:14 | 2024-09-08T19:33:13.058Z WARN 1 --- [main] JpaBaseConfiguration$JpaWebConfiguration : spring.jpa.open-in-view is enabled by default. Therefore, database queries may be performed during view rendering. Explicitly configure spring.jpa.open-in-view to disable this warning |
| 2024-09-08 19:33:14 | 2024-09-08T19:33:13.395Z INFO 1 --- [main] o.s.b.a.e.web.EndpointLinksResolver : Exposing 4 endpoint(s) beneath base path '/actuator' |
| 2024-09-08 19:33:14 | 2024-09-08T19:33:13.459Z INFO 1 --- [main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8080 (http) with context path '' |
| 2024-09-08 19:33:14 | 2024-09-08T19:33:13.460Z INFO 1 --- [main] u.j.c.RefreshScopeRefreshedEventListener : Refreshing cached encryptable property sources on ServletWebServerInitializedEvent |
| 2024-09-08 19:33:14 | 2024-09-08T19:33:13.461Z INFO 1 --- [main] CachingDelegateEncryptablePropertySource : Property Source commandLineArgs refreshed |
| 2024-09-08 19:33:14 | 2024-09-08T19:33:13.461Z INFO 1 --- [main] CachingDelegateEncryptablePropertySource : Property Source systemProperties refreshed |
| 2024-09-08 19:33:14 | 2024-09-08T19:33:13.461Z INFO 1 --- [main] CachingDelegateEncryptablePropertySource : Property Source systemEnvironment refreshed |
...
测试通过!