部署Prometheus
地址: https://github.com/prometheus-operator/kube-prometheus/tree/release-0.7
学习来源:https://www.cnblogs.com/lidong94/p/14500276.html、https://juejin.cn/post/6865504989695967245?searchId=20240312205710B746697AB0CDB7706DB3
我使用的kube-Prometheus,因为我的k8s版本是1.19,所以根据版本对应选择的0.7
wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.7.0.zip
unzip v0.7.0.zip # 就可以看到kube-prometheus-0.7.0
cd kube-prometheus-0.7.0
修改service 类型为 NodePort,不然公网无法访问,并添加Nodeport端口,
vi manifests/grafana-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: grafana
name: grafana
namespace: monitoring
spec:
type: NodePort # 新增的
ports:
- name: http
port: 3000
targetPort: http
nodePort: 30099 # 新增的
selector:
app: grafana
vi manifests/alertmanager-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
alertmanager: main
name: alertmanager-main
namespace: monitoring
spec:
type: NodePort # 新增
ports:
- name: web
port: 9093
targetPort: web
nodePort: 30300 # 新增
selector:
alertmanager: main
app: alertmanager
sessionAffinity: ClientIP
vi manifests/prometheus-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
type: NodePort # 新增
ports:
- name: web
port: 9090
targetPort: web
nodePort: 30100 # 新增
selector:
app: prometheus
prometheus: k8s
sessionAffinity: ClientIP
注意三个文件的端口要不一样,不然会冲突,在这里运行启动命令其实就可以创建了,但是创建后会少两个监控,所以再增加一下配置
还有要修改镜像的,因为我上网是科学的,所以没有做这一步,可以看我上面贴的学习来源。
新增监控
1、修改监控端口
vi /etc/kubernetes/manifests/kube-scheduler.yaml
把这个参数的值改成这个 --bind-address=0.0.0.0
vi /etc/kubernetes/manifests/kube-controller-manager.yaml
把这个参数的值改成这个 --bind-address=0.0.0.0
在~目录下创建下面两个文件
vi kube-system-controller-manager.yaml
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-controller-manager
labels:
k8s-app: kube-controller-manager
spec:
ports:
- name: https-metrics
port: 10257
selector:
component: kube-controller-manager
vi kube-system-schedule.yaml
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler
labels:
k8s-app: kube-scheduler
spec:
ports:
- name: https-metrics
port: 10259
selector:
component: kube-scheduler
kubectl apply -f kube-system-controller-manager.yaml
kubectl apply -f kube-system-schedule.yaml
部署命令
创建命令
kubectl create -f manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f manifests/
等执行完就可以访问了
prometheus: http://103.39.226.71:30100/graph
第一次启动的时候,发现Prometheus访问不了,可以进入容器的日志中看下,报错了的话,删掉一个pod试试,删除pod会重启一个pod,这样好像就可以了。
grafna: http://103.39.226.71:30099/login 默认账号admin 密码admin
点击这个可以看到default文件夹,文件夹里就有那些监控数据
由于grafana默认时区是UTC,比中国时间慢了8小时,很不便于日常监控查看,需要进行修改,如下图
grep -i timezone grafana-dashboardDefinitions.yaml # 找到所有的,发现有一个是小写utc,其他都是UTC
# 执行下面的,这两个是不同的,都要执行
sed -i 's/UTC/UTC+8/g' grafana-dashboardDefinitions.yaml
sed -i 's/utc/utc+8/g' grafana-dashboardDefinitions.yaml
kubectl apply -f grafana-dashboardDefinitions.yaml
再看时间就会变成当前时间了。
发现还是为0
根据这个去尝试解决 https://www.cnblogs.com/zoujiaojiao/p/16921462.html
kubectl delete -f prometheus-serviceMonitorKubeScheduler.yaml
kubectl delete -f prometheus-serviceMonitorKubeControllerManager.yaml
修改这两个文件内容
prometheus-serviceMonitorKubeControllerManager.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: kube-controller-manager
name: kube-controller-manager
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 30s
port: https-metrics
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
namespaceSelector:
matchNames:
- kube-system
selector:
matchLabels:
k8s-app: kube-controller-manager
prometheus-serviceMonitorKubeScheduler.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: kube-scheduler
name: kube-scheduler
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 30s
port: https-metrics
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
namespaceSelector:
matchNames:
- kube-system
selector:
matchLabels:
k8s-app: kube-scheduler
kubectl apply -f prometheus-serviceMonitorKubeScheduler.yaml
kubectl apply -f prometheus-serviceMonitorKubeControllerManager.yaml
执行最后这两个命令的时候一开始也没成功,然后我先DELETE了这两个server,然后重新创建,重复了两遍又可以了,可能是反应太慢了。反正现在显示正常了
我的到这里就先结束了,没有配置持久化,没有配置告警,如果要配置的,我上面贴的学习链接可以去看下。
删除命令
kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup