项目场景:
prometheus scheduler及kube-controller-manager监控报错
问题描述
kubeadm搭建完kube-prometheus 会有这个报错
原因分析:
root@master2:~# kubectl describe servicemonitor -n kube-system kube-controller-manager
通过以上图片我们发现 k8s会去 kube-system 下的svc里找带有 app.kubernetes.io/name
标签的svc
root@master2:~# kubectl get svc -n kube-system -l app.kubernetes.io/name=kube-controller-manager
No resources found in monitoring namespace.
这里并没有这个标签
解决方案:
1) 我们需要把监听地址改成0.0.0.0
我们这里是kubeadm安装的 修改完这个文件 即可生效, 所有master节点都要配置
- --bind-address=127.0.0.1改为- --bind-address=0.0.0.0
root@master2:~# vim /etc/kubernetes/manifests/kube-controller-manager.yaml
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=0.0.0.0
- --client-ca-file=/etc/kubernetes/pki/ca.crt
...
2) 把符合标签的svc创建出来
apiVersion: v1
kind: Endpoints
metadata:
annotations:
app.kubernetes.io/name: kube-controller-manager
name: kube-controller-manager-monitoring
namespace: kube-system
subsets:
- addresses:
- ip: 192.168.1.27
- ip: 192.168.1.28
- ip: 192.168.1.29
ports:
- name: https-metrics
port: 10257
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: kube-controller-manager
name: kube-controller-manager-monitoring
namespace: kube-system
spec:
ports:
- name: https-metrics
port: 10257
protocol: TCP
targetPort: 10257
sessionAffinity: None
type: ClusterIP
标签要一致 端口也要一致
kube-controller-manager。 scheduler. 解决方法一样
现在再看已经不报错了