1 项目目标
(1)对Prometheus有基本的了解
(2)能够部署出一套Prometheus看板系统
(3)对Prometheus界面熟悉
1.1 规划节点
主机名 | 主机IP | 节点规划 |
prome-master01 | 10.0.1.10 | 服务端 |
prome-node01 | 10.0.1.20 | 客户端 |
1.2 基础准备
系统镜像:centos7.9
安装包下载网址:Releases · prometheus/prometheus (github.com)
Download Grafana | Grafana Labs
Releases · prometheus/node_exporter · GitHub
2 安装部署Prometheus+grafana+node_exporter
2.1 部署Prometheus
Releases · prometheus/prometheus (github.com)这里找最新版下载。
wget https://github.com/prometheus/prometheus/releases/download/v2.53.2/prometheus-2.53.2.linux-amd64.tar.gz
tar xf prometheus-2.53.2.linux-amd64.tar.gz -C /usr/local
cd /usr/local
mv prometheus-2.53.2.linux-amd64/ prometheus
设置为系统服务。
vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=https://prometheus.io
[Service]
Restart=on-failure
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --web.listen-address=:9090
[Install]
WantedBy=multi-user.target
启动Prometheus
systemctl daemon-reload
systemctl start prometheus
systemctl enable prometheus
systemctl status prometheus
访问IP:9090,后面详细介绍Prometheus界面
2.2 部署grafana
Download Grafana | Grafana Labs这里找最新版下载。
sudo yum install -y https://dl.grafana.com/enterprise/release/grafana-enterprise-11.1.4-1.x86_64.rpm
启动grafana
systemctl start grafana-server.service
systemctl enable grafana-server.service
systemctl status grafana-server.service
访问IP:3000,之前有一篇Prometheus-operator文章已经介绍Prometheus界面,不重复赘述。
2.3 部署node_exporter
Releases · prometheus/node_exporter · GitHub这里找最新版下载。
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
tar xf node_exporter-1.1.2.linux-amd64.tar.gz -C /usr/local
cd /usr/local
mv node_exporter-1.1.2.linux-amd64/ node_exporter
设置为系统服务。
vim /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=https://prometheus.io
[Service]
Restart=on-failure
ExecStart=/usr/local/node_exporter/node_exporter --collector.systemd --collector.systemd.unit-whitelist=(docker|sshd|nginx).service
[Install]
WantedBy=multi-user.target
启动服务。
systemctl daemon-reload
systemctl start node_exporter
systemctl enable node_exporter
/usr/local/prometheus/prometheus.yml 添加监控项,添加最后三行
- job_name: 'linux'
static_configs:
- targets: ['10.0.1.10:9100','10.0.1.20:9100']
# my global config
global:
scrape_interval: 15s
evaluation_interval: 15s
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: 'linux'
static_configs:
- targets: ['10.0.1.10:9100','10.0.1.20:9100']
重启Prometheus
systemctl restart prometheus
3 Prometheus页面
3.1 graph页面
- autocomplete 可以补全metric tag信息或者内置的关键字
- table查询 instante查询
- graph查询
- resolution分辨率
3.2 target页面
展示采集任务
3.3 flags页面
命令行参数
3.4 status页面
描述运行信息和编译的信息
3.5 tsdb-status页面
打印存储状态信息
帮助定位重查询
3.6 service discover
服务发现结果展示的页面
4 Prometheus配置文件
4.1 各大配置段的含义
global:
# 默认采集目标数据指标的频率
[ scrape_interval: <duration> | default = 1m ]
# 采集请求的超时时间
[ scrape_timeout: <duration> | default = 10s ]
# 评估规则的频率
[ evaluation_interval: <duration> | default = 1m ]
# 与外部扩展系统(federation, remote storage, Alertmanager)通信时添加到时间序列或告警的标签
external_labels:
[ <labelname>: <labelvalue> ... ]
# 记录 PromQL 查询的文件。重新加载配置将重新打开文件
[ query_log_file: <string> ]
# rule_files 指定了 globs 列表,从所有匹配的文件中读取规则和告警
rule_files:
[ - <filepath_glob> ... ]
# 数据采集的配置列表
scrape_configs:
[ - <scrape_config> ... ]
# Alerting 指定了与 Alertmanager 相关的配置
alerting:
alert_relabel_configs:
[ - <relabel_config> ... ]
alertmanagers:
[ - <alertmanager_config> ... ]
# 与远程写相关的配置
remote_write:
[ - <remote_write> ... ]
# 与远程读相关的配置
remote_read:
[ - <remote_read> ... ]
Prometheus示例可以用来做下列用途:
prometheus 配置文件各个大配置段
- scrape_configs 采集配置段做采集器
- rule_files 告警、预聚合配置文件段
- remote_read 远程查询段
- remote_write 远程写入段
- alerting Alertmanager信息段
4.2 <scrape_config>
- job_name:prometheus
#true代表使用原始数据的时间戳,false代表使用prometheus采集器的时间戳
honor_timestamps: true
#多久执行一次采集,就是这个job多久执行一次
scrape_interval:15s
#采集的超时
scrape_timeout:15s
#就是采集target的metric暴露httppath,默认是/metrics,比如探针型的就是/probe
metrics_path:/metrics
#采集目标的协议是否是https
scheme:http
#是否跟踪redirect
follow_redirects:true
static_configs:
- targets:
- localhost:9090