服务器监控软件夜莺使用(二)

news2024/11/17 1:38:35

文章目录

  • 一、采集器安装
    • 1. Categraf简介
    • 2. Categraf部署
    • 3. 测试服务器部署
    • 4. 系统监控插件
    • 5. 显卡监控插件
    • 6. 服务监控插件
  • 二、监控仪表盘
    • 1. 机器列表
    • 2. 系统监控
    • 3. 服务监控
  • 三、告警配置
    • 1. 邮件通知
    • 2. 告警规则
    • 3. 告警自愈


一、采集器安装

1. Categraf简介

Categraf 需要部署到所有需要监控的机器上,因为采集 CPU、内存、进程等指标需要读取操作系统里的信息。
Categraf 推送监控数据到服务端,基于 Prometheus 的 RemoteWrite 协议。

Grafana 仪表盘市场
categraf插件说明
categraf部署文档
categraf下载地址
下载文件例如: categraf-v0.3.45-linux-amd64.tar.gz

2. Categraf部署

有些监控插件,docker部署方式很难配置,所以采用二进制部署Categraf。

  1. 删除不使用的插件
    categraf-v0.3.45-linux-amd64/conf/input.*
  2. 修改插件配置*.toml
  3. 修改Categraf配置config.toml
[global]
hostname = "机器标签"
[[writers]]
url = "http://192.168.6.226:17000/prometheus/v1/write"
[ibex]
enable = true
servers = ["192.168.6.226:20090"]
[heartbeat]
url = "http://192.168.6.226:17000/v1/n9e/heartbeat"
  1. 拷贝categraf
    拷贝categraf-v0.3.45-linux-amd64内的所有文件/文件夹到要部署的环境 /home/monitor/categraf
  2. 安装启动categraf
cd /home/monitor/categraf && chmod +x categraf && ./categraf --install && ./categraf --start
  • 其他命令
# 以service方式安装, 相当于添加service文件+systemctl daemon-reload
sudo ./categraf  --install
# 以service方式卸载, 相当于systemctl stop categraf + 删除service文件
# 如果安装过categraf,先卸载
sudo ./categraf  --remove
# 以service方式启动categraf ,相当于systemctl start categraf
sudo ./categraf  --start
# 以service方式停止categraf,相当于systemctl stop categraf
sudo ./categraf  --stop
# 以service方式查看categraf,相当于systemctl status categraf
sudo ./categraf  --status
# 采集了哪些 mysql 指标
sudo ./categraf --test --inputs mysql

3. 测试服务器部署

在这里插入图片描述

4. 系统监控插件

  • cpu 插件:采集本机 CPU 的使用率、空闲率等
    input.cpu/cpu.toml,可使用默认配置
# 采集频率
interval = 15
# 是否采集每个单核的指标
collect_per_cpu = false
  • 磁盘 插件:采集磁盘利用率、inode利用率等
    input.disk/disk.toml,可使用默认配置
# 采集频率
interval = 15

# 统计指定挂载点
# mount_points = ["/"]

# 按文件系统类型忽略挂载点
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs", "nsfs", "CDFS"]

# 忽略挂载点
ignore_mount_points = ["/boot", "/var/lib/kubelet/pods"]
  • 磁盘IO 插件:采集磁盘读写IO指标
    input.diskio/diskio.toml,可使用默认配置
# 采集频率
interval = 15

# 统计指定设备
# devices = ["sda", "sdb", "vd*"]
  • 内核 插件:采集 OS 启动时间,上下文切换的次数等
    input.kernel/kernel.toml,可使用默认配置
# 采集频率
interval = 15
  • 内存 插件:采集内存利用率等
    input.mem/mem.toml,可使用默认配置
# 采集频率
interval = 15

# 是否采集各个平台特有的指标
collect_platform_fields = true
  • 网络流量 插件:采集网卡的流量、包量等
    input.net/net.toml,可使用默认配置
# 采集频率
interval = 15

# 是否在Linux上收集协议统计信息 
# collect_protocol_stats = false

# 统计指定网卡信息
# interfaces = ["eth0"]
  • 网络连接 插件:采集有多少 time_wait 连接,多少 established 连接等
    input.netstat/netstat.toml,可使用默认配置
# 采集频率
interval = 15

disable_summary_stats = false

# 如果有很多网络连接, 该插件占用系统资源
disable_connection_stats = true

tcp_ext = false
ip_ext = false
  • ntp时间 插件:监控机器时间偏移量
    input.ntp/ntp.toml
# 采集频率
interval = 15

# ntp 服务器
ntp_servers = ["ntp.aliyun.com"]

# 响应超时时间
timeout = 5
  • 进程 插件:采集进程 running 的有多少,sleeping 的有多少,total 有多少
    input.processes/processes.toml,可使用默认配置
# 采集频率
interval = 15

#  强制使用ps命令收集 
# force_ps = false

#  强制使用/proc收集
# force_proc = false
  • system 插件:采集系统负载信息
    input.system/system.toml,可使用默认配置
# 采集频率
interval = 15

# 是否收集system_n_users信息
# collect_user_number = false

5. 显卡监控插件

  • nvidia显卡 插件:监控nvidia显卡信息
    input.nvidia_smi/nvidia_smi.toml
# 采集频率
interval = 15

# 执行本地命令
nvidia_smi_command = "nvidia-smi"

# 可以通过运行`nvidia-smi --help-query-gpus`来查找可能的字段
# `AUTO` 自动检测要查询的字段
query_field_names = "AUTO"

6. 服务监控插件

  • docker 插件:docker容器监控
    input.docker/docker.toml
# 采集频率
interval = 15

[[instances]]
# interval = global.interval * interval_times
interval_times = 1

## Docker Endpoint
endpoint = "unix:///var/run/docker.sock"

# 包括/排除的容器
container_name_include = []
container_name_exclude = []

gather_services = false
gather_extend_memstats = false

container_id_label_enable = true
container_id_label_short_style = false

timeout = "5s"

perdevice_include = []

total_include = ["cpu", "blkio", "network"]

docker_label_include = []
docker_label_exclude = ["annotation*", "io.kubernetes*", "*description*", "*maintainer*", "*hash", "*author*", "*org_*", "*date*", "*url*", "*docker_compose*"]
  • 日志 插件:提取日志内容,转换为监控metrics
    input.mtail/mtail.toml
# 采集频率
interval = 15

[[instances]]
progs = "/home/monitor/categraf/conf/input.mtail/prog1" # 日志解析规则配置文件的路径
logs = ["/home/logs/example/all.log"] # 日志文件
labels = { log="6.221-example-log" } # 日志标签
override_timezone = "Asia/Shanghai" # 时区
emit_metric_timestamp = "true" # 时间戳

input.mtail/prog1/rule_error.mtail

gauge error_num
/ERROR.*/ {
      error_num++
}

input.mtail/prog1/rule_info.mtail

gauge info_num
/INFO.*/ {
      info_num++
}

input.mtail/prog1/rule_login.mtail

gauge login_num
/登录账户.*/ {
      login_num++
}
  • mysql 插件:连到 mysql 实例,执行一些 sql,解析输出内容,整理为监控数据上报
    input.mysql/mysql.toml
# 采集频率
interval = 15

# 定义instance, 一个instance对应一个mysql实例
[[instances]]
address = "192.168.6.200:3306"
username = "root"
password = "123456"

# 是否使用tls 等定制参数
parameters = "tls=false"
  • nginx 插件:监控nginx状态,该插件依赖nginx的 **http_stub_status_module
    input.nginx/nginx.toml
# 采集频率
interval = 15

[[instances]]
# 设置访问 Nginx stub_status 链接
urls = ["http://192.168.6.223:8080/nginx_status"]

response_timeout = "5s"

nginx服务需要启用http_stub_status_module模块
nginx.conf 配置加上

http {
     location /nginx_status {
            stub_status on;
            access_log off;
            allow 192.168.6.226;			// 允许IP访问
            deny all;						// 禁止其他IP访问
        }
    }
}

http://192.168.6.223:8080/nginx_status
在这里插入图片描述

  • redis 插件:就是连上 redis,执行 info 命令,解析结果,整理成监控数据上报
    input.redis/redis.toml
# 采集频率
interval = 15

# 定义instance, 一个instance对应一个redis实例
[[instances]]
address = "192.168.6.223:6379"
username = ""
password = ""
pool_size = 2

# 是否开启slowlog收集
gather_slowlog = true

# 最多收集少条slowlog
slowlog_max_len = 100

二、监控仪表盘

1. 机器列表

  • 仪表盘 JSON
{
    "name": "机器列表",
    "tags": "",
    "ident": "",
    "configs": {
        "panels": [
            {
                "type": "table",
                "id": "77bf513a-8504-4d33-9efe-75aaf9abc9e4",
                "layout": {
                    "h": 11,
                    "i": "77bf513a-8504-4d33-9efe-75aaf9abc9e4",
                    "isResizable": true,
                    "w": 24,
                    "x": 0,
                    "y": 5
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "avg(system_uptime{ident=~\"$ident\"}) by (ident)",
                        "refId": "A",
                        "legend": "启动时长"
                    },
                    {
                        "expr": "avg(cpu_usage_active{cpu=\"cpu-total\", ident=~\"$ident\"}) by (ident)",
                        "legend": "CPU使用率",
                        "refId": "B"
                    },
                    {
                        "expr": "avg(mem_used_percent{ident=~\"$ident\"}) by (ident)",
                        "legend": "内存使用率",
                        "refId": "C"
                    },
                    {
                        "expr": "avg(mem_total{ident=~\"$ident\"}) by (ident)",
                        "legend": "总内存",
                        "refId": "D"
                    },
                    {
                        "expr": "avg(disk_used_percent{ident=~\"$ident\",path=\"/\"}) by (ident)",
                        "legend": "硬盘使用率",
                        "refId": "E"
                    },
                    {
                        "expr": "avg(disk_total{ident=~\"$ident\"}) by (ident)",
                        "refId": "F",
                        "legend": "总硬盘"
                    },
                    {
                        "expr": "avg(rate(net_bytes_recv{ident=~\"$ident\"}[1m])) by(ident)",
                        "refId": "G",
                        "legend": "网络入流量"
                    },
                    {
                        "expr": "avg(rate(net_bytes_sent{ident=~\"$ident\"}[1m])) by(ident)",
                        "refId": "H",
                        "legend": "网络出流量"
                    },
                    {
                        "expr": "avg(nvidia_smi_utilization_gpu_ratio{ident=~\"$ident\"}) by (ident)",
                        "refId": "I",
                        "legend": "GPU使用率"
                    },
                    {
                        "expr": "avg(nvidia_smi_memory_used_bytes/nvidia_smi_memory_total_bytes{ident=~\"$ident\"}) by (ident)",
                        "refId": "J",
                        "legend": "显存使用率"
                    },
                    {
                        "expr": "avg(nvidia_smi_memory_total_bytes{ident=~\"$ident\"}) by (ident)",
                        "refId": "K",
                        "legend": "总显存"
                    },
                    {
                        "expr": "ntp_offset_ms",
                        "refId": "L",
                        "legend": "NTP偏移 ms"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {
                            "renameByName": {
                                "ident": "机器"
                            }
                        }
                    }
                ],
                "name": "机器列表",
                "maxPerRow": 4,
                "custom": {
                    "showHeader": true,
                    "colorMode": "background",
                    "calc": "lastNotNull",
                    "displayMode": "labelValuesToRows",
                    "aggrDimension": "ident",
                    "sortColumn": "ident",
                    "sortOrder": "ascend",
                    "linkMode": "cellLink"
                },
                "options": {
                    "standardOptions": {}
                },
                "overrides": [
                    {
                        "type": "special",
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "A"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "humantimeSeconds"
                            }
                        }
                    },
                    {
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "B"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "percent",
                                "decimals": 1
                            },
                            "valueMappings": []
                        }
                    },
                    {
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "C"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "percent",
                                "decimals": 1
                            },
                            "valueMappings": []
                        },
                        "type": "special"
                    },
                    {
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "D"
                        },
                        "properties": {
                            "standardOptions": {
                                "decimals": 1,
                                "util": "bytesIEC"
                            },
                            "valueMappings": []
                        },
                        "type": "special"
                    },
                    {
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "E"
                        },
                        "properties": {
                            "standardOptions": {
                                "decimals": 1,
                                "util": "percent"
                            },
                            "valueMappings": []
                        },
                        "type": "special"
                    },
                    {
                        "type": "special",
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "F"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "bytesIEC",
                                "decimals": 0
                            }
                        }
                    },
                    {
                        "type": "special",
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "G"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "bytesSecIEC",
                                "decimals": 1
                            }
                        }
                    },
                    {
                        "type": "special",
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "H"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "bytesSecIEC",
                                "decimals": 1
                            }
                        }
                    },
                    {
                        "type": "special",
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "I"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "percentUnit",
                                "decimals": 1
                            }
                        }
                    },
                    {
                        "type": "special",
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "J"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "percentUnit",
                                "decimals": 1
                            }
                        }
                    },
                    {
                        "type": "special",
                        "matcher": {
                            "id": "byFrameRefID",
                            "value": "K"
                        },
                        "properties": {
                            "standardOptions": {
                                "util": "bytesIEC",
                                "decimals": 1
                            }
                        }
                    }
                ]
            }
        ],
        "var": [
            {
                "definition": "prometheus",
                "name": "prom",
                "type": "datasource"
            },
            {
                "allOption": true,
                "datasource": {
                    "cate": "prometheus",
                    "value": "${prom}"
                },
                "definition": "label_values(system_load1,ident)",
                "multi": true,
                "name": "ident",
                "type": "query"
            }
        ],
        "version": "3.0.0"
    }
}
  • 仪表盘 效果
    在这里插入图片描述

2. 系统监控

  • 仪表盘 JSON
{
    "name": "系统监控",
    "tags": "",
    "ident": "",
    "configs": {
        "panels": [
            {
                "type": "timeseries",
                "id": "043c26de-d19f-4fe8-a615-2b7c10ceb828",
                "layout": {
                    "h": 7,
                    "w": 8,
                    "x": 0,
                    "y": 0,
                    "i": "043c26de-d19f-4fe8-a615-2b7c10ceb828",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "cpu_usage_active{ident=~\"$ident\"}",
                        "refId": "A",
                        "legend": "{{ident}}-使用率"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "CPU使用率",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "util": "percent",
                        "min": 0,
                        "max": 101,
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off",
                            "standardOptions": {
                                "min": null,
                                "max": null,
                                "decimals": null
                            }
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "239aacdf-1982-428b-b240-57f4ce7f946d",
                "layout": {
                    "h": 7,
                    "w": 8,
                    "x": 8,
                    "y": 0,
                    "i": "239aacdf-1982-428b-b240-57f4ce7f946d",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "mem_used_percent{ident=~\"$ident\"}",
                        "refId": "A",
                        "legend": "{{ident}}-使用率"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "内存使用率",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "util": "percent",
                        "min": 0,
                        "max": 101,
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off",
                            "standardOptions": {
                                "decimals": null,
                                "min": null,
                                "max": null
                            }
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "bbd1ebda-99f6-419c-90a5-5f84973976dd",
                "layout": {
                    "h": 7,
                    "w": 8,
                    "x": 16,
                    "y": 0,
                    "i": "bbd1ebda-99f6-419c-90a5-5f84973976dd",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "rate(diskio_read_bytes{ident=~\"$ident\"}[1m])",
                        "legend": "{{ident}}-{{name}}-读IO",
                        "refId": "A"
                    },
                    {
                        "expr": "rate(diskio_write_bytes{ident=~\"$ident\"}[1m])",
                        "legend": "{{ident}}-{{name}}-写IO",
                        "refId": "B"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "磁盘IO",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "util": "bytesIEC",
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "f2ee5d32-737c-4095-b6b7-b15b778ffdb9",
                "layout": {
                    "h": 7,
                    "w": 8,
                    "x": 0,
                    "y": 7,
                    "i": "f2ee5d32-737c-4095-b6b7-b15b778ffdb9",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "rate(net_bytes_recv{ident=~\"$ident\"}[1m])",
                        "legend": "{{ident}}-入流量",
                        "refId": "A"
                    },
                    {
                        "expr": "rate(net_bytes_sent{ident=~\"$ident\"}[1m])",
                        "legend": "{{ident}}-出流量",
                        "refId": "B"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "网络流量",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "util": "bytesIEC",
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "6be9a2be-1d4c-488d-b695-aa1d82df3a3c",
                "layout": {
                    "h": 7,
                    "w": 8,
                    "x": 8,
                    "y": 7,
                    "i": "e164a7cb-394c-4670-b83c-e9321a08cbe6",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "nvidia_smi_utilization_gpu_ratio{ident=~\"$ident\"}",
                        "legend": "{{ident}}-使用率",
                        "refId": "A"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "显卡使用率",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "util": "percentUnit",
                        "min": 0,
                        "max": 1.01,
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "7873f825-1e41-45e9-a1ee-792a87fd4351",
                "layout": {
                    "h": 7,
                    "w": 8,
                    "x": 16,
                    "y": 7,
                    "i": "37ced102-b020-4e3f-8247-6b2c9240a762",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "nvidia_smi_memory_used_bytes/nvidia_smi_memory_total_bytes{ident=~\"$ident\"}",
                        "legend": "{{ident}}-使用率",
                        "refId": "A"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "显存使用率",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "util": "percentUnit",
                        "min": 0,
                        "max": 1.01,
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            }
        ],
        "var": [
            {
                "definition": "prometheus",
                "name": "prom",
                "type": "datasource"
            },
            {
                "allOption": true,
                "datasource": {
                    "cate": "prometheus",
                    "value": "${prom}"
                },
                "definition": "label_values(system_load1,ident)",
                "multi": true,
                "name": "ident",
                "type": "query"
            }
        ],
        "version": "3.0.0"
    }
}
  • 仪表盘 效果
    在这里插入图片描述

3. 服务监控

  • 仪表盘 JSON
{
    "name": "服务监控",
    "tags": "",
    "ident": "",
    "configs": {
        "panels": [
            {
                "type": "timeseries",
                "id": "043c26de-d19f-4fe8-a615-2b7c10ceb828",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 0,
                    "y": 0,
                    "i": "043c26de-d19f-4fe8-a615-2b7c10ceb828",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "mysql_global_status_threads_connected{ident=~\"$ident\"}",
                        "refId": "A",
                        "legend": "{{ident}}-当前连接数"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "MySQL 连接数",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "min": null,
                        "max": null,
                        "decimals": null
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off",
                            "standardOptions": {
                                "min": null,
                                "max": null,
                                "decimals": null
                            }
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "bbd1ebda-99f6-419c-90a5-5f84973976dd",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 8,
                    "y": 0,
                    "i": "bbd1ebda-99f6-419c-90a5-5f84973976dd",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "mysql_global_status_slow_queries{ident=~\"$ident\"}",
                        "legend": "{{ident}}-慢查询",
                        "refId": "A"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "MySQL 慢查询数",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "decimals": null
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "3ca8db64-b25e-4e72-8dac-187cec4886ae",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 16,
                    "y": 0,
                    "i": "7174939f-2742-47bd-a023-5d1d3698bf76",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "mtail_login_num{ident=~\"$ident\"}",
                        "legend": "{{ident}}-登录",
                        "refId": "A",
                        "time": {
                            "start": "now-24h",
                            "end": "now"
                        }
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "登录 日志数",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "093b192e-e991-4590-ab4b-aa768159e00f",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 0,
                    "y": 6,
                    "i": "a18a3bd3-8c2b-4fa2-81f3-7b0d00b49cc9",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "redis_connected_clients{ident=~\"$ident\"}",
                        "refId": "A",
                        "legend": "{{ident}}-当前连接数"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "Redis 连接数",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "min": null,
                        "max": null,
                        "decimals": null
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0.01,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off",
                            "standardOptions": {
                                "min": null,
                                "max": null,
                                "decimals": null
                            }
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "2674442f-937f-4027-806b-10b2286b14f6",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 8,
                    "y": 6,
                    "i": "c8c061df-894d-458e-a89d-86a8428c52c9",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "redis_used_memory{ident=~\"$ident\"}",
                        "legend": "{{ident}}-内存",
                        "refId": "A"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "Redis 使用内存",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "decimals": null
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "d26e8bc3-16a0-4a60-9aa9-36d71b85abc5",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 16,
                    "y": 6,
                    "i": "0a3310ea-74ca-48fa-8c18-52c1b0f71235",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "mtail_error_num{ident=~\"$ident\"}",
                        "legend": "{{ident}}-错误",
                        "refId": "A",
                        "time": {
                            "start": "now-24h",
                            "end": "now"
                        }
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "Error 日志数",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "7fa2cdbe-b782-4b71-bd7e-2cdba7455e77",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 0,
                    "y": 12,
                    "i": "9a2e4d49-7a4f-4627-b2f6-cbe0e4ab04b1",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "nginx_active{ident=~\"$ident\"}",
                        "refId": "A",
                        "legend": "{{ident}}-活跃连接"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "Nginx 活跃连接数",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "min": null,
                        "max": null,
                        "decimals": null
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off",
                            "standardOptions": {
                                "min": null,
                                "max": null,
                                "decimals": null
                            }
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "0cb01432-ea29-41f4-8e6f-e6b9b71e90ab",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 8,
                    "y": 12,
                    "i": "8bf97e38-e840-4804-a686-28bb65fec78d",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "docker_n_containers_running{ident=~\"$ident\"}",
                        "refId": "A",
                        "legend": "{{ident}}-启动容器"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "Docker 启动容器数",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "min": null,
                        "max": null,
                        "decimals": null
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off",
                            "standardOptions": {
                                "min": null,
                                "max": null,
                                "decimals": null
                            }
                        }
                    }
                ]
            },
            {
                "type": "timeseries",
                "id": "936b934b-6340-4743-8c12-821c63210fd6",
                "layout": {
                    "h": 6,
                    "w": 8,
                    "x": 16,
                    "y": 12,
                    "i": "c6da1998-c1e3-4486-a24c-58e26d349206",
                    "isResizable": true
                },
                "version": "3.0.0",
                "datasourceCate": "prometheus",
                "datasourceValue": "${prom}",
                "targets": [
                    {
                        "expr": "docker_container_mem_usage{ident=~\"$ident\"}",
                        "legend": "{{ident}}-{{container_name}}-内存",
                        "refId": "A"
                    }
                ],
                "transformations": [
                    {
                        "id": "organize",
                        "options": {}
                    }
                ],
                "name": "Docker 内存使用率",
                "maxPerRow": 4,
                "options": {
                    "tooltip": {
                        "mode": "all",
                        "sort": "desc"
                    },
                    "legend": {
                        "displayMode": "hidden",
                        "behaviour": "showItem"
                    },
                    "standardOptions": {
                        "decimals": 0
                    },
                    "thresholds": {
                        "steps": [
                            {
                                "color": "#634CD9",
                                "value": null,
                                "type": "base"
                            }
                        ]
                    }
                },
                "custom": {
                    "drawStyle": "lines",
                    "lineInterpolation": "smooth",
                    "spanNulls": false,
                    "lineWidth": 2,
                    "fillOpacity": 0,
                    "gradientMode": "none",
                    "stack": "off",
                    "scaleDistribution": {
                        "type": "linear"
                    }
                },
                "overrides": [
                    {
                        "matcher": {
                            "id": "byFrameRefID"
                        },
                        "properties": {
                            "rightYAxisDisplay": "off"
                        }
                    }
                ]
            }
        ],
        "var": [
            {
                "definition": "prometheus",
                "name": "prom",
                "type": "datasource"
            },
            {
                "allOption": true,
                "datasource": {
                    "cate": "prometheus",
                    "value": "${prom}"
                },
                "definition": "label_values(system_load1,ident)",
                "multi": true,
                "name": "ident",
                "type": "query"
            }
        ],
        "version": "3.0.0"
    }
}
  • 仪表盘 效果
    在这里插入图片描述

三、告警配置

1. 邮件通知

  • 配置 SMTP
    在这里插入图片描述
  • 配置 用户邮箱在这里插入图片描述
  • 配置 邮件通知模板在这里插入图片描述
<!DOCTYPE html>
	<html lang="en">
	<head>
		<meta charset="UTF-8">
		<meta http-equiv="X-UA-Compatible" content="ie=edge">
		<title>夜莺告警通知</title>
		<style type="text/css">
			.wrapper {
				background-color: #f8f8f8;
				padding: 15px;
				height: 100%;
			}
			.main {
				width: 600px;
				padding: 30px;
				margin: 0 auto;
				background-color: #fff;
				font-size: 12px;
				font-family: verdana,'Microsoft YaHei',Consolas,'Deja Vu Sans Mono','Bitstream Vera Sans Mono';
			}
			header {
				border-radius: 2px 2px 0 0;
			}
			header .title {
				font-size: 14px;
				color: #333333;
				margin: 0;
			}
			header .sub-desc {
				color: #333;
				font-size: 14px;
				margin-top: 6px;
				margin-bottom: 0;
			}
			hr {
				margin: 20px 0;
				height: 0;
				border: none;
				border-top: 1px solid #e5e5e5;
			}
			em {
				font-weight: 600;
			}
			table {
				margin: 20px 0;
				width: 100%;
			}
	
			table tbody tr{
				font-weight: 200;
				font-size: 12px;
				color: #666;
				height: 32px;
			}
			.succ {
				background-color: green;
				color: #fff;
			}
			.fail {
				background-color: red;
				color: #fff;
			}
			.succ th, .succ td, .fail th, .fail td {
				color: #fff;
			}
			table tbody tr th {
				width: 80px;
				text-align: right;
			}
			.text-right {
				text-align: right;
			}
			.body {
				margin-top: 24px;
			}
			.body-text {
				color: #666666;
				-webkit-font-smoothing: antialiased;
			}
			.body-extra {
				-webkit-font-smoothing: antialiased;
			}
			.body-extra.text-right a {
				text-decoration: none;
				color: #333;
			}
			.body-extra.text-right a:hover {
				color: #666;
			}
			.button {
				width: 200px;
				height: 50px;
				margin-top: 20px;
				text-align: center;
				border-radius: 2px;
				background: #2D77EE;
				line-height: 50px;
				font-size: 20px;
				color: #FFFFFF;
				cursor: pointer;
			}
			.button:hover {
				background: rgb(25, 115, 255);
				border-color: rgb(25, 115, 255);
				color: #fff;
			}
			footer {
				margin-top: 10px;
				text-align: right;
			}
			.footer-logo {
				text-align: right;
			}
			.footer-logo-image {
				width: 108px;
				height: 27px;
				margin-right: 10px;
			}
			.copyright {
				margin-top: 10px;
				font-size: 12px;
				text-align: right;
				color: #999;
				-webkit-font-smoothing: antialiased;
			}
		</style>
	</head>
	<body>
	<div class="wrapper">
		<div class="main">
			<header>
				<h3 class="title">{{.RuleName}}</h3>
				<p class="sub-desc"></p>
			</header>
			<hr>
			<div class="body">
				<table cellspacing="0" cellpadding="0" border="0">
					<tbody>
					{{if .IsRecovered}}
					<tr class="succ">
						<th>级别状态:</th>
						<td>S{{.Severity}} Recovered</td>
					</tr>
					{{else}}
					<tr class="fail">
						<th>级别状态:</th>
						<td>S{{.Severity}} Triggered</td>
					</tr>
					{{end}}
	
					{{if not .IsRecovered}}
					<tr>
						<th>触发时值:</th>
						<td>{{.TriggerValue}}</td>
					</tr>
					{{end}}
	
					{{if .TargetIdent}}
					<tr>
						<th>监控对象:</th>
						<td>{{.TargetIdent}}</td>
					</tr>
					{{end}}
					<tr>
						<th>监控指标:</th>
						<td>{{.TagsJSON}}</td>
					</tr>

                    {{$time_duration := sub now.Unix .FirstTriggerTime }}
					{{if .IsRecovered}}
					<tr>
						<th>持续时间:</th>
						<td>{{humanizeDurationInterface $time_duration}}</td>
					</tr>
					<tr>
						<th>恢复时间:</th>
						<td>{{timeformat .LastEvalTime}}</td>
					</tr>
					{{else}}
					<tr>
						<th>触发时间:</th>
						<td>
							{{timeformat .TriggerTime}}
						</td>
					</tr>
					{{end}}
					</tbody>
				</table>
			</div>
		</div>
	</div>
	</body>
	</html>

2. 告警规则

  • CPU 使用率超过90%
[
  {
    "cate": "prometheus",
    "datasource_ids": [
      0
    ],
    "name": "CPU 使用率超过90%",
    "note": "",
    "prod": "metric",
    "algorithm": "",
    "algo_params": null,
    "delay": 0,
    "severity": 0,
    "severities": [
      1
    ],
    "disabled": 0,
    "prom_for_duration": 60,
    "prom_ql": "",
    "rule_config": {
      "inhibit": true,
      "queries": [
        {
          "keys": {
            "labelKey": "",
            "valueKey": ""
          },
          "prom_ql": "cpu_usage_active > 90",
          "severity": 1
        }
      ]
    },
    "prom_eval_interval": 15,
    "enable_stime": "00:00",
    "enable_stimes": [
      "00:00"
    ],
    "enable_etime": "23:59",
    "enable_etimes": [
      "23:59"
    ],
    "enable_days_of_week": [
      "1",
      "2",
      "3",
      "4",
      "5",
      "6",
      "0"
    ],
    "enable_days_of_weeks": [
      [
        "1",
        "2",
        "3",
        "4",
        "5",
        "6",
        "0"
      ]
    ],
    "enable_in_bg": 0,
    "notify_recovered": 1,
    "notify_channels": [
      "email"
    ],
    "notify_repeat_step": 60,
    "notify_max_number": 3,
    "recover_duration": 60,
    "callbacks": [],
    "runbook_url": "",
    "append_tags": [],
    "annotations": {},
    "extra_config": null
  }
]
  • MySQL 1分钟内慢查询数超过10个
[
  {
    "cate": "prometheus",
    "datasource_ids": [
      0
    ],
    "name": "MySQL 1分钟内慢查询数超过10个",
    "note": "",
    "prod": "metric",
    "algorithm": "",
    "algo_params": null,
    "delay": 0,
    "severity": 0,
    "severities": [
      1
    ],
    "disabled": 0,
    "prom_for_duration": 120,
    "prom_ql": "",
    "rule_config": {
      "inhibit": false,
      "queries": [
        {
          "keys": {
            "labelKey": "",
            "valueKey": ""
          },
          "prom_ql": "increase(mysql_global_status_slow_queries[1m]) > 10",
          "severity": 1
        }
      ]
    },
    "prom_eval_interval": 15,
    "enable_stime": "00:00",
    "enable_stimes": [
      "00:00"
    ],
    "enable_etime": "23:59",
    "enable_etimes": [
      "23:59"
    ],
    "enable_days_of_week": [
      "1",
      "2",
      "3",
      "4",
      "5",
      "6",
      "0"
    ],
    "enable_days_of_weeks": [
      [
        "1",
        "2",
        "3",
        "4",
        "5",
        "6",
        "0"
      ]
    ],
    "enable_in_bg": 0,
    "notify_recovered": 1,
    "notify_channels": [
      "email"
    ],
    "notify_repeat_step": 60,
    "notify_max_number": 3,
    "recover_duration": 60,
    "callbacks": [],
    "runbook_url": "",
    "append_tags": [],
    "annotations": {},
    "extra_config": null
  }
]
  • MySQL 连接数超过80%
[
  {
    "cate": "prometheus",
    "datasource_ids": [
      0
    ],
    "name": "MySQL 连接数超过80%",
    "note": "",
    "prod": "metric",
    "algorithm": "",
    "algo_params": null,
    "delay": 0,
    "severity": 0,
    "severities": [
      1
    ],
    "disabled": 0,
    "prom_for_duration": 120,
    "prom_ql": "",
    "rule_config": {
      "inhibit": false,
      "queries": [
        {
          "keys": {
            "labelKey": "",
            "valueKey": ""
          },
          "prom_ql": "avg by (instance) (mysql_global_status_threads_connected) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 80",
          "severity": 1
        }
      ]
    },
    "prom_eval_interval": 15,
    "enable_stime": "00:00",
    "enable_stimes": [
      "00:00"
    ],
    "enable_etime": "23:59",
    "enable_etimes": [
      "23:59"
    ],
    "enable_days_of_week": [
      "1",
      "2",
      "3",
      "4",
      "5",
      "6",
      "0"
    ],
    "enable_days_of_weeks": [
      [
        "1",
        "2",
        "3",
        "4",
        "5",
        "6",
        "0"
      ]
    ],
    "enable_in_bg": 0,
    "notify_recovered": 1,
    "notify_channels": [
      "email"
    ],
    "notify_repeat_step": 60,
    "notify_max_number": 3,
    "recover_duration": 60,
    "callbacks": [],
    "runbook_url": "",
    "append_tags": [],
    "annotations": {},
    "extra_config": null
  }
]
  • 内存 使用率超过85%
[
  {
    "cate": "prometheus",
    "datasource_ids": [
      0
    ],
    "name": "内存 使用率超过85%",
    "note": "",
    "prod": "metric",
    "algorithm": "",
    "algo_params": null,
    "delay": 0,
    "severity": 0,
    "severities": [
      1
    ],
    "disabled": 0,
    "prom_for_duration": 60,
    "prom_ql": "",
    "rule_config": {
      "inhibit": true,
      "queries": [
        {
          "keys": {
            "labelKey": "",
            "valueKey": ""
          },
          "prom_ql": "mem_used_percent > 85",
          "severity": 1
        }
      ]
    },
    "prom_eval_interval": 15,
    "enable_stime": "00:00",
    "enable_stimes": [
      "00:00"
    ],
    "enable_etime": "23:59",
    "enable_etimes": [
      "23:59"
    ],
    "enable_days_of_week": [
      "1",
      "2",
      "3",
      "4",
      "5",
      "6",
      "0"
    ],
    "enable_days_of_weeks": [
      [
        "1",
        "2",
        "3",
        "4",
        "5",
        "6",
        "0"
      ]
    ],
    "enable_in_bg": 0,
    "notify_recovered": 1,
    "notify_channels": [
      "email"
    ],
    "notify_repeat_step": 60,
    "notify_max_number": 3,
    "recover_duration": 60,
    "callbacks": [],
    "runbook_url": "",
    "append_tags": [],
    "annotations": {},
    "extra_config": null
  }
]
  • 硬盘 使用率超过80%
[
  {
    "cate": "prometheus",
    "datasource_ids": [
      0
    ],
    "name": "硬盘 使用率超过80%",
    "note": "",
    "prod": "metric",
    "algorithm": "",
    "algo_params": null,
    "delay": 0,
    "severity": 0,
    "severities": [
      1
    ],
    "disabled": 0,
    "prom_for_duration": 60,
    "prom_ql": "",
    "rule_config": {
      "inhibit": true,
      "queries": [
        {
          "keys": {
            "labelKey": "",
            "valueKey": ""
          },
          "prom_ql": "disk_used_percent > 80",
          "severity": 1
        }
      ]
    },
    "prom_eval_interval": 30,
    "enable_stime": "00:00",
    "enable_stimes": [
      "00:00"
    ],
    "enable_etime": "23:59",
    "enable_etimes": [
      "23:59"
    ],
    "enable_days_of_week": [
      "0",
      "1",
      "2",
      "3",
      "4",
      "5",
      "6"
    ],
    "enable_days_of_weeks": [
      [
        "0",
        "1",
        "2",
        "3",
        "4",
        "5",
        "6"
      ]
    ],
    "enable_in_bg": 0,
    "notify_recovered": 1,
    "notify_channels": [],
    "notify_repeat_step": 60,
    "notify_max_number": 3,
    "recover_duration": 60,
    "callbacks": [],
    "runbook_url": "",
    "append_tags": [],
    "annotations": {},
    "extra_config": null
  }
]
  • 网络 入流量超过6M/s
[
  {
    "cate": "prometheus",
    "datasource_ids": [
      0
    ],
    "name": "网络 入流量超过6M/s",
    "note": "",
    "prod": "metric",
    "algorithm": "",
    "algo_params": null,
    "delay": 0,
    "severity": 0,
    "severities": [
      1
    ],
    "disabled": 0,
    "prom_for_duration": 60,
    "prom_ql": "",
    "rule_config": {
      "inhibit": false,
      "queries": [
        {
          "keys": {
            "labelKey": "",
            "valueKey": ""
          },
          "prom_ql": "rate(net_bytes_recv[1m]) / 1024 / 1024 > 6",
          "severity": 1
        }
      ]
    },
    "prom_eval_interval": 15,
    "enable_stime": "00:00",
    "enable_stimes": [
      "00:00"
    ],
    "enable_etime": "23:59",
    "enable_etimes": [
      "23:59"
    ],
    "enable_days_of_week": [
      "1",
      "2",
      "3",
      "4",
      "5",
      "6",
      "0"
    ],
    "enable_days_of_weeks": [
      [
        "1",
        "2",
        "3",
        "4",
        "5",
        "6",
        "0"
      ]
    ],
    "enable_in_bg": 0,
    "notify_recovered": 1,
    "notify_channels": [
      "email"
    ],
    "notify_repeat_step": 60,
    "notify_max_number": 3,
    "recover_duration": 60,
    "callbacks": [],
    "runbook_url": "",
    "append_tags": [],
    "annotations": {},
    "extra_config": null
  }
]
  • 网络 出流量超过6M/s
[
  {
    "cate": "prometheus",
    "datasource_ids": [
      0
    ],
    "name": "网络 出流量超过6M/s",
    "note": "",
    "prod": "metric",
    "algorithm": "",
    "algo_params": null,
    "delay": 0,
    "severity": 0,
    "severities": [
      1
    ],
    "disabled": 0,
    "prom_for_duration": 60,
    "prom_ql": "",
    "rule_config": {
      "inhibit": false,
      "queries": [
        {
          "keys": {
            "labelKey": "",
            "valueKey": ""
          },
          "prom_ql": "rate(net_bytes_sent[1m]) / 1024 / 1024 > 6",
          "severity": 1
        }
      ]
    },
    "prom_eval_interval": 15,
    "enable_stime": "00:00",
    "enable_stimes": [
      "00:00"
    ],
    "enable_etime": "23:59",
    "enable_etimes": [
      "23:59"
    ],
    "enable_days_of_week": [
      "1",
      "2",
      "3",
      "4",
      "5",
      "6",
      "0"
    ],
    "enable_days_of_weeks": [
      [
        "1",
        "2",
        "3",
        "4",
        "5",
        "6",
        "0"
      ]
    ],
    "enable_in_bg": 0,
    "notify_recovered": 1,
    "notify_channels": [
      "email"
    ],
    "notify_repeat_step": 60,
    "notify_max_number": 3,
    "recover_duration": 60,
    "callbacks": [],
    "runbook_url": "",
    "append_tags": [],
    "annotations": {},
    "extra_config": null
  }
]

3. 告警自愈

  • 自愈配置
    在这里插入图片描述
  • 测试告警自愈
    告警自愈 > 自愈脚本 > 创建
    在这里插入图片描述
    告警自愈 > 自愈脚本 > test 创建任务 > 保存立刻执行 > 执行历史 > 点击标题下的任务
    在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1368542.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

QTableView和QTableWidget之间的联系 和 区别?什么时候该用QTableView,什么时候该用QTableWidget?

参考&#xff1a;tableview与tablewidget的区别_wx64f5321a2db60的技术博客_51CTO博客 QTableView 和 QTableWidget 是 Qt 中用于显示和编辑表格数据的两个不同的部件&#xff0c;它们之间存在一些重要的区别&#xff1a; 一、主要区别是&#xff1a;QTableView可以使用自定义…

拥有影响力,项目经理才能如鱼得水

优秀的项目经理&#xff0c;不仅需要具备卓越的组织和协调能力&#xff0c;还需要拥有足够的影响力&#xff0c;以便能够推动项目的顺利进行。然而&#xff0c;现实情况是&#xff0c;许多项目经理并没有意识到影响力的重要性&#xff0c;导致他们在工作中事半功倍&#xff0c;…

中国建设银行 关于解决微软升级导致插入网银盾无法自动打开企业网银的通知

关于解决微软升级导致插入网银盾无法自动打开企业网银的通知 发布时间&#xff1a;2023-10-18 尊敬的客户&#xff1a; 近期Windows操作系统升级会禁止使用IE浏览器&#xff0c;可能会导致您在插入网银盾后无法自动弹出企业网银登录页面&#xff0c;您可以通过以下方式解决&…

一篇文章带您了解如何实现WordPress主题/插件本地化翻译

要实现WordPress主题和插件的本地化翻译就需要了解什么是国际化和本地化以及WordPress是如何实现国际化和本地化的。 什么是国际化&#xff1f; 国际化是为软件&#xff08;在本例中为 WordPress&#xff09;提供多语言支持的过程。国际化通常缩写为 i18n&#xff0c;其中 18 代…

Zoho Mail企业邮箱:跨境协作的利器,荣登Top榜单

在全球化的商业环境中&#xff0c;高效的协作工具对于企业及个人来说都至关重要。邮件因其自身规格正式、全球通用等特点&#xff0c;在跨境通信场景中仍然是最高频使用的工具之一。 Zoho Mail企业邮箱因邮件抵达率高&#xff0c;数据加密严&#xff0c;纯净无广告&#xff0c;…

数据分析求职-常见面试题前言

今天和大家聊聊数据分析求职常见面试题&#xff0c;这是这个系列的第一篇文章&#xff0c;但是我不想开始就直接罗列题目&#xff0c;因为这样的文章实在太多了&#xff0c;同学们的兴趣程度肯定一般。所以&#xff0c;我想先和大家聊聊在准备面试题时候通常遇到的困扰&#xf…

京东年度数据报告-2023全年度打印机十大热门品牌销量(销额)榜单

2023年度&#xff0c;打印机消费市场的销售总量呈现下滑。根据鲸参谋平台的数据显示&#xff0c;京东平台上打印机市场的年度销量为650万&#xff0c;同比下滑约9%&#xff1b;销售额将近55亿&#xff0c;同比下滑约10%。 在这里&#xff0c;鲸参谋平台综合了京东平台上电脑办公…

第17集《佛法修学概要》

戊四、业果轻重 诸位法师慈悲&#xff01;陈会长慈悲&#xff01; 诸位学员&#xff01;阿弥陀佛&#xff01; 请大家打开讲义第四十二页。 我们讲到戊四、业果轻重。业果的轻重有三段&#xff1a;第一个约心&#xff1b;第二个约境&#xff1b;第三个约相续。我们讲到第二…

Vue3+Vite打包跨平台(七牛、阿里OSS)上传部署前端项目

1、业务场景 阅读之前&#xff0c;想了解一下各位观众老爷们&#xff0c;你们公司的项目是怎么部署的&#xff1a; 1.本地打包手动上传服务器&#xff1b; 2.本地打包自动上传服务器&#xff1b; 3.代码仓库流水线自动构建&#xff1b; 4.其他…&#xff1b; 我们用的第3种部…

认知能力测验,⑦如何破解类比推理类测试题?

关于认知能力测评&#xff0c;今天这稿算是最后一篇&#xff0c;一共写了7篇&#xff0c;分别是数字推理、逻辑思维、语言常识、数量关系、图形推理、逻辑判断、和类比推理。 不论是校招、社招、网申、还是行测&#xff0c;在线人才测评已经是普遍普及的想象&#xff0c;而认知…

BUUCTF--ciscn_2019_s_31

这题是一题ret2csu,先查看下保护&#xff1a; 64位架构的程序&#xff0c;那么传参就是寄存器传参了。开启了NX&#xff0c;也不存在ret2shellocde。接下来黑盒测试下&#xff1a; 输入一个字节都能触发段错误&#xff0c;并且还跟了一串不知道啥来的东西&#xff0c;盲猜是栈上…

HTML5+CSS3小实例:弹出式悬停效果

实例:弹出式悬停效果 技术栈:HTML+CSS 效果: 源码: 【HTML】 <!DOCTYPE html> <html lang="zh-CN"><head><meta charset="UTF-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><m…

[Markdown] Markdown常用快捷键分类汇总

文章目录 Markdown1、标题2、列表3、强调4、链接和图片5、代码和公式6、表格和任务列表7、引用8、分割线9、脚注10、目录11、注释12、定义 Markdown Markdown是一种轻量级的标记语言&#xff0c;可以让你用简单的语法来编写格式丰富的文档。 Markdown编辑器是一种专门用于编辑…

c++学习:文件输入输出类模板

目录 头文件 常用类模板 basic_ifstream文件输入类模板 模板原型 模板的成员类型和成员对象和成员函数 文件输入类模板的容器对象 实例 basic_ifstream文件输出类模板 模板原型 模板的成员类型和成员对象和成员函数 实例 basic_ifstream文件输出类模板 模板原型 …

Certum与Geotrust的SSL证书区别

Certum和GeoTrust都是知名的CA认证机构&#xff0c;这两个品牌下的SSL证书在多个方面存在一些差异。今天就随SSL盾小编了解Certum与Geotrust证书的区别。 一、Certum机构背景 Certum是波兰的一家CA认证机构&#xff0c;成立于2002年&#xff0c;至今已有近20多年的历史。旗下有…

在NR中,什么是PDCCH order?

根据38.300中触发RA的场景&#xff0c;PDCCH Order 发起的随机接入对应的上图中橙框中的过程&#xff0c;即用于上行失步后&#xff0c;当gNB有下行数据要发送时&#xff0c;会使用PDCCH Order强制UE发起RACH以重新完成上行同步。 配置有SUL的小区进行RA时&#xff0c;网络可以…

深入了解pnpm:一种高效的包管理工具

✨专栏介绍 在当今数字化时代&#xff0c;Web应用程序已经成为了人们生活和工作中不可或缺的一部分。而要构建出令人印象深刻且功能强大的Web应用程序&#xff0c;就需要掌握一系列前端技术。前端技术涵盖了HTML、CSS和JavaScript等核心技术&#xff0c;以及各种框架、库和工具…

Golang 交叉编译之一文详解

博客原文 文章目录 Golang 中的交叉编译不同操作系统间的编译Linux 下编译windowsmacos windows 下编译Linuxmacos macos 下编译Linuxwindows 不同架构下的编译amd64x86 参考 Golang 中的交叉编译 在 Golang 中&#xff0c;交叉编译指的是在同一台机器上生成针对不同操作系统或…

关于最近VSCode的Python代码格式化失效问题的解决办法

隔了一段时间再次打开VSCode写Python脚本&#xff0c;Python扩展弹出一条警告 查看日志输出发现Python的代码格式化设置发生了变化 简单来说就是Python扩展已经将原有的默认代码格式化工具 "ms-python.python" 弃用&#xff0c;格式化功能已移交到单独的格式化工具…

CUDA安装一直卡在检查系统兼容性,或花费极长的时间检查兼容性,但最后显示NVIDIA安装程序失败

CUDA安装一直卡在检查系统兼容性&#xff0c;或花费极长的时间检查兼容性&#xff0c;但最后显示NVIDIA安装程序失败 ⚙️1.软件环境⚙️&#x1f50d;2.问题描述&#x1f50d;&#x1f421;3.解决方法&#x1f421;&#x1f914;4.结果预览&#x1f914; ⚙️1.软件环境⚙️ W…