HertzBeat是一款免Agent的监控平台,拥有强大自定义监控能力,可以对应用服务、中间件、数据库、操作系统、云原生等进行监控,配置监控告警阈值,以及告警通知(邮件、微信、钉钉、飞书)。关于这个软件的介绍,我这里就不做过多的介绍了,感兴趣的可以去官网(https://hertzbeat.com/docs)去详细了解下。
今天我主要和大家分享下,如何使用docker来配置和运行HertzBeat。
HertzBeat将监控历史数据存储在时序数据库里面,官方文档给出了两款兼容的时序数据库软件,分别是IoTDB和TDengine,这里我使用的是TDengine。
首先,我们将所需的docker镜像下载到虚拟机(Docker已经提前安装,版本为20.10.21,如何安装请参考我博客的DockerCE的文章分类)。
# docker pull tancloud/hertzbeat
Using default tag: latest
latest: Pulling from tancloud/hertzbeat
751ef25978b2: Pull complete
140e22108c7d: Pull complete
993077aca88e: Pull complete
d1a940e1e4e8: Pull complete
4f368e97aba5: Pull complete
4f4fb700ef54: Pull complete
Digest: sha256:ed3d981673ee34e2d462ba0dda415f62aeec2380ccd0f45a8f1e481d05b2c735
Status: Downloaded newer image for tancloud/hertzbeat:latest
docker.io/tancloud/hertzbeat:latest
# docker pull tdengine/tdengine:2.4.0.12
2.4.0.12: Pulling from tdengine/tdengine
2f94e549220a: Pull complete
0c7809c5a70c: Pull complete
354dceb62d94: Pull complete
ded68138e6c3: Pull complete
a049546d9313: Pull complete
c67be503641a: Pull complete
1f27396f6efc: Pull complete
fe556ec02776: Pull complete
Digest: sha256:0209b13bc6bffaac98fb05df58a86b06d998877d786efcdf59e68299b538d8bd
Status: Downloaded newer image for tdengine/tdengine:2.4.0.12
docker.io/tdengine/tdengine:2.4.0.12
然后,我们通过下面的命令来运行TDengine数据库
docker run -dti -p 6030-6049:6030-6049 -p 6030-6049:6030-6049/udp \
-v /data/taosdata:/var/lib/taos \
-e TZ=Asia/Shanghai \
--name tdengine tdengine/tdengine:2.4.0.12
数据库root用户的默认密码是taosdata,这里我将其修改为其他密码,通过docker exec进入容器内部执行下面的命令。
# docker exec -it tdengine /bin/bash
root@77a11dd2b845:~/TDengine-server-2.4.0.12# taos
taos> show databases;
taos> CREATE DATABASE hertzbeat KEEP 90 DAYS 10 BLOCKS 6 UPDATE 1;
Query OK, 0 of 0 row(s) in database (0.001995s)
taos> alter user root pass 'YourPassword';
Query OK, 0 of 0 row(s) in database (0.002024s)
taos> quit
root@77a11dd2b845:~/TDengine-server-2.4.0.12# exit
exit
备注:
CREATE DATABASE hertzbeat KEEP 90 DAYS 10 BLOCKS 6 UPDATE 1
上面这条语句的意思是:创建一个名为 hertzbeat 的库,这个库的数据将保留90天(超过90天将被自动删除),每 10 天一个数据文件,内存块数为 6,允许更新数据
下面,我们修改HertzBeat的两个关键的配置文件,分别是application.yml和sureness.yml
# application.yml
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
server:
port: 1157
spring:
application:
name: ${HOSTNAME:@hertzbeat@}${PID}
profiles:
active: prod
mvc:
static-path-pattern: /**
jackson:
default-property-inclusion: ALWAYS
web:
resources:
static-locations:
- classpath:/dist/
- classpath:../dist/
# need to disable spring boot mongodb auto config, or default mongodb connection tried and failed..
autoconfigure:
exclude: org.springframework.boot.autoconfigure.mongo.MongoAutoConfiguration, org.springframework.boot.autoconfigure.data.mongo.MongoDataAutoConfiguration
thymeleaf:
prefix: classpath:/templates/
check-template-location: true
cache: true
suffix: .html
mode: HTML
management:
endpoints:
web:
exposure:
include: '*'
enabled-by-default: off
sureness:
auths:
- digest
- basic
- jwt
jwt:
secret: 'CyaFv0bwq2Eik0jdrKUtsA6bx3sDJeFV643R
LnfKefTjsIfJLBa2YkhEqEGtcHDTNe4CU6+9
8tVt4bisXQ13rbN0oxhUZR73M6EByXIO+SV5
dKhaX0csgOCTlCxq20yhmUea6H6JIpSE2Rwp'
---
spring:
config:
activate:
on-profile: prod
datasource:
driver-class-name: org.h2.Driver
username: sa
password: 123456
url: jdbc:h2:./data/hertzbeat;MODE=MYSQL
hikari:
max-lifetime: 120000
jpa:
hibernate:
ddl-auto: update
# Not Require, Please config if you need email notify
# 非必填:不使用邮箱作为警告通知可以去掉spring.mail配置
mail:
# Attention: this is mail server address.
# 请注意此为邮件服务器地址:qq邮箱为 smtp.qq.com qq企业邮箱为 smtp.exmail.qq.com
host: smtp.exmail.qq.com
username: example@tancloud.cn
# Attention: this is not email account password, this requires an email authorization code
# 请注意此非邮箱账户密码 此需填写邮箱授权码
password: example
port: 465
default-encoding: UTF-8
properties:
mail:
smtp:
socketFactoryClass: javax.net.ssl.SSLSocketFactory
ssl:
enable: true
debug: false
warehouse:
store:
td-engine:
enabled: true
driver-class-name: com.taosdata.jdbc.rs.RestfulDriver
url: jdbc:TAOS-RS://192.168.223.199:6041/hertzbeat
username: root
password: YourPassword
iot-db:
enabled: false
host: 127.0.0.1
rpc-port: 6667
username: root
password: root
# org.apache.iotdb.session.util.Version: V_O_12 || V_0_13
version: V_0_13
# if iotdb version >= 0.13 use default queryTimeoutInMs = -1; else use default queryTimeoutInMs = 0
query-timeout-in-ms: -1
# 数据存储时间:默认'7776000000'(90天,单位为毫秒,-1代表永不过期)
# data expire time, unit:ms, default '7776000000'(90 days, -1:never expire)
expire-time: '7776000000'
memory:
enabled: true
init-size: 1024
redis:
enabled: false
host: 127.0.0.1
port: 6379
password: 123456
alerter:
# custom console url
console-url: https://console.tancloud.cn
# base of alert eval interval time, unit:ms. The next time is 2 times the previous time.
alert-eval-interval-base: 600000
# max of alert eval interval time, unit:ms
max-alert-eval-interval: 86400000
# system alert(available alert, reachable alert...) trigger times
system-alert-trigger-times: 1
备注:
上面的配置文件,仅修改了warehouse块的td-engine配置(紫色部分)
td-engine: enabled: true driver-class-name: com.taosdata.jdbc.rs.RestfulDriver url: jdbc:TAOS-RS://192.168.223.199:6041/hertzbeat username: root password: YourPassword |
# hertzbeat用户和用户权限配置-sureness.yml
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## -- sureness.yml文本数据源 -- ##
# 加载到匹配字典的资源,也就是需要被保护的,设置了所支持角色访问的资源
# 没有配置的资源也默认被认证保护,但不鉴权
# eg: /api/v1/source1===get===[admin] 表示 /api/v2/host===post 这条资源支持 admin 这一种角色访问
# eg: /api/v1/source2===get===[] 表示 /api/v1/source2===get 这条资源不支持任何角色访问
resourceRole:
- /api/account/auth/refresh===post===[admin,user,guest]
- /api/apps/**===get===[admin,user,guest]
- /api/monitor/**===get===[admin,user,guest]
- /api/monitor/**===post===[admin,user]
- /api/monitor/**===put===[admin,user]
- /api/monitor/**===delete==[admin]
- /api/monitors/**===get===[admin,user,guest]
- /api/monitors/**===post===[admin,user]
- /api/monitors/**===put===[admin,user]
- /api/monitors/**===delete===[admin]
- /api/alert/**===get===[admin,user,guest]
- /api/alert/**===post===[admin,user]
- /api/alert/**===put===[admin,user]
- /api/alert/**===delete===[admin]
- /api/alerts/**===get===[admin,user,guest]
- /api/alerts/**===post===[admin,user]
- /api/alerts/**===put===[admin,user]
- /api/alerts/**===delete===[admin]
- /api/notice/**===get===[admin,user,guest]
- /api/notice/**===post===[admin,user]
- /api/notice/**===put===[admin,user]
- /api/notice/**===delete===[admin]
- /api/tag/**===get===[admin,user,guest]
- /api/tag/**===post===[admin,user]
- /api/tag/**===put===[admin,user]
- /api/tag/**===delete===[admin]
- /api/summary/**===get===[admin,user,guest]
- /api/summary/**===post===[admin,user]
- /api/summary/**===put===[admin,user]
- /api/summary/**===delete===[admin]
# 需要被过滤保护的资源,不认证鉴权直接访问
# /api/v1/source3===get 表示 /api/v1/source3===get 可以被任何人访问 无需登录认证鉴权
excludedResource:
- /api/account/auth/**===*
- /api/i18n/**===get
- /api/apps/hierarchy===get
- /actuator/**===get
# web ui 前端静态资源
- /===get
- /dashboard/**===get
- /monitors/**===get
- /alert/**===get
- /account/**===get
- /setting/**===get
- /passport/**===get
- /**/*.html===get
- /**/*.js===get
- /**/*.css===get
- /**/*.ico===get
- /**/*.ttf===get
- /**/*.png===get
- /**/*.gif===get
- /**/*.jpg===get
- /**/*.svg===get
- /**/*.json===get
# swagger ui 资源
- /swagger-resources/**===get
- /v2/api-docs===get
- /v3/api-docs===get
# h2 database
- /h2-console/**===*
# 用户账户信息
# 下面有 admin tom lili 三个账户
# eg: admin 拥有[admin,user]角色,密码为hertzbeat
# eg: tom 拥有[user],密码为hertzbeat
# eg: lili 拥有[guest],明文密码为lili, 加盐密码为1A676730B0C7F54654B0E09184448289
account:
- appId: admin
credential: hertzbeat
role: [admin,user]
- appId: tom
credential: hertzbeat
role: [user]
- appId: guest
credential: hertzbeat
role: [guest]
- appId: lili
# 注意 Digest认证不支持加盐加密的密码账户
# 加盐加密的密码,通过 MD5(password+salt)计算
# 此账户的原始密码为 lili
credential: 1A676730B0C7F54654B0E09184448289
salt: 123
role: [guest]
备注:这里没有做什么修改,密码是默认的,为hertzbeat。
account: - appId: admin credential: hertzbeat |
在/data目录下创建上面两个配置文件,然后使用下面的命令运行HertzBeat即可。
docker run -dti -p 1157:1157 \
-e LANG=zh_CN.UTF-8 \
-e TZ=Asia/Shanghai \
-v /data/hertzbeat/data:/opt/hertzbeat/data \
-v /data/hertzbeat/logs:/opt/hertzbeat/logs \
-v /data/application.yml:/opt/hertzbeat/config/application.yml \
-v /data/sureness.yml:/opt/hertzbeat/config/sureness.yml \
--restart=always \
--name hertzbeat tancloud/hertzbeat:latest
运行成功后,我们可以使用docker ps查看tdengine数据库实例和HertzBeat实例的进程。
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4bb57f02dedd tancloud/hertzbeat:latest "./bin/entrypoint.sh" 2 hours ago Up 2 hours 0.0.0.0:1157->1157/tcp, :::1157->1157/tcp hertzbeat
77a11dd2b845 tdengine/tdengine:2.4.0.12 "/tini -- /usr/bin/e…" 4 hours ago Up 4 hours 0.0.0.0:6030-6049->6030-6049/tcp, 0.0.0.0:6030-6049->6030-6049/udp, :::6030-6049->6030-6049/tcp, :::6030-6049->6030-6049/udp tdengine
本地虚拟机的HertzBeat管理页面:http://192.168.223.199:1157/
在仪表盘,我们可以看见已经配置的监控的总览,看起来还是很不错的。
下面分享下相关监控的趋势图(必须安装和运行时序数据库):
1)服务器可用性监控
2)网页可用性监控
3)端口可用性监控
4)系统资源使用监控
5)中间件监控
备注:Tomcat中间件监控,跨服务器监控,要在catalina.sh里面配置开启JMX(IP地址不能是127.0.0.1或者localhost),同时还需要在tomcat-user.xml里面配置监控用户。
CATALINA_OPTS="$CATALINA_OPTS -Dcom.sun.management.jmxremote -Djava.rmi.server.hostname=192.168.223.199 -Dcom.sun.management.jmxremote.port=9011 -Dcom.su
n.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
<role rolename="manager"/>
<user username="monitor" password="YourPassword" roles="manager"/>
HertzBeat的配置信息如下:
监控告警测试:
模拟停掉Tomcat服务,然后再启动,就会出现下面的告警列表,看起来还是很清晰的。
在告警中心可以看见更加详细的告警信息记录:
总结,这款软件监控的对象还是很广的,监控配置也很简单。但是,监控数据的GUI展示和目前流行的Grafana相比,监控数据的展示界面显得还是相对简陋的,期待后期可以完善和提升。另外,软件的安全性也有待提升,因为明文密码配置在文件中,在生产环境中(尤其是大厂)肯定是不允许的。