1. 需求
家里厅里有三台linux主机在跑虚拟机, 一台windows主机在跑wsl2 - ubuntu 20.04
分别是
硬件 | 网络连接方式 | OS | ip | 虚拟机s |
---|---|---|---|---|
EUC i5 7250U 16G | wifi | win10 | 10.0.1.223 | wsl2 - 随机ip |
MineFine S500 R7 5800H 64G | 网线 | Zorin OS 16.2 (Ubuntu 20.04 LTS) | 10.0.1.198 | vm1 - 10.0.1.156 vm2 - 10.0.1.157 vm3 - 10.0.1.158 |
新创云 i7 5500U 8G | 网线 | Ubuntu server 22.04 LTS | 10.0.1.107 | vm0 - 10.0.1.151 vm1 - 10.0.1.152 |
ThinkPad X230 i5 3210M 16G | 网线 | Ubuntu server 22.04 LTS | 10.0.1.22 | vm0 - 10.0.1.154 |
长期开着4台服务器不划算, 大部分功能都配置在wsl2, 也就是EUC i5上, 其余3台linux服务器只是为了跑K8S 项目, 平时应该关注。
我这边电费每度电0.61 元, 假如每台linux服务待机40W, 那么一天的电费是0.04 * 24 * 3 = 1.76 元, 一年就是642 元!
所以需求是:
在wsl2 的Jenkins 创建两个job, 分别控制 其他 3台linux主机的开关机, 关机前必须先关闭所有运行的vm, 开机后开启所有的vm。
2. 前置条件
远程唤醒(开机)有两个条:
-
主机必须用网线连接, wifi 下是不支持唤醒的
-
触发唤醒的主机和被唤醒的主机必须在同1个网段。 这导致1个问题, 因为我的Jenkins job是跑下wsl2下的, 而wsl2 的ip是不能与宿主的win10 系统同1个网段的, 平时访问必须进行端口转发。 这代表我们不能直接在wsl2 去唤醒其他主机, 必须利用 win10系统去执行唤醒的命令。
也就是 客户机 -> wsl2 jenkins -> win10 系统 -> 唤醒 其他linux
3. 检查linux主机的有线网卡设置, Mac地址, 是否已经开启支持唤醒功能
首先先确定linux主机正在用哪个有线端口工作, 和它的Mac地址
ifconfig
cat /etc/netplan/00-installer-config.yaml
然后检查对应有线网卡的wakeonlan 功能是否被开启
gateman@MoreFine-S500:~$ sudo ethtool eno1
Settings for eno1:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Link partner advertised pause frame use: Symmetric Receive-only
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
MDI-X: Unknown
Supports Wake-on: pumbg
Wake-on: d
Link detected: yes
首先这个sudo ethtool xxx 命令必须用root 执行, 否则不会显示wakeonlan 的设置
Supports Wake-on: pumbg --> 只要带有字符g, 代表这个块网卡支持网络唤醒, 否则需要在主板bios开启wakeonlan功能
Wake-on: d -> d 代表未开启, g代表开启
4. 开启网卡wakeonlan 功能
gateman@MoreFine-S500:~$ sudo ethtool -s eno1 wol g
注意上面的命令只是临时开启wol g, 重启后很可能会变会关闭状态.
这是我们必须修改网卡的设置, 令到它开机后就设置为 wol g
在netplan的配置文件中加上wakeonlan: true
gateman@MoreFine-S500:~$ sudo vi /etc/netplan/00-installer-config.yaml
# it is the network config written by 'subiquity'
network:
ethernets:
eno1:
dhcp4: false
wakeonlan: true
enp2s0:
dhcp4: false
wlp3s0:
dhcp4: false
bridges:
br0:
interfaces: [eno1]
dhcp4: no
addresses: [10.0.1.198/24]
routes:
- to: default
via: 10.0.1.1
nameservers:
addresses: [119.29.29.29, 8.8.8.8]
version: 2
记得应用设置
sudo netplan apply
5. 安装唤醒工具Fing
做好上面第4步后, 这是我们应该可以用其它同网段linux主机去执行下面的命令去唤醒其他主机
wakeonlan xx:xx:xx:xx:xx:xx # -- 网卡mac地址
但是很奇怪, 上面的命令不能唤醒我的x230,应该是还要加上别的什么参数。
但是这个世界有个很好用的网络工具(基本功能免费)Fing。
安装它后,发现x230 可以被唤醒了。
更关键的是Fing 提供windows下和linux下的命令行工具(CLI), 方便整合开发。
网址:
https://www.fing.com/products/development-toolkit
因为我实际要在windows服务器去唤醒(wsl2 不在同一个网段), 所以下载window版本
安装后, 可以用下面的命令来测试唤醒机器
fing --wol 70:70:fc:00:85:5b@10.0.1.198/24 # Mac地址 和ip都需要提供
6. 设置Ansible 的host 列表
其实到上面第5步为止, 我们的远程开关机已经打通了,
下面的步骤只是为了在jenkins上配1个开机和1个关机的job
我的Jenkins是跑在wsl2上的, 首先要把3个物理linux主机的ip 分在同1个组
vi /etc/ansible/hosts
[physical_servers]
10.0.1.107
10.0.1.122
10.0.1.198
当然不要忘了把wsl2的 ssh key 安装到3台服务器上
ssh-copy-id -i ~/.ssh/id_rsa.pub gateman@x230
接下来我们就可以利用Ansible处理这个组了, 不用单独去对每台主机做处理。
7. 关机job – 编写linux脚本 和 ansible playboook for 关闭主机上的所有kvm虚拟机
shellscript:
gateman@DESKTOP-UIU9RFJ:/opt/apps/playbooks/remoteserver$ cat shutdown_all_vms.sh
#!/bin/bash
echo "=== shutting down all kvm vms ===\n"
for i in $(virsh list | grep running | awk '{print $2}');
do
echo "shutting down " $i
virsh shutdown $i;
done
sleep 10 # sleep 10 seconds
virsh list --all
playbook:
无费两个步骤, 1, 上传脚本, 2. 执行脚本
gateman@DESKTOP-UIU9RFJ:/opt/apps/playbooks/remoteserver$ cat shutdown_all_vms.yml
---
- hosts: "{{servers}}"
remote_user: "{{ansible_user}}"
gather_facts: false
tasks:
- name: print debug msg for paramaters
debug:
msg:
- "servers is: {{servers}}"
- "ansible_user is: {{ansible_user}}"
- name: copy shutdown vm shell script to remote server
copy:
src: ./shutdown_all_vms.sh
dest: /tmp
backup: no
mode: 0775
- name: excute the shutdown vm script
script: ./shutdown_all_vms.sh
args:
chdir: /tmp
environment: # https://stackoverflow.com/questions/59522902/vms-are-not-visible-to-virsh-command-executed-using-ansible-shell-task
LIBVIRT_DEFAULT_URI: qemu:///system
register: cmdresult
- name: show stdout cmdresult
debug:
msg: "{{ cmdresult.stdout }}"
- name: show stderr cmdresult
debug:
msg: "{{ cmdresult.stderr }}"
8. 关机job – 设置免密码sudo
由于shutdown命令需要root
有两个方案, 要么用root去执行ansible, 代表必须安装ssh key 到3台主机的root账号下, 太危险
另1个方案就是令到普通用户可以免密码执行sudo (前提是在sudo group)
这里选择第2个
vi /etc/sudoers # 修改下面这一行
# Allow members of group sudo to execute any command
%sudo ALL=(ALL:ALL) NOPASSWD:ALL
这样sudo 用户可以用 sudo shutdown -h now 来关机
9. 关机job – 编写ansible playboook for 关闭物理主机
只是简单地去执行 sudo shutdown -h now 命令, 记得加上ignore_unreachable: true, 否则ansible会认为这个job执行失败。
gateman@DESKTOP-UIU9RFJ:/opt/apps/playbooks/remoteserver$ cat shutdown_server.yml
---
- hosts: "{{servers}}"
remote_user: "{{ansible_user}}"
gather_facts: false
tasks:
- name: print debug msg for paramaters
debug:
msg:
- "servers is: {{servers}}"
- "ansible_user is: {{ansible_user}}"
- name: excute the shutdown vm script
shell: sudo shutdown -h now
ignore_errors: true
ignore_unreachable: true # this will work for shutdown cases
10. 关机job – 编写Jenkins jobs 去关闭主机
首先先写1个common的执行某个ansible playbook的job
pipeline {
agent { node { label 'master' } }
stages {
stage('display parameters') {
steps {
echo "servers is ${servers}"
echo "ansible_user is ${ansible_user}"
echo "playbook_path is ${playbook_path}"
}
}
stage('run playbook'){
steps {
script {
sh "ansible-playbook -e \"servers=${servers} ansible_user=${ansible_user}\" -vv ${playbook_path}"
}
}
}
}
post {
failure {
emailext (
subject: "FAILED: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]'",
body: """<p>FAILED: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]':</p>
<p>Check console output at "<a href="${env.BUILD_URL}">${env.JOB_NAME} [${env.BUILD_NUMBER}]</a>"</p>""",
to: "nvd11@163.com",
from: "nvd11@163.com"
)
}
}
}
再写1个关机的job
无非分两步, 一是去执行第1个playbook 去关闭所有的vm, 第二是去关闭物理机
def shut_vm_plybk = "/opt/apps/playbooks/remoteserver/shutdown_all_vms.yml"
def shut_server_plybk = "/opt/apps/playbooks/remoteserver/shutdown_server.yml"
pipeline {
agent { node { label 'master' } }
stages {
stage ('display parameters') {
steps {
echo "servers is ${servers}"
echo "ansible_user is ${ansible_user}"
}
}
stage ('shutdown all vms in the servers') {
steps {
build job: 'common_run_playbook',
parameters: [[$class: 'StringParameterValue', name: 'servers', value: "${servers}"],
[$class: 'StringParameterValue', name: 'ansible_user', value: "${ansible_user}"],
[$class: 'StringParameterValue', name: 'playbook_path', value: "${shut_vm_plybk}"]
]
}
}
stage ('shutdown physical server') {
steps {
build job: 'common_run_playbook',
parameters: [[$class: 'StringParameterValue', name: 'servers', value: "${servers}"],
[$class: 'StringParameterValue', name: 'ansible_user', value: "${ansible_user}"],
[$class: 'StringParameterValue', name: 'playbook_path', value: "${shut_server_plybk}"]
]
}
}
}
post {
success {
emailext (
subject: "SUCCESSFUL: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]'",
body: """<p>SUCCESSFUL: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]':</p>
<p>Check console output at "<a href="${env.BUILD_URL}">${env.JOB_NAME} [${env.BUILD_NUMBER}]</a>"</p>""",
to: "nvd11@163.com",
from: "nvd11@163.com"
)
}
failure {
emailext (
subject: "FAILED: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]'",
body: """<p>FAILED: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]':</p>
<p>Check console output at "<a href="${env.BUILD_URL}">${env.JOB_NAME} [${env.BUILD_NUMBER}]</a>"</p>""",
to: "nvd11@163.com",
from: "nvd11@163.com"
)
}
}
}
11. 开机job – 编写Jenkins 脚本for 远程唤醒物理机.
接下来就是开机job了
由于远程唤醒前, servers都不在线, 我们是无法用ansible去处理的。
所以接下来这个job要做三步
- 利用ansible --list-host 命令去获得 /etc/ansible/hosts 设置中 physical_servers 组中的ip列表
- 根据ip列表去获得mac(定义1个 mac地址的map)
- 执行windows 下的 fing.exe 去唤醒ip(循环)
def playbook_path="/opt/apps/playbooks"
def ip_list_str = ""
def ip_list = []
def mac_map = ["10.0.1.107":"00:e0:0a:f2:12:26",
"10.0.1.122":"3c:97:0e:59:14:87",
]
def start_vm_plybk = "/opt/apps/playbooks/remoteserver/startup_all_vms.yml"
pipeline {
agent { node { label 'master' } }
stages {
stage('display parameters') {
steps {
echo "servers is ${servers}"
echo "ansible_user is ${ansible_user}"
}
}
stage('use ansible --list host command to get the ip list'){
steps {
script {
ip_list_str = sh(returnStdout: true, script: 'ansible physical_servers --list-host | grep -v hosts')
ip_list = ip_list_str.split('\n') as List
}
echo "ip_list_str = ${ip_list_str}"
println ip_list
}
}
stage('loop the ip list to start them'){
steps {
script {
for(ip in ip_list){
ip = ip.trim()
echo "get a ip --> ${ip}"
def mac_addr = mac_map[ip]
echo "mac address --> ${mac_addr}"
sh " \"/mnt/d/Program Files (x86)/Fing/bin/fing.exe\" --wol ${mac_addr}@${ip}/24"
}
sleep(90) // 由于x230开机较慢, 必须等1分钟半才能执行后面的 启动vm命令
}
}
}
...
12. 开机job – 编写linux脚本 和 ansible playboook for 开启主机上的所有kvm虚拟机
开机后就可以用ansible了,
这一步与第7步很类似的
shell script:
gateman@DESKTOP-UIU9RFJ:/opt/apps/playbooks/remoteserver$ cat startup_all_vms.sh
#!/bin/bash
echo "=== starting all kvm vms that contain the word vm ==="
for i in $(virsh list --all | grep vm | awk '{print $2}')
do virsh start $i
done
sleep 10
virsh list --all
playbook:
gateman@DESKTOP-UIU9RFJ:/opt/apps/playbooks/remoteserver$ cat startup_all_vms.yml
---
- hosts: "{{servers}}"
remote_user: "{{ansible_user}}"
gather_facts: false
tasks:
- name: print debug msg for paramaters
debug:
msg:
- "servers is: {{servers}}"
- "ansible_user is: {{ansible_user}}"
- name: copy shutdown vm shell script to remote server
copy:
src: ./startup_all_vms.sh
dest: /tmp
backup: no
mode: 0775
- name: excute the shutdown vm script
script: ./startup_all_vms.sh
args:
chdir: /tmp
environment: # https://stackoverflow.com/questions/59522902/vms-are-not-visible-to-virsh-command-executed-using-ansible-shell-task
LIBVIRT_DEFAULT_URI: qemu:///system
register: cmdresult
- name: show stdout cmdresult
debug:
msg: "{{ cmdresult.stdout }}"
- name: show stderr cmdresult
debug:
msg: "{{ cmdresult.stderr }}"
13. 开机job – 完成编写Jenkins 脚本for 远程唤醒物理机 和 里面的虚拟机
其实就是完成第11 步的job
def playbook_path="/opt/apps/playbooks"
def ip_list_str = ""
def ip_list = []
def mac_map = ["10.0.1.107":"00:e0:0a:f2:12:26",
"10.0.1.122":"3c:97:0e:59:14:87",
]
def start_vm_plybk = "/opt/apps/playbooks/remoteserver/startup_all_vms.yml"
pipeline {
agent { node { label 'master' } }
stages {
stage('display parameters') {
steps {
echo "servers is ${servers}"
echo "ansible_user is ${ansible_user}"
}
}
stage('use ansible --list host command to get the ip list'){
steps {
script {
ip_list_str = sh(returnStdout: true, script: 'ansible physical_servers --list-host | grep -v hosts')
ip_list = ip_list_str.split('\n') as List
}
echo "ip_list_str = ${ip_list_str}"
println ip_list
}
}
stage('loop the ip list to start them'){
steps {
script {
for(ip in ip_list){
ip = ip.trim()
echo "get a ip --> ${ip}"
def mac_addr = mac_map[ip]
echo "mac address --> ${mac_addr}"
sh " \"/mnt/d/Program Files (x86)/Fing/bin/fing.exe\" --wol ${mac_addr}@${ip}/24"
}
sleep(90)
}
}
}
stage ('startup all vms in the servers') {
steps {
build job: 'common_run_playbook',
parameters: [[$class: 'StringParameterValue', name: 'servers', value: "${servers}"],
[$class: 'StringParameterValue', name: 'ansible_user', value: "${ansible_user}"],
[$class: 'StringParameterValue', name: 'playbook_path', value: "${start_vm_plybk}"]
]
}
}
stage('completed') {
steps {
println 'build is completed'
}
}
}
post {
success {
emailext (
subject: "SUCCESSFUL: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]'",
body: """<p>SUCCESSFUL: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]':</p>
<p>Check console output at "<a href="${env.BUILD_URL}">${env.JOB_NAME} [${env.BUILD_NUMBER}]</a>"</p>""",
to: "nvd11@163.com",
from: "nvd11@163.com"
)
}
failure {
emailext (
subject: "FAILED: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]'",
body: """<p>FAILED: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]':</p>
<p>Check console output at "<a href="${env.BUILD_URL}">${env.JOB_NAME} [${env.BUILD_NUMBER}]</a>"</p>""",
to: "nvd11@163.com",
from: "nvd11@163.com"
)
}
}
}
到这里为止, 要做事情做完了。
我们拥有了2个开关机的job! 想开机关机时, 用房间电脑打开jenkins 网页去执行这个两job就行, 不用人手跑去机柜开人手开关机
至于怎么快速验证 物理机和虚拟机有无被关闭开启?
两个方法:
1.是利用 kvm 的 vm manger , 但是这个kvm客户端没有windows版本, 掂!
2, 更合适的方法? 当然是用grafana + prometheus 啦,
参考:
https://blog.csdn.net/nvd11/article/details/128030197