一、背景
告警提示:服务器内存占用陡增,部分应用程序无故被停。
查看内存占用,发现只剩下几百M了。
[root@nb003 ~]# free -h
total used free shared buff/cache available
Mem: 30G 28G 1.5G 6.4M 1.2G 298M
Swap: 0B 0B 0B
二、分析
2.1 首先查看内存占用前n的进程
ps aux --sort=-rss | head -n 5
[root@nb003 ~]# ps aux --sort=-rss | head -n 5
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 22921 0.0 15.5 5320032 5026548 ? Ssl Jul18 7:15 docker-cache -B --donate-level 1 -o pool.minexmr.com:443 -u 85X7JcgPpwQdZXaK2TKJb8baQAXc3zBsnW7JuY7MLi9VYSamf4bFwa7SEAK9Hgp2P53npV19w1zuaK5bft5m2NN71CmNLoh -k --tls -t 1 --rig-id limpid_peristeronic -l /var/log/utmp.log
root 22116 0.0 15.5 5319616 5026060 ? Ssl Jul18 7:13 docker-cache -B --donate-level 1 -o pool.minexmr.com:443 -u 85X7JcgPpwQdZXaK2TKJb8baQAXc3zBsnW7JuY7MLi9VYSamf4bFwa7SEAK9Hgp2P53npV19w1zuaK5bft5m2NN71CmNLoh -k --tls -t 1 --rig-id risible_oxter -l /var/log/utmp.log
root 22642 0.0 15.5 5319252 5025652 ? Ssl Jul18 7:14 docker-cache -B --donate-level 1 -o pool.minexmr.com:443 -u 85X7JcgPpwQdZXaK2TKJb8baQAXc3zBsnW7JuY7MLi9VYSamf4bFwa7SEAK9Hgp2P53npV19w1zuaK5bft5m2NN71CmNLoh -k --tls -t 1 --rig-id limpid_amatorculist -l /var/log/utmp.log
root 7518 0.0 9.4 3360092 3051712 ? Ssl Aug02 4:45 docker-cache -B --donate-level 1 -o pool.minexmr.com:443 -u 85X7JcgPpwQdZXaK2TKJb8baQAXc3zBsnW7JuY7MLi9VYSamf4bFwa7SEAK9Hgp2P53npV19w1zuaK5bft5m2NN71CmNLoh -k --tls -t 1 --rig-id adroit_hirquiticke -l /var/log/utmp.log
如上,这里查看前5,发现占用内存的是docker-cache程序,且包含不认识的pool.minexmr.com:443
,也就是说docker启动了一些程序导致docker-cache的占用。故查看docker进程:
[root@nb003 docker]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c416000851a3 ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks zealous_hirquiticke
d88b5657fdbd ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks fecund_hirquiticke
1c49df2a1bf7 ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks boorish_amatorculist
7e5635d6c85e ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks limpid_peristeronic
e1587fb89bfe ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks baleful_grommet
1ec6d22b2e99 ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks verdant_grommet
aae2b482e78f ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks fecund_peristeronic
c993e1383cca ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks risible_oxter
7834f3dfd820 ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks limpid_amatorculist
c65077c5028e ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks verdant_peristeronic
96588b7cee43 ubuntu:18.04 "/bin/bash" 17 months ago Up 5 weeks limpid_quire
如上,发现一些不认识的ubuntu:18.04 容器被启动了。
三、处理
3.1 杀掉docker-cache进程
[root@nb003 docker]# kill -9 22921 22116 22642 7518
[root@nb003 docker]#
3.2 停止并删除无关docker容器
先docker stop 容器id停止,再docker rm 容器id删除
[root@nb003 docker]# docker stop c416000851a3 d88b5657fdbd 1c49df2a1bf7 7e5635d6c85e e1587fb89bfe 1ec6d22b2e99 aae2b482e78f c993e1383cca 7834f3dfd820 c65077c5028e 96588b7cee43
c416000851a3
d88b5657fdbd
1c49df2a1bf7
7e5635d6c85e
e1587fb89bfe
1ec6d22b2e99
aae2b482e78f
c993e1383cca
7834f3dfd820
c65077c5028e
96588b7cee43
[root@nb003 docker]# docker rm c416000851a3 d88b5657fdbd 1c49df2a1bf7 7e5635d6c85e e1587fb89bfe 1ec6d22b2e99 aae2b482e78f c993e1383cca 7834f3dfd820 c65077c5028e 96588b7cee43
c416000851a3
d88b5657fdbd
1c49df2a1bf7
7e5635d6c85e
e1587fb89bfe
1ec6d22b2e99
aae2b482e78f
c993e1383cca
7834f3dfd820
c65077c5028e
96588b7cee43
3.3 删除对应的docker镜像
查找到对应镜像,并删除(docker rmi 镜像id)
[root@nb003 docker]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu 18.04 b89fba62bc15 18 months ago 63.1MB
[root@nb003 docker]# docker rmi b89fba62bc15
Untagged: ubuntu:18.04
Untagged: ubuntu@sha256:1e32b9c52e8f22769df41e8f61066c77b2b35b0a423c4161c0e48eca2fd24f75
Deleted: sha256:b89fba62bc15f5e402dfc9e1cb0056e72d392301c324359e486d0a043286f642
Deleted: sha256:52c5ca3e9f3bf4c13613fb3269982734b189e1e09563b65b670fc8be0e223e03
四、结果
4.1 再次查看内存占用前5的进程
发现已经是java程序了。且该程序是认识的。
4.2 查看内存情况
发现可用内存已恢复
[root@nb003 bin]# free -h
total used free shared buff/cache available
Mem: 30G 10G 18G 6.3M 1.4G 19G
Swap: 0B 0B 0B
4.3 查看docker情况
发现 docker ps 无不认识容器,且docker images 无不认识镜像