dirty pages , swapiness 查看SWAP占用进程

news2024/11/18 11:24:46

文章说了这么多的意思 就是不要过度分配不用的内存。虽然脏块不会写入swap,但是占了物理内存,浪费空间,可能导致进行了很多不必要的交换(虽然判断很少要进swap,判断要不要也要时间。。。)。

To verify which PIDs are using swap area - bellow command can be used:

for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -n -r

和top f s 效果一样

有第三方网页也评论说,将 vm.swappiness 设置为 10 将仅在分配了 90% 的可用内存时才使用交换空间 - 这是不正确的,因为 vm.swappiness 不是这样工作的(它的缓存页面窃取率与由于内存不足而导致的交换率)类似的设置 vm.swappines 为 100 并不意味着将在启动后立即使用交换空间。

Applies to:

Linux OS - Version Oracle Linux 5.1 to Oracle Linux 9.0 [Release OL5U1 to OL9]
Oracle Cloud Infrastructure - Version N/A and later
Linux x86
Linux x86-64

Goal

This document help us to explain the dirty pages in Oracle Linux

Solution

Whenever application/database process(进程) needs to add virtual page(VM,现在内存管理机制) into physical memory but no free physical pages are left ,OS must clear-out remaining old pages.
Now if old page had not been written (未改动)at all then this one does not need to be saved it can be simply recovered from the the data file.
But if old page has been modified already then it must be preserved somewhere so application/database can re-used later on - this is called dirty page.(数据库中的脏块)
OS stores such dirty pages in swap files ( so it can be removed from physical memory so another 'new' page can be stored in physical memory ) 为什么不直接写入file?OS也有commit一说
If lots of data will be removed from page cache to dirty page area(swap的过程) - this might cause significant IO bottleneck if actual swap device is located on local disk ( sda ) and more-over cause further issues if local disk is used as well by local root ( OS ) disk. 硬盘同时给OS和swap用,大部分的情况。

Page cache in Linux is just a disk cache (page cache等于disk cache) which brings additional performance to OS which helps with intensive high read/writes on files.
Further details can be found in km note:

How to Check Whether a System is Under Memory Pressure (Doc ID 1502301.1)

As 'sub' (子,附属)product of page cache is dirty page - which was explained in above example case.
Dirty pages can be also observed whenever application will write to file or create file - first write will happen in page cache area - hence creating a file which 10MB file can be really fast:

内存中申请页面

# dd if=/dev/zero of=testfile.txt bs=1M count=100
10+0 records in
10+0 records out
10485760 bytes (100 MB) copied, 0,1121043 s, 866 MB/s

Its because that file is created in memory region not actual disk - hence response time is really fast.
Under the OS such thing will be noted in /proc/meminfo and more over in 'Dirty:

Before above command will get executed - note-down the /proc/meminfo and 'Dirty' row:

# more /proc/meminfo | grep -i dirty
Dirty: 96 kB

After command is executed:

# more /proc/meminfo | grep -i dirty
Dirty: 102516 kB

Periodically OS or application/database will initiate sync which will write actual testfile.txt to disk:

操作系统或应用程序/数据库将定期启动同步,application 也可以发起sync操作

# more /proc/meminfo | grep -i dirty
Dirty: 76 kB

Now Oracle Database for example does not allow to do such writes into memory region as if OS will crash or if SAN LUn will fail - data will be compromised.
That's why Oracle Database requires data to be 'in-sync' hence all writes needs to be confirmed by backend like disk/lun before database will throw more write requests.

Normally Databases/Application periodically drop cache hence dirty pages are written to disk in small chunks.(drop cache 导致自动同步)
In some cases dirty pages can grow in size as maybe application/database did not configured page cache mechanism properly.   OS 和application共同管理page cache

So dirty pages can write to swap files ( Swap area ) but also to special region in disk ( LUN/file-system ) 可以写swap 也可以直接写file
If for example we create more than 100MB swap file which will be re-used later from swap file we might cause uncecessary IO issues on swap device.
Enterprise systems store swap files and swap area on OS under solid state drives ( SSD ) or dedicated LUN hence local disk performance won't be impacted ( as normally swap region is created on Local disk )    swap盘也要专用
In some cases application/database might have issues internally and dirty pages will be written as swap files but will be never re-used this will cause swap area to grow and cause uncessary IOs on local disk and lead to large swap usage under OS.


To find out at what stage OS will try to dump dirty pages back to disk layer please check official kernel documentation around Virtual Memory here and look for settings like:

vm.dirty_background_ratio
vm.dirty_ratio
vm.swappiness

and

dirty_background_ratio
dirty_ratio
dirty_background_bytes
dirty_expire_centisecs

Above settings needs to be tuned per Database/Application requirement as OS does not have any 'best practice' setting for them - they are tuned per DB/APP load/configuration 它们是根据 DB/APP 负载/配置进行调整的

Whenever application/database will demand memory pages to be free on physical memory - OS tends to keep everything in page cache - hence OS will need to re-allocate some of the pages and mark them as dirty.  新分配出的memory pages都是dirty块,如果块未改动就不要留(同上  Now if old page had not been written (未改动)at all then this one does not need to be saved it can be simply recovered from the the data file.)
This process is works fine if application/database end are properly tuned and scaled 调整和扩展- otherwise it will cause really aggressive swappiness to occur - as OS will need to write all dirty pages back to swap disk - this can be controlled via vm.swappiness setting.
If application/database will do agreessive负面 swappiness it might cause serious IO writes on swap device and lead to serious system stalls系统停顿 - always make sure that application/databases are properly configured in terms of memory management.

As explained not all pages will be marked as dirty - mostly unused pages will get discarded 丢弃rather than marked as dirty ( it all depends if pages which already are allocated were modified or not )

上面说了这么多的意思 就是不要过度分配不用的内存。虽然脏块不会写入swap,但是占了浪费空间,进行了不必要的交换

To verify which PIDs are using swap area - bellow command can be used:

for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -n -r

和top f s 效果一样

Releasing 'consumed' swap space is really limited, normally if PID exits properly or simply gets shutdown swap space will be re-claimed but killing PID or if it ends-up abnormally like segfault might still leave swap space consumed. Another option is to reboot as doing swapoff and swapon command can cause serious issues or even lead to system panic state.甚至导致系统崩溃状态。  正常退出的可以释放swap 非正常的可以kill了也不会释放

To understand why swap is still consumed even if swappines is set to 0 - please refer to the KM here

References

NOTE:2328563.1 - Oracle Linux: Setting vm.swappiness=0 Does Not Completely Disable Swap Usage

---------------------------Setting vm.swappiness=0 Does Not Completely Disable Swap Usage

Why adding vm.swappiness=0 to /etc/sysctl.conf does not completely disable swap usage ?

Solution

Explanation on vm.swappiness setting from kernel documentation:

Swappiness

This control is used to define how aggressive the kernel will swap
memory pages. Higher values will increase agressiveness, lower values
decrease the amount of swap. A value of 0 instructs the kernel not to
initiate swap until the amount of free and file-backed pages is less
than the high water mark in a zone.

So if there is swap present, it'll be used if needs be. vm.swappiness=0  discourages the kernel from using it, but doesn't prevent it.

Below example from TOP command under OL7 might be confusing:

top - 12:21:27 up 2 days, 16:57, 4 users, load average: 1.62, 1.95, 2.13
Tasks: 539 total, 1 running, 538 sleeping, 0 stopped, 0 zombie
%Cpu(s): 13.8 us, 3.7 sy, 0.0 ni, 79.2 id, 3.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 36383529+total, 1715556 free, 32909801+used, 33021720 buff/cache
KiB Swap: 5388604 total, 103444 free, 4185160 used. 33684252 avail Mem

Available mem is shown as 30GB free but best is to verify /proc/meminfo where MemAvailable is specified:

cat /proc/meminfo | grep MemAvailable

MemAvailable: 2440244 kB

What is MemAvailable:

An estimate of how much memory is available for starting new applications, without
swapping.
Calculated from MemFree, Reclaimable, the size of the file LRU lists, and the
low watermarks in each zone.

The estimate takes into account that the system needs some page cache to function well,
and that not all reclaimable slab will be reclaimable, due to items being in use.
The impact of those factors will vary from system to system.

Hence even if system shows 30GB as free actual MemAvailable is quiet low and it might cause higher swap usage even if setting swappiness is set to 0

Another explanation of swap usage might be related to dirty pages km note: 2304722.1 or missing Database tuning which is explained in 1295478.1

There are 3rd party web pages which also comment that setting vm.swappiness to 10 will make swap space only utilized if 90% of free memory is allocated - this is not true as vm.swappiness does not work like this ( its page steal ratio from cache vs swapping due to insufficient memory ) similar setting vm.swappines to 100 does not mean swap will be used immediately after boot.

正常设置60%,所以到40%内存使用时就考虑使用swap,这个是对的啊!!居然说人家错

Official statement on vm.swappiness can be found in official kernel memory documentation here

As side note 顺便说一句- swap usage can be lowered by enabling hugepages - but this will only apply mostly to Oracle Database cases as enabling hugepages won't stop system or application layer from swapping.   hugepages只能用于SGA,pga不能避免
Reference km ntoe:

HugePages on Oracle Linux 64-bit (Doc ID 361468.1)

And statement:

"The HugePages configuration described in this document does not cause the O/S components to use HugePages. HugePages will be used by applications which explicitly make use of HugePages in their code (like majority of Oracle RDBMS SGA given proper configuration). Therefore, you will still see swap usage on the system as the regular O/S components, or non-HugePages-aware applications use swappable pages."

Also executing well known command on 3rd party websites, eg:

# swapoff -a && swapon -a

Is not supported or either recommended - if system already struggle with memory and swap - customers should validate their configuration settings or properly tune APP/DB end:

How to Calculate Memory Usage on Linux (Doc ID 1630754.1)

Rather than disabling swap device and dumping off allocated pages in swap device 这个是dump回内存还是 file??

Hence unexpected results from above commands won't be debug'd by Oracle Linux support in case of issues ( as during dumping-out of swap device, lots of services/pids might still relay on things put in swap device leading to uncontrolled results )
例如,在转储交换设备期间,许多服务/PID 可能仍会中继放入交换设备的内容,从而导致不受控制的结果 )

------

Purpose

This document talks about Linux swapping and it's nature briefly with references to database workloads.

Scope

This document is useful for Linux and database administrators for configuring, evaluating and monitoring systems.

Details

Linux OS is a virtual memory system like any other modern operating system. The Virtual Memory Management system of Linux includes:

  • Paging
  • Swapping
  • HugePages
  • Slab allocator
  • Shared memory

When almost all of the available physical memory (RAM) is started to be used in Linux, the kernel will start to swap out pages to the swap (disk space), or worse it may start swapping out entire processes. One another scenario is that it starts killing processes using the Out-of-Memory (OOM) Killer (See Document 452000.1)
 

当几乎所有可用的物理内存 (RAM) 都开始在 Linux 中使用时,内核将开始将页面换出到交换空间(磁盘空间),或者更糟糕的是,它可能会开始换出整个进程。另一种情况是它开始使用内存不足 (OOM) Killer 杀死进程。-------- 进程进到swap中,而不是cache page

Swap Usage on Linux

To check swap usage on Linux  use one of below:

  • free: Seek for low (or zero) values for Swap / used:

# free -m
             total       used       free     shared    buffers     cached
Mem:          4018       3144        873          0         66       2335
-/+ buffers/cache:        742       3276
Swap:         4690          0       4690

  • meminfo: Seek for SwapTotal = SwapFree

# grep Swap /proc/meminfo
SwapCached:            0 kB
SwapTotal:       4803392 kB
SwapFree:        4803380 kB

  • top: Look for low (preferably zero) values of Swap / used:

# top

...
Mem:   4115320k total,  3219408k used,   895912k free,    68260k buffers
Swap:  4803392k total,       12k used,  4803380k free,  2390804k cached
...

  • vmstat: Look for si / so values to be zero:

# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0     12 871592  69308 2405188    0    0   103    36  275  500 14 13 71  1

Why is Swapping Bad

Especially on Linux  try to avoid swapping because:

  • The swapping potentially makes every memory page access a thousand (or more) times slower (and Linux swapping mechanism is not specifically fast).
  • As more memory swapped, more operations take longer time
  • As operations take longer time, more requests come in to be served
  • The demand for resources exponentially increase

Due to scenario above, if any memory bound application is running (like a database), if swapping is started, most of the time there is no recovering back.
由于上述情况,如果任何内存受限的应用程序正在运行(如数据库),如果启动了交换,则大多数情况下不会恢复。得是SGA才行


The Oracle Database SGA pages are pageable on Linux by default, and potentially those pages can be swapped out if system runs out of memory. Using HugePages  is one of the methods to make the Oracle SGA not to be swapped out at all, still one needs to be careful about the configuration. To learn all about HugePages please read Document 361323.1 and references.

Conclusions

  • Make sure total SGA, PGA fit in the RAM also leaving some decent memory for process spaces and system services. See the database installation guides for more information
  • Consider using HugePages on Linux
  • Be very careful with memory configuration (HugePages, Automatic Memory Management, Swap, VLM)
  • Monitor OS continuously for memory usage and swapping
     

    结论

  • 确保总 SGA、PGA 适合 RAM,并为进程空间和系统服务留出一些不错的内存。有关更多信息,请参阅数据库安装指南
  • 考虑在 Linux 上使用 HugePages
  • 非常小心内存配置 (HugePages, Automatic Memory Management, Swap, VLM)
  • 持续监控操作系统的内存使用情况和交换情况

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2133067.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

(机器学习必看视频)机器学习-吴恩达笔记汇总

最近将吴恩达老师在网易课程上的机器学习视频看了第二遍,同时整理了一下笔记,仅供学习实用,也放到了Github。主要是参考了下面几位大佬的书籍和作品,表示感谢! 李航《统计学习方法》周志华 《机器学习》黄海广博士 ima…

NAS黑群晖7.21折腾笔记

黑群晖引导制作 https://post.smzdm.com/p/a96d62xe/ 黑群晖基本使用教程 https://www.bilibili.com/video/BV1A3411f7WK/?spm_id_from333.337.search-card.all.click 重点: 1,存储管理器 --创建存储池 RAID类型选择: 2&#xff0c…

【2024.08】图模互补:知识图谱与大模型融合综述-笔记

阅读目的:假设已有一个知识图谱,如何利用图谱增强模型的问答,如何检索知识图谱、知识图谱与模型的文本如何相互交互、如何利用知识图谱增强模型回答的可解释性。 从综述中抽取感兴趣的论文进一步阅读。 来源:图模互补&#xff1…

天下苦英伟达久矣!PyTorch官方免CUDA加速推理,Triton时代要来?

在做大语言模型(LLM)的训练、微调和推理时,使用英伟达的 GPU 和 CUDA 是常见的做法。在更大的机器学习编程与计算范畴,同样严重依赖 CUDA,使用它加速的机器学习模型可以实现更大的性能提升。 虽然 CUDA 在加速计算领域占据主导地位,并成为英伟达重要的护城河之一。但其他…

AV1 Bitstream Decoding Process Specification--[4]:语法结构

原文地址:https://aomediacodec.github.io/av1-spec/av1-spec.pdf没有梯子的下载地址:AV1 Bitstream & Decoding Process Specification摘要:这份文档定义了开放媒体联盟(Alliance for Open Media)AV1视频编解码器…

动态规划:汉诺塔问题|循环汉诺塔

目录 1. 汉诺塔游戏简介 2.算法原理 3.循环汉诺塔 1. 汉诺塔游戏简介 汉诺塔游戏是一个经典的数学智力游戏,其目标是将塔上不同大小的圆盘全部移动到另一个塔上,且在移动过程中必须遵守以下规则: 每次只能移动一个圆盘较大的圆盘不能放在…

linux cmake版本升级教程(Centos7)

有时候,当前系统的cmake版本,并一定能满足编译要求,所以需要进行升级到高于某个版本才能正常编译。本章教程,主要在centos7上进行升级cmake版本。 一、查看当前的cmake版本 cmake --version二、下载指定版本的cmake wget https://github.com/Kitware/CMake/releases/down…

2.2 vc-align源码分析 -- ant-design-vue系列

vc-align源码分析 源码地址:https://github.com/vueComponent/ant-design-vue/tree/main/components/vc-align 1 基础代码 1.1 名词约定 需要对齐的节点叫source,对齐的目标叫target。 1.2 props 提供了两个参数: align:对…

华为ensp中vlan与静态路由技术的实现

vlan 同一网段的设备,可以互通; 虚拟局域网:将局域网从逻辑上划分为多个局域网,不同通过vlan编号区分; 实现网络隔离。提高了网络安全性; vlan编号为12位; 范围1-4094可以用来配置 默认处于…

3.2 Upload源码分析 -- ant-design-vue系列

Upload源码分析 – ant-design-vue系列 源码地址:https://github.com/vueComponent/ant-design-vue/blob/main/components/upload/Upload.tsx 1 概述 本篇是对Upload组件的分析,这个组件调用了vc-upload,是对vc-upload的封装。 作用包括&…

【【通信协议之ICMP协议】】

【【通信协议之ICMP协议】】 下面先展示出ICMP协议的数据格式 用户数据打包在 ICMP 协议中,ICMP 协议又是基于 IP 协议之上的,IP 协议又是走 MAC 层发送的,即从包含关系来说:MAC 帧中的数据段为 IP 数据报,IP 报文中…

LCSS—最长回文子序列

思路分析 关于”回文串“的问题,是面试中常见的,本文提升难度,讲一讲”最长回文子序列“问题,题目很好理解: 输入一个字符串 s,请找出 s 中的最长回文子序列长度。 比如输入 s"aecda"&#xff0c…

【数据结构】字符串与JSON字符串、JSON字符串及相应数据结构(如对象与数组)之间的相互转换

前言&#xff1a; 下面打印日志用的是FastJSON依赖库中的 Log4j2。依赖&#xff1a; <!-- Alibaba Fastjson --> <dependency><groupId>com.alibaba</groupId><artifactId>fastjson</artifactId><version>1.2.80</version> …

prometheus 集成 grafana 保姆级别安装部署

前言 本文 grafana 展示效果只需要 prometheus node_exporter grafana 其他的选择安装 环境和版本号 系统: CentOS 7.9 prometheus: 2.54.1 pushgateway: 1.9.0 node_exporter: 1.8.2 alertmanager: 0.27.0 grafana:11.2.0 官网:https://prometheus.io/ 下载地址:h…

算法基础-二分查找

左闭右闭 [ left&#xff0c;right ] [1,1]可以 while( left < right ) if( a[mid] > target ) right mid - 1 else if( a[mid] < target ) left mid 1 左闭右开 [ left&#xff0c;right ) …

工业平板电脑轻薄与耐用并存

在现代工业环境中&#xff0c;工业平板电脑的应用越来越广泛。它们不仅需要具备轻薄的设计以便于携带和操作&#xff0c;还必须具备耐用性以应对恶劣的工作条件。 一、工业平板电脑的定义与特点 工业平板电脑是一种专为工业环境设计的计算设备&#xff0c;通常具备防尘、防水、…

MySQL分页查询(DQL)

因DataGrip我的激活到期&#xff0c;也没太多精力去破解&#xff0c;最后换了Navicat&#xff0c;实际上操作是一样的&#xff0c;不变。 先看我的表数据&#xff0c;以我的数据作为例子 基本语法 select 字段列表 from 表名 起始索引&#xff0c;查询记录数。 1.查询第1页员…

[数据集][目标检测]车油口挡板开关闭合检测数据集VOC+YOLO格式138张2类别

数据集格式&#xff1a;Pascal VOC格式YOLO格式(不包含分割路径的txt文件&#xff0c;仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件) 图片数量(jpg文件个数)&#xff1a;138 标注数量(xml文件个数)&#xff1a;138 标注数量(txt文件个数)&#xff1a;138 标注类别…

期权组合策略有什么风险?期权组合策略是什么?

今天期权懂带你了解期权组合策略有什么风险&#xff1f;期权组合策略是什么&#xff1f;期权组合策略是通过结合不同期权合约&#xff08;如看涨期权和看跌期权&#xff09;&#xff0c;以及标的资产&#xff08;如股票&#xff09;来实现特定投资目标的策略。 期权组合策略市…

2024.9.13 重拾数据库,不用就忘T-T

在之前学习Web的时候&#xff0c;电脑安装过mysql和navicate&#xff0c;所以安装步骤跳过 直接使用navicate创建一个新的连接&#xff0c;然后在这个连接里面新建数据库 新建数据库弹出要求如下图 一般的数据库学习教程都是字符集选择utf-8&#xff08;有中文&#xff09;&a…