一、 table_relation_vacuum函数
1. 函数定义
前篇最后(https://blog.csdn.net/Hehuyi_In/article/details/128749517),我们提到了table_relation_vacuum函数(tableam.h文件),本篇继续学习。
如前面所说,手动和autovacuum触发的vacuum操作均会走到该函数,需要对表加4级锁。该函数针对lazy vacuum,因此vacuum full,CLUSTER,ANALYZE操作不会走到它。
static inline void
table_relation_vacuum(Relation rel, struct VacuumParams *params,
BufferAccessStrategy bstrategy)
{
rel->rd_tableam->relation_vacuum(rel, params, bstrategy);
}
这里遇到一个问题,relation_vacuum实际不是一个函数,源码文件中也没找到它的内容,无法再看到下层函数,因此这里借助gdb跟踪一把。
2. 下层函数追踪
b table_relation_vacuum
可以看到后面实际是调用了heap_vacuum_rel函数(vacuumlazy.c文件)。兜兜转转半天,终于见到了lazy vacuum函数的庐山真面目。但在学习它之前,还是先来看看其中几个预备知识点,避免一脸懵逼。
二、 准备知识
1. TransactionIdPrecedesOrEquals函数
这个老朋友之前学习过,用来比较哪个事务id更旧,原理参考:https://blog.csdn.net/Hehuyi_In/article/details/102869893
/*
* TransactionIdPrecedesOrEquals --- is id1 logically <= id2?
*/
bool
TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2)
{
int32 diff;
if (!TransactionIdIsNormal(id1) || !TransactionIdIsNormal(id2))
return (id1 <= id2);
diff = (int32) (id1 - id2);
return (diff <= 0);
}
2. LVRelState结构体
LV指的是Lazy Vacuum,从这个名字可以猜测是与表状态相关的结构体。
typedef struct LVRelState
{
/* Target heap relation and its indexes */
Relation rel;
Relation *indrels;
int nindexes;
/* Wraparound failsafe has been triggered? */
bool failsafe_active;
/* Consider index vacuuming bypass optimization? */
bool consider_bypass_optimization;
/* Doing index vacuuming, index cleanup, rel truncation? */
bool do_index_vacuuming;
bool do_index_cleanup;
bool do_rel_truncate;
/* Buffer access strategy and parallel state */
BufferAccessStrategy bstrategy;
LVParallelState *lps;
/* Statistics from pg_class when we start out */
BlockNumber old_rel_pages; /* previous value of pg_class.relpages */
double old_live_tuples; /* previous value of pg_class.reltuples */
/* rel's initial relfrozenxid and relminmxid */
TransactionId relfrozenxid;
MultiXactId relminmxid;
/* VACUUM operation's cutoff for pruning */
TransactionId OldestXmin;
/* VACUUM operation's cutoff for freezing XIDs and MultiXactIds */
TransactionId FreezeLimit;
MultiXactId MultiXactCutoff;
/* Error reporting state */
char *relnamespace;
char *relname;
char *indname;
BlockNumber blkno; /* used only for heap operations */
OffsetNumber offnum; /* used only for heap operations */
VacErrPhase phase;
/*
* State managed by lazy_scan_heap() follows
*/
LVDeadTuples *dead_tuples; /* items to vacuum from indexes */
BlockNumber rel_pages; /* total number of pages */
BlockNumber scanned_pages; /* number of pages we examined */
BlockNumber pinskipped_pages; /* # of pages skipped due to a pin */
BlockNumber frozenskipped_pages; /* # of frozen pages we skipped */
BlockNumber tupcount_pages; /* pages whose tuples we counted */
BlockNumber pages_removed; /* pages remove by truncation */
BlockNumber lpdead_item_pages; /* # pages with LP_DEAD items */
BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
/* Statistics output by us, for table */
double new_rel_tuples; /* new estimated total # of tuples */
double new_live_tuples; /* new estimated total # of live tuples */
/* Statistics output by index AMs */
IndexBulkDeleteResult **indstats;
/* Instrumentation counters */
int num_index_scans;
int64 tuples_deleted; /* # deleted from table */
int64 lpdead_items; /* # deleted from indexes */
int64 new_dead_tuples; /* new estimated total # of dead items in
* table */
int64 num_tuples; /* total number of nonremovable tuples */
int64 live_tuples; /* live tuples (reltuples estimate) */
} LVRelState;
三、 heap_vacuum_rel函数
根据注释,该函数负责vacuum单个堆表、清理其索引、更新relpages,reltuples的统计信息。进入此函数前,我们已经完成了事务开启以及对应表的4级锁获取。
/*
* heap_vacuum_rel() -- perform VACUUM for one heap relation
*
* This routine vacuums a single heap, cleans out its indexes, and
* updates its relpages and reltuples statistics.
*
* At entry, we have already established a transaction and opened
* and locked the relation.
*/
void
heap_vacuum_rel(Relation rel, VacuumParams *params,
BufferAccessStrategy bstrategy)
{
LVRelState *vacrel;
PGRUsage ru0;
TimestampTz starttime = 0;
WalUsage walusage_start = pgWalUsage;
WalUsage walusage = {0, 0, 0};
long secs;
int usecs;
double read_rate,
write_rate;
bool aggressive; /* should we scan all unfrozen pages? 是否应该扫描所有未冻结页? */
bool scanned_all_unfrozen; /* actually scanned all such pages? 是否实际扫描了所有未冻结页? */
char **indnames = NULL;
TransactionId xidFullScanLimit;
MultiXactId mxactFullScanLimit;
BlockNumber new_rel_pages;
BlockNumber new_rel_allvisible;
double new_live_tuples;
TransactionId new_frozen_xid;
MultiXactId new_min_multi;
ErrorContextCallback errcallback;
PgStat_Counter startreadtime = 0;
PgStat_Counter startwritetime = 0;
TransactionId OldestXmin;
TransactionId FreezeLimit;
MultiXactId MultiXactCutoff;
…
首先根据输入的freeze参数,计算并赋值给各类限制值变量(带&的都是),用于下面判断是否采取迫切模式(aggressive)。根据前面的注释,aggressive=true则需要扫描所有未冻结页。冻结相关参考:postgresql_internals-14 学习笔记(三)冻结、rebuild_Hehuyi_In的博客-CSDN博客
/* 根据输入的freeze参数,计算各类限制值(带&的都是),用于下面判断是否采取迫切(aggressive)清理 */
vacuum_set_xid_limits(rel,
params->freeze_min_age,
params->freeze_table_age,
params->multixact_freeze_min_age,
params->multixact_freeze_table_age,
&OldestXmin, &FreezeLimit, &xidFullScanLimit,
&MultiXactCutoff, &mxactFullScanLimit);
/*
* 如果表的relfrozenxid <= xidFullScanLimit(表中最新xid- vacuum_freeze_table_age),则触发aggressive scan,multiXid类似;如果设置了DISABLE_PAGE_SKIPPING(禁用跳过页),则也触发aggressive scan。
*/
aggressive = TransactionIdPrecedesOrEquals(rel->rd_rel->relfrozenxid,
xidFullScanLimit);
aggressive |= MultiXactIdPrecedesOrEquals(rel->rd_rel->relminmxid,
mxactFullScanLimit);
if (params->options & VACOPT_DISABLE_PAGE_SKIPPING)
aggressive = true;
初始化vacrel变量并根据各类option设置其字段初始值,各字段含义参考LVRelState结构体定义。
vacrel = (LVRelState *) palloc0(sizeof(LVRelState));
/* Set up high level stuff about rel */
vacrel->rel = rel;
/* 打开表索引,返回索引名和数量 */
vac_open_indexes(vacrel->rel, RowExclusiveLock, &vacrel->nindexes,
&vacrel->indrels);
vacrel->failsafe_active = false;
vacrel->consider_bypass_optimization = true;
/*
* The index_cleanup param either disables index vacuuming and cleanup or
* forces it to go ahead when we would otherwise apply the index bypass
* optimization. The default is 'auto', which leaves the final decision
* up to lazy_vacuum().
*
* The truncate param allows user to avoid attempting relation truncation,
* though it can't force truncation to happen.
*/
Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
params->truncate != VACOPTVALUE_AUTO);
vacrel->do_index_vacuuming = true;
vacrel->do_index_cleanup = true;
vacrel->do_rel_truncate = (params->truncate != VACOPTVALUE_DISABLED);
if (params->index_cleanup == VACOPTVALUE_DISABLED)
{
/* Force disable index vacuuming up-front */
vacrel->do_index_vacuuming = false;
vacrel->do_index_cleanup = false;
}
else if (params->index_cleanup == VACOPTVALUE_ENABLED)
{
/* Force index vacuuming. Note that failsafe can still bypass. */
vacrel->consider_bypass_optimization = false;
}
else
{
/* Default/auto, make all decisions dynamically */
Assert(params->index_cleanup == VACOPTVALUE_AUTO);
}
vacrel->bstrategy = bstrategy;
vacrel->old_rel_pages = rel->rd_rel->relpages;
vacrel->old_live_tuples = rel->rd_rel->reltuples;
vacrel->relfrozenxid = rel->rd_rel->relfrozenxid;
vacrel->relminmxid = rel->rd_rel->relminmxid;
/* Set cutoffs for entire VACUUM */
vacrel->OldestXmin = OldestXmin;
vacrel->FreezeLimit = FreezeLimit;
vacrel->MultiXactCutoff = MultiXactCutoff;
vacrel->relnamespace = get_namespace_name(RelationGetNamespace(rel));
vacrel->relname = pstrdup(RelationGetRelationName(rel));
vacrel->indname = NULL;
vacrel->phase = VACUUM_ERRCB_PHASE_UNKNOWN;
…
lazy_scan_heap是lazy vacuum的核心函数,该函数将首先扫描表(会用到vm文件),找到无效的元组和具有空闲空间的page,然后计算表的有效元组数,最后执行表和索引的清理操作。
/* Do the vacuuming,核心函数 */
lazy_scan_heap(vacrel, params, aggressive);
- 关闭表索引
- 计算实际是否扫描了所有未冻结页(aggressive模式),并设置scanned_all_unfrozen的值
- lazy_truncate_heap进行文件末尾的页截断(可选操作),这部分空间可以释放回操作系统。注意这个函数会短暂加8级锁,有可能影响业务
-
ConditionalLockRelation(vacrel->rel, AccessExclusiveLock)
- 更新pg_class中的统计信息
- 清理vacrel中的索引统计信息及索引名
/* Done with indexes,关闭表索引 */
vac_close_indexes(vacrel->nindexes, vacrel->indrels, NoLock);
/*
* Compute whether we actually scanned the all unfrozen pages. If we did,
* we can adjust relfrozenxid and relminmxid.
*
* NB: We need to check this before truncating the relation, because that
* will change ->rel_pages.
*/
if ((vacrel->scanned_pages + vacrel->frozenskipped_pages)
< vacrel->rel_pages)
{
Assert(!aggressive);
scanned_all_unfrozen = false;
}
else
scanned_all_unfrozen = true;
/*
* Optionally truncate the relation.尝试truncate文件末的页
*/
if (should_attempt_truncation(vacrel))
{
/*
* Update error traceback information. This is the last phase during
* which we add context information to errors, so we don't need to
* revert to the previous phase.
*/
update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_TRUNCATE,
vacrel->nonempty_pages,
InvalidOffsetNumber);
lazy_truncate_heap(vacrel);
}
/*
* Update statistics in pg_class.
*/
new_rel_pages = vacrel->rel_pages;
new_live_tuples = vacrel->new_live_tuples;
visibilitymap_count(rel, &new_rel_allvisible, NULL);
if (new_rel_allvisible > new_rel_pages)
new_rel_allvisible = new_rel_pages;
new_frozen_xid = scanned_all_unfrozen ? FreezeLimit : InvalidTransactionId;
new_min_multi = scanned_all_unfrozen ? MultiXactCutoff : InvalidMultiXactId;
vac_update_relstats(rel,
new_rel_pages,
new_live_tuples,
new_rel_allvisible,
vacrel->nindexes > 0,
new_frozen_xid,
new_min_multi,
false);
…
/* Cleanup index statistics and index names */
for (int i = 0; i < vacrel->nindexes; i++)
{
if (vacrel->indstats[i])
pfree(vacrel->indstats[i]);
if (indnames && indnames[i])
pfree(indnames[i]);
}
}
如你所见,我们又掉进了新的坑里——lazy_scan_heap,下一篇继续研究研究这个函数~
参考:
《PostgreSQL数据库内核分析》
PostgreSQL 源码解读(128)- MVCC#12(vacuum过程-heap_vacuum_rel函数)_ITPUB博客
http://blog.itpub.net/6906/viewspace-2564641/
Postgresql Freezing 实现原理_13446560的技术博客_51CTO博客
https://www.pudn.com/news/6277722b517cd20ea491bf39.html