postgres 源码解析43 元组的插入流程详解 heap_insert

news2025/1/18 5:43:44

本文讲解postgres中元组的插入流程,深入了解其实现原理。同时此过程涉及元组xmin/xmax与标识位的设置细节,与事务的可见性部分密切相关相关,借此复习一下。

heappage结构

在这里插入图片描述

执行流程框架图

在这里插入图片描述

heap_prepare_insert

该函数执行内容较为简单,主要是设置HeapTuple结构体的头信息,包括标识位和xmin/xmax

/*
 * Subroutine for heap_insert(). Prepares a tuple for insertion. This sets the
 * tuple header fields and toasts the tuple if necessary.  Returns a toasted
 * version of the tuple if it was toasted, or the original tuple if not. Note
 * that in any case, the header fields are also set in the original tuple.
 */
static HeapTuple
heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
					CommandId cid, int options)
{
	/*
	 * To allow parallel inserts, we need to ensure that they are safe to be
	 * performed in workers. We have the infrastructure to allow parallel
	 * inserts in general except for the cases where inserts generate a new
	 * CommandId (eg. inserts into a table having a foreign key column).
	 */
	if (IsParallelWorker())                         
		ereport(ERROR,
				(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
				 errmsg("cannot insert tuples in a parallel worker")));

	tup->t_data->t_infomask &= ~(HEAP_XACT_MASK);      
	tup->t_data->t_infomask2 &= ~(HEAP2_XACT_MASK);
	tup->t_data->t_infomask |= HEAP_XMAX_INVALID;			// 先设置 XMAX_INVALID信息
	HeapTupleHeaderSetXmin(tup->t_data, xid);				// 设置xim
	if (options & HEAP_INSERT_FROZEN)						
		HeapTupleHeaderSetXminFrozen(tup->t_data);

	HeapTupleHeaderSetCmin(tup->t_data, cid);				// 设置cid
	HeapTupleHeaderSetXmax(tup->t_data, 0); /* for cleanliness */		// 设置 xmax 
	tup->t_tableOid = RelationGetRelid(relation);			// 元组对应的表信息 OID

	/*
	 * If the new tuple is too big for storage or contains already toasted
	 * out-of-line attributes from some other relation, invoke the toaster.
	 *
	 * 如果该元组超过TOAST元组阈值时,需要进行 toast元组的插入工作
	 */
	if (relation->rd_rel->relkind != RELKIND_RELATION &&
		relation->rd_rel->relkind != RELKIND_MATVIEW)
	{
		/* toast table entries should never be recursively toasted */
		Assert(!HeapTupleHasExternal(tup));
		return tup;
	}
	else if (HeapTupleHasExternal(tup) || tup->t_len > TOAST_TUPLE_THRESHOLD)
		return heap_toast_insert_or_update(relation, tup, NULL, options);
	else
		return tup;
}

RelationPutHeapTuple

该函数的功能是将上述准备的元组插入到指定页中,并更新该元组在页内的偏移量信息

/*
 * RelationPutHeapTuple - place tuple at specified page
 *
 * !!! EREPORT(ERROR) IS DISALLOWED HERE !!!  Must PANIC on failure!!!
 *
 * Note - caller must hold BUFFER_LOCK_EXCLUSIVE on the buffer.
 */
void
RelationPutHeapTuple(Relation relation,
					 Buffer buffer,
					 HeapTuple tuple,
					 bool token)
{
	Page		pageHeader;
	OffsetNumber offnum;

	/*
	 * A tuple that's being inserted speculatively should already have its
	 * token set.
	 */
	Assert(!token || HeapTupleHeaderIsSpeculative(tuple->t_data));

	/*
	 * Do not allow tuples with invalid combinations of hint bits to be placed
	 * on a page.  This combination is detected as corruption by the
	 * contrib/amcheck logic, so if you disable this assertion, make
	 * corresponding changes there.
	 */
	Assert(!((tuple->t_data->t_infomask & HEAP_XMAX_COMMITTED) &&
			 (tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI)));

	/* Add the tuple to the page */
	// 从buffer中获取对应物理页地址
	pageHeader = BufferGetPage(buffer);			

	// 在页中获取指定空间位置并填充相关数据,细节在此处
	offnum = PageAddItem(pageHeader, (Item) tuple->t_data,
						 tuple->t_len, InvalidOffsetNumber, false, true);

	if (offnum == InvalidOffsetNumber)
		elog(PANIC, "failed to add tuple to page");

	/* Update tuple->t_self to the actual position where it was stored */
	// 更新项指针的偏移量 ItemPointerData
	ItemPointerSet(&(tuple->t_self), BufferGetBlockNumber(buffer), offnum);

	/*
	 * Insert the correct position into CTID of the stored tuple, too (unless
	 * this is a speculative insertion, in which case the token is held in
	 * CTID field instead)
	 * 填充 heaptupleHeader CTID信息
	 */
	if (!token)
	{
		ItemId		itemId = PageGetItemId(pageHeader, offnum);
		HeapTupleHeader item = (HeapTupleHeader) PageGetItem(pageHeader, itemId);

		item->t_ctid = tuple->t_self;
	}
}

PageAddItemExtended

向page中增加一个Item,并按照heap页的结构插入对应位置,如果含有空闲项指针但未足够多的空间空间,则返回无效的 InvalidOffsetNumber,反之返回Item结构在page中偏移量。


/*
 *	PageAddItemExtended 
 *
 *	Add an item to a page.  Return value is the offset at which it was
 *	inserted, or InvalidOffsetNumber if the item is not inserted for any
 *	reason.  A WARNING is issued indicating the reason for the refusal.
 *
 *	offsetNumber must be either InvalidOffsetNumber to specify finding a
 *	free line pointer, or a value between FirstOffsetNumber and one past
 *	the last existing item, to specify using that particular line pointer.
 * 
 *	If offsetNumber is valid and flag PAI_OVERWRITE is set, we just store
 *	the item at the specified offsetNumber, which must be either a
 *	currently-unused line pointer, or one past the last existing item.
 * 
 *  如果offsetNumber有效且设置 PAI_OVERWRITE标识,需判断item是否使用,如使用则打印warnin日志,
 * 返回InvalidOffsetNumber;
 * 
 *	If offsetNumber is valid and flag PAI_OVERWRITE is not set, insert
 *	the item at the specified offsetNumber, moving existing items later
 *	in the array to make room.
 * 
 * 如果offsetNumber有效但未设置 PAI_OVERWRITE标识,则插入指定位置,并将原位置后的item向后移动
 *	If offsetNumber is not valid, then assign a slot by finding the first
 *	one that is both unused and deallocated.
 *
 * 如果offsetNumber无效,则分配第一个未使用的item槽
 * 
 *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
 *	MaxHeapTuplesPerPage line pointers on the page.
 *
 * 如果设置 PAI_IS_HEAP,则 offsetNumber 不许超过页内最大元组数
 *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
 */
OffsetNumber
PageAddItemExtended(Page page,
					Item item,
					Size size,
					OffsetNumber offsetNumber,
					int flags)
{
	PageHeader	phdr = (PageHeader) page;
	Size		alignedSize;
	int			lower;
	int			upper;
	ItemId		itemId;
	OffsetNumber limit;
	bool		needshuffle = false;

	/*
	 * Be wary about corrupted page pointers
	 */
	if (phdr->pd_lower < SizeOfPageHeaderData ||
		phdr->pd_lower > phdr->pd_upper ||
		phdr->pd_upper > phdr->pd_special ||
		phdr->pd_special > BLCKSZ)
		ereport(PANIC,
				(errcode(ERRCODE_DATA_CORRUPTED),
				 errmsg("corrupted page pointers: lower = %u, upper = %u, special = %u",
						phdr->pd_lower, phdr->pd_upper, phdr->pd_special)));

	/*
	 * Select offsetNumber to place the new item at
	 */
	limit = OffsetNumberNext(PageGetMaxOffsetNumber(page));

	/* was offsetNumber passed in? */
	if (OffsetNumberIsValid(offsetNumber))
	{
		/* yes, check it */
		if ((flags & PAI_OVERWRITE) != 0)
		{
			if (offsetNumber < limit)
			{
				itemId = PageGetItemId(phdr, offsetNumber);
				if (ItemIdIsUsed(itemId) || ItemIdHasStorage(itemId))
				{
					elog(WARNING, "will not overwrite a used ItemId");
					return InvalidOffsetNumber;
				}
			}
		}
		else
		{
			if (offsetNumber < limit)
				needshuffle = true; /* need to move existing linp's */
		}
	}
	else
	{
		/* offsetNumber was not passed in, so find a free slot */
		/* if no free slot, we'll put it at limit (1st open slot) */
		if (PageHasFreeLinePointers(phdr))
		{
			/*
			 * Scan line pointer array to locate a "recyclable" (unused)
			 * ItemId.
			 *
			 * Always use earlier items first.  PageTruncateLinePointerArray
			 * can only truncate unused items when they appear as a contiguous
			 * group at the end of the line pointer array.
			 */
			for (offsetNumber = FirstOffsetNumber;
				 offsetNumber < limit;	/* limit is maxoff+1 */
				 offsetNumber++)
			{
				itemId = PageGetItemId(phdr, offsetNumber);

				/*
				 * We check for no storage as well, just to be paranoid;
				 * unused items should never have storage.  Assert() that the
				 * invariant is respected too.
				 */
				Assert(ItemIdIsUsed(itemId) || !ItemIdHasStorage(itemId));

				if (!ItemIdIsUsed(itemId) && !ItemIdHasStorage(itemId))
					break;
			}
			if (offsetNumber >= limit)
			{
				/* the hint is wrong, so reset it */
				PageClearHasFreeLinePointers(phdr);
			}
		}
		else
		{
			/* don't bother searching if hint says there's no free slot */
			offsetNumber = limit;
		}
	}

	/* Reject placing items beyond the first unused line pointer */
	if (offsetNumber > limit)
	{
		elog(WARNING, "specified item offset is too large");
		return InvalidOffsetNumber;
	}

	/* Reject placing items beyond heap boundary, if heap */
	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage)
	{
		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
		return InvalidOffsetNumber;
	}

	/*
	 * Compute new lower and upper pointers for page, see if it'll fit.
	 *
	 * Note: do arithmetic as signed ints, to avoid mistakes if, say,
	 * alignedSize > pd_upper.
	 */
	if (offsetNumber == limit || needshuffle)
		lower = phdr->pd_lower + sizeof(ItemIdData);
	else
		lower = phdr->pd_lower;

	alignedSize = MAXALIGN(size);

	upper = (int) phdr->pd_upper - (int) alignedSize;

	if (lower > upper)
		return InvalidOffsetNumber;

	/*
	 * OK to insert the item.  First, shuffle the existing pointers if needed.
	 */
	itemId = PageGetItemId(phdr, offsetNumber);

	if (needshuffle)
		memmove(itemId + 1, itemId,
				(limit - offsetNumber) * sizeof(ItemIdData));

	/* set the line pointer */
	ItemIdSetNormal(itemId, upper, size);

	/*
	 * Items normally contain no uninitialized bytes.  Core bufpage consumers
	 * conform, but this is not a necessary coding rule; a new index AM could
	 * opt to depart from it.  However, data type input functions and other
	 * C-language functions that synthesize datums should initialize all
	 * bytes; datumIsEqual() relies on this.  Testing here, along with the
	 * similar check in printtup(), helps to catch such mistakes.
	 *
	 * Values of the "name" type retrieved via index-only scans may contain
	 * uninitialized bytes; see comment in btrescan().  Valgrind will report
	 * this as an error, but it is safe to ignore.
	 */
	VALGRIND_CHECK_MEM_IS_DEFINED(item, size);

	/* copy the item's data onto the page */
	memcpy((char *) page + upper, item, size);

	/* adjust page header */
	phdr->pd_lower = (LocationIndex) lower;
	phdr->pd_upper = (LocationIndex) upper;

	return offsetNumber;
}

以重用item为例【ItemIdData结构体为4字节,大小固定 重用的依据】:
如图所示:由于经过更新或者删除操作,项指针 item3空闲

  1. 首先会遍历页内已有的所有item,找到第一个未使用的item进行分配
    2)重用的话空闲下边界不变(pd_lower),上边界会发生改变 (pd_upper - sizelen);
  2. 调用memcpy进行数据的拷贝,并更新pageheader结构体信息
    在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/85119.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

课设项目之——教学辅助系统(学生考试监考系统)

在考试场中为学生监考十分枯燥&#xff0c;因此&#xff0c;建立一个可靠的作弊检测系统来识别学生是否存在作弊行为。 使用一个名为 Yolo3 的训练模型和一个名为 coco 的数据集&#xff0c;我们测试了考场中学生的书籍和手机&#xff0c;并将他们标记为作弊者。 使用haarcasc…

如何将dxf或dwg等CAD文件与卫星影像地图叠加进行绘图设计?

引言&#xff1a; 在测绘、电力、水利、规划或道路设计等GIS相关行业中&#xff0c;通常会用AutoCAD进行矢量地图数据的绘制&#xff0c;而这些地图数据通常又是建立在投影平面坐标的基础上进行绘制的。 为了确保地图数据的准确性与精度的要求&#xff0c;这些地图数据经常会…

将一个乱序数组变为有序数组的最少交换次数

给定一个包含1-n的数列&#xff0c;通过交换任意两个元素给数列重新排序。求最少需要多少次交换&#xff0c;能把数组排成按1-n递增的顺序 总之就是将这个位置应该出现的元素和这个位置现在的元素交换位置 代码实现&#xff1a; 核心&#xff1a;记住一点&#xff0c;hashmap用…

【debug】时序预测的结果都是一个趋势

时序预测的结果都是一个趋势现象原因solutionother solutions现象 预测的是一个序列。 在测试集中随机取20个来看&#xff0c;所有的预测序列都是一个趋势&#xff0c;但是大小有所区别。 举例图片 原因 目前来看是数据的问题&#xff0c;应该是样本不均衡&#xff0c;某一…

简单个人网页制作 个人介绍网页模板 静态HTML留言表单页面网站模板 大学生个人主页网页

&#x1f389;精彩专栏推荐&#x1f447;&#x1f3fb;&#x1f447;&#x1f3fb;&#x1f447;&#x1f3fb; ✍️ 作者简介: 一个热爱把逻辑思维转变为代码的技术博主 &#x1f482; 作者主页: 【主页——&#x1f680;获取更多优质源码】 &#x1f393; web前端期末大作业…

[ Linux ] 一篇带你理解Linux下线程概念

目录 1.Linux线程的概念 1.1什么是线程 1.1.1如何验证一个进程内有多个线程&#xff1f; 1.2线程的优点 1.3线程的缺点 1.4 线程异常 1.5 线程用途 2.Linux进程与线程 2.1进程和线程 2.2 进程和线程的关系 2.3如何看待之前学习的单进程&#xff1f; 1.Linux线程的概…

迪杰斯特拉算法求图的最短路径(java)

迪杰斯特拉算法 图的最短路径的解法 单源最短路径 从一个点开始&#xff0c;可以找到其中任意一个点的最短路径。 多源最短路径 从任何一个点开始&#xff0c;可以找到其中任何一个点的最短路径。 解题过程 给定一个带权有向图G(G, V), 另外&#xff0c;还给定 V 中的一…

力扣(LeetCode)1832. 判断句子是否为全字母句(C++)

哈希集合1 哈希集合记录 262626 个字母是否出现&#xff0c;一次遍历字符串&#xff0c;维护哈希集合&#xff0c;同时维护答案。遍历完成&#xff0c;仅当答案等于 262626 &#xff0c;句子是全字母句。 class Solution { public:bool checkIfPangram(string sentence) {boo…

轻松提高性能和并发度,springboot简单几步集成缓存

目录 1、缘由 2、技术介绍 2.1、技术调研 2.2、spring支持的cache 2.3、cache的核心注解 2.3.1 EnableCaching 2.3.2 Cacheable 2.3.3 CachePut 2.3.4 CacheEvict 2.4 cache的架构 2.5 cachemanager的实现类 3、搞个例子 3.1 为什么使用redis 作为缓存 3.2 代码走起…

【虚幻引擎】UE4/UE5数字孪生与前端Web页面匹配

一、数字孪生 数字孪生是一种多维动态的数字映射&#xff0c;可大幅提高效能。数字孪生是充分利用物理模型、传感器更新、运行历史等数据&#xff0c;集成多学科、多物理量、多尺度、多概率的仿真过程&#xff0c;在虚拟空间中完成对现实体的复制和映射&#xff0c;从而反映物理…

MySQL常用窗口函数

1、窗口函数概念 窗口的概念非常重要&#xff0c;它可以理解为记录集合&#xff0c;窗口函数也就是在满足某种条件的记录集合上执行的特殊函数对于每条记录都要在此窗口内执行函数&#xff0c;有的函数随着记录不同&#xff0c;窗口大小都是固定的&#xff0c;这种属于静态窗口…

c语言:枚举类型—enum

枚举类型一.常见形式二.枚举和宏定义三.枚举的意义四.插个小知识一.常见形式 这里举一个例子&#xff0c;我想要枚举颜色 注意一下细节&#xff0c;所有成员间用逗号隔开&#xff0c;最后一个成员后不加标点符号 这里看上去和定义结构体和联合体的样式一样&#xff0c;但其实前…

minio安装部署和minIO-Client的使用

minio安装部署和minIO-Client的使用 一、服务器安装minio 1.进行下载 下载地址&#xff1a; GNU/Linux https://dl.min.io/server/minio/release/linux-amd64/minio2.新建minio安装目录&#xff0c;执行如下命令 mkdir -p /home/minio/data把二进制文件上传到安装目录后&a…

【PAT甲级 - C++题解】1128 N Queens Puzzle

✍个人博客&#xff1a;https://blog.csdn.net/Newin2020?spm1011.2415.3001.5343 &#x1f4da;专栏地址&#xff1a;PAT题解集合 &#x1f4dd;原题地址&#xff1a; 题目详情 - 1128 N Queens Puzzle (pintia.cn) &#x1f511;中文翻译&#xff1a;皇后问题 &#x1f4e3;…

第9章 无线网络和移动网络

目录 9.1 无线局域网 WLAN 9.1.1 无线局域网的组成 1. 无线局域网 WLAN (Wireless Local Area Network) 2. IEEE 802.11 3. 移动自组网络 9.1.2 802.11 局域网的物理层 9.1.3 802.11 局域网的 MAC 层协议 1. CSMA/CA 协议 2. 时间间隔 DIFS 的重要性 3. MAC两个子层…

acwing基础课——Floyd

由数据范围反推算法复杂度以及算法内容 - AcWing 常用代码模板3——搜索与图论 - AcWing 基本思想&#xff1a; floyd算法的原理是基于动态规划的基础上实现的&#xff0c;因为是稠密图我们通过邻接矩阵来存储&#xff0c;我们将各点距离初始化为正无穷(该点到自己的距离为0)&…

软件测试基础理论体系学习8-什么是验收测试?验收测试的内容是什么?过程是什么?有什么测试策略?

8-什么是验收测试&#xff1f;验收测试的内容是什么&#xff1f;过程是什么&#xff1f;有什么测试策略&#xff1f;1 验收测试的主要内容1.1 简介和说明1.2 验收测试的目的1.3 验收测试的任务1.4 验收测试主要内容1.4.1 验收测试标准1.4.2 配置复审1.4.3 α、β测试2 验收测试…

基于intel低功耗平台边缘计算解决方案助力半导体设备升级

半导体芯片是现代电子领域的大脑。事实上&#xff0c;在通信、计算、零售、医疗保健和运输应用领域&#xff0c;半导体芯片为各种先进技术提供了基础。2020年全球半导体销售额增长6.5%&#xff0c;相关制造设备的生产需求也相应增加。 某业内日本半导体设备制造厂商&#xff0…

营销不知道怎么做,不妨试试社交新零售电商结合新型引流模式

大家好&#xff0c;我是林工&#xff0c;如今移动互联网的快速发展为社交媒体分销提供了生根发芽的土壤&#xff0c;人们不仅仅满足于单方面的购物消费&#xff0c;开始利用社交属性&#xff0c;出现了具有强大带货能力和分销能力的人群&#xff0c;也就是当时的代购和微商&…

BEVFormer-accelerate:基于EasyCV加速BEVFormer

作者&#xff1a;贺弘 夕陌 谦言 临在 导言 BEVFormer是一种纯视觉的自动驾驶感知算法&#xff0c;通过融合环视相机图像的空间和时序特征显式的生成具有强表征能力的BEV特征&#xff0c;并应用于下游3D检测、分割等任务&#xff0c;取得了SOTA的结果。我们在EasyCV开源框架&…