postgres源码解析41 btree索引文件的创建--1

上述小节讲解了索引文件的物理创建，本篇讲解btree索引元组填充至索引文件的操作。先从数据结构入手，后深入执行流程。
postgres源码解析37 表创建执行全流程梳理–1
postgres源码解析37 表创建执行全流程梳理–2
postgres源码解析37 表创建执行全流程梳理–3
postgres源码解析37 表创建执行全流程梳理–4

数据结构与页面图

在这里插入图片描述

btree handler

该函数记录了btree 的访问方法参数和支持的回调函数

/*
 * Btree handler function: return IndexAmRoutine with access method parameters
 * and callbacks.
 */
Datum
bthandler(PG_FUNCTION_ARGS)
{
	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);

	amroutine->amstrategies = BTMaxStrategyNumber;
	amroutine->amsupport = BTNProcs;
	amroutine->amoptsprocnum = BTOPTIONS_PROC;
	amroutine->amcanorder = true;
	amroutine->amcanorderbyop = false;
	amroutine->amcanbackward = true;
	amroutine->amcanunique = true;
	amroutine->amcanmulticol = true;
	amroutine->amoptionalkey = true;
	amroutine->amsearcharray = true;
	amroutine->amsearchnulls = true;
	amroutine->amstorage = false;
	amroutine->amclusterable = true;
	amroutine->ampredlocks = true;
	amroutine->amcanparallel = true;
	amroutine->amcaninclude = true;
	amroutine->amusemaintenanceworkmem = false;
	amroutine->amparallelvacuumoptions =
		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
	amroutine->amkeytype = InvalidOid;

	amroutine->ambuild = btbuild;
	amroutine->ambuildempty = btbuildempty;
	amroutine->aminsert = btinsert;
	amroutine->ambulkdelete = btbulkdelete;
	amroutine->amvacuumcleanup = btvacuumcleanup;
	amroutine->amcanreturn = btcanreturn;
	amroutine->amcostestimate = btcostestimate;
	amroutine->amoptions = btoptions;
	amroutine->amproperty = btproperty;
	amroutine->ambuildphasename = btbuildphasename;
	amroutine->amvalidate = btvalidate;
	amroutine->amadjustmembers = btadjustmembers;
	amroutine->ambeginscan = btbeginscan;
	amroutine->amrescan = btrescan;
	amroutine->amgettuple = btgettuple;
	amroutine->amgetbitmap = btgetbitmap;
	amroutine->amendscan = btendscan;
	amroutine->ammarkpos = btmarkpos;
	amroutine->amrestrpos = btrestrpos;
	amroutine->amestimateparallelscan = btestimateparallelscan;
	amroutine->aminitparallelscan = btinitparallelscan;
	amroutine->amparallelrescan = btparallelrescan;

	PG_RETURN_POINTER(amroutine);
}

下面简单介绍重点回调函数的功能：

functions	description
btbuild	将生成的元组填充至索引文件中，会调用IndexBuildHeapScan函数扫描基表以获取表中的元组构建索引元祖
btbuildempty	在INIT_FORKNUM分支构建一个空的btree 索引
btinsert	向btree插入索引元组,如果是唯一索引，则插入过程会进行该索引元祖是否存在
btbulkdelete	删除所有指向heaptuple集合的索引条目，该集合元组由回调函数负责判断是否删除
btvacuumcleanup	该函数通常在vacuum操作之后调用，主要做一些额外工作，如更新索引页的meta信息
btcanreturn	检查btree 索引是否支持 Index-only scans, btree中返回值为 true
btconstestimate	用于估算一个索引扫描的代价
btoptions	用于分析和验证一个索引的reloptions数组，仅当一个索引存在非空reloptions数组时才会被调用
btbeginscan	在btree 索引上开启扫描，其功能是构建索引扫描描述符结构体IndexScanDesc,后续的执行借助该结构体信息
btrescan	用于重新开启扫描，通过该函数可以使用一个新的扫描关键字
btgettuple	在给定的扫描描述符中获取下一个元组(按照给定的方向移动)，获取成功保存该元组的TID
btgetbimap	获取所有匹配的元组，并将其加入至bitmap
btendscan	结束扫描并释放相应的资源
btmarkpos	保存当前扫描位点，挂载至扫描描述符的opaque字段
btrestrpos	将扫描恢复至最近标记的位置，与btmarkpos对应

索引相关的系统表

1 pg_am

postgres=# select * from pg_am;

 oid  | amname |      amhandler       | amtype 
------+--------+----------------------+--------
    2 | heap   | heap_tableam_handler | t
  403 | btree  | bthandler            | i
  405 | hash   | hashhandler          | i
  783 | gist   | gisthandler          | i
 2742 | gin    | ginhandler           | i
 4000 | spgist | spghandler           | i
 3580 | brin   | brinhandler          | i
(7 rows)

说明：记录了每一种索引类型的访问函数句柄

2 pg_index

                                                Table "pg_catalog.pg_index"
       Column        |     Type     | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
---------------------+--------------+-----------+----------+---------+----------+-------------+--------------+-------------
 indexrelid          | oid          |           | not null |         | plain    |             |              | 
 indrelid            | oid          |           | not null |         | plain    |             |              | 
 indnatts            | smallint     |           | not null |         | plain    |             |              | 
 indnkeyatts         | smallint     |           | not null |         | plain    |             |              | 
 indisunique         | boolean      |           | not null |         | plain    |             |              | 
 indnullsnotdistinct | boolean      |           | not null |         | plain    |             |              | 
 indisprimary        | boolean      |           | not null |         | plain    |             |              | 
 indisexclusion      | boolean      |           | not null |         | plain    |             |              | 
 indimmediate        | boolean      |           | not null |         | plain    |             |              | 
 indisclustered      | boolean      |           | not null |         | plain    |             |              | 
 indisvalid          | boolean      |           | not null |         | plain    |             |              | 
 indcheckxmin        | boolean      |           | not null |         | plain    |             |              | 
 indisready          | boolean      |           | not null |         | plain    |             |              | 
 indislive           | boolean      |           | not null |         | plain    |             |              | 
 indisreplident      | boolean      |           | not null |         | plain    |             |              | 
 indkey              | int2vector   |           | not null |         | plain    |             |              | 
 indcollation        | oidvector    |           | not null |         | plain    |             |              | 
 indclass            | oidvector    |           | not null |         | plain    |             |              | 
 indoption           | int2vector   |           | not null |         | plain    |             |              | 
 indexprs            | pg_node_tree | C         |          |         | extended |             |              | 
 indpred             | pg_node_tree | C         |          |         | extended |             |              | 
Indexes:
    "pg_index_indexrelid_index" PRIMARY KEY, btree (indexrelid)
    "pg_index_indrelid_index" btree (indrelid)
Access method: heap

Column	Description
indexrelid	此索引的pg_class项的OID
indrelid	indrelid 此索引的基表的pg_class项的OID
indnatts	索引中的总列数（与pg_class.relnatts重复），这个数目包括键和被包括的属性
indnkeyatts	索引中键列的编号，不计入任何的内含列，它们只是被存储但不参与索引的语义
indisunique	如为真, 这是唯一索引
indisprimary	如为真，表示索引为表的主键（如果此列为真，indisunique也总是为真）
indisexclusion	如为真，此索引支持一个排他约束
indimmediate	如为真，唯一性检查在插入时立即被执行（如果indisunique为假，此列无关）
indisclustered	如果为真，表示表最后以此索引进行了聚簇
indisvalid	如果为真，此索引当前可以用于查询。为假表示此索引可能不完整：它肯定还在被INSERT/UPDATE操作所修改，但它不能安全地被用于查询。如果索引是唯一索引，唯一性属性也不能被保证。
indcheckxmin	如果为真，直到此pg_index行的xmin低于查询的TransactionXmin事务之前，查询都不能使用此索引，因为表可能包含具有它们可见的不相容行的损坏HOT链
indisready	如果为真，表示此索引当前可以用于插入。为假表示索引必须被INSERT/UPDATE操作忽略。
indislive	如果为假，索引正处于被删除过程中，并且必须被所有处理忽略（包括HOT安全的决策）
indisreplident	如果为真，这个索引被选择为使用ALTER TABLE … REPLICA IDENTITY USING INDEX …的“replica identity”
indkey	这是一个indnatts值的数组，它表示了此索引索引的表列。例如一个1 3值可能表示表的第一和第三列组成了索引项。键列出现在非键（内含）列前面。数组中的一个0表示对应的索引属性是一个在表列上的表达式，而不是一个简单的列引用。
indcollation	对于索引键（indnkeyatts值）中的每一列，这包含要用于该索引的排序规则的OID，如果该列不是一种可排序数据类型则为零。
indclass	对于索引键中的每一列（indnkeyatts值），这里包含了要使用的操作符类的OID。详见pg_opclass。
indoption	这是一个indnkeyatts值的数组，用于存储每列的标志位。位的意义由索引的访问方法定义。
indexprs	非简单列引用索引属性的表达式树（以nodeToString()形式）。对于indkey中每一个为0的项，这个列表中都有一个元素。如果所有的索引属性都是简单引用，此列为空。
indpred	部分索引谓词的表达式树（以nodeToString()形式）。如果不是部分索引，此列为空。

3 pg_opclass

                                        Table "pg_catalog.pg_opclass"
    Column    |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
--------------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 oid          | oid     |           | not null |         | plain   |             |              | 
 opcmethod    | oid     |           | not null |         | plain   |             |              | 
 opcname      | name    |           | not null |         | plain   |             |              | 
 opcnamespace | oid     |           | not null |         | plain   |             |              | 
 opcowner     | oid     |           | not null |         | plain   |             |              | 
 opcfamily    | oid     |           | not null |         | plain   |             |              | 
 opcintype    | oid     |           | not null |         | plain   |             |              | 
 opcdefault   | boolean |           | not null |         | plain   |             |              | 
 opckeytype   | oid     |           | not null |         | plain   |             |              | 
Indexes:
    "pg_opclass_oid_index" PRIMARY KEY, btree (oid)
    "pg_opclass_am_name_nsp_index" UNIQUE CONSTRAINT, btree (opcmethod, opcname, opcnamespace)
Access method: heap

pg_opclass定义索引访问方法的操作符类。每一个操作符类定义了一种特定数据类型和一种特定索引访问方法的索引列的语义。一个操作符类实际上指定了一个特定的操作符族可以用于一个特定可索引列数据类型。该族中可用于索引列的操作符能够接受该列的数据类型作为它们的左输入。

Column	Description
opcmethod	操作符类所属的索引访问方法
opcname	操作符类的名称
opcnamespace	操作符类所属的名字空间
opcowner	操作符类的拥有者
opcfamily	包含此操作符类的操作符集合
opcintype	操作符类索引的数据类型
opcdefault	如果此操作符类为opcintype的默认值则为真
opckeytype	存储在索引中的数据的类型，如果值为0表示与opcintype相同

4 pg_opfamily

                                      Table "pg_catalog.pg_opfamily"
    Column    | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
--------------+------+-----------+----------+---------+---------+-------------+--------------+-------------
 oid          | oid  |           | not null |         | plain   |             |              | 
 opfmethod    | oid  |           | not null |         | plain   |             |              | 
 opfname      | name |           | not null |         | plain   |             |              | 
 opfnamespace | oid  |           | not null |         | plain   |             |              | 
 opfowner     | oid  |           | not null |         | plain   |             |              | 
Indexes:
    "pg_opfamily_oid_index" PRIMARY KEY, btree (oid)
    "pg_opfamily_am_name_nsp_index" UNIQUE CONSTRAINT, btree (opfmethod, opfname, opfnamespace)
Access method: heap

pg_opfamily定义了操作符族。每一个操作符族是操作符和相关支持例程的集合，支持例程用于实现一个特定索引访问方法的语义。此外，按照访问方法指定的某种方式，一个族内的操作符都是“兼容的”。操作符族概念允许在索引中使用跨数据类型操作符，并可以使用访问方法语义的知识推导出。

Column	Description
opfmethod	操作符族适用的索引访问方法
opfname name	操作符系列的名字
opfnamespace	操作符系列所属的名字空间
opfowner	操作符系列的拥有者

5 pg_amop

                                           Table "pg_catalog.pg_amop"
     Column     |   Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
----------------+----------+-----------+----------+---------+---------+-------------+--------------+-------------
 oid            | oid      |           | not null |         | plain   |             |              | 
 amopfamily     | oid      |           | not null |         | plain   |             |              | 
 amoplefttype   | oid      |           | not null |         | plain   |             |              | 
 amoprighttype  | oid      |           | not null |         | plain   |             |              | 
 amopstrategy   | smallint |           | not null |         | plain   |             |              | 
 amoppurpose    | "char"   |           | not null |         | plain   |             |              | 
 amopopr        | oid      |           | not null |         | plain   |             |              | 
 amopmethod     | oid      |           | not null |         | plain   |             |              | 
 amopsortfamily | oid      |           | not null |         | plain   |             |              | 
Indexes:
    "pg_amop_oid_index" PRIMARY KEY, btree (oid)
    "pg_amop_fam_strat_index" UNIQUE CONSTRAINT, btree (amopfamily, amoplefttype, amoprighttype, amopstrategy)
    "pg_amop_opr_fam_index" UNIQUE CONSTRAINT, btree (amopopr, amoppurpose, amopfamily)
Access method: heap

pg_amop存储关于与访问方法操作符族相关的操作符信息。对于一个操作符族中的每一个成员即操作符都在这个目录中有一行。一个成员可以是一个搜索操作符或者一个排序操作符。一个操作符可以出现在多个族中，但在同一个组中既不能出现在多个搜索位置也不能出现在多个排序位置（虽然不太可能出现，但是允许一个操作符同时用于搜索和排序目的）。

Column	Description
amopfamily	这个操作符所在的操作符集合
amoplefttype	操作符的左输入数据类型
amoprighttype	操作符的右输入数据类型
amopstrategy	操作符策略号
amoppurpose	操作符目的，s表示搜索，o表示排序
amopopr	操作符的OID
amopmethod	使用此操作符集合的索引访问方法
amopsortfamily	如果是一个排序操作符，该项会根据这个 B树操作符族排序，如果是一个搜索操作符则为0

6 pg_amproc

                                           Table "pg_catalog.pg_amproc"
     Column      |   Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
-----------------+----------+-----------+----------+---------+---------+-------------+--------------+-------------
 oid             | oid      |           | not null |         | plain   |             |              | 
 amprocfamily    | oid      |           | not null |         | plain   |             |              | 
 amproclefttype  | oid      |           | not null |         | plain   |             |              | 
 amprocrighttype | oid      |           | not null |         | plain   |             |              | 
 amprocnum       | smallint |           | not null |         | plain   |             |              | 
 amproc          | regproc  |           | not null |         | plain   |             |              | 
Indexes:
    "pg_amproc_oid_index" PRIMARY KEY, btree (oid)
    "pg_amproc_fam_proc_index" UNIQUE CONSTRAINT, btree (amprocfamily, amproclefttype, amprocrighttype, amprocnum)
Access method: heap

pg_amproc存储关于访问方法操作符族相关的支持函数。属于一个操作符族的每一个支持函数在这个目录中都有一行。

Column	Description
amprocfamily	使用这个操作符的操作符集合
amproclefttype	相关操作符的左输入数据类型
amprocrighttype	相关操作符的右输入数据类型
amprocnum	支持的函数编号
amproc	函数的OID