Greenplum数据库优化器——Join类型

news2024/11/19 9:38:28

在这里插入图片描述

join 类型语法支持

from语句允许JOIN表达式和表名列表,将joined_table从table_ref中分离出来,It may seem silly to separate joined_table from table_ref, but there is method in SQL’s madness: if you don’t do it this way you get reduce-reduce conflicts, because it’s not clear to the parser generator whether to expect alias_clause after ‘)’ or not.
在这里插入图片描述

/* table_ref is where an alias clause can be attached. */
table_ref: 	| joined_table
				{ $$ = (Node *) $1; }
			| '(' joined_table ')' alias_clause
				{ $2->alias = $4; $$ = (Node *) $2; }

joined_table:
			'(' joined_table ')' { $$ = $2; }
			| table_ref CROSS JOIN table_ref
				{ /* CROSS JOIN is same as unqualified inner join */
					JoinExpr *n = makeNode(JoinExpr); n->jointype = JOIN_INNER; n->isNatural = FALSE;
					n->larg = $1; n->rarg = $4; n->usingClause = NIL; n->quals = NULL; $$ = n; }
			| table_ref join_type JOIN table_ref join_qual
				{   JoinExpr *n = makeNode(JoinExpr); n->jointype = $2; n->isNatural = FALSE;
					n->larg = $1; n->rarg = $4;
					if ($5 != NULL && IsA($5, List)) n->usingClause = (List *) $5; /* USING clause */
					else n->quals = $5; /* ON clause */
					$$ = n; }
			| table_ref JOIN table_ref join_qual
				{   /* letting join_type reduce to empty doesn't work */
					JoinExpr *n = makeNode(JoinExpr); n->jointype = JOIN_INNER; n->isNatural = FALSE;
					n->larg = $1; n->rarg = $3;
					if ($4 != NULL && IsA($4, List)) n->usingClause = (List *) $4; /* USING clause */
					else n->quals = $4; /* ON clause */
					$$ = n; }
			| table_ref NATURAL join_type JOIN table_ref
				{   JoinExpr *n = makeNode(JoinExpr); n->jointype = $3; n->isNatural = TRUE;
					n->larg = $1; n->rarg = $5; n->usingClause = NIL; /* figure out which columns later... */ n->quals = NULL; /* fill later */
					$$ = n; }
			| table_ref NATURAL JOIN table_ref
				{   /* letting join_type reduce to empty doesn't work */
					JoinExpr *n = makeNode(JoinExpr); n->jointype = JOIN_INNER; n->isNatural = TRUE;
					n->larg = $1; n->rarg = $4; n->usingClause = NIL; /* figure out which columns later... */ n->quals = NULL; /* fill later */
					$$ = n; }

For the same reason we must treat ‘JOIN’ and ‘join_type JOIN’ separately, rather than allowing join_type to expand to empty; if we try it, the parser generator can’t figure out when to reduce an empty join_type right after table_ref. 以上语法支持join类型如下所示。join_type支持FULL OUTER_P/FULL、LEFT OUTER_P/LEFT、RIGHT OUTER_P/RIGHT、INNER_P语法。

parser JOIN类型JOIN类型naturalusingClausequals
CROSS JOIN(CROSS JOIN is same as unqualifies inner join)JOIN_INNERFALSENILNULL
join_typejoin_typeFALSEjoin_qual USING clausejoin_qual ON clause
JOINJOIN_INNERFALSEjoin_qual USING clausejoin_qual ON clause
NATURAL join_type JOINjoin_typeTRUENILNULL
NATURAL JOINJOIN_INNERTRUENILNULL
join_type:	FULL join_outer							{ $$ = JOIN_FULL; }
			| LEFT join_outer						{ $$ = JOIN_LEFT; }
			| RIGHT join_outer						{ $$ = JOIN_RIGHT; }
			| INNER_P								{ $$ = JOIN_INNER; }
/* OUTER is just noise... */
join_outer: OUTER_P									{ $$ = NULL; }
			| /*EMPTY*/								{ $$ = NULL; }

首先准备了两个表 (Student 和 Course),其中 Student 表中的 C_S_Id 字段为外键列,关联的是 Course 表的 C_Id 主键列。
在这里插入图片描述

JOIN_INNER
–PG语法CROSS JOIN(CROSS JOIN is same as unqualifies inner join),PG内部类型JOIN_INNER-- 交叉连接(cross join):交叉连接将会返回被连接的两个表的笛卡尔积,返回结果的行数等于两个表行数的乘积。不加条件返回两个表行数的乘积:select * from Student s cross join Course c
在这里插入图片描述
在这里插入图片描述
–PG语法INNER_P,PG内部类型JOIN_INNER–内连接(inner join):满足on条件表达式,内连接是取满足条件表达式的两个表的交集(即两个表都有的数据)。select * from Student s inner join Course c on s.C_S_Id=c.C_Id
在这里插入图片描述
–PG语法JOIN,PG内部类型JOIN_INNER,USING clause | ON clause–同上
–PG语法NATURAL INNER_P JOIN,PG内部类型JOIN_INNER–说真的,这种连接查询没有存在的价值,既然是SQL2标准中定义的,就给出个例子看看吧。自然连接无需指定连接列,SQL会检查两个表中是否相同名称的列,且假设他们在连接条件中使用,并且在连接条件中仅包含一个连接列。不允许使用ON语句,不允许指定显示列,显示列只能用*表示。对于每种连接类型(除了交叉连接外),均可指定NATURAL。下面给出几个例子。

SELECT * FROM ORDERS O NATURAL INNER JOIN CUSTOMERS C;
SELECT * FROM ORDERS O NATURAL LEFT OUTER JOIN CUSTOMERS C;
SELECT * FROM ORDERS O NATURAL RIGHT OUTER JOIN CUSTOMERS C;
SELECT * FROM ORDERS O NATURAL FULL OUTER JOIN CUSTOMERS C;

–PG语法NATURAL JOIN,PG内部类型JOIN_INNER–同上

JOIN_FULL
–PG语法FULL OUTER_P/FULL,PG内部类型JOIN_FULL,USING clause | ON clause–全外连接(full join / full outer join):满足on条件表达式,返回两个表符合条件的所有行,a表没有匹配的则a表的列返回null,b表没有匹配的则b表的列返回null,即返回的是左连接和右连接的并集。select * from Student s full join Course c on s.C_S_Id=c.C_Id
在这里插入图片描述
–PG语法NATURAL FULL OUTER_P/FULL JOIN,PG内部类型JOIN_FULL–

JOIN_LEFT
–PG语法LEFT OUTER_P/LEFT,PG内部类型JOIN_LEFT,USING clause | ON clause–左外连接(left join / left outer join): 满足on条件表达式,左外连接是以左表为准,返回左表所有的数据,与右表匹配的则有值,没有匹配的则以空(null)取代。select * from Student s left join Course c on s.C_S_Id=c.C_Id
在这里插入图片描述
–PG语法NATURAL LEFT OUTER_P/LEFT JOIN,PG内部类型JOIN_LEFT–

JOIN_RIGHT
–PG语法RIGHT OUTER_P/RIGHT,PG内部类型JOIN_RIGHT,USING clause | ON clause–右外连接(right join / right outer join):满足on条件表达式,右外连接是以右表为准,返回右表所有的数据,与左表匹配的则有值,没有匹配的则以空(null)取代。select * from Student s right join Course c on s.C_S_Id=c.C_Id
在这里插入图片描述
–PG语法NATURAL RIGHT OUTER_P/RIGHT JOIN,PG内部类型JOIN_RIGHT–

/* JoinType - enums for types of relation joins
 * JoinType determines the exact semantics of joining two relations using
 * a matching qualification.  For example, it tells what to do with a tuple
 * that has no match in the other relation.
 * This is needed in both parsenodes.h and plannodes.h, so put it here... */
typedef enum JoinType{
	/* The canonical kinds of joins according to the SQL JOIN syntax. Only these codes can appear in parser output (e.g., JoinExpr nodes). */
	JOIN_INNER,					/* matching tuple pairs only */
	JOIN_LEFT,					/* pairs + unmatched LHS tuples */
	JOIN_FULL,					/* pairs + unmatched LHS + unmatched RHS */
	JOIN_RIGHT,					/* pairs + unmatched RHS tuples */
} JoinType;

join类型转换支持

/* JoinType - enums for types of relation joins
 * JoinType determines the exact semantics of joining two relations using
 * a matching qualification.  For example, it tells what to do with a tuple
 * that has no match in the other relation.
 * This is needed in both parsenodes.h and plannodes.h, so put it here... */
typedef enum JoinType{
	/* Semijoins and anti-semijoins (as defined in relational theory) do not
	 * appear in the SQL JOIN syntax, but there are standard idioms for
	 * representing them (e.g., using EXISTS).  The planner recognizes these
	 * cases and converts them to joins.  So the planner and executor must
	 * support these codes.  NOTE: in JOIN_SEMI output, it is unspecified
	 * which matching RHS row is joined to.  In JOIN_ANTI output, the row is
	 * guaranteed to be null-extended. */
	JOIN_SEMI,					/* 1 copy of each LHS row that has match(es) */
	JOIN_ANTI,					/* 1 copy of each LHS row that has no match */
	JOIN_LASJ_NOTIN,			/* Left Anti Semi Join with Not-In semantics: If any NULL values are produced by inner side, return no join results. Otherwise, same as LASJ */
} JoinType;

半连接 SEMI JOIN 是指在两表关联时,当第二个表中存在一个或多个匹配记录时,返回第一个表的记录。与普通JOIN不同,SEMI JOIN中第一个表里的记录最多只返回一次。SEMI JOIN 通常无法直接用SQL语句来表示,而是由 IN 或 EXISTS 子查询转换得到。SQL举例:

SELECT * FROM employees WHERE dept_name IN ( SELECT dept_name FROM departments ) 
SELECT * FROM employees WHERE EXISTS ( SELECT * FROM departments WHERE employees.dept_name = departments.dept_name )

反连接 ANTI JOIN 与半连接 SEMI JOIN 相反,是指在两表关联时,当第二个表中不存在匹配记录时,返回第一个表的记录。ANTI JOIN 通常无法直接用SQL语句来表示,而是由 NOT IN 或 NOT EXISTS 子查询转换得到。SQL举例:

SELECT * FROM employees WHERE dept_name NOT IN ( SELECT dept_name FROM departments ) 
SELECT * FROM employees WHERE NOT EXISTS ( SELECT * FROM departments WHERE employees.dept_name = departments.dept_name )

从上述注释中可以看出,这些JOIN类型是再上拉子连接时转换成JOIN的,其调用栈如下所示:

pull_up_sublinks --> pull_up_sublinks_jointree_recurse --> pull_up_sublinks_qual_recurse --> convert_ANY_sublink_to_join --> result->jointype = JOIN_SEMI
pull_up_sublinks --> pull_up_sublinks_jointree_recurse --> pull_up_sublinks_qual_recurse --> convert_EXISTS_sublink_to_join --> result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
pull_up_sublinks --> pull_up_sublinks_jointree_recurse --> pull_up_sublinks_qual_recurse --> convert_IN_to_antijoin --> JoinExpr *join_expr = make_join_expr(NULL, subq_indx, JOIN_LASJ_NOTIN)

subquery_planner --> [if we have any outer joins, try to reduce them to plain inner joins] reduce_outer_joins --> reduce_outer_joins_pass2 --> jointype = JOIN_ANTI see if we can reduce JOIN_LEFT to JOIN_ANTI

优化器内部使用join类型

/* JoinType - enums for types of relation joins
 * JoinType determines the exact semantics of joining two relations using
 * a matching qualification.  For example, it tells what to do with a tuple
 * that has no match in the other relation.
 * This is needed in both parsenodes.h and plannodes.h, so put it here... */
typedef enum JoinType{
	/* These codes are used internally in the planner, but are not supported
	 * by the executor (nor, indeed, by most of the planner). */
	JOIN_UNIQUE_OUTER,			/* LHS path must be made unique */
	JOIN_UNIQUE_INNER,			/* RHS path must be made unique */

	/* GPDB: Like JOIN_UNIQUE_OUTER/INNER, these codes are used internally
	 * in the planner, but are not supported by the executor or by most of the
	 * planner. A JOIN_DEDUP_SEMI join indicates a semi-join, but to be
	 * implemented by performing a normal inner join, and eliminating the
	 * duplicates with a UniquePath above the join. That can be useful in
	 * an MPP environment, if performing the join as an inner join avoids
	 * moving the larger of the two relations. */
	JOIN_DEDUP_SEMI,			/* inner join, LHS path must be made unique afterwards */
	JOIN_DEDUP_SEMI_REVERSE		/* inner join, RHS path must be made unique afterwards */
} JoinType;

如上这些JOIN类型用于确定使用匹配限定连接两个关系的确切语义。例如,它告诉如何处理在另一个关系中没有匹配项的元组。调用堆栈如下所示:make_join_rel Find or create a join RelOptInfo that represents the join ofthe two given rels, and add to it path information for paths created with the two rels as outer and inner rel. (The join rel may already contain paths generated from other pairs of rels that add up to the same set of base rels.)

  1. Construct Relids set that identifies the joinrel. Relids joinrelids = bms_union(rel1->relids, rel2->relids);
  2. Check validity and determine join type. join_is_legal(root, rel1, rel2, joinrelids, &sjinfo, &reversed)
  3. Find or build the join RelOptInfo, and compute the restrictlist that goes with this particular joining. RelOptInfo *joinrel = build_join_rel(root, joinrelids, rel1, rel2, sjinfo, &restrictlist);
  4. 针对sjinfo->jointype为JOIN_INNER的处理:add_paths_to_joinrel(root, joinrel, rel1, rel2, JOIN_INNER, sjinfo, restrictlist); add_paths_to_joinrel(root, joinrel, rel2, rel1, JOIN_INNER, sjinfo, restrictlist);
  5. 针对sjinfo->jointype为JOIN_LEFT的处理:add_paths_to_joinrel(root, joinrel, rel1, rel2, JOIN_LEFT, sjinfo, restrictlist); add_paths_to_joinrel(root, joinrel, rel2, rel1, JOIN_RIGHT, sjinfo, restrictlist);
  6. 针对sjinfo->jointype为JOIN_FULL的处理:add_paths_to_joinrel(root, joinrel, rel1, rel2, JOIN_FULL, sjinfo, restrictlist); add_paths_to_joinrel(root, joinrel, rel2, rel1, JOIN_FULL, sjinfo, restrictlist);
  7. 针对sjinfo->jointype为JOIN_SEMI的处理:We might have a normal semijoin, or a case where we don’t have enough rels to do the semijoin but can unique-ify the RHS and then do an innerjoin (see comments in join_is_legal). In the latter case we can’t apply JOIN_SEMI joining该情况下add_paths_to_joinrel(root, joinrel, rel1, rel2, JOIN_SEMI, sjinfo, restrictlist); add_paths_to_joinrel(root, joinrel, rel1, rel2, JOIN_DEDUP_SEMI, sjinfo, restrictlist); add_paths_to_joinrel(root, joinrel, rel2, rel1, JOIN_DEDUP_SEMI_REVERSE, sjinfo, restrictlist);;If we know how to unique-ify the RHS and one input rel is exactly the RHS (not a superset) we can consider unique-ifying it and then doing a regular join. create_unique_path(root, rel2, rel2->cheapest_total_path, sjinfo); add_paths_to_joinrel(root, joinrel, rel1, rel2, JOIN_UNIQUE_INNER, sjinfo, restrictlist); add_paths_to_joinrel(root, joinrel, rel2, rel1, JOIN_UNIQUE_OUTER, sjinfo, restrictlist);
  8. 针对sjinfo->jointype为JOIN_ANTI或JOIN_LASJ_NOTIN的处理:add_paths_to_joinrel(root, joinrel, rel1, rel2, sjinfo->jointype, sjinfo, restrictlist)

参考资料:
https://www.w3resource.com/slides/sql-joins-slide-presentation.php
https://developer.aliyun.com/article/501423
https://zhuanlan.zhihu.com/p/471575162
https://zhuanlan.zhihu.com/p/627685950

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/726599.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【网络结构】——TinyViT 一种transformer的蒸馏方案

来自 Microsoft 摘要 TinyViT是一种新型的小型transformer,它们高效且可以在大规模数据集上进行预训练。TinyViT通过在预训练期间应用蒸馏来传输知识,将大型预训练模型的知识转移到小型模型中,同时使小型模型能够获得大量预训练数据的红利。…

hcip实验--RIP

实验实验要求 : 要求:R1-R2-R3-R4-R5 RIP 100运行版本2 R6-R7 RIP 200 运行版本1 1.使用合理IP地址规划网络,各自创建环回接口 2.R1创建环回 172.16.1.1/24 172.16.2.1/24 172.16.3.1/24 3.要求R4使用R2访问R1环回 4.减少路由条目数量&am…

AttributeError: ‘FreeTypeFont‘ object has no attribute ‘getsize‘

yolo训练时,yolo的训练项目报错,如下 w, h self.font.getsize(text) # text width, height AttributeError: ‘FreeTypeFont’ object has no attribute ‘getsize’ 说是字体没有getsize属性,实际看了一下,此属性存在&#xff0…

simulink 使能子模块 对应if else

Enabled Subsystem 使能子模块 这个值是对内部的全部变量↓ 对输出↓

web 页面布局:(一)align与表格布局

web 页面布局:(一)align与表格布局 古早时代页面布局 表格布局合并单元格表格布局的弃用 古早时代 之前,我们花费了一点时间,去了解了一下 html 的本质,那么,现在,我们就要尝试开始…

多元回归预测 | Matlab基于灰狼算法优化深度置信网络(GWO-DBN)的数据回归预测,matlab代码回归预测,多变量输入模型

文章目录 效果一览文章概述部分源码参考资料效果一览 文章概述 多元回归预测 | Matlab基于灰狼算法优化深度置信网络(GWO-DBN)的数据回归预测,matlab代码回归预测,多变量输入模型,matlab代码回归预测,多变量输入模型,多变量输入模型 评价指标包括:MAE、RMSE和R2等,代码质…

[Android JNI] --- JNI基础

1 JNI概念 什么是JNI JNI 全称 Java Native Interface,Java 本地化接口,可以通过 JNI 调用系统提供的 API。操作系统,无论是 Linux,Windows 还是 Mac OS,或者一些汇编语言写的底层硬件驱动都是 C/C 写的。Java和C/C不…

一款批量漏洞挖掘工具

介绍 QingScan一个批量漏洞挖掘工具,黏合各种好用的扫描器。 是一款聚合扫描器,本身不生产安全扫描功能,但会作为一个安全扫描工具的搬运工;当添加一个目标后,QingScan会自动调用各种扫描器对目标进行扫描&#xff0c…

一文读懂智能汽车滑板底盘

摘要: 所谓滑板式底盘,即将电池、电动传动系统、悬架、刹车等部件提前整合在底盘上,实现车身和底盘的分离,设计解耦。基于这类平台,车企可以大幅降低前期研发和测试成本,同时快速响应市场需求打造不同的车型。尤其是无…

系统架构设计师-软件工程(2)

一、需求工程 1、需求工程阶段划分 软件需求是指用户对系统在功能、行为、性能、设计约束等方面的期望。 【需求工程主要活动的阶段划分】 2、需求获取 3、需求分析 (1)数据流图(DFD) 简称DFD,它从…

LabVIEW开发矿用泵液压头测试系

LabVIEW开发矿用泵液压头测试系 在矿井中,矿用泵是用于排放矿井水的关键设备。如果不正常运行,矿山的生产必然受到严重影响,工人的生命也受到严重威胁。确保矿用泵能够正常运行非常重要。由于其运行条件非常恶劣,矿用泵的故障率高…

网络故障排除之Traceroute命令详解

概要 遇到网络故障的时候,你一般会最先使用哪条命令进行排障? 除了Ping,还有Traceroute、Show、Telnet又或是Clear、Debug等等。 今天安排的,是Traceroute排障命令详解,给你分享3个经典排障案例哈。 一. Traceroute…

ChatGPT 最佳实践指南之:写出清晰的指示

Write clear instructions 写出清晰的指示 GPTs can’t read your mind. If outputs are too long, ask for brief replies. If outputs are too simple, ask for expert-level writing. If you dislike the format, demonstrate the format you’d like to see. The less GPTs…

如何使网站快速拥有登录注册功能

如何使网站快速拥有登录注册功能 一、产品介绍二、开始使用1、如何判断用户是否登录?2、如何让用户登录?举个例子: 3、登录成功后如何拿到用户数据?4、如何维护用户的登录态? 二、注意点 前端必备工具(免费图床、API、chatAI等)推荐网站LuckyCola: h…

机器学习——支持向量机(数学基础推导篇【未完】)

在一个周日下午,夏天的雨稀里哗啦地下着 我躺在床上,捧着ipad看支持向量机 睡了好几个觉…支持向量机太好睡了 拉格朗日乘数法太好睡了 几何函数太好睡了 在我看来,支持向量机是目前学下来,最难以理解的内容 希望日后不要太难…脑…

[计算机入门] Windows对话框

2.4 对话框 在图形用户界面中,对话框是一种特殊的窗口, 用来在用户界面中向用户显示信息,或者在需要的时候获得用户的输入响应。之所以称之为对话框是因为它们使计算机和用户之间构成了一个对话——或者是通知用户一些信息,或者是请求用户的…

C. Russian Roulette(构造)

传送门 题意 俄罗斯转盘,长度为n的环,有k个子弹,然后挨着对着脑袋打。 你是第一个人,你希望你死的概率最小,问你怎么去设置这个子弹的位置。 第二个人会一开始随机砖圈,使得每一个位置开始都是可能的。…

电脑技巧:怎么轻松地搞定Win11系统备份任务

目录 1、选择免费备份软件来自动备份系统 2、如何逐步配置定时系统备份任务? “我是一个电脑小白,不是很懂电脑的一些操作。我刚买了一台新电脑,它装的是Win11系统,我害怕它出现什么问题,听朋友说可以通过备份的方…

Kotlin~责任链模式

概念 允许多个对象按顺序处理请求或任务。 角色介绍 Handler: 处理器接口,提供设置后继者&#xff08;可选&#xff09;ConcreteHandler&#xff1a;具体处理器&#xff0c;处理请求 UML 代码实现 比如ATM机吐钱就可以使用责任链实现。 class PartialFunction<in P1, o…

【环境配置】Conda ERROR:Failed building wheel for lap

问题 note: This error originates from a subprocess, and is likely not a problem with pip.ERROR: Failed building wheel for lapRunning setup.py clean for lap Failed to build lap ERROR: Could not build wheels for lap, which is required to install pyproject.to…