相关:
《Postgresql源码(105)分区表剪枝代码分析》
《Postgresql源码(106)Generic Plan与Custom Plan的区别(以分区表为例)》
实例
CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
CREATE TABLE measurement_y2006m01 PARTITION OF measurement FOR VALUES FROM ('2006-01-01') TO ('2006-02-01');
CREATE TABLE measurement_y2006m02 PARTITION OF measurement FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
CREATE TABLE measurement_y2006m04 PARTITION OF measurement FOR VALUES FROM ('2006-04-01') TO ('2006-05-01');
CREATE TABLE measurement_y2006m05 PARTITION OF measurement FOR VALUES FROM ('2006-05-01') TO ('2006-06-01');
insert into measurement values (1, '2006-01-03', floor(random() * 100), floor(random() * 100));
insert into measurement values (2, '2006-02-04', floor(random() * 100), floor(random() * 100));
insert into measurement values (3, '2006-03-05', floor(random() * 100), floor(random() * 100));
insert into measurement values (4, '2006-03-06', floor(random() * 100), floor(random() * 100));
insert into measurement values (5, '2006-03-07', floor(random() * 100), floor(random() * 100));
insert into measurement values (6, '2006-03-08', floor(random() * 100), floor(random() * 100));
insert into measurement values (7, '2006-04-09', floor(random() * 100), floor(random() * 100));
insert into measurement values (8, '2006-04-10', floor(random() * 100), floor(random() * 100));
insert into measurement values (9, '2006-05-11', floor(random() * 100), floor(random() * 100));
insert into measurement values (10, '2006-05-12', floor(random() * 100), floor(random() * 100));
prepare p1 (date, date) as select * from measurement where logdate between $1 and $2;
explain analyze execute p1('2006-04-05', '2006-04-20');
首先看Custom Plan(强制choose_custom_plan返回true)
postgres=# explain analyze execute p1('2006-04-05', '2006-04-20');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Seq Scan on measurement_y2006m04 measurement (cost=0.00..37.75 rows=9 width=16) (actual time=0.017..0.020 rows=6 loops=1)
Filter: ((logdate >= '2006-04-05'::date) AND (logdate <= '2006-04-20'::date))
Planning Time: 1620.157 ms
Execution Time: 0.041 ms
再看Generic Plan(强制choose_custom_plan返回false)
postgres=# explain analyze execute p1('2006-04-05', '2006-04-20');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..188.97 rows=45 width=16) (actual time=0.027..0.034 rows=6 loops=1)
Subplans Removed: 4
-> Seq Scan on measurement_y2006m04 measurement_1 (cost=0.00..37.75 rows=9 width=16) (actual time=0.024..0.028 rows=6 loops=1)
Filter: ((logdate >= $1) AND (logdate <= $2))
Planning Time: 28898.081 ms
Execution Time: 0.102 ms
(6 rows)
问题一:看起来都剪枝了?
是的,从计划上来看,只能看到4月表,确实都剪枝了。
但仔细看还是有区别的,Generic Plan多了一行Subplans Removed: 4
。这表示Generic计划实际是没剪枝的,是运行时做的prune。
证明:查看GetCachedPlan生成plan,可以看到下面的计划树把每个月分片都戴上了,明显未剪枝:
计划中是没剪枝的,但在计划中记录了裁剪的信息:make_partition_pruneinfo。
然后在执行器启动阶段:
#0 ExecInitAppend (node=0x17435a8, estate=0x173b3b0, eflags=16) at nodeAppend.c:111
#1 0x00000000007566d8 in ExecInitNode (node=0x17435a8, estate=0x173b3b0, eflags=16) at execProcnode.c:182
#2 0x000000000074c12c in InitPlan (queryDesc=0x1737b48, eflags=16) at execMain.c:938
#3 0x000000000074b112 in standard_ExecutorStart (queryDesc=0x1737b48, eflags=16) at execMain.c:265
#4 0x000000000074ae78 in ExecutorStart (queryDesc=0x1737b48, eflags=0) at execMain.c:144
#5 0x00000000006a69fc in ExplainOnePlan (plannedstmt=0x1743510, into=0x0, es=0x1744b70, queryString=0x17287d0 "prepare p1 (date, date) as select * from measurement where logdate between $1 and $2;", params=0x1744e58, queryEnv=0x0, planduration=0x7ffc0dc5c470, bufusage=0x0) at explain.c:579
#6 0x00000000006d3a35 in ExplainExecuteQuery (execstmt=0x16370a0, into=0x0, es=0x1744b70, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');", params=0x0, queryEnv=0x0) at prepare.c:642
#7 0x00000000006a6807 in ExplainOneUtility (utilityStmt=0x16370a0, into=0x0, es=0x1744b70, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');", params=0x0, queryEnv=0x0) at explain.c:486
#8 0x00000000006a62da in ExplainOneQuery (query=0x1637368, cursorOptions=2048, into=0x0, es=0x1744b70, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');", params=0x0, queryEnv=0x0) at explain.c:375
#9 0x00000000006a5f8b in ExplainQuery (pstate=0x1658dc0, stmt=0x16370d8, params=0x0, dest=0x1658d28) at explain.c:281
#10 0x00000000009935b5 in standard_ProcessUtility (pstmt=0x16375f0, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1658d28, qc=0x7ffc0dc5c9b0) at utility.c:870
#11 0x0000000000992d3e in ProcessUtility (pstmt=0x16375f0, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1658d28, qc=0x7ffc0dc5c9b0) at utility.c:530
#12 0x0000000000991947 in PortalRunUtility (portal=0x16d6250, pstmt=0x16375f0, isTopLevel=true, setHoldSnapshot=true, dest=0x1658d28, qc=0x7ffc0dc5c9b0) at pquery.c:1158
#13 0x00000000009916b3 in FillPortalStore (portal=0x16d6250, isTopLevel=true) at pquery.c:1031
#14 0x000000000099101c in PortalRun (portal=0x16d6250, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x16376e0, altdest=0x16376e0, qc=0x7ffc0dc5cb80) at pquery.c:763
#15 0x000000000098ac9f in exec_simple_query (query_string=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');") at postgres.c:1250
#16 0x000000000098f1e0 in PostgresMain (dbname=0x16620d8 "postgres", username=0x16620b8 "mingjie") at postgres.c:4593
#17 0x00000000008cc052 in BackendRun (port=0x1658840) at postmaster.c:4511
#18 0x00000000008cb9eb in BackendStartup (port=0x1658840) at postmaster.c:4239
#19 0x00000000008c7fe8 in ServerLoop () at postmaster.c:1806
#20 0x00000000008c78ba in PostmasterMain (argc=1, argv=0x1630110) at postmaster.c:1478
#21 0x00000000007cdaff in main (argc=1, argv=0x1630110) at main.c:202
ExecInitAppend函数会根据优化器记录的信息,node->part_prune_info:
来对执行计划进行运行时剪枝:ExecInitPartitionPruning
从上图part_prune_info的结构来看,运行是剪枝和优化器剪枝用的是一套逻辑,都是先走prune step,最后combine,详见上一篇。
问题二:为什么Generic Plan不能剪枝?
Generic Plan是不太参数的计划,也就是只能靠
prepare p1 (date, date) as select * from measurement where logdate between $1 and $2;
生成计划,无法拿到参数也就无法剪枝了。
代码上有什么区别呢?
对比下prune_append_rel_partitions的执行流程区别:
prepare p1 (date, date) as select * from measurement where logdate between $1 and $2;
explain analyze execute p1('2006-04-05', '2006-04-20');