从ORCA优化器浅析——重要主流程概述中可以知道进入真正优化器引擎执行流程之前需要对优化器提出要求,比如后面会提到的required columns、required sort orders等。而CQueryContext即是承载这些内容的类。首先CQueryContext类是通过PqcGenerate函数构造的,其函数签名如下static CQueryContext *PqcGenerate(CMemoryPool *mp, /* memory pool */ CExpression *pexpr, /* expression representing the query */ ULongPtrArray *pdrgpulQueryOutputColRefId, /* array of output column reference id */ CMDNameArray *pdrgpmdname, /* array of output column names */ BOOL fDeriveStats); // generate the query context for the given expression and array of output column ref ids
。其调用处如上图所示,参数为上一个步骤的输出参数,比如expression representing the query EXPR树等。CQueryContext实例将在CEngine::Init流程中被赋值给CEngine的m_pqc成员。
CQueryContext *CQueryContext::PqcGenerate(CMemoryPool *mp, CExpression *pexpr, ULongPtrArray *pdrgpulQueryOutputColRefId, CMDNameArray *pdrgpmdname, BOOL fDeriveStats) {
CColRefSet *pcrs = GPOS_NEW(mp) CColRefSet(mp);
CColRefArray *colref_array = GPOS_NEW(mp) CColRefArray(mp);
COptCtxt *poptctxt = COptCtxt::PoctxtFromTLS();
CColumnFactory *col_factory = poptctxt->Pcf();
// Collect required column references (colref_array)
const ULONG length = pdrgpulQueryOutputColRefId->Size();
for (ULONG ul = 0; ul < length; ul++) {
ULONG *pul = (*pdrgpulQueryOutputColRefId)[ul];
CColRef *colref = col_factory->LookupColRef(*pul);
pcrs->Include(colref);
colref_array->Append(colref);
}
// Collect required properties (prpp) at the top level:
// By default no sort order requirement is added, unless the root operator in
// the input logical expression is a LIMIT. This is because Orca always
// attaches top level Sort to a LIMIT node.
COrderSpec *pos = NULL; CExpression *pexprResult = pexpr; COperator *popTop = PopTop(pexpr);
if (COperator::EopLogicalLimit == popTop->Eopid()){
// top level operator is a limit, copy order spec to query context
pos = CLogicalLimit::PopConvert(popTop)->Pos();
pos->AddRef();
}else{
pos = GPOS_NEW(mp) COrderSpec(mp); // no order required
}
CDistributionSpec *pds = NULL;
BOOL fDML = CUtils::FLogicalDML(pexpr->Pop());
poptctxt->MarkDMLQuery(fDML);
// DML commands do not have distribution requirement. Otherwise the
// distribution requirement is Singleton.
if (fDML){
pds = GPOS_NEW(mp) CDistributionSpecAny(COperator::EopSentinel);
}else{
pds = GPOS_NEW(mp) CDistributionSpecSingleton(CDistributionSpecSingleton::EstMaster);
}
// By default, no rewindability requirement needs to be satisfied at the top level
CRewindabilitySpec *prs = GPOS_NEW(mp) CRewindabilitySpec(
CRewindabilitySpec::ErtNone, CRewindabilitySpec::EmhtNoMotion);
// Ensure order, distribution and rewindability meet 'satisfy' matching at the top level
CEnfdOrder *peo = GPOS_NEW(mp) CEnfdOrder(pos, CEnfdOrder::EomSatisfy);
CEnfdDistribution *ped =
GPOS_NEW(mp) CEnfdDistribution(pds, CEnfdDistribution::EdmSatisfy);
CEnfdRewindability *per =
GPOS_NEW(mp) CEnfdRewindability(prs, CEnfdRewindability::ErmSatisfy);
// Required CTEs are obtained from the CTEInfo global information in the optimizer context
CCTEReq *pcter = poptctxt->Pcteinfo()->PcterProducers(mp);
// NB: Partition propagation requirements are not initialized here. They are
// constructed later based on derived relation properties (CPartInfo) by
// CReqdPropPlan::InitReqdPartitionPropagation().
CReqdPropPlan *prpp = GPOS_NEW(mp) CReqdPropPlan(pcrs, peo, ped, per, pcter);
// Finally, create the CQueryContext
pdrgpmdname->AddRef();
return GPOS_NEW(mp) CQueryContext(mp, pexprResult, prpp, colref_array, pdrgpmdname, fDeriveStats);
}
其实还有对CQueryContext类实例化的调用为CPhysical::PrpCreate(CMemoryPool *mp) { return GPOS_MEM(mp)CReqdPropRelational(); }
,主要用于Create base container of required properties。主要用于CExpressionHandle的ComputeChildReqProps和ComputeChildReqdCols函数为非Scalar孩子节点创建存放需求属性的实例,也就是CReqdPropRelational。CReqdPropRelational和CReqdPropPlan是CReqdProp的子类,这在后续内容中会体现出来。
由于CQueryContext实例将在CEngine::Init流程中被赋值给CEngine的m_pqc成员,所以CQueryContext对优化器的要求也将由CEngine管理,通过对m_pqc的跟踪,其主要调用的函数是m_pqc->Prpp()
和m_pqc->FDeroveStats
,这里仅观察m_pqc->Prpp()
函数。其中最重要的是*CEngince::Optmize函数将CReqdPrpPlan m_prpp成员委托给了COptimizationContext类,如下代码所示,由此印证了其功能为Optimization context object stores properties required to hold on the plan generated by the optimizer,而CReqdPropPlan *m_prpp; // required plan properties
仅仅是COptimizationContext类的一个成员,还有其他properties需要优化器去关注。其他调用m_pqc->Prpp()
函数函数流程都是优化后的执行计划提取相关,将在Counting, Enumerating, and Sampling of Execution Plans in a Cost-Based Query Optimizer论文分析之后说明。