2022-11-21 mysql列存储引擎-架构实现缺陷梳理-P2

news2024/10/5 19:19:10

摘要:

收集现有代码的糟糕实现,前事不忘后事之师,把这些烂东西定死在耻辱柱上以免再次发生

糟糕的设计:

一. DGMaterializedIterator::GetNextPackrow

函数实现:

int DimensionGroupMaterialized::DGMaterializedIterator::GetNextPackrow(int dim, int ahead) {
  MEASURE_FET("DGMaterializedIterator::GetNextPackrow(int dim, int ahead)");
  if (ahead == 0) return GetCurPackrow(dim);
  IndexTable *cur_t = t[dim];
  if (cur_t == NULL) return -1;
  uint64_t end_block = cur_t->EndOfCurrentBlock(cur_pos);
  if (next_pack[dim] >= no_obj || uint64_t(next_pack[dim]) >= end_block) return -1;
  uint64_t ahead_pos = 0;
  //	cout << "dim " << dim << ",  " << next_pack[dim] << " -> " <<
  // ahead1[dim] << "  " <<
  // ahead2[dim] << "  " << ahead3[dim] << "    (" << ahead << ")" << endl;
  if (ahead == 1)
    ahead_pos = t[dim]->Get64InsideBlock(next_pack[dim]);
  else if (ahead == 2 && ahead1[dim] != -1)
    ahead_pos = t[dim]->Get64InsideBlock(ahead1[dim]);
  else if (ahead == 3 && ahead2[dim] != -1)
    ahead_pos = t[dim]->Get64InsideBlock(ahead2[dim]);
  else if (ahead == 4 && ahead3[dim] != -1)
    ahead_pos = t[dim]->Get64InsideBlock(ahead3[dim]);
  if (ahead_pos == 0) return -1;
  return int((ahead_pos - 1) >> p_power);

  return -1;
}
    int64_t *next_pack;  // beginning of the next pack (or no_obj if there is no
                         // other pack)
    int64_t *ahead1, *ahead2,
        *ahead3;  // beginning of the three next packs after next_pack, or
                  // -1 if not determined properly

问题:

  1. 直接将内部对于pack的预读的实现细节暴漏出去,接口的调用方必须知道DGMaterializedIterator内部的处理
  2. DGMaterializedIterator类的实现与调用方的逻辑互相耦合,调用方必须了解DGMaterializedIterator的实现才能使用该类的功能、
  3. 没有更高层次的抽象,只有在iterator级别的封装使用,没有对类的功能和目的做思考,导致业务逻辑散布各个类的实现细节,对于维护和扩展是个灾难,而且出现问题难以快速定位
  4. 在做大的功能模块的架构的前,最基础的就是把一个函数写好,把一个类给设计好,细节做好,才能谈宏观的,不存在连个函数写的各种逻辑漏洞,类设计的稀烂却能做好功能设计的,这个类设计的真烂的出奇

建议:

  1. 先设计好每个功能的划分,类承载哪些功能和起到的作用,在恰到好处的抽象层次上解耦业务逻辑
  2. 对于类,建议根据面向对象的原则,去设计类和类间的交互关系

二. AggregatePackrow使用magic number作为错误码

函数实现:

          int grouping_result = AggregatePackrow(gbw, &mit, cur_tuple);
          if (sender) {
            sender->SetAffectRows(gbw.NumOfGroups());
          }
          if (grouping_result == 2) throw common::KilledException();
          if (grouping_result != 5) packrows_found++;  // for statistics
          if (grouping_result == 1) break;             // end of the aggregation

int AggregationAlgorithm::AggregatePackrow(GroupByWrapper &gbw, MIIterator *mit, int64_t cur_tuple) {
  int64_t packrow_length = mit->GetPackSizeLeft();
  if (!gbw.AnyTuplesLeft(cur_tuple, cur_tuple + packrow_length - 1)) {
    mit->NextPackrow();
    return 5;
  }

问题:

  1. 使用magic number无法直接从使用角度理解其出错的语义
  2. 数字无法被有意义的搜索,在定位问题时,无法根据错误码快速定位代码

建议:

  1. 使用枚举定义出错误码,返回枚举值

三. 逻辑不自洽,在自身的逻辑中崩溃

函数实现:

  std::unique_ptr<GroupByWrapper> gbw_ptr(new GroupByWrapper(*gb_sharding));
  gbw_ptr->FillDimsUsed(dims);
  gbw_ptr->SetDistinctTuples(mit->NumOfTuples());
  if (!gbw_ptr->IsOnePass()) gbw_ptr->InitTupleLeft(mit->NumOfTuples());

GroupByWrapper::GroupByWrapper(const GroupByWrapper &sec)
    : distinct_watch(sec.p_power), m_conn(sec.m_conn), gt(sec.gt) {
  p_power = sec.p_power;
  attrs_size = sec.attrs_size;
  just_distinct = sec.just_distinct;
  virt_col = new vcolumn::VirtualColumn *[attrs_size];
  input_mode = new GBInputMode[attrs_size];
  is_lookup = new bool[attrs_size];
  attr_mapping = new int[attrs_size];  // output attr[j] <-> gt group[attr_mapping[j]]
  dist_vals = new int64_t[attrs_size];

  for (int i = 0; i < attrs_size; i++) {
    attr_mapping[i] = sec.attr_mapping[i];
    virt_col[i] = sec.virt_col[i];
    input_mode[i] = sec.input_mode[i];
    is_lookup[i] = sec.is_lookup[i];
    dist_vals[i] = sec.dist_vals[i];
  }
  no_grouping_attr = sec.no_grouping_attr;
  no_aggregated_attr = sec.no_aggregated_attr;
  no_more_groups = sec.no_more_groups;
  no_groups = sec.no_groups;
  no_attr = sec.no_attr;
  pack_not_omitted = new bool[no_attr];
  packrows_omitted = 0;
  packrows_part_omitted = 0;
  for (int i = 0; i < no_attr; i++) pack_not_omitted[i] = sec.pack_not_omitted[i];

  tuple_left = NULL;
  if (sec.tuple_left) tuple_left = new Filter(*sec.tuple_left);  // a copy of filter
  // init distinct_watch to make copy ctor has all Initialization logic
  distinct_watch.Initialize(no_attr);
  for (int gr_a = 0; gr_a < no_attr; gr_a++) {
    if (gt.AttrDistinct(gr_a)) {
      distinct_watch.DeclareAsDistinct(gr_a);
    }
  }
}

  if (sec.tuple_left) tuple_left = new Filter(*sec.tuple_left);  // a copy of filter

void GroupByWrapper::InitTupleLeft(int64_t n) {
  DEBUG_ASSERT(tuple_left == NULL);
  tuple_left = new Filter(n, p_power);
  tuple_left->Set();
}
  DEBUG_ASSERT(tuple_left == NULL);

问题:

  1. GroupByWrapper的拷贝构造函数中,tuple_left存在被赋值的情况
  2. 创建 GroupByWrapper后立即调用 InitTupleLeft, 此函数中如果 tuple_left 不为NULL则宕掉

四. 函数内逻辑冗余,导致被不必要的执行多次

函数实现:

问题:

  1. 进入else分支后,将被执行两次IsType_JoinSimple

五. 在没有理解原有代码的设计目的前,瞎改造,将只有串行的函数强行用多线程串行

AggregationAlgorithm::MultiDimensionalGroupByScan

原有代码:

void AggregationAlgorithm::MultiDimensionalGroupByScan(GroupByWrapper &gbw, _int64& limit, _int64& offset, map<int,vector<PackOrderer::OrderingInfo> > &oi, ResultSender* sender, bool limit_less_than_no_groups)
{
	MEASURE_FET("TempTable::MultiDimensionalGroupByScan(...)");
	bool first_pass = true;
	_int64 cur_tuple = 0;				// tuples are numbered according to tuple_left filter (not used, if tuple_left is null)
	_int64 displayed_no_groups = 0;

	// Determine dimensions to be iterated
	bool no_dims_found = true;
	DimensionVector dims(mind->NoDimensions());
	gbw.FillDimsUsed(dims);
	for(int i = 0; i < mind->NoDimensions(); i++)
		if(dims[i]) {
			no_dims_found = false;
			break;
		}
	if(no_dims_found)
		dims[0] = true;							// at least one dimension is needed

	vector<PackOrderer> po(mind->NoDimensions());
	// do not use pack orderer if there are too many expected groups
	// (more than 50% of tuples)
	if(gbw.UpperApproxOfGroups() < mind->NoTuples() / 2) {
		map<int,vector<PackOrderer::OrderingInfo> >::iterator oi_it;
		bool one_group = (gbw.UpperApproxOfGroups() == 1);
		for(oi_it = oi.begin(); oi_it!= oi.end(); oi_it++)
			PackOrderer::ChoosePackOrderer(po[(*oi_it).first],(*oi_it).second, one_group);
	}
	MIIterator mit(mind, dims, po);

	factor = mit.Factor();
	if(mit.NoTuples() == NULL_VALUE_64 || mit.NoTuples() > MAX_ROW_NUMBER) {	// 2^47, a limit for filter below
		throw OutOfMemoryRCException("Aggregation is too large.");
	}
	gbw.SetDistinctTuples(mit.NoTuples());

#ifndef __BH_COMMUNITY__
	AggregationWorkerEnt ag_worker(gbw, this);
	if(gbw.MayBeParallel() && ag_worker.MayBeParallel(mit) 
		&& !limit_less_than_no_groups)	// if we are going to skip groups, we cannot do it in parallel
		ag_worker.CheckThreads(mit);	// CheckThreads() must be executed if we want to be parallel
#else
	AggregationWorker ag_worker(gbw, this);
#endif

	if(!gbw.IsOnePass())
		gbw.InitTupleLeft(mit.NoTuples());
	bool rewind_needed = false;
	bool was_prefetched = false;
	try {
		do {
			if(rccontrol.isOn())  {
				if(gbw.UpperApproxOfGroups() == 1 || first_pass)
					rccontrol.lock(m_conn->GetThreadID()) << "Aggregating: " << mit.NoTuples() << " tuples left." << unlock;
				else
					rccontrol.lock(m_conn->GetThreadID()) << "Aggregating: " << gbw.TuplesNoOnes() << " tuples left, " << displayed_no_groups << " gr. found so far" << unlock;
			}
			cur_tuple = 0;
			gbw.ClearNoGroups();			// count groups locally created in this pass
			gbw.ClearDistinctBuffers();		// reset buffers for a new contents
			gbw.AddAllGroupingConstants(mit);
			ag_worker.Init(mit);
			if(rewind_needed)
				mit.Rewind();	// aggregated rows will be massively omitted packrow by packrow
			rewind_needed = true;
			was_prefetched = false;
			for(uint i = 0; i < t->NoAttrs(); i++) {		// left as uninitialized (NULL or 0)
				if(t->GetAttrP(i)->mode == DELAYED) {
					MIDummyIterator m(1);
					t->GetAttrP(i)->term.vc->LockSourcePacks(m);
				}
			}

			while(mit.IsValid()) { / First stage - some distincts may be delayed
				if(m_conn->killed())
					throw KilledRCException();

				/// Grouping on a packrow 
				_int64 packrow_length = mit.GetPackSizeLeft();
				if(ag_worker.ThreadsUsed() == 1) {
					if(was_prefetched == false) {
						for(int i = 0; i < gbw.NoAttr(); i++)
							if(gbw.GetColumn(i))
								gbw.GetColumn(i)->InitPrefetching(mit);
						was_prefetched = true;
					}

					int grouping_result = AggregatePackrow(gbw, &mit, cur_tuple);
					if(grouping_result == 2)
						throw KilledRCException();
					if(grouping_result != 5)
						packrows_found++;				// for statistics
					if(grouping_result == 1)
						break;							// end of the aggregation
					if(!gbw.IsFull() && gbw.MemoryBlocksLeft() == 0) {
						gbw.SetAsFull();
					}
				} else {
					if(was_prefetched) {
						for(int i = 0; i < gbw.NoAttr(); i++)
							if(gbw.GetColumn(i))
								gbw.GetColumn(i)->StopPrefetching();
						was_prefetched = false;
					}
					MIInpackIterator lmit(mit);
					int grouping_result = ag_worker.AggregatePackrow(lmit, cur_tuple);
					if(grouping_result != 5)
						packrows_found++;				// for statistics
					if(grouping_result == 1)
						break;
					if(grouping_result == 2)
						throw KilledRCException();
					if(grouping_result == 3 || grouping_result == 4)
						throw NotImplementedRCException("Aggregation overflow.");
					if(mit.BarrierAfterPackrow()) {
						ag_worker.Barrier();
					}
					ag_worker.ReevaluateNumberOfThreads(mit);
					mit.NextPackrow();
				}
				cur_tuple += packrow_length;
			}
			MultiDimensionalDistinctScan(gbw, mit);		// if not needed, no effect
			ag_worker.Commit();

			
			// Now it is time to prepare output values
			if(first_pass) {
				first_pass = false;
				_int64 upper_groups = gbw.NoGroups() + gbw.TuplesNoOnes();		// upper approximation: the current size + all other possible rows (if any)
				t->CalculatePageSize(upper_groups);
				if(upper_groups > gbw.UpperApproxOfGroups())
					upper_groups = gbw.UpperApproxOfGroups();					// another upper limitation: not more than theoretical number of combinations

				MIDummyIterator m(1);
				for(uint i = 0; i < t->NoAttrs(); i++) {
					t->GetAttrP(i)->CreateBuffer(upper_groups);					// note: may be more than needed
					if(t->GetAttrP(i)->mode == DELAYED)
						t->GetAttrP(i)->term.vc->LockSourcePacks(m);
				}
			}
			rccontrol.lock(m_conn->GetThreadID()) << "Generating output." << unlock;
			gbw.RewindRows();
			while(gbw.RowValid()) {		
				// copy GroupTable into TempTable, row by row
				if(t->NoObj() >= limit)
					break;
				AggregateFillOutput(gbw, gbw.GetCurrentRow(), offset);		// offset is decremented for each row, if positive
				if(sender && t->NoObj() > 65535) {
					TempTable::RecordIterator iter = t->begin();	
					for(_int64 i = 0; i < t->NoObj(); i++) {
						sender->Send(iter);
						++iter;
					}
					displayed_no_groups += t->NoObj();
					limit -= t->NoObj();
					t->SetNoObj(0);
				}
				gbw.NextRow();
			}
			if(sender) {
				TempTable::RecordIterator iter = t->begin();	
				for(_int64 i = 0; i < t->NoObj(); i++) {
					sender->Send(iter);
					++iter;
				}
				displayed_no_groups += t->NoObj();
				limit -= t->NoObj();
				t->SetNoObj(0);
			} else 
				displayed_no_groups = t->NoObj();
			if(t->NoObj() >= limit)
				break;
			if(gbw.AnyTuplesLeft())
				gbw.ClearUsed();				// prepare for the next pass, if needed
		} while(gbw.AnyTuplesLeft());			// do the next pass, if anything left
	} catch(...) {
		ag_worker.Commit(false);
		throw;
	}
	if(rccontrol.isOn())
		rccontrol.lock(m_conn->GetThreadID()) << "Aggregated (" << displayed_no_groups
									<< " gr). Omitted packrows: " << gbw.packrows_omitted << " + "
									<< gbw.packrows_part_omitted << " partially, out of " << packrows_found << " total." << unlock;
}

核心处理:

			while(mit.IsValid()) { / First stage - some distincts may be delayed
				if(m_conn->killed())
					throw KilledRCException();

				/// Grouping on a packrow 
				_int64 packrow_length = mit.GetPackSizeLeft();
				if(ag_worker.ThreadsUsed() == 1) {
					if(was_prefetched == false) {
						for(int i = 0; i < gbw.NoAttr(); i++)
							if(gbw.GetColumn(i))
								gbw.GetColumn(i)->InitPrefetching(mit);
						was_prefetched = true;
					}

					int grouping_result = AggregatePackrow(gbw, &mit, cur_tuple);
					if(grouping_result == 2)
						throw KilledRCException();
					if(grouping_result != 5)
						packrows_found++;				// for statistics
					if(grouping_result == 1)
						break;							// end of the aggregation
					if(!gbw.IsFull() && gbw.MemoryBlocksLeft() == 0) {
						gbw.SetAsFull();
					}
				} else {
					if(was_prefetched) {
						for(int i = 0; i < gbw.NoAttr(); i++)
							if(gbw.GetColumn(i))
								gbw.GetColumn(i)->StopPrefetching();
						was_prefetched = false;
					}
					MIInpackIterator lmit(mit);
					int grouping_result = ag_worker.AggregatePackrow(lmit, cur_tuple);
					if(grouping_result != 5)
						packrows_found++;				// for statistics
					if(grouping_result == 1)
						break;
					if(grouping_result == 2)
						throw KilledRCException();
					if(grouping_result == 3 || grouping_result == 4)
						throw NotImplementedRCException("Aggregation overflow.");
					if(mit.BarrierAfterPackrow()) {
						ag_worker.Barrier();
					}
					ag_worker.ReevaluateNumberOfThreads(mit);
					mit.NextPackrow();
				}
				cur_tuple += packrow_length;
			}

被魔改的代码:

void AggregationAlgorithm::MultiDimensionalGroupByScan(GroupByWrapper &gbw, int64_t &limit, int64_t &offset,
                                                       ResultSender *sender, bool limit_less_than_no_groups) {
  MEASURE_FET("TempTable::MultiDimensionalGroupByScan(...)");
  bool first_pass = true;
  // tuples are numbered according to tuple_left filter (not used, if tuple_left
  // is null)
  int64_t cur_tuple = 0;
  int64_t displayed_no_groups = 0;

  // Determine dimensions to be iterated
  bool no_dims_found = true;
  DimensionVector dims(mind->NumOfDimensions());
  gbw.FillDimsUsed(dims);
  for (int i = 0; i < mind->NumOfDimensions(); i++)
    if (dims[i]) {
      no_dims_found = false;
      break;
    }
  if (no_dims_found) dims[0] = true;  // at least one dimension is needed

  std::vector<PackOrderer> po(mind->NumOfDimensions());
  MIIterator mit(mind, dims, po);

  factor = mit.Factor();
  if (mit.NumOfTuples() == common::NULL_VALUE_64 ||
      mit.NumOfTuples() > common::MAX_ROW_NUMBER) {  // 2^47, a limit for filter below
    throw common::OutOfMemoryException("Aggregation is too large.");
  }
  gbw.SetDistinctTuples(mit.NumOfTuples());

  int thd_cnt = 1;
  if (ParallelAllowed(gbw) && !limit_less_than_no_groups) {
    thd_cnt = std::thread::hardware_concurrency() / 4;  // For concurrence reason, don't swallow all cores once.
  }

  AggregationWorkerEnt ag_worker(gbw, mind, thd_cnt, this);

  if (!gbw.IsOnePass()) gbw.InitTupleLeft(mit.NumOfTuples());
  bool rewind_needed = false;
  try {
    do {
      if (rccontrol.isOn()) {
        if (gbw.UpperApproxOfGroups() == 1 || first_pass)
          rccontrol.lock(m_conn->GetThreadID())
              << "Aggregating: " << mit.NumOfTuples() << " tuples left." << system::unlock;
        else
          rccontrol.lock(m_conn->GetThreadID()) << "Aggregating: " << gbw.TuplesNoOnes() << " tuples left, "
                                                << displayed_no_groups << " gr. found so far" << system::unlock;
      }
      cur_tuple = 0;
      gbw.ClearNoGroups();         // count groups locally created in this pass
      gbw.ClearDistinctBuffers();  // reset buffers for a new contents
      gbw.AddAllGroupingConstants(mit);
      ag_worker.Init(mit);
      if (rewind_needed)
        mit.Rewind();  // aggregated rows will be massively omitted packrow by
                       // packrow
      rewind_needed = true;
      for (uint i = 0; i < t->NumOfAttrs(); i++) {  // left as uninitialized (NULL or 0)
        if (t->GetAttrP(i)->mode == common::ColOperation::DELAYED) {
          MIDummyIterator m(1);
          t->GetAttrP(i)->term.vc->LockSourcePacks(m);
        }
      }
      if (ag_worker.ThreadsUsed() > 1) {
        ag_worker.DistributeAggreTaskAverage(mit);
      } else {
        while (mit.IsValid()) {  // need muti thread
                                 // First stage -
                                 //  some distincts may be delayed
          if (m_conn->Killed()) throw common::KilledException();

          // Grouping on a packrow
          int64_t packrow_length = mit.GetPackSizeLeft();
          int grouping_result = AggregatePackrow(gbw, &mit, cur_tuple);
          if (sender) {
            sender->SetAffectRows(gbw.NumOfGroups());
          }
          if (grouping_result == 2) throw common::KilledException();
          if (grouping_result != 5) packrows_found++;  // for statistics
          if (grouping_result == 1) break;             // end of the aggregation
          if (!gbw.IsFull() && gbw.MemoryBlocksLeft() == 0) {
            gbw.SetAsFull();
          }
          cur_tuple += packrow_length;
        }
      }
      gbw.ClearDistinctBuffers();              // reset buffers for a new contents
      MultiDimensionalDistinctScan(gbw, mit);  // if not needed, no effect
      ag_worker.Commit();

      // Now it is time to prepare output values
      if (first_pass) {
        first_pass = false;
        int64_t upper_groups = gbw.NumOfGroups() + gbw.TuplesNoOnes();  // upper approximation: the current size +
                                                                     // all other possible rows (if any)
        t->CalculatePageSize(upper_groups);
        if (upper_groups > gbw.UpperApproxOfGroups())
          upper_groups = gbw.UpperApproxOfGroups();  // another upper limitation: not more
                                                     // than theoretical number of
                                                     // combinations

        MIDummyIterator m(1);
        for (uint i = 0; i < t->NumOfAttrs(); i++) {
          if (t->GetAttrP(i)->mode == common::ColOperation::GROUP_CONCAT) {
            t->GetAttrP(i)->SetTypeName(common::CT::VARCHAR);
            t->GetAttrP(i)->OverrideStringSize(tianmu_group_concat_max_len);
          }
          t->GetAttrP(i)->CreateBuffer(upper_groups);  // note: may be more than needed
          if (t->GetAttrP(i)->mode == common::ColOperation::DELAYED) t->GetAttrP(i)->term.vc->LockSourcePacks(m);
        }
      }
      rccontrol.lock(m_conn->GetThreadID()) << "Group/Aggregate end. Begin generating output." << system::unlock;
      rccontrol.lock(m_conn->GetThreadID()) << "Output rows: " << gbw.NumOfGroups() + gbw.TuplesNoOnes()
                                            << ", output table row limit: " << t->GetPageSize() << system::unlock;
      int64_t output_size = (gbw.NumOfGroups() + gbw.TuplesNoOnes()) * t->GetOneOutputRecordSize();
      gbw.RewindRows();
      if (t->GetPageSize() >= (gbw.NumOfGroups() + gbw.TuplesNoOnes()) && output_size > (1L << 29) &&
          !t->HasHavingConditions() && tianmu_sysvar_parallel_filloutput) {
        // Turn on parallel output when:
        // 1. output page is large enough to hold all output rows
        // 2. output result is larger than 512MB
        // 3. no have condition
        rccontrol.lock(m_conn->GetThreadID()) << "Start parallel output" << system::unlock;
        ParallelFillOutputWrapper(gbw, offset, limit, mit);
      } else {
        while (gbw.RowValid()) {
          // copy GroupTable into TempTable, row by row
          if (t->NumOfObj() >= limit) break;
          AggregateFillOutput(gbw, gbw.GetCurrentRow(),
                              offset);  // offset is decremented for each row, if positive
          if (sender && t->NumOfObj() > (1 << mind->ValueOfPower()) - 1) {
            TempTable::RecordIterator iter = t->begin();
            for (int64_t i = 0; i < t->NumOfObj(); i++) {
              sender->Send(iter);
              ++iter;
            }
            displayed_no_groups += t->NumOfObj();
            limit -= t->NumOfObj();
            t->SetNumOfObj(0);
          }
          gbw.NextRow();
        }
      }
      if (sender) {
        TempTable::RecordIterator iter = t->begin();
        for (int64_t i = 0; i < t->NumOfObj(); i++) {
          sender->Send(iter);
          ++iter;
        }
        displayed_no_groups += t->NumOfObj();
        limit -= t->NumOfObj();
        t->SetNumOfObj(0);
      } else
        displayed_no_groups = t->NumOfObj();
      if (t->NumOfObj() >= limit) break;
      if (gbw.AnyTuplesLeft()) gbw.ClearUsed();  // prepare for the next pass, if needed
    } while (gbw.AnyTuplesLeft());               // do the next pass, if anything left
  } catch (...) {
    ag_worker.Commit(false);
    throw;
  }
  if (rccontrol.isOn())
    rccontrol.lock(m_conn->GetThreadID())
        << "Generating output end. "
        << "Aggregated (" << displayed_no_groups << " group). Omitted packrows: " << gbw.packrows_omitted << " + "
        << gbw.packrows_part_omitted << " partially, out of " << packrows_found << " total." << system::unlock;
}

核心处理:

      if (ag_worker.ThreadsUsed() > 1) {
        ag_worker.DistributeAggreTaskAverage(mit);
      } else {
        while (mit.IsValid()) {  // need muti thread
                                 // First stage -
                                 //  some distincts may be delayed
          if (m_conn->Killed()) throw common::KilledException();

          // Grouping on a packrow
          int64_t packrow_length = mit.GetPackSizeLeft();
          int grouping_result = AggregatePackrow(gbw, &mit, cur_tuple);
          if (sender) {
            sender->SetAffectRows(gbw.NumOfGroups());
          }
          if (grouping_result == 2) throw common::KilledException();
          if (grouping_result != 5) packrows_found++;  // for statistics
          if (grouping_result == 1) break;             // end of the aggregation
          if (!gbw.IsFull() && gbw.MemoryBlocksLeft() == 0) {
            gbw.SetAsFull();
          }
          cur_tuple += packrow_length;
        }
      }

单线程中遍历迭代器给GroupByWrapper的filter的blocks赋值,用来在下一次遍历中使用:

(gdb) bt
#0  Tianmu::core::AggregationAlgorithm::AggregatePackrow (this=0x7ff52da4b580, gbw=..., mit=0x7ff52da4aee0, cur_tuple=41953)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/aggregation_algorithm.cpp:580
#1  0x0000000003005b74 in Tianmu::core::AggregationAlgorithm::MultiDimensionalGroupByScan (this=0x7ff52da4b580, gbw=..., limit=@0x7ff52da4b208: 7422784, offset=@0x7ff52da4b608: 0, sender=0x0, 
    limit_less_than_no_groups=false) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/aggregation_algorithm.cpp:280
#2  0x00000000030053ca in Tianmu::core::AggregationAlgorithm::Aggregate (this=0x7ff52da4b580, just_distinct=false, limit=@0x7ff52da4b600: -1, offset=@0x7ff52da4b608: 0, sender=0x0)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/aggregation_algorithm.cpp:196
#3  0x0000000002df1e3e in Tianmu::core::TempTable::Materialize (this=0x7fd1f8001e10, in_subq=false, sender=0x7fd1f8931ed0, lazy=false)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/temp_table.cpp:1972
#4  0x0000000002d3a414 in Tianmu::core::Engine::Execute (this=0x7cef220, thd=0x7fd1f80125f0, lex=0x7fd1f8014918, result_output=0x7fd1f8ab1c20, unit_for_union=0x0)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/engine_execute.cpp:426
#5  0x0000000002d395b6 in Tianmu::core::Engine::HandleSelect (this=0x7cef220, thd=0x7fd1f80125f0, lex=0x7fd1f8014918, result=@0x7ff52da4bd18: 0x7fd1f8ab1c20, setup_tables_done_option=0, 
    res=@0x7ff52da4bd14: 0, optimize_after_tianmu=@0x7ff52da4bd0c: 1, tianmu_free_join=@0x7ff52da4bd10: 1, with_insert=0)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/engine_execute.cpp:232
#6  0x0000000002e21e47 in Tianmu::dbhandler::TIANMU_HandleSelect (thd=0x7fd1f80125f0, lex=0x7fd1f8014918, result=@0x7ff52da4bd18: 0x7fd1f8ab1c20, setup_tables_done_option=0, res=@0x7ff52da4bd14: 0, 
    optimize_after_tianmu=@0x7ff52da4bd0c: 1, tianmu_free_join=@0x7ff52da4bd10: 1, with_insert=0)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/handler/ha_rcengine.cpp:82
#7  0x0000000002462f6a in execute_sqlcom_select (thd=0x7fd1f80125f0, all_tables=0x7fd1f8b3ec88) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:5182
#8  0x000000000245c2ee in mysql_execute_command (thd=0x7fd1f80125f0, first_level=true) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:2831
#9  0x0000000002463f33 in mysql_parse (thd=0x7fd1f80125f0, parser_state=0x7ff52da4ceb0) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:5621
#10 0x00000000024591cb in dispatch_command (thd=0x7fd1f80125f0, com_data=0x7ff52da4d650, command=COM_QUERY) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:1495
#11 0x00000000024580f7 in do_command (thd=0x7fd1f80125f0) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:1034
#12 0x000000000258accd in handle_connection (arg=0x9c1ef60) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/conn_handler/connection_handler_per_thread.cc:313
#13 0x0000000002c71102 in pfs_spawn_thread (arg=0x8140030) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/perfschema/pfs.cc:2197
#14 0x00007ff57ca91ea5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ff57acc6b0d in clone () from /lib64/libc.so.6

(gdb) bt
#0  Tianmu::core::Filter::Set (this=0x7fd1f8b8f990, b=0, n=41953) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/filter.cpp:243
#1  0x0000000002d5c465 in Tianmu::core::Filter::Set (this=0x7fd1f8b8f990, n=41953) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/filter.h:74
#2  0x0000000003074929 in Tianmu::core::DistinctWrapper::SetAsOmitted (this=0x7ff52da4b220, attr=3, obj=41953)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/groupby_wrapper.h:44
#3  0x0000000003073071 in Tianmu::core::GroupByWrapper::DistinctlyOmitted (this=0x7ff52da4b220, attr=3, obj=41953)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/groupby_wrapper.cpp:534
#4  0x0000000003007427 in Tianmu::core::AggregationAlgorithm::AggregatePackrow (this=0x7ff52da4b580, gbw=..., mit=0x7ff52da4aee0, cur_tuple=41953)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/aggregation_algorithm.cpp:580
#5  0x0000000003005b74 in Tianmu::core::AggregationAlgorithm::MultiDimensionalGroupByScan (this=0x7ff52da4b580, gbw=..., limit=@0x7ff52da4b208: 7422784, offset=@0x7ff52da4b608: 0, sender=0x0, 
    limit_less_than_no_groups=false) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/aggregation_algorithm.cpp:280
#6  0x00000000030053ca in Tianmu::core::AggregationAlgorithm::Aggregate (this=0x7ff52da4b580, just_distinct=false, limit=@0x7ff52da4b600: -1, offset=@0x7ff52da4b608: 0, sender=0x0)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/aggregation_algorithm.cpp:196
#7  0x0000000002df1e3e in Tianmu::core::TempTable::Materialize (this=0x7fd1f8001e10, in_subq=false, sender=0x7fd1f8931ed0, lazy=false)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/temp_table.cpp:1972
#8  0x0000000002d3a414 in Tianmu::core::Engine::Execute (this=0x7cef220, thd=0x7fd1f80125f0, lex=0x7fd1f8014918, result_output=0x7fd1f8ab1c20, unit_for_union=0x0)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/engine_execute.cpp:426
#9  0x0000000002d395b6 in Tianmu::core::Engine::HandleSelect (this=0x7cef220, thd=0x7fd1f80125f0, lex=0x7fd1f8014918, result=@0x7ff52da4bd18: 0x7fd1f8ab1c20, setup_tables_done_option=0, 
    res=@0x7ff52da4bd14: 0, optimize_after_tianmu=@0x7ff52da4bd0c: 1, tianmu_free_join=@0x7ff52da4bd10: 1, with_insert=0)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/core/engine_execute.cpp:232
#10 0x0000000002e21e47 in Tianmu::dbhandler::TIANMU_HandleSelect (thd=0x7fd1f80125f0, lex=0x7fd1f8014918, result=@0x7ff52da4bd18: 0x7fd1f8ab1c20, setup_tables_done_option=0, res=@0x7ff52da4bd14: 0, 
    optimize_after_tianmu=@0x7ff52da4bd0c: 1, tianmu_free_join=@0x7ff52da4bd10: 1, with_insert=0)
    at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/tianmu/handler/ha_rcengine.cpp:82
#11 0x0000000002462f6a in execute_sqlcom_select (thd=0x7fd1f80125f0, all_tables=0x7fd1f8b3ec88) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:5182
#12 0x000000000245c2ee in mysql_execute_command (thd=0x7fd1f80125f0, first_level=true) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:2831
#13 0x0000000002463f33 in mysql_parse (thd=0x7fd1f80125f0, parser_state=0x7ff52da4ceb0) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:5621
#14 0x00000000024591cb in dispatch_command (thd=0x7fd1f80125f0, com_data=0x7ff52da4d650, command=COM_QUERY) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:1495
#15 0x00000000024580f7 in do_command (thd=0x7fd1f80125f0) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/sql_parse.cc:1034
#16 0x000000000258accd in handle_connection (arg=0x9c1ef60) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/sql/conn_handler/connection_handler_per_thread.cc:313
#17 0x0000000002c71102 in pfs_spawn_thread (arg=0x8140030) at /home/jenkins/workspace/stonedb5.7-zsl-centos7.9-30-119-20220805/storage/perfschema/pfs.cc:2197
#18 0x00007ff57ca91ea5 in start_thread () from /lib64/libpthread.so.0
#19 0x00007ff57acc6b0d in clone () from /lib64/libc.so.6

问题:

  1. AggregationAlgorithm::AggregatePackrow在遍历迭代器过程中给GroupByWrapper的filter的blocks赋值
  2. 不考虑此前代码的实现,强行用多线程来跑只有单线程遍历的逻辑,不顾底层block数据实现的逻辑

建议:

  1. 没看明白原有代码的业务逻辑的因果关系前,不要随便按照臆想瞎几把搞
  2. 写不出来就空着说写不出来,也别为了糊弄写一坨南辕北辙的垃圾上去

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/23482.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【Linux系统】第一篇:基础指令篇

文章目录一、Linux中的文件二、Linux用户三、Linux基本指令ls指令pwd命令cd指令touch指令mkdir指令rmdir指令rm 指令man指令cp指令mv指令cat指令tac指令more指令less指令head指令tail指令管道重定向date指令cal指令find指令which指令alias指令whereis指令grep指令wc指令sort指令…

Node的web编程(二)

一、JSON数据 1、定义 JavaScript Object Notation&#xff0c;是一种轻量级的前后端数据交换的格式(数据格式)。 2、特点 &#xff08;1&#xff09;容易阅读和编写 &#xff08;2&#xff09;语言无关性 &#xff08;3&#xff09;便于编译、解析 3、语法要求 &#…

Mac m1配置flutter开发环境

Mac m1配置flutter开发环境 文章目录Mac m1配置flutter开发环境一、下载Android Studio二、下载flutter sdk三、新建flutter project四、使用在线环境进行Flutter开发Dart在线运行环境Flutter在线运行环境一、下载Android Studio 进入官网下载&#xff0c;选择苹果芯片版本。 …

【Spring(三)】熟练掌握Spring的使用

有关Spring的所有文章都收录于我的专栏&#xff1a;&#x1f449;Spring&#x1f448; 目录 一、前言 二、通过静态工厂获取对象 三、通过实例工厂获取对象 四、通过FactoryBean获取对象 五、Bean配置信息重用 六、Bean创建顺序 七、Bean对象的单例和多例 八、Bean的生命周期 九…

Weblogic SSRF 漏洞(CVE-2014-4210)分析

Weblogic SSRF 漏洞是一个比较经典的SSRF 漏洞案例&#xff0c;该漏洞存在于 http://127.0.0.1:7001/uddiexplorer/SearchPublicRegistries. jsp 页面中&#xff0c;如图 1-1 所示图 1-1 Weblogic SSRF 漏洞Weblogic SSRF 漏洞可以通过向服务端发送以下请求参数进行触发&#x…

ARFoundation系列讲解 - 70 HumanBodyTracking3D

---------------------------------------------- 视频教程来源于网络,侵权必删! --------------------------------------------- 一、简介 HumanBodyTracking3D(身体跟踪3D)案例,当设备检查到人体时,会返回检测到人体关节点的3D空间位置(需要在iOS 13或更高版本的A12…

瞪羚优化算法(Matlab代码实现)

&#x1f468;‍&#x1f393;个人主页&#xff1a;研学社的博客 &#x1f4a5;&#x1f4a5;&#x1f49e;&#x1f49e;欢迎来到本博客❤️❤️&#x1f4a5;&#x1f4a5; &#x1f3c6;博主优势&#xff1a;&#x1f31e;&#x1f31e;&#x1f31e;博客内容尽量做到思维缜…

Java集合类——ArrayList(扩容机制)

线性表 线性表是n个相同类型元素的有限序列&#xff0c;逻辑上连续物理上不一定是连续的&#xff0c;存储结构上分为顺序存储和链式存储&#xff0c;常见的线性表有&#xff1a;顺序表&#xff0c;链表&#xff0c;栈&#xff0c;队列…… ArrayList 数据结构 ArrayList&am…

赋值运算符重载,取地址及const取地址操作符重载

赋值运算符重载1.运算符重载2.赋值运算符重载3.取地址及const取地址操作符重载如果一个类中什么成员都没有&#xff0c;那么该类简称为空类。而空类中其实并不是真的什么都没有&#xff0c;任何类在什么都不写时&#xff0c;编译器会自动生成以下6个默认成员函数。构造函数&…

同花顺_代码解析_技术指标_V,W

本文通过对同花顺中现成代码进行解析&#xff0c;用以了解同花顺相关策略设计的思想 目录 V&R VMA VMACD VOSC VPT VR VRFS VRSI VSTD W&R WVAD V&R 波动区间 用来衡量该股的市场波动风险.即95%的概率波动区间. 行号 1 n -> 250 2 x -> 收…

【考研英语语法】状语从句精讲

一、状语从句概述 &#xff08;一&#xff09;状语从句的含义 状语从句&#xff0c;指的就是一个句子作状语&#xff0c;表达“描述性的信息”&#xff0c;补充说明另一个句子&#xff08;主句&#xff09;。描述性的信息有很多种&#xff0c;可以描述时间、地点、原因、结果…

Web大学生网页成品HTML+CSS音乐吧 7页

⛵ 源码获取 文末联系 ✈ Web前端开发技术 描述 网页设计题材&#xff0c;DIVCSS 布局制作,HTMLCSS网页设计期末课程大作业 | 音乐网页设计 | 仿网易云音乐 | 各大音乐官网网页 | 明星音乐演唱会主题 | 爵士乐音乐 | 民族音乐 | 等网站的设计与制作 | HTML期末大学生网页设计作…

Django开发笔记

Django开发笔记Django学习1. Django安装path()函数2. 创建项目2.1 终端命令创建2.2 pycharm创建项目3. App4. 创建页面4.1 再写一个页面4.2 模板---Templates4.3 静态文件4.3.1 创建static目录4.3.2 静态文件的引用5. 模板语法案例&#xff1a;伪联通新闻中心6. 请求和响应案例…

KT148A语音芯片按键版本一对一触发播放常见的问题集锦FAQ_V4

1.1 有3个IO&#xff0c;都是一样的功能吗&#xff1f;从配置文件的说明来看&#xff0c;功能是键控发声&#xff0c;那么3个IO都只能是键控发声吗&#xff1f;还是可以有选择地某个IO对应播放那段语音&#xff1f;三个按键有什么区别&#xff1f;他们和语音号是如何对应的&…

[附源码]java毕业设计校园环境保护监督系统

项目运行 环境配置&#xff1a; Jdk1.8 Tomcat7.0 Mysql HBuilderX&#xff08;Webstorm也行&#xff09; Eclispe&#xff08;IntelliJ IDEA,Eclispe,MyEclispe,Sts都支持&#xff09;。 项目技术&#xff1a; SSM mybatis Maven Vue 等等组成&#xff0c;B/S模式 M…

SpringBoot SpringBoot 开发实用篇 5 整合第三方技术 5.7 memcached 下载与安装

SpringBoot 【黑马程序员SpringBoot2全套视频教程&#xff0c;springboot零基础到项目实战&#xff08;spring boot2完整版&#xff09;】 SpringBoot 开发实用篇 文章目录SpringBootSpringBoot 开发实用篇5 整合第三方技术5.7 memcached 下载与安装5.7.1 memcached 下载5.7.…

基于Web的个人网页响应式页面设计与实现 HTML+CSS+JavaScript(web前端网页制作课作业)

&#x1f389;精彩专栏推荐&#x1f447;&#x1f3fb;&#x1f447;&#x1f3fb;&#x1f447;&#x1f3fb; ✍️ 作者简介: 一个热爱把逻辑思维转变为代码的技术博主 &#x1f482; 作者主页: 【主页——&#x1f680;获取更多优质源码】 &#x1f393; web前端期末大作业…

裁员潮血洗硅谷,推特、Meta、亚麻都扛不住了!

据纽约时报14日报道&#xff0c;亚马逊计划最早于本周开启大规模裁员&#xff0c;上万名员工将被波及&#xff0c;Alexa、零售和人力部门将是重灾区。 在亚马逊员工人人自危的情况下&#xff0c;新的噩梦才刚刚开始&#xff0c;因为&#xff0c;这不过是硅谷裁员潮的冰山一角……

36、Java——一个案例学会三层架构对数据表的增删改查

✅作者简介&#xff1a;热爱国学的Java后端开发者&#xff0c;修心和技术同步精进。 &#x1f34e;个人主页&#xff1a;Java Fans的博客 &#x1f34a;个人信条&#xff1a;不迁怒&#xff0c;不贰过。小知识&#xff0c;大智慧。 &#x1f49e;当前专栏&#xff1a;Java案例分…

Spring Security(1)

您好&#xff0c;我是湘王&#xff0c;这是我的CSDN博客&#xff0c;欢迎您来&#xff0c;欢迎您再来&#xff5e; 虽然说互联网是一个非常开发、几乎没有边界的信息大海&#xff0c;但说起来有点奇怪的是&#xff0c;每个稍微有点规模的互联网应用都有自己的权限系统&#xff…