InfluxDB SHOW SERIES语句按照什么顺序返回？

news2026/2/12 0:18:56

在这里插入图片描述本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。

本作品 (李兆龙博文, 由李兆龙创作)，由李兆龙确认，转载请注明版权。

文章目录

引言
样例
SHOW SERIES比较原理
结论
结束语

引言

influxdb的计算引擎为了做到自底而上的合并逻辑，必然存在某种row的排序规则，这些规则在查询计划的创建阶段就已经确定。

比如group by tags，此类查询基于tag的排序；而group by time的查询，则基于time去做排序；group by tags, time 则是基于tags和time；至于show则是基于field排序；特殊的比如show series cardinality在改写为select后实际是基于tags排序；

事实上要完成一个流式引擎必须自底而上遵循某种排序规则；例如子查询内部的排序规则和内部排序规则不同是则需要重排，这就没法做到流式返回。

样例

show series不是基于serieskey或者基于展开后的tag做排序，简单的看两个show series 返回的case：

"yottadb_partition_replicas_num_lzl_ywq,account_id2=Mqw0vQ\\=\\="
"yottadb_partition_replicas_num_lzl_ywq,account_id4=Mqw0vQ\\=\\="
"yottadb_partition_replicas_num_lzl_ywq,account_id5=Mqw0vQ\\=\\="
"yottadb_partition_replicas_num_lzl_ywq,account_id7=Mqw0vQ\\=\\="
"yottadb_partition_replicas_num_lzl_ywq,account_id8=Mqw0vQ\\=\\="
"yottadb_partition_replicas_num_lzl_ywq,account_id9=Mqw0vQ\\=\\="

account_id2	account_id4	account_id5	account_id7	account_id8	account_id9
Mqw0vQ\=\=	null	null	null	null	null
null	Mqw0vQ\=\=	null	null	null	null
null	null	Mqw0vQ\=\=	null	null	null
null	null	null	Mqw0vQ\=\=	null	null
null	null	null	null	Mqw0vQ\=\=	null
null	null	null	null	null	Mqw0vQ\=\=

这个case表示不是以serieskey内部的tag本身做排序；因为以这个规则排序展开后account_id9应该排在最前面，因为null本身小于任何值。

"car1,city=city_0dddd,id=2,type=type_2"
"car1,city=city_0dddd\\ ,id=2,type=type_2"
"car1,city=city_1,id=3,type=type_0"

空格的ascall码小于 ‘,’ ，所以也不是以serieskey排序

SHOW SERIES比较原理

我们回到show series，引擎内部迭代器调用栈帧如下：

v1/coordinator/statement_executor.go:createIteratorsV2
influxql/query/select.go:Select
influxql/query/compile.go: (c *compiledStatement) Prepare 已经把shards赋值给preparedStatement.ic
influxql/query/select.go: (p *preparedStatement) Select
influxql/query/select.go:buildCursor
influxql/query/select.go:buildAuxIterator
tsdb/shard.go:(a Shards) CreateIterator
tsdb/shard.go:(s *Shard) CreateIterator
tsdb/engine/tsm1/engine.go:CreateIterator
tsdb/shard.go:(a Shards) CreateIterator
tsdb/shard.go:(a Shards) createSeriesIterator
tsdb/index.go:NewSeriesPointIterator
tsdb/index.go:MeasurementIterator (is IndexSet) measurementIterator()
tsdb/index/tsi1/index.go: (i *Index) MeasurementIterator()
tsdb/index/tsi1/partition.go (p *Partition) MeasurementIterator()
tsdb/index/tsi1/file_set.go (fs *FileSet) MeasurementIterator()
tsdb/index/tsi1/log_file.go (f *LogFile) MeasurementIterator()

事实上createSeriesIterator之下的MeasurementIterator都是基于measurement name做比较，真正的比较是shards中的seriesPointIterator，其首先会获取每一个measurement的全部时间序列对应的tsid，随后从sfile中获取tsid对应的serieskey，直接原地调用sort，排序seriesKeys，实际调用CompareSeriesKeys作为比较函数。

核心比较函数为CompareSeriesKeys：

func CompareSeriesKeys(a, b []byte) int {
	// Handle 'nil' keys.
	if len(a) == 0 && len(b) == 0 {
		return 0
	} else if len(a) == 0 {
		return -1
	} else if len(b) == 0 {
		return 1
	}

	// Read total size.
	_, a = ReadSeriesKeyLen(a)
	_, b = ReadSeriesKeyLen(b)

	// Read names.
	name0, a := ReadSeriesKeyMeasurement(a)
	name1, b := ReadSeriesKeyMeasurement(b)

	// Compare names, return if not equal.
	if cmp := bytes.Compare(name0, name1); cmp != 0 {
		return cmp
	}

	// Read tag counts.
	tagN0, a := ReadSeriesKeyTagN(a)
	tagN1, b := ReadSeriesKeyTagN(b)

	// Compare each tag in order.
	for i := 0; ; i++ {
		// Check for EOF.
		if i == tagN0 && i == tagN1 {
			return 0
		} else if i == tagN0 {
			return -1
		} else if i == tagN1 {
			return 1
		}

		// Read keys.
		var key0, key1, value0, value1 []byte
		key0, value0, a = ReadSeriesKeyTag(a)
		key1, value1, b = ReadSeriesKeyTag(b)

		// Compare keys & values.
		if cmp := bytes.Compare(key0, key1); cmp != 0 {
			return cmp
		} else if cmp := bytes.Compare(value0, value1); cmp != 0 {
			return cmp
		}
	}
}

结论

所以事实上show series返回的真正顺序遵循以下规则：

以measurement name排序
每个measuremnet基于serieskey内部的tag本身做排序

其实很好理解，因为从tsi中获取measurement对应的tsid后又从sfile获取对应的serieskey，此时的serieskey为了解析的性能，事实上进行了二进制编码的，无法直接拿来比较；其次此时是没有schema的概念的，所以如果基于tag去排序效率会及其低下，因为需要先计算合并后的schema结构，然后再生成新的row，所以搞了个这么个性能不错，但是奇奇怪怪的排序方法；

这种比较方式可以应用到流式框架，但是意义不大，因为本身这种排序就没有规定一个全局顺序（不同的tagkey都直接拿来比较了），而且与influxql本来的排序方式差别很大，额外的开发工作很多；

从用户的角度看，show series的顺序没有什么意义，就算希望有意义，当前的show series也无法满足，所以最好的方式就是直接hash merge，不给用户保证show series的返回顺序，influxdb的官网也没有保证show series的顺序。