目录
- 一、ElasticSearch文档分值_score计算底层原理
- 1.boolean model
- 2.relevance score算法
- 2、分析一个document上的_score是如何被计算出来的
- 二、分词器工作流程
- 1.character filter、tokenizer、token filter
- 2、内置分词器的简单介绍
- 3、定制分词器
- 3.1默认的分词器--standard
- 3.2修改分词器的设置
- 3.3定制化自己的分词器
- 3.4 ik分词器详解
- 三、高亮显示
- 1.高亮简述
- 2.常用的highlight
- 3.fast vector highlight
- 4.高亮片段fragment的设置
- 四、 聚合搜索技术深入
- 1.bucket和metric
- 2聚合操作案例
- 2.1聚合操作之histogram 区间统计
- 2.2date_histogram区间分组
- 2.3_global bucket
- 2.4 aggs+order(聚合+排序)
- 2.5search+aggs (条件查询+聚合)
- 2.6filter+aggs(过滤+聚合)
- 2.7聚合中使用filter
集群节点介绍
es配置文件夹中
主节点:node.master:true
数据节点: node.data: true
-
客户端节点
当主节点和数据节点配置都设置为false的时候,该节点只能处理路由请求,处理搜索,分发索引操作等,从本质上来说该客户节点表现为智能负载平衡器。独立的客户端节点在一个比较大的集群中是非常有用的,他协调主节点和数据节点,客户端节点加入集群可以得到集群的状态,根据集群的状态可以直接路由请求。 -
数据节点
数据节点主要是存储索引数据的节点,主要对文档进行增删改查操作,聚合操作等。数据节点对cpu,内存,io要求较高, 在优化的时候需要监控数据节点的状态,当资源不够的时候,需要在集群中添加新的节点。 -
主节点
主资格节点的主要职责是和集群操作相关的内容,如创建或删除索引,跟踪哪些节点是群集的一部分,并决定哪些分片分配给相关的节点。稳定的主节点对集群的健康是非常重要的,默认情况下任何一个集群中的节点都有可能被选为主节点,索引数据和搜索查询等操作会占用大量的cpu,内存,io资源,为了确保一个集群的稳定,分离主节点和数据节点是一个比较好的选择。
一、ElasticSearch文档分值_score计算底层原理
1.boolean model
第一步、根据用户的query条件,先过滤出包含指定term(关键字)的doc(文档)
例如查询"hello world"
query "hello world" 拆分不同的term--> hello / world / hello & world
第二步、根据你的条件进行筛选
bool --> must/must not/should 筛选条件--> 过滤 --> 包含 / 不包含 / 可能包含
到这里还没有进行打分。
2.relevance score算法
该算法是计算出一个索引中的文本,与搜索文本,他们之间的关联匹配程度。
Elasticsearch使用的是 term frequency/inverse document frequency算法,简称为TF/IDF算法(TF除以IDF)。
第三步、开始计算
Term frequency(TF)
:搜索文本中的各个词条在field文本中出现了多少次,出现次数越多,就越相关。
例如
搜索请求:hello world
会拆成hello和world。去文档中去找这些关键字出现的次数。出现次数越多,分数越高。
doc1:hello you, and world is very good
doc2:hello, how are you
Inverse document frequency(IDF)
:搜索文本中的各个词条在整个索引
的所有文档中出现了多少次,出现的次数越多,就越不相关。
(可以这么理解,就比如你搜索的关键字为:'的,是’这些关键字几乎在整个索引存在很多。考虑到类似这一情况进行的该算法。)
例如
搜索请求:hello world
doc1:hello, july is good
doc2:hi world, how are you
此外处理上述的tf和idf外还有一个因素有关
3. Field-length norm
:field长度,field越长,相关度越弱
例如
搜索请求:hello world
doc1:{ "title": "hello july", "content": "...... 1000个单词" }
doc2:{ "title": "my baby", "content": "...... 1000个单词,hi world" }
hello world在整个index中出现的次数是一样多的,但是,doc1更相关,title 字段中内容更短。
2、分析一个document上的_score是如何被计算出来的
使用_explain
进行一个简单的查询举例。
GET /test_index08/_doc/3/_explain
{"query":{"match":{"f":"hello"}}}
结果
包含上述所说的idf和tf等相关分数,这里先简单了解。es的计算分数涉及到的数学知识还是比较复杂的这里不展开讲解了。
二、分词器工作流程
1.character filter、tokenizer、token filter
- 切分词语和normalization
根据指定的分词器,把要保存到es中的数据进行切分,给你一段句子,然后将这段句子拆分成一个一个的单个的单词,同时对每个单词进行normalization(时态转换,单复数转换等)。
工作流程大致可以分为三个步骤
第一步:character filter:在一段文本进行分词之前,先进行预处理,比如说最常见的就是,过滤一些内容(把html标签过滤掉,把一些特殊符号进行转换& --> and,&转and等。)
第二步:tokenizer:分词,hello you and me --> hello, you, and, me
第三步:token filter:lowercase,stop word,synonymom,(例如处理大小写转换,停用词的处理,同义词的处理等。)
经过各种处理后,最后处理好的结果才会拿去建立倒排索引。
2、内置分词器的简单介绍
测试内容:Set the shape to semi-transparent by calling set_trans(5)
- standard analyze
结果:set, the, shape, to, semi, transparent, by, calling, set_trans, 5(默认的是standard分词器) - simple analyzer
结果:set, the, shape, to, semi, transparent, by, calling, set, trans - whitespace analyzer
结果:Set, the, shape, to, semi-transparent, by, calling, set_trans(5) - stop analyzer
结果:移除停用词,比如a the it等等
举例
POST _analyze
{
"analyzer": "standard",
"text": "Set the shape to semi-transparent by calling set_trans(5)"
}
详细结果
{
"tokens" : [
{
"token" : "set",
"start_offset" : 0,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "the",
"start_offset" : 4,
"end_offset" : 7,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "shape",
"start_offset" : 8,
"end_offset" : 13,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "to",
"start_offset" : 14,
"end_offset" : 16,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "semi",
"start_offset" : 17,
"end_offset" : 21,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "transparent",
"start_offset" : 22,
"end_offset" : 33,
"type" : "<ALPHANUM>",
"position" : 5
},
{
"token" : "by",
"start_offset" : 34,
"end_offset" : 36,
"type" : "<ALPHANUM>",
"position" : 6
},
{
"token" : "calling",
"start_offset" : 37,
"end_offset" : 44,
"type" : "<ALPHANUM>",
"position" : 7
},
{
"token" : "set_trans",
"start_offset" : 45,
"end_offset" : 54,
"type" : "<ALPHANUM>",
"position" : 8
},
{
"token" : "5",
"start_offset" : 55,
"end_offset" : 56,
"type" : "<NUM>",
"position" : 9
}
]
}
3、定制分词器
3.1默认的分词器–standard
standard tokenizer:以单词边界进行切分
standard token filter:什么都不做
lowercase token filter:将所有字母转换为小写
stop token filer(默认被禁用):移除停用词,比如a the it等等
3.2修改分词器的设置
英文环境下,启用停用词。
例如
创建一个名为my_index的索引,其中es_std
为自定义分词器名称,stopwords
为设置英文环境下启用停用词。
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"es_std": {
"type": "standard",
"stopwords": "_english_"
}
}
}
}
}
默认分词器分词
GET /my_index/_analyze
{
"analyzer": "standard",
"text": "a dog is in the house"
}
结果
{
"tokens" : [
{
"token" : "a",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "dog",
"start_offset" : 2,
"end_offset" : 5,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "is",
"start_offset" : 6,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "in",
"start_offset" : 9,
"end_offset" : 11,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "the",
"start_offset" : 12,
"end_offset" : 15,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "house",
"start_offset" : 16,
"end_offset" : 21,
"type" : "<ALPHANUM>",
"position" : 5
}
]
}
测试自定义分词器的分词结果
GET /my_index/_analyze
{
"analyzer": "es_std",
"text": "a dog is in the house"
}
结果
{
"tokens" : [
{
"token" : "dog",
"start_offset" : 2,
"end_offset" : 5,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "house",
"start_offset" : 16,
"end_offset" : 21,
"type" : "<ALPHANUM>",
"position" : 5
}
]
}
3.3定制化自己的分词器
创建一个my_index2索引,要求内容中的 & 转换成and,其中&Toand
名称是自定义的,类型为mapping(映射关系),多个条件使用逗号分隔,设置停用词文本中有the、a把他过滤掉,其中my_stopwords
名称自定义,类型为stop(停用词)。my_analyzer
为自定分词的名称,类型为custom(自定义分词器),html_strip为es中自带的,自动过滤掉html标签,lowercase作用是大写转小写,“tokenizer”: "standard"表示在standard分词器基础上进行扩展。
PUT /my_index2
{
"settings": {
"analysis": {
"char_filter": {
"&Toand": {
"type": "mapping",
"mappings": [
"&=> and",
"!=> not"
]
}
},
"filter": {
"my_stopwords": {
"type": "stop",
"stopwords": [
"the",
"a"
]
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"char_filter": [
"html_strip",
"&Toand"
],
"tokenizer": "standard",
"filter": [
"lowercase",
"my_stopwords"
]
}
}
}
}
}
进行测试
GET /my_index2/_analyze
{
"text": "tom&jerry are a friend in the house, <a>, HAHA!!",
"analyzer": "my_analyzer"
}
结果
{
"tokens" : [
{
"token" : "tomandjerry",
"start_offset" : 0,
"end_offset" : 9,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "are",
"start_offset" : 10,
"end_offset" : 13,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "friend",
"start_offset" : 16,
"end_offset" : 22,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "in",
"start_offset" : 23,
"end_offset" : 25,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "house",
"start_offset" : 30,
"end_offset" : 35,
"type" : "<ALPHANUM>",
"position" : 6
},
{
"token" : "hahanotnot",
"start_offset" : 42,
"end_offset" : 48,
"type" : "<ALPHANUM>",
"position" : 7
}
]
}
3.4 ik分词器详解
ik配置文件地址:config目录下
文件主要作用:
- IKAnalyzer.cfg.xml:用来配置自定义词库
- main.dic:ik原生内置的中文词库,总共有27万多条,只要是这些单词,都会被分在一起
- quantifier.dic:放了一些单位相关的词
- suffix.dic:放了一些后缀
- surname.dic:中国的姓氏
- stopword.dic:英文停用词
main.dic
:包含了原生的中文词语,会按照这个里面的词语去分词stopword.dic
:包含了英文的停用词
如何对IK分词器自定义词库?
方法1:
增加需要自定义的词库,更改指定配置文件中的内容,把增加的词库地址配置进去。
例如,我在config目录下新建了一个文件夹叫custom,然后里边有一个custom.dic文件
修改IKAnalyzer.cfg.xml
配置文件内容(每个节点都要修改)
<entry key="ext_dict">custom/custom.dic</entry>
这种方法需要重启es,才能生效。
方法2(IK热更新):
把整个custom.dic文件放到一个指定的地址上,比如192.168.5.5:8888/custom.dic。当配置es 的时候把地址统一写成这个地址,此时你要更新custom.dic内容时,直接对它进行修改即可。也不需要再重启es了。
方法3(修改源码):
修改es中的源码,使其读取mysql中的词库。下载源码进行修改。
三、高亮显示
1.高亮简述
多查询的内容,进行高亮显示,类似百度搜索的结果。
高亮演示
先新建一个索引并增加一条数据。
指定某些字段使用的分词器。
PUT /test_highlight
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"content": {
"type": "text",
"analyzer": "ik_max_word"
}
}
}
}
或者设置索引默认分词器
PUT /test_highlight
{
"settings" : {
"index" : {
"analysis.analyzer.default.type": "ik_max_word"
}
}
}
插入数据
PUT /test_highlight/_doc/1
{
"title": "这是july写的第一篇文章",
"content": "大家好,这是我写的第一篇文章,特别喜欢这个文章"
}
查询内容进行高亮
GET /test_highlight/_doc/_search
{
"query": {
"match": {
"title": "文章"
}
},
"highlight": {
"fields": {
"title": {}
}
}
}
结果
{
"took" : 416,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "test_highlight",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"title" : "这是july写的第一篇文章",
"content" : "大家好,这是我写的第一篇文章,特别喜欢这个文章"
},
"highlight" : {
"title" : [
"这是july写的第一篇<em>文章</em>"
]
}
}
]
}
}
<em></em>标签,会变成红色,所以说你的指定的field中,如果包含了那个搜索词的话,就会在那个field的文本中,对搜索词进行红色的高亮显示
注意:这里只有query中的title条件这一个字段进行高亮,如果你想让content也高亮的话,content字段需要出现在query中,如果只是添加在highlight中是不生效的!请看如下举例
GET /test_highlight/_doc/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "文章"
}
},
{
"match": {
"content": "文章"
}
}
]
}
},
"highlight": {
"fields": {
"title": {},
"content": {}
}
}
}
结果
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.68324494,
"hits" : [
{
"_index" : "test_highlight",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.68324494,
"_source" : {
"title" : "这是july写的第一篇文章",
"content" : "大家好,这是我写的第一篇文章,特别喜欢这个文章"
},
"highlight" : {
"title" : [
"这是july写的第一篇<em>文章</em>"
],
"content" : [
"大家好,这是我写的第一篇<em>文章</em>,特别喜欢这个<em>文章</em>"
]
}
}
]
}
}
2.常用的highlight
-
plain highlight,lucene highlight,默认
-
posting highlight,index_options=offsets
posting性能比plain要高,因为不需要重新对高亮文本进行分词。对磁盘的消耗更少。
高亮查询如何使用posting方式
在新建索引时,指定mapping格式如下。
例如:要对content字段进行高亮,设置"index_options": “offsets”。
PUT /test_highlight
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"content": {
"type": "text",
"analyzer": "ik_max_word",
"index_options": "offsets"
}
}
}
}
查询方式和默认高亮是一样的
GET /test_highlight/_doc/_search
{
"query": {
"match": {
"content": "文章"
}
},
"highlight": {
"fields": {
"content": {}
}
}
}
3.fast vector highlight
index-time term vector设置在mapping中,就会用fast verctor highlight。
对大field而言(大于1mb),性能更高
如何使用
例如:要对content字段进行高亮,设置"term_vector" : “with_positions_offsets”
PUT /test_highlight
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"content": {
"type": "text",
"analyzer": "ik_max_word",
"term_vector" : "with_positions_offsets"
}
}
}
}
查询方式也是一样的。
如何强制使用指定高亮类型查询
GET /test_highlight/_doc/_search
{
"query": {
"match": {
"content": "文章"
}
},
"highlight": {
"fields": {
"content": {
"type": "plain"
}
}
}
}
4.高亮片段fragment的设置
场景:你需要高亮的内容’java’,对应字段中内容超过1w个字。那么我可能不需要把所有内容都拿出来,只需要拿出来一小部分就可以,也不需要把所有匹配的一下子都展示出来,只展示前边几个高亮的就可以。
GET /test_highlight/_search
{
"query" : {
"match": { "content": "文章" }
},
"highlight" : {
"fields" : {
"content" : {"fragment_size" : 5, "number_of_fragments" : 3 }
}
}
}
fragment_size
: 默认是100,设置获取内容的长度。
number_of_fragments
:你可能你的高亮的fragment文本片段有多个片段,你可以指定就显示几个片段。
四、 聚合搜索技术深入
1.bucket和metric
在Elasticsearch中,bucket和metric是两种重要的聚合(Aggregation)类型。它们被用于在搜索结果中分组、过滤和计算数据。
Bucket:
是一个用于将文档分成段或者桶的聚合操作。我们可以将Bucket看作是一种分类操作,通过Bucket聚合可以将搜索结果按照某种规则进行分组,形成多个不同的Bucket。
常见的Bucket类型有:
- Terms Bucket:按照指定字段的值进行分组,类似于SQL中的GROUP BY。
- Date Histogram Bucket:按照时间间隔对文档进行分组,比如每天、每周、每月等。
- Range Bucket:按照数值范围进行分组,例如按照价格区间进行分组。
Metric:
是对Bucket中的文档进行计算的聚合操作。Metric通常会应用于已经分组的数据上,从而计算出汇总数据。
常见的Metric类型有:
- Sum Metric:对指定字段的数值进行求和计算。
- Avg Metric:对指定字段的数值进行平均计算。
- Max Metric:对指定字段的数值取最大值。
- Min Metric:对指定字段的数值取最小值。
- Cardinality Metric:对指定字段的不同值进行计数。
举个例子,如果我们有一个包含产品销售记录的索引,其中有字段"category"表示产品类型,那么我们可以使用Terms Bucket对每种产品类型进行分组,然后再应用某些Metric,如Sum Metric来计算每种产品类型的总销售额。
这可以通过以下Elasticsearch查询实现:
{
"aggs": {
"sales_by_category": {
"terms": { "field": "category" },
"aggs": {
"total_sales": { "sum": { "field": "price" } }
}
}
}
}
上述查询首先使用Terms Bucket将所有产品按照产品类型进行分组,然后使用Sum Metric对每个分组内的价格进行求和,最终得到每个产品类型的总销售额。其中sales_by_category
为自定的分组名称。
2聚合操作案例
新建索引,并插入数据。
PUT /cars
{
"mappings": {
"properties": {
"price": {
"type": "long"
},
"color": {
"type": "keyword"
},
"brand": {
"type": "keyword"
},
"model": {
"type": "keyword"
},
"sold_date": {
"type": "date"
},
"remark": {
"type": "text",
"analyzer": "ik_max_word"
}
}
}
}
添加数据
POST /cars/_bulk
{"index":{}}
{"price":258000,"color":"金色","brand":"大众","model":"大众迈腾","sold_date":"2021-10-28","remark":"大众中档车"}
{"index":{}}
{"price":123000,"color":"金色","brand":"大众","model":"大众速腾","sold_date":"2021-11-05","remark":"大众神车"}
{"index":{}}
{"price":239800,"color":"白色","brand":"标志","model":"标志508","sold_date":"2021-05-18","remark":"标志品牌全球上市车型"}
{"index":{}}
{"price":148800,"color":"白色","brand":"标志","model":"标志408","sold_date":"2021-07-02","remark":"比较大的紧凑型车"}
{"index":{}}
{"price":1998000,"color":"黑色","brand":"大众","model":"大众辉腾","sold_date":"2021-08-19","remark":"大众最让人肝疼的车"}
{"index":{}}
{"price":218000,"color":"红色","brand":"奥迪","model":"奥迪A4","sold_date":"2021-11-05","remark":"小资车型"}
{"index":{}}
{"price":489000,"color":"黑色","brand":"奥迪","model":"奥迪A6","sold_date":"2022-01-01","remark":"政府专用?"}
{"index":{}}
{"price":1899000,"color":"黑色","brand":"奥迪","model":"奥迪A 8","sold_date":"2022-02-12","remark":"很贵的大A6"}
①根据color分组统计销售数量
只执行聚合分组,不做复杂的聚合统计。在ES中最基础的聚合为terms,相当于SQL中的count。
在ES中默认为分组数据做排序,使用的是doc_count数据执行降序排列。可以使用_key元数据,根据分组后的字段数据执行不同的排序方案,也可以根据_count元数据,根据分组后的统计值执行不同的排序方案。
GET /cars/_search
{
"aggs": {
"group_by_color": {
"terms": {
"field": "color",
"order": {
"_count": "desc"
}
}
}
}
}
结果,其中hits
展示的是元数据内容,aggregations
展示的是聚合后的内容。
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"group_by_color" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "黑色",
"doc_count" : 3
},
{
"key" : "白色",
"doc_count" : 2
},
{
"key" : "金色",
"doc_count" : 2
},
{
"key" : "红色",
"doc_count" : 1
}
]
}
}
}
如果不想要元数据则需设置一下size即可。
GET /cars/_search
{
"size": 0,
"aggs": {
"group_by_color": {
"terms": {
"field": "color",
"order": {
"_count": "desc"
}
}
}
}
}
②统计不同color车辆的平均价格(下钻分析,aggs嵌套aggs)
本案例先根据color执行聚合分组,在此分组的基础上,对组内数据执行聚合统计,这个组内数据的聚合统计就是metric。同样可以执行排序,因为组内有聚合统计,且对统计数据给予了命名avg_by_price,所以可以根据这个聚合统计数据字段名执行排序逻辑。
GET /cars/_search
{
"size": 0,
"aggs": {
"group_by_color": {
"terms": {
"field": "color",
"order": {
"avg_by_price": "asc"
}
},
"aggs": {
"avg_by_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
结果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_color" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "金色",
"doc_count" : 2,
"avg_by_price" : {
"value" : 190500.0
}
},
{
"key" : "白色",
"doc_count" : 2,
"avg_by_price" : {
"value" : 194300.0
}
},
{
"key" : "红色",
"doc_count" : 1,
"avg_by_price" : {
"value" : 218000.0
}
},
{
"key" : "黑色",
"doc_count" : 3,
"avg_by_price" : {
"value" : 1462000.0
}
}
]
}
}
}
size可以设置为0,表示不返回ES中的文档,只返回ES聚合之后的数据,提高查询速度,当然如果你需要这些文档的话,也可以按照实际情况进行设置。
③统计不同color不同brand中车辆的平均价格
查询
GET /cars/_search
{
"aggs": {
"group_by_color": {
"terms": {
"field": "color",
"order": {
"avg_by_price_color": "asc"
}
},
"aggs": {
"avg_by_price_color": {
"avg": {
"field": "price"
}
},
"group_by_brand": {
"terms": {
"field": "brand",
"order": {
"avg_by_price_brand": "desc"
}
},
"aggs": {
"avg_by_price_brand": {
"avg": {
"field": "price"
}
}
}
}
}
}
}
}
结果
{
"took" : 13,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"group_by_color" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "金色",
"doc_count" : 2,
"avg_by_price_color" : {
"value" : 190500.0
},
"group_by_brand" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "大众",
"doc_count" : 2,
"avg_by_price_brand" : {
"value" : 190500.0
}
}
]
}
},
{
"key" : "白色",
"doc_count" : 2,
"avg_by_price_color" : {
"value" : 194300.0
},
"group_by_brand" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "标志",
"doc_count" : 2,
"avg_by_price_brand" : {
"value" : 194300.0
}
}
]
}
},
{
"key" : "红色",
"doc_count" : 1,
"avg_by_price_color" : {
"value" : 218000.0
},
"group_by_brand" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "奥迪",
"doc_count" : 1,
"avg_by_price_brand" : {
"value" : 218000.0
}
}
]
}
},
{
"key" : "黑色",
"doc_count" : 3,
"avg_by_price_color" : {
"value" : 1462000.0
},
"group_by_brand" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "大众",
"doc_count" : 1,
"avg_by_price_brand" : {
"value" : 1998000.0
}
},
{
"key" : "奥迪",
"doc_count" : 2,
"avg_by_price_brand" : {
"value" : 1194000.0
}
}
]
}
}
]
}
}
}
先根据color聚合分组,在组内根据brand再次聚合分组,这种操作可以称为下钻分析。(即嵌套定义)
aggs也可水平定义,、格式如下。
GET /index_name/type_name/_search
{
"aggs" : {
"分组名称1" : {},
"分组名称2" : {}
}
}
举例:
GET /cars/_search
{
"aggs": {
"group_by_color": {
"terms": {
"field": "color"
}
},
"avg_by_price_color": {
"avg": {
"field": "price"
}
}
}
}
结果
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"avg_by_price_color" : {
"value" : 671700.0
},
"group_by_color" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "黑色",
"doc_count" : 3
},
{
"key" : "白色",
"doc_count" : 2
},
{
"key" : "金色",
"doc_count" : 2
},
{
"key" : "红色",
"doc_count" : 1
}
]
}
}
}
④统计不同color中的最大和最小价格、总价
查询
GET /cars/_search
{
"aggs": {
"group_by_color": {
"terms": {
"field": "color"
},
"aggs": {
"max_price": {
"max": {
"field": "price"
}
},
"min_price": {
"min": {
"field": "price"
}
},
"sum_price": {
"sum": {
"field": "price"
}
}
}
}
}
}
结果
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"group_by_color" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "黑色",
"doc_count" : 3,
"max_price" : {
"value" : 1998000.0
},
"min_price" : {
"value" : 489000.0
},
"sum_price" : {
"value" : 4386000.0
}
},
{
"key" : "白色",
"doc_count" : 2,
"max_price" : {
"value" : 239800.0
},
"min_price" : {
"value" : 148800.0
},
"sum_price" : {
"value" : 388600.0
}
},
{
"key" : "金色",
"doc_count" : 2,
"max_price" : {
"value" : 258000.0
},
"min_price" : {
"value" : 123000.0
},
"sum_price" : {
"value" : 381000.0
}
},
{
"key" : "红色",
"doc_count" : 1,
"max_price" : {
"value" : 218000.0
},
"min_price" : {
"value" : 218000.0
},
"sum_price" : {
"value" : 218000.0
}
}
]
}
}
}
⑤统计不同品牌汽车中价格排名最高的车型
查询
GET cars/_search
{
"size": 0,
"aggs": {
"group_by_brand": {
"terms": {
"field": "brand"
},
"aggs": {
"top_car": {
"top_hits": {
"size": 1,
"sort": [
{
"price": {
"order": "desc"
}
}
],
"_source": {
"includes": [
"model",
"price"
]
}
}
}
}
}
}
}
结果
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_brand" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "大众",
"doc_count" : 3,
"top_car" : {
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : null,
"_source" : {
"price" : 1998000,
"model" : "大众辉腾"
},
"sort" : [
1998000
]
}
]
}
}
},
{
"key" : "奥迪",
"doc_count" : 3,
"top_car" : {
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : null,
"_source" : {
"price" : 1899000,
"model" : "奥迪A 8"
},
"sort" : [
1899000
]
}
]
}
}
},
{
"key" : "标志",
"doc_count" : 2,
"top_car" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : null,
"_source" : {
"price" : 239800,
"model" : "标志508"
},
"sort" : [
239800
]
}
]
}
}
}
]
}
}
}
2.1聚合操作之histogram 区间统计
histogram类似terms,也是进行bucket分组操作的,是根据一个field,实现数据区间分组。
例如:以100万为一个范围,统计不同范围内车辆的销售量和平均价格。那么使用histogram的聚合的时候,field指定价格字段price。区间范围是100万(即interval : 1000000)。这个时候ES会将price价格区间划分为: [0, 1000000), [1000000, 2000000), [2000000, 3000000)等,依次类推。在划分区间的同时,histogram会类似terms进行数据数量的统计(count),可以通过嵌套aggs对聚合分组后的组内数据做再次聚合分析。
查询
GET /cars/_search
{
"aggs": {
"histogram_by_price": {
"histogram": {
"field": "price",
"interval": 1000000
},
"aggs": {
"avg_by_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
结果
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"histogram_by_price" : {
"buckets" : [
{
"key" : 0.0,
"doc_count" : 6,
"avg_by_price" : {
"value" : 246100.0
}
},
{
"key" : 1000000.0,
"doc_count" : 2,
"avg_by_price" : {
"value" : 1948500.0
}
}
]
}
}
}
2.2date_histogram区间分组
date_histogram可以对date类型的field执行区间聚合分组,如每月销量,每年销量等。
如:以月为单位,统计不同月份汽车的销售数量及销售总金额。这个时候可以使用date_histogram实现聚合分组,其中field来指定用于聚合分组的字段,interval指定区间范围(可选值有:year、quarter、month、week、day、hour、minute、second),format指定日期格式化,min_doc_count指定每个区间的最少document(如果不指定,默认为0,当区间范围内没有document时,也会显示bucket分组),extended_bounds指定起始时间和结束时间(如果不指定,默认使用字段中日期最小值所在范围和最大值所在范围为起始和结束时间)。
举例:统计2021年到2022年这个区间统计总价。
es7.x之前版本的语法
GET /cars/_search
{
"aggs": {
"histogram_by_date": {
"date_histogram": {
"field": "sold_date",
"interval": "month",
"format": "yyyy-MM-dd",
"min_doc_count": 1,
"extended_bounds": {
"min": "2021-01-01",
"max": "2022-12-31"
}
},
"aggs": {
"sum_by_price": {
"sum": {
"field": "price"
}
}
}
}
}
}
结果
#! Deprecation: [interval] on [date_histogram] is deprecated, use [fixed_interval] or [calendar_interval] in the future.
{
"took" : 12,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"histogram_by_date" : {
"buckets" : [
{
"key_as_string" : "2021-05-01",
"key" : 1619827200000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 239800.0
}
},
{
"key_as_string" : "2021-07-01",
"key" : 1625097600000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 148800.0
}
},
{
"key_as_string" : "2021-08-01",
"key" : 1627776000000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 1998000.0
}
},
{
"key_as_string" : "2021-10-01",
"key" : 1633046400000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 258000.0
}
},
{
"key_as_string" : "2021-11-01",
"key" : 1635724800000,
"doc_count" : 2,
"sum_by_price" : {
"value" : 341000.0
}
},
{
"key_as_string" : "2022-01-01",
"key" : 1640995200000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 489000.0
}
},
{
"key_as_string" : "2022-02-01",
"key" : 1643673600000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 1899000.0
}
}
]
}
}
}
es7.x版本之后的语法
查询
把关键字interval
换成calendar_interval
GET /cars/_search
{
"aggs": {
"histogram_by_date": {
"date_histogram": {
"field": "sold_date",
"calendar_interval": "month",
"format": "yyyy-MM-dd",
"min_doc_count": 1,
"extended_bounds": {
"min": "2021-01-01",
"max": "2022-12-31"
}
},
"aggs": {
"sum_by_price": {
"sum": {
"field": "price"
}
}
}
}
}
}
结果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"histogram_by_date" : {
"buckets" : [
{
"key_as_string" : "2021-05-01",
"key" : 1619827200000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 239800.0
}
},
{
"key_as_string" : "2021-07-01",
"key" : 1625097600000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 148800.0
}
},
{
"key_as_string" : "2021-08-01",
"key" : 1627776000000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 1998000.0
}
},
{
"key_as_string" : "2021-10-01",
"key" : 1633046400000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 258000.0
}
},
{
"key_as_string" : "2021-11-01",
"key" : 1635724800000,
"doc_count" : 2,
"sum_by_price" : {
"value" : 341000.0
}
},
{
"key_as_string" : "2022-01-01",
"key" : 1640995200000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 489000.0
}
},
{
"key_as_string" : "2022-02-01",
"key" : 1643673600000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 1899000.0
}
}
]
}
}
}
2.3_global bucket
在聚合统计数据的时候,有些时候需要对比部分数据和总体数据。
例如:
统计某品牌车辆平均价格和所有车辆平均价格。global是用于定义一个全局bucket,这个bucket会忽略query的条件,检索所有document进行对应的聚合统计。
查询
GET /cars/_search
{
"size": 0,
"query": {
"match": {
"brand": "大众"
}
},
"aggs": {
"volkswagen_of_avg_price": {
"avg": {
"field": "price"
}
},
"all_avg_price": {
"global": {},
"aggs": {
"all_of_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
结果
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"all_avg_price" : {
"doc_count" : 8,
"all_of_price" : {
"value" : 671700.0
}
},
"volkswagen_of_avg_price" : {
"value" : 793000.0
}
}
}
2.4 aggs+order(聚合+排序)
对聚合统计数据进行排序。
例如:
统计每个品牌的汽车销量和销售总额,按照销售总额的降序排列。
查询
GET /cars/_search
{
"aggs": {
"group_of_brand": {
"terms": {
"field": "brand",
"order": {
"sum_of_price": "desc"
}
},
"aggs": {
"sum_of_price": {
"sum": {
"field": "price"
}
}
}
}
}
}
结果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"group_of_brand" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "奥迪",
"doc_count" : 3,
"sum_of_price" : {
"value" : 2606000.0
}
},
{
"key" : "大众",
"doc_count" : 3,
"sum_of_price" : {
"value" : 2379000.0
}
},
{
"key" : "标志",
"doc_count" : 2,
"sum_of_price" : {
"value" : 388600.0
}
}
]
}
}
}
如果有多层aggs,执行下钻聚合的时候,也可以根据最内层聚合数据执行排序。(即外层排序的内容可以使用里层的别名进行排序)
例如
统计每个品牌中每种颜色车辆的销售总额,并根据销售总额降序排列。这就像SQL中的分组排序一样,
只能组内数据排序,而不能跨组实现排序。
查询
GET /cars/_search
{
"aggs": {
"group_by_brand": {
"terms": {
"field": "brand"
},
"aggs": {
"group_by_color": {
"terms": {
"field": "color",
"order": {
"sum_of_price": "desc"
}
},
"aggs": {
"sum_of_price": {
"sum": {
"field": "price"
}
}
}
}
}
}
}
}
结果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "VIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 1899000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A 8",
"sold_date" : "2022-02-12",
"remark" : "很贵的大A6。。。"
}
}
]
},
"aggregations" : {
"group_by_brand" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "大众",
"doc_count" : 3,
"group_by_color" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "黑色",
"doc_count" : 1,
"sum_of_price" : {
"value" : 1998000.0
}
},
{
"key" : "金色",
"doc_count" : 2,
"sum_of_price" : {
"value" : 381000.0
}
}
]
}
},
{
"key" : "奥迪",
"doc_count" : 3,
"group_by_color" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "黑色",
"doc_count" : 2,
"sum_of_price" : {
"value" : 2388000.0
}
},
{
"key" : "红色",
"doc_count" : 1,
"sum_of_price" : {
"value" : 218000.0
}
}
]
}
},
{
"key" : "标志",
"doc_count" : 2,
"group_by_color" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "白色",
"doc_count" : 2,
"sum_of_price" : {
"value" : 388600.0
}
}
]
}
}
]
}
}
}
2.5search+aggs (条件查询+聚合)
聚合类似SQL中的group by子句,search类似SQL中的where子句。在ES中是完全可以将search和aggregations整合起来,执行相对更复杂的搜索统计。
例如:
统计某品牌车辆每个季度的销量和销售额。
查询
GET /cars/_search
{
"query": {
"match": {
"brand": "大众"
}
},
"aggs": {
"histogram_by_date": {
"date_histogram": {
"field": "sold_date",
"calendar_interval": "quarter",
"min_doc_count": 1
},
"aggs": {
"sum_by_price": {
"sum": {
"field": "price"
}
}
}
}
}
}
结果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.9444616,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 0.9444616,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 0.9444616,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 0.9444616,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
}
]
},
"aggregations" : {
"histogram_by_date" : {
"buckets" : [
{
"key_as_string" : "2021-07-01T00:00:00.000Z",
"key" : 1625097600000,
"doc_count" : 1,
"sum_by_price" : {
"value" : 1998000.0
}
},
{
"key_as_string" : "2021-10-01T00:00:00.000Z",
"key" : 1633046400000,
"doc_count" : 2,
"sum_by_price" : {
"value" : 381000.0
}
}
]
}
}
}
2.6filter+aggs(过滤+聚合)
filter也可以和aggs组合使用实现过滤聚合分析。
例如:
统计10万–50万之间的车辆的平均价格。
GET /cars/_search
{
"query": {
"constant_score": {
"filter": {
"range": {
"price": {
"gte": 100000,
"lte": 500000
}
}
}
}
},
"aggs": {
"avg_by_price": {
"avg": {
"field": "price"
}
}
}
}
结果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "T4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 239800,
"color" : "白色",
"brand" : "标志",
"model" : "标志508",
"sold_date" : "2021-05-18",
"remark" : "标志品牌全球上市车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UIR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 148800,
"color" : "白色",
"brand" : "标志",
"model" : "标志408",
"sold_date" : "2021-07-02",
"remark" : "比较大的紧凑型车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UoR_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 218000,
"color" : "红色",
"brand" : "奥迪",
"model" : "奥迪A4",
"sold_date" : "2021-11-05",
"remark" : "小资车型"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "U4R_-4cBUF6rBrkiDpRJ",
"_score" : 1.0,
"_source" : {
"price" : 489000,
"color" : "黑色",
"brand" : "奥迪",
"model" : "奥迪A6",
"sold_date" : "2022-01-01",
"remark" : "政府专用?"
}
}
]
},
"aggregations" : {
"avg_by_price" : {
"value" : 246100.0
}
}
}
2.7聚合中使用filter
filter也可以使用在aggs句法中,filter的范围决定了其过滤的范围。
如:统计某品牌汽车最近一年的销售总额。将filter放在aggs内部,代表这个过滤器只对query搜索得到的结果执行filter过滤。如果filter放在aggs外部,过滤器则会过滤所有的数据。
①12M/M 表示 12 个月。
②1y/y 表示 1年。
③d 表示天
查询
GET /cars/_search
{
"query": {
"match": {
"brand": "大众"
}
},
"aggs": {
"count_last_year": {
"filter": {
"range": {
"sold_date": {
"gte": "now-12M"
}
}
},
"aggs": {
"sum_of_price_last_year": {
"sum": {
"field": "price"
}
}
}
}
}
}
结果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.9444616,
"hits" : [
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "TYR_-4cBUF6rBrkiDpRJ",
"_score" : 0.9444616,
"_source" : {
"price" : 258000,
"color" : "金色",
"brand" : "大众",
"model" : "大众迈腾",
"sold_date" : "2021-10-28",
"remark" : "大众中档车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "ToR_-4cBUF6rBrkiDpRJ",
"_score" : 0.9444616,
"_source" : {
"price" : 123000,
"color" : "金色",
"brand" : "大众",
"model" : "大众速腾",
"sold_date" : "2021-11-05",
"remark" : "大众神车"
}
},
{
"_index" : "cars",
"_type" : "_doc",
"_id" : "UYR_-4cBUF6rBrkiDpRJ",
"_score" : 0.9444616,
"_source" : {
"price" : 1998000,
"color" : "黑色",
"brand" : "大众",
"model" : "大众辉腾",
"sold_date" : "2021-08-19",
"remark" : "大众最让人肝疼的车"
}
}
]
},
"aggregations" : {
"count_last_year" : {
"meta" : { },
"doc_count" : 0,
"sum_of_price_last_year" : {
"value" : 0.0
}
}
}
}