五、浅析[ElasticSearch]底层原理与分组聚合查询

news2024/11/7 15:33:10

目录

  • 一、ElasticSearch文档分值_score计算底层原理
    • 1.boolean model
    • 2.relevance score算法
    • 2、分析一个document上的_score是如何被计算出来的
  • 二、分词器工作流程
    • 1.character filter、tokenizer、token filter
    • 2、内置分词器的简单介绍
    • 3、定制分词器
      • 3.1默认的分词器--standard
      • 3.2修改分词器的设置
      • 3.3定制化自己的分词器
      • 3.4 ik分词器详解
  • 三、高亮显示
    • 1.高亮简述
    • 2.常用的highlight
    • 3.fast vector highlight
    • 4.高亮片段fragment的设置
  • 四、 聚合搜索技术深入
    • 1.bucket和metric
    • 2聚合操作案例
      • 2.1聚合操作之histogram 区间统计
      • 2.2date_histogram区间分组
      • 2.3_global bucket
      • 2.4 aggs+order(聚合+排序)
      • 2.5search+aggs (条件查询+聚合)
      • 2.6filter+aggs(过滤+聚合)
      • 2.7聚合中使用filter

集群节点介绍

es配置文件夹中

主节点:node.master:true
数据节点: node.data: true
  1. 客户端节点
      当主节点和数据节点配置都设置为false的时候,该节点只能处理路由请求,处理搜索,分发索引操作等,从本质上来说该客户节点表现为智能负载平衡器。独立的客户端节点在一个比较大的集群中是非常有用的,他协调主节点和数据节点,客户端节点加入集群可以得到集群的状态,根据集群的状态可以直接路由请求。

  2. 数据节点
      数据节点主要是存储索引数据的节点,主要对文档进行增删改查操作,聚合操作等。数据节点对cpu,内存,io要求较高, 在优化的时候需要监控数据节点的状态,当资源不够的时候,需要在集群中添加新的节点。

  3. 主节点
      主资格节点的主要职责是和集群操作相关的内容,如创建或删除索引,跟踪哪些节点是群集的一部分,并决定哪些分片分配给相关的节点。稳定的主节点对集群的健康是非常重要的,默认情况下任何一个集群中的节点都有可能被选为主节点,索引数据和搜索查询等操作会占用大量的cpu,内存,io资源,为了确保一个集群的稳定,分离主节点和数据节点是一个比较好的选择。

一、ElasticSearch文档分值_score计算底层原理

1.boolean model

第一步、根据用户的query条件,先过滤出包含指定term(关键字)的doc(文档)
例如查询"hello world"

query "hello world"  拆分不同的term-->  hello / world / hello & world

第二步、根据你的条件进行筛选

bool --> must/must not/should 筛选条件--> 过滤 --> 包含 / 不包含 / 可能包含

到这里还没有进行打分。

2.relevance score算法

该算法是计算出一个索引中的文本,与搜索文本,他们之间的关联匹配程度。
Elasticsearch使用的是 term frequency/inverse document frequency算法,简称为TF/IDF算法(TF除以IDF)。
第三步、开始计算

  1. Term frequency(TF):搜索文本中的各个词条在field文本中出现了多少次,出现次数越多,就越相关。
    例如
    搜索请求:hello world
    会拆成hello和world。去文档中去找这些关键字出现的次数。出现次数越多,分数越高。
doc1:hello you, and world is very good

doc2:hello, how are you
  1. Inverse document frequency(IDF):搜索文本中的各个词条在整个索引的所有文档中出现了多少次,出现的次数越多,就越不相关。
    (可以这么理解,就比如你搜索的关键字为:'的,是’这些关键字几乎在整个索引存在很多。考虑到类似这一情况进行的该算法。)
    例如
    搜索请求:hello world
doc1:hello, july is good

doc2:hi world, how are you

此外处理上述的tf和idf外还有一个因素有关
3. Field-length norm:field长度,field越长,相关度越弱

例如
搜索请求:hello world

doc1:{ "title": "hello july", "content": "...... 1000个单词" }
doc2:{ "title": "my baby", "content": "...... 1000个单词,hi world" }

hello world在整个index中出现的次数是一样多的,但是,doc1更相关,title 字段中内容更短。

2、分析一个document上的_score是如何被计算出来的

使用_explain进行一个简单的查询举例。

GET /test_index08/_doc/3/_explain
{"query":{"match":{"f":"hello"}}}

结果
包含上述所说的idf和tf等相关分数,这里先简单了解。es的计算分数涉及到的数学知识还是比较复杂的这里不展开讲解了。
在这里插入图片描述


二、分词器工作流程

1.character filter、tokenizer、token filter

  • 切分词语和normalization

根据指定的分词器,把要保存到es中的数据进行切分,给你一段句子,然后将这段句子拆分成一个一个的单个的单词,同时对每个单词进行normalization(时态转换,单复数转换等)。

工作流程大致可以分为三个步骤
第一步:character filter:在一段文本进行分词之前,先进行预处理,比如说最常见的就是,过滤一些内容(把html标签过滤掉,把一些特殊符号进行转换& --> and,&转and等。)

第二步:tokenizer:分词,hello you and me --> hello, you, and, me

第三步:token filter:lowercase,stop word,synonymom,(例如处理大小写转换,停用词的处理,同义词的处理等。)

经过各种处理后,最后处理好的结果才会拿去建立倒排索引。

2、内置分词器的简单介绍

测试内容:Set the shape to semi-transparent by calling set_trans(5)

  • standard analyze
    结果:set, the, shape, to, semi, transparent, by, calling, set_trans, 5(默认的是standard分词器)
  • simple analyzer
    结果:set, the, shape, to, semi, transparent, by, calling, set, trans
  • whitespace analyzer
    结果:Set, the, shape, to, semi-transparent, by, calling, set_trans(5)
  • stop analyzer
    结果:移除停用词,比如a the it等等

举例

POST _analyze
{
  "analyzer": "standard",
  "text": "Set the shape to semi-transparent by calling set_trans(5)"
}

详细结果

{
  "tokens" : [
    {
      "token" : "set",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "the",
      "start_offset" : 4,
      "end_offset" : 7,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "shape",
      "start_offset" : 8,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "to",
      "start_offset" : 14,
      "end_offset" : 16,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
      "token" : "semi",
      "start_offset" : 17,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
      "token" : "transparent",
      "start_offset" : 22,
      "end_offset" : 33,
      "type" : "<ALPHANUM>",
      "position" : 5
    },
    {
      "token" : "by",
      "start_offset" : 34,
      "end_offset" : 36,
      "type" : "<ALPHANUM>",
      "position" : 6
    },
    {
      "token" : "calling",
      "start_offset" : 37,
      "end_offset" : 44,
      "type" : "<ALPHANUM>",
      "position" : 7
    },
    {
      "token" : "set_trans",
      "start_offset" : 45,
      "end_offset" : 54,
      "type" : "<ALPHANUM>",
      "position" : 8
    },
    {
      "token" : "5",
      "start_offset" : 55,
      "end_offset" : 56,
      "type" : "<NUM>",
      "position" : 9
    }
  ]
}

3、定制分词器

3.1默认的分词器–standard

standard tokenizer:以单词边界进行切分

standard token filter:什么都不做

lowercase token filter:将所有字母转换为小写

stop token filer(默认被禁用):移除停用词,比如a the it等等

3.2修改分词器的设置

英文环境下,启用停用词。
例如
创建一个名为my_index的索引,其中es_std为自定义分词器名称,stopwords为设置英文环境下启用停用词。

PUT /my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "es_std": {
          "type": "standard",
          "stopwords": "_english_"
        }
      }
    }
  }
}

默认分词器分词

GET /my_index/_analyze
{
  "analyzer": "standard", 
  "text": "a dog is in the house"
}

结果

{
  "tokens" : [
    {
      "token" : "a",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "dog",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "is",
      "start_offset" : 6,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "in",
      "start_offset" : 9,
      "end_offset" : 11,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
      "token" : "the",
      "start_offset" : 12,
      "end_offset" : 15,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
      "token" : "house",
      "start_offset" : 16,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 5
    }
  ]
}

测试自定义分词器的分词结果

GET /my_index/_analyze
{
  "analyzer": "es_std", 
  "text": "a dog is in the house"
}

结果

{
  "tokens" : [
    {
      "token" : "dog",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "house",
      "start_offset" : 16,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 5
    }
  ]
}

3.3定制化自己的分词器

创建一个my_index2索引,要求内容中的 & 转换成and,其中&Toand名称是自定义的,类型为mapping(映射关系),多个条件使用逗号分隔,设置停用词文本中有the、a把他过滤掉,其中my_stopwords名称自定义,类型为stop(停用词)。my_analyzer为自定分词的名称,类型为custom(自定义分词器),html_strip为es中自带的,自动过滤掉html标签,lowercase作用是大写转小写,“tokenizer”: "standard"表示在standard分词器基础上进行扩展。

PUT /my_index2
{
  "settings": {
    "analysis": {
      "char_filter": {
        "&Toand": {
          "type": "mapping",
          "mappings": [
            "&=> and",
            "!=> not"
          ]
        }
      },
      "filter": {
        "my_stopwords": {
          "type": "stop",
          "stopwords": [
            "the",
            "a"
          ]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "char_filter": [
            "html_strip",
            "&Toand"
          ],
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "my_stopwords"
          ]
        }
      }
    }
  }
}

进行测试

GET /my_index2/_analyze
{
  "text": "tom&jerry are a friend in the house, <a>, HAHA!!",
  "analyzer": "my_analyzer"
}

结果

{
  "tokens" : [
    {
      "token" : "tomandjerry",
      "start_offset" : 0,
      "end_offset" : 9,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "are",
      "start_offset" : 10,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "friend",
      "start_offset" : 16,
      "end_offset" : 22,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
      "token" : "in",
      "start_offset" : 23,
      "end_offset" : 25,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
      "token" : "house",
      "start_offset" : 30,
      "end_offset" : 35,
      "type" : "<ALPHANUM>",
      "position" : 6
    },
    {
      "token" : "hahanotnot",
      "start_offset" : 42,
      "end_offset" : 48,
      "type" : "<ALPHANUM>",
      "position" : 7
    }
  ]
}

3.4 ik分词器详解

ik配置文件地址:config目录下
在这里插入图片描述
文件主要作用:

  1. IKAnalyzer.cfg.xml:用来配置自定义词库
  2. main.dic:ik原生内置的中文词库,总共有27万多条,只要是这些单词,都会被分在一起
  3. quantifier.dic:放了一些单位相关的词
  4. suffix.dic:放了一些后缀
  5. surname.dic:中国的姓氏
  6. stopword.dic:英文停用词
  7. main.dic:包含了原生的中文词语,会按照这个里面的词语去分词
  8. stopword.dic:包含了英文的停用词

如何对IK分词器自定义词库?
方法1:
增加需要自定义的词库,更改指定配置文件中的内容,把增加的词库地址配置进去。
例如,我在config目录下新建了一个文件夹叫custom,然后里边有一个custom.dic文件
修改IKAnalyzer.cfg.xml配置文件内容(每个节点都要修改)

<entry key="ext_dict">custom/custom.dic</entry>

这种方法需要重启es,才能生效。

方法2(IK热更新):
把整个custom.dic文件放到一个指定的地址上,比如192.168.5.5:8888/custom.dic。当配置es 的时候把地址统一写成这个地址,此时你要更新custom.dic内容时,直接对它进行修改即可。也不需要再重启es了。

方法3(修改源码):
修改es中的源码,使其读取mysql中的词库。下载源码进行修改。


三、高亮显示

1.高亮简述

多查询的内容,进行高亮显示,类似百度搜索的结果。
在这里插入图片描述
高亮演示
先新建一个索引并增加一条数据。
指定某些字段使用的分词器。

PUT /test_highlight
{
  "mappings": {

      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
          "type": "text",
          "analyzer": "ik_max_word"
        }
      }
    }
  
}

或者设置索引默认分词器

PUT /test_highlight
{
    "settings" : {
        "index" : {
            "analysis.analyzer.default.type": "ik_max_word"
        }
    }
}

插入数据

PUT /test_highlight/_doc/1
{
  "title": "这是july写的第一篇文章",
  "content": "大家好,这是我写的第一篇文章,特别喜欢这个文章"
}

查询内容进行高亮

GET /test_highlight/_doc/_search
{
  "query": {
    "match": {
      "title": "文章"
    }
  },
  "highlight": {
    "fields": {
      "title": {}
    }
  }
}

结果

{
  "took" : 416,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "test_highlight",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "title" : "这是july写的第一篇文章",
          "content" : "大家好,这是我写的第一篇文章,特别喜欢这个文章"
        },
        "highlight" : {
          "title" : [
            "这是july写的第一篇<em>文章</em>"
          ]
        }
      }
    ]
  }
}

<em></em>标签,会变成红色,所以说你的指定的field中,如果包含了那个搜索词的话,就会在那个field的文本中,对搜索词进行红色的高亮显示

注意:这里只有query中的title条件这一个字段进行高亮,如果你想让content也高亮的话,content字段需要出现在query中,如果只是添加在highlight中是不生效的!请看如下举例

GET /test_highlight/_doc/_search 
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "文章"
          }
        },
        {
          "match": {
            "content": "文章"
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "title": {},
      "content": {}
    }
  }
}

结果

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.68324494,
    "hits" : [
      {
        "_index" : "test_highlight",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.68324494,
        "_source" : {
          "title" : "这是july写的第一篇文章",
          "content" : "大家好,这是我写的第一篇文章,特别喜欢这个文章"
        },
        "highlight" : {
          "title" : [
            "这是july写的第一篇<em>文章</em>"
          ],
          "content" : [
            "大家好,这是我写的第一篇<em>文章</em>,特别喜欢这个<em>文章</em>"
          ]
        }
      }
    ]
  }
}

2.常用的highlight

  • plain highlight,lucene highlight,默认

  • posting highlight,index_options=offsets

posting性能比plain要高,因为不需要重新对高亮文本进行分词。对磁盘的消耗更少。

高亮查询如何使用posting方式
在新建索引时,指定mapping格式如下。
例如:要对content字段进行高亮,设置"index_options": “offsets”。

PUT /test_highlight
{
  "mappings": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
          "type": "text",
          "analyzer": "ik_max_word",
          "index_options": "offsets"
        }
      }
  }
}

查询方式和默认高亮是一样的

GET /test_highlight/_doc/_search 
{
  "query": {
    "match": {
      "content": "文章"
    }
  },
  "highlight": {
    "fields": {
      "content": {}
    }
  }
}

3.fast vector highlight

index-time term vector设置在mapping中,就会用fast verctor highlight。
对大field而言(大于1mb),性能更高
如何使用
例如:要对content字段进行高亮,设置"term_vector" : “with_positions_offsets”
PUT /test_highlight

{
  "mappings": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
          "type": "text",
          "analyzer": "ik_max_word",
          "term_vector" : "with_positions_offsets"
        }
      }
  }
}

查询方式也是一样的。
如何强制使用指定高亮类型查询

GET /test_highlight/_doc/_search 
{
  "query": {
    "match": {
      "content": "文章"
    }
  },
  "highlight": {
    "fields": {
      "content": {
        "type": "plain"
      }
    }
  }
}

4.高亮片段fragment的设置

场景:你需要高亮的内容’java’,对应字段中内容超过1w个字。那么我可能不需要把所有内容都拿出来,只需要拿出来一小部分就可以,也不需要把所有匹配的一下子都展示出来,只展示前边几个高亮的就可以。

GET /test_highlight/_search
{
    "query" : {
        "match": { "content": "文章" }
    },
    "highlight" : {
        "fields" : {
            "content" : {"fragment_size" : 5, "number_of_fragments" : 3 }
        }
    }
}

fragment_size: 默认是100,设置获取内容的长度。
number_of_fragments:你可能你的高亮的fragment文本片段有多个片段,你可以指定就显示几个片段。

四、 聚合搜索技术深入

1.bucket和metric

在Elasticsearch中,bucket和metric是两种重要的聚合(Aggregation)类型。它们被用于在搜索结果中分组、过滤和计算数据。
Bucket:是一个用于将文档分成段或者桶的聚合操作。我们可以将Bucket看作是一种分类操作,通过Bucket聚合可以将搜索结果按照某种规则进行分组,形成多个不同的Bucket。

常见的Bucket类型有:

  • Terms Bucket:按照指定字段的值进行分组,类似于SQL中的GROUP BY。
  • Date Histogram Bucket:按照时间间隔对文档进行分组,比如每天、每周、每月等。
  • Range Bucket:按照数值范围进行分组,例如按照价格区间进行分组。

Metric:是对Bucket中的文档进行计算的聚合操作。Metric通常会应用于已经分组的数据上,从而计算出汇总数据。
常见的Metric类型有:

  • Sum Metric:对指定字段的数值进行求和计算。
  • Avg Metric:对指定字段的数值进行平均计算。
  • Max Metric:对指定字段的数值取最大值。
  • Min Metric:对指定字段的数值取最小值。
  • Cardinality Metric:对指定字段的不同值进行计数。

举个例子,如果我们有一个包含产品销售记录的索引,其中有字段"category"表示产品类型,那么我们可以使用Terms Bucket对每种产品类型进行分组,然后再应用某些Metric,如Sum Metric来计算每种产品类型的总销售额。
这可以通过以下Elasticsearch查询实现:

{
    "aggs": {
        "sales_by_category": {
            "terms": { "field": "category" },
            "aggs": {
                "total_sales": { "sum": { "field": "price" } }
            }
        }
    }
}

上述查询首先使用Terms Bucket将所有产品按照产品类型进行分组,然后使用Sum Metric对每个分组内的价格进行求和,最终得到每个产品类型的总销售额。其中sales_by_category为自定的分组名称。

2聚合操作案例

新建索引,并插入数据。

PUT /cars
{
  "mappings": {
    "properties": {
      "price": {
        "type": "long"
      },
      "color": {
        "type": "keyword"
      },
      "brand": {
        "type": "keyword"
      },
      "model": {
        "type": "keyword"
      },
      "sold_date": {
        "type": "date"
      },
      "remark": {
        "type": "text",
        "analyzer": "ik_max_word"
      }
    }
  }
}

添加数据

POST /cars/_bulk
{"index":{}}
{"price":258000,"color":"金色","brand":"大众","model":"大众迈腾","sold_date":"2021-10-28","remark":"大众中档车"}
{"index":{}}
{"price":123000,"color":"金色","brand":"大众","model":"大众速腾","sold_date":"2021-11-05","remark":"大众神车"}
{"index":{}}
{"price":239800,"color":"白色","brand":"标志","model":"标志508","sold_date":"2021-05-18","remark":"标志品牌全球上市车型"}
{"index":{}}
{"price":148800,"color":"白色","brand":"标志","model":"标志408","sold_date":"2021-07-02","remark":"比较大的紧凑型车"}
{"index":{}}
{"price":1998000,"color":"黑色","brand":"大众","model":"大众辉腾","sold_date":"2021-08-19","remark":"大众最让人肝疼的车"}
{"index":{}}
{"price":218000,"color":"红色","brand":"奥迪","model":"奥迪A4","sold_date":"2021-11-05","remark":"小资车型"}
{"index":{}}
{"price":489000,"color":"黑色","brand":"奥迪","model":"奥迪A6","sold_date":"2022-01-01","remark":"政府专用?"}
{"index":{}}
{"price":1899000,"color":"黑色","brand":"奥迪","model":"奥迪A 8","sold_date":"2022-02-12","remark":"很贵的大A6"}

①根据color分组统计销售数量
只执行聚合分组,不做复杂的聚合统计。在ES中最基础的聚合为terms,相当于SQL中的count。
在ES中默认为分组数据做排序,使用的是doc_count数据执行降序排列。可以使用_key元数据,根据分组后的字段数据执行不同的排序方案,也可以根据_count元数据,根据分组后的统计值执行不同的排序方案。

GET /cars/_search
{
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color",
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}

结果,其中hits展示的是元数据内容,aggregations展示的是聚合后的内容。

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "黑色",
          "doc_count" : 3
        },
        {
          "key" : "白色",
          "doc_count" : 2
        },
        {
          "key" : "金色",
          "doc_count" : 2
        },
        {
          "key" : "红色",
          "doc_count" : 1
        }
      ]
    }
  }
}

如果不想要元数据则需设置一下size即可。

GET /cars/_search
{
  "size": 0, 
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color",
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}

②统计不同color车辆的平均价格(下钻分析,aggs嵌套aggs)
本案例先根据color执行聚合分组,在此分组的基础上,对组内数据执行聚合统计,这个组内数据的聚合统计就是metric。同样可以执行排序,因为组内有聚合统计,且对统计数据给予了命名avg_by_price,所以可以根据这个聚合统计数据字段名执行排序逻辑。

GET /cars/_search
{
  "size": 0, 
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color",
        "order": {
          "avg_by_price": "asc"
        }
      },
      "aggs": {
        "avg_by_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "金色",
          "doc_count" : 2,
          "avg_by_price" : {
            "value" : 190500.0
          }
        },
        {
          "key" : "白色",
          "doc_count" : 2,
          "avg_by_price" : {
            "value" : 194300.0
          }
        },
        {
          "key" : "红色",
          "doc_count" : 1,
          "avg_by_price" : {
            "value" : 218000.0
          }
        },
        {
          "key" : "黑色",
          "doc_count" : 3,
          "avg_by_price" : {
            "value" : 1462000.0
          }
        }
      ]
    }
  }
}

size可以设置为0,表示不返回ES中的文档,只返回ES聚合之后的数据,提高查询速度,当然如果你需要这些文档的话,也可以按照实际情况进行设置。

③统计不同color不同brand中车辆的平均价格

查询

GET /cars/_search
{
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color",
        "order": {
          "avg_by_price_color": "asc"
        }
      },
      "aggs": {
        "avg_by_price_color": {
          "avg": {
            "field": "price"
          }
        },
        "group_by_brand": {
          "terms": {
            "field": "brand",
            "order": {
              "avg_by_price_brand": "desc"
            }
          },
          "aggs": {
            "avg_by_price_brand": {
              "avg": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}

结果

{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "金色",
          "doc_count" : 2,
          "avg_by_price_color" : {
            "value" : 190500.0
          },
          "group_by_brand" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "大众",
                "doc_count" : 2,
                "avg_by_price_brand" : {
                  "value" : 190500.0
                }
              }
            ]
          }
        },
        {
          "key" : "白色",
          "doc_count" : 2,
          "avg_by_price_color" : {
            "value" : 194300.0
          },
          "group_by_brand" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "标志",
                "doc_count" : 2,
                "avg_by_price_brand" : {
                  "value" : 194300.0
                }
              }
            ]
          }
        },
        {
          "key" : "红色",
          "doc_count" : 1,
          "avg_by_price_color" : {
            "value" : 218000.0
          },
          "group_by_brand" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "奥迪",
                "doc_count" : 1,
                "avg_by_price_brand" : {
                  "value" : 218000.0
                }
              }
            ]
          }
        },
        {
          "key" : "黑色",
          "doc_count" : 3,
          "avg_by_price_color" : {
            "value" : 1462000.0
          },
          "group_by_brand" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "大众",
                "doc_count" : 1,
                "avg_by_price_brand" : {
                  "value" : 1998000.0
                }
              },
              {
                "key" : "奥迪",
                "doc_count" : 2,
                "avg_by_price_brand" : {
                  "value" : 1194000.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

先根据color聚合分组,在组内根据brand再次聚合分组,这种操作可以称为下钻分析。(即嵌套定义)
aggs也可水平定义,、格式如下。

GET /index_name/type_name/_search
{
"aggs" : {
"分组名称1" : {},
"分组名称2" : {}
}
}

举例:

GET /cars/_search
{
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color"
      }
    },
    "avg_by_price_color": {
      "avg": {
        "field": "price"
      }
    }
  }

}

结果

{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "avg_by_price_color" : {
      "value" : 671700.0
    },
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "黑色",
          "doc_count" : 3
        },
        {
          "key" : "白色",
          "doc_count" : 2
        },
        {
          "key" : "金色",
          "doc_count" : 2
        },
        {
          "key" : "红色",
          "doc_count" : 1
        }
      ]
    }
  }
}

④统计不同color中的最大和最小价格、总价
查询

GET /cars/_search
{
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color"
      },
      "aggs": {
        "max_price": {
          "max": {
            "field": "price"
          }
        },
        "min_price": {
          "min": {
            "field": "price"
          }
        },
        "sum_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_color" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "黑色",
          "doc_count" : 3,
          "max_price" : {
            "value" : 1998000.0
          },
          "min_price" : {
            "value" : 489000.0
          },
          "sum_price" : {
            "value" : 4386000.0
          }
        },
        {
          "key" : "白色",
          "doc_count" : 2,
          "max_price" : {
            "value" : 239800.0
          },
          "min_price" : {
            "value" : 148800.0
          },
          "sum_price" : {
            "value" : 388600.0
          }
        },
        {
          "key" : "金色",
          "doc_count" : 2,
          "max_price" : {
            "value" : 258000.0
          },
          "min_price" : {
            "value" : 123000.0
          },
          "sum_price" : {
            "value" : 381000.0
          }
        },
        {
          "key" : "红色",
          "doc_count" : 1,
          "max_price" : {
            "value" : 218000.0
          },
          "min_price" : {
            "value" : 218000.0
          },
          "sum_price" : {
            "value" : 218000.0
          }
        }
      ]
    }
  }
}

⑤统计不同品牌汽车中价格排名最高的车型
查询

GET cars/_search
{
  "size": 0,
  "aggs": {
    "group_by_brand": {
      "terms": {
        "field": "brand"
      },
      "aggs": {
        "top_car": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "price": {
                  "order": "desc"
                }
              }
            ],
            "_source": {
              "includes": [
                "model",
                "price"
              ]
            }
          }
        }
      }
    }
  }
}

结果

{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_brand" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "大众",
          "doc_count" : 3,
          "top_car" : {
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "UYR_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
                    "price" : 1998000,
                    "model" : "大众辉腾"
                  },
                  "sort" : [
                    1998000
                  ]
                }
              ]
            }
          }
        },
        {
          "key" : "奥迪",
          "doc_count" : 3,
          "top_car" : {
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "VIR_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
                    "price" : 1899000,
                    "model" : "奥迪A 8"
                  },
                  "sort" : [
                    1899000
                  ]
                }
              ]
            }
          }
        },
        {
          "key" : "标志",
          "doc_count" : 2,
          "top_car" : {
            "hits" : {
              "total" : {
                "value" : 2,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "T4R_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
                    "price" : 239800,
                    "model" : "标志508"
                  },
                  "sort" : [
                    239800
                  ]
                }
              ]
            }
          }
        }
      ]
    }
  }
}

2.1聚合操作之histogram 区间统计

histogram类似terms,也是进行bucket分组操作的,是根据一个field,实现数据区间分组。
例如:以100万为一个范围,统计不同范围内车辆的销售量和平均价格。那么使用histogram的聚合的时候,field指定价格字段price。区间范围是100万(即interval : 1000000)。这个时候ES会将price价格区间划分为: [0, 1000000), [1000000, 2000000), [2000000, 3000000)等,依次类推。在划分区间的同时,histogram会类似terms进行数据数量的统计(count),可以通过嵌套aggs对聚合分组后的组内数据做再次聚合分析。

查询

GET /cars/_search
{
  "aggs": {
    "histogram_by_price": {
      "histogram": {
        "field": "price",
        "interval": 1000000
      },
      "aggs": {
        "avg_by_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "histogram_by_price" : {
      "buckets" : [
        {
          "key" : 0.0,
          "doc_count" : 6,
          "avg_by_price" : {
            "value" : 246100.0
          }
        },
        {
          "key" : 1000000.0,
          "doc_count" : 2,
          "avg_by_price" : {
            "value" : 1948500.0
          }
        }
      ]
    }
  }
}

2.2date_histogram区间分组

date_histogram可以对date类型的field执行区间聚合分组,如每月销量,每年销量等。
如:以月为单位,统计不同月份汽车的销售数量及销售总金额。这个时候可以使用date_histogram实现聚合分组,其中field来指定用于聚合分组的字段,interval指定区间范围(可选值有:year、quarter、month、week、day、hour、minute、second),format指定日期格式化,min_doc_count指定每个区间的最少document(如果不指定,默认为0,当区间范围内没有document时,也会显示bucket分组),extended_bounds指定起始时间和结束时间(如果不指定,默认使用字段中日期最小值所在范围和最大值所在范围为起始和结束时间)。

举例:统计2021年到2022年这个区间统计总价。
es7.x之前版本的语法

GET /cars/_search
{
  "aggs": {
    "histogram_by_date": {
      "date_histogram": {
        "field": "sold_date",
        "interval": "month",
        "format": "yyyy-MM-dd",
        "min_doc_count": 1,
        "extended_bounds": {
          "min": "2021-01-01",
          "max": "2022-12-31"
        }
      },
      "aggs": {
        "sum_by_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

#! Deprecation: [interval] on [date_histogram] is deprecated, use [fixed_interval] or [calendar_interval] in the future.
{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "histogram_by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2021-05-01",
          "key" : 1619827200000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 239800.0
          }
        },
        {
          "key_as_string" : "2021-07-01",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 148800.0
          }
        },
        {
          "key_as_string" : "2021-08-01",
          "key" : 1627776000000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1998000.0
          }
        },
        {
          "key_as_string" : "2021-10-01",
          "key" : 1633046400000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 258000.0
          }
        },
        {
          "key_as_string" : "2021-11-01",
          "key" : 1635724800000,
          "doc_count" : 2,
          "sum_by_price" : {
            "value" : 341000.0
          }
        },
        {
          "key_as_string" : "2022-01-01",
          "key" : 1640995200000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 489000.0
          }
        },
        {
          "key_as_string" : "2022-02-01",
          "key" : 1643673600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1899000.0
          }
        }
      ]
    }
  }
}

es7.x版本之后的语法
查询
把关键字interval换成calendar_interval

GET /cars/_search
{
  "aggs": {
    "histogram_by_date": {
      "date_histogram": {
        "field": "sold_date",
        "calendar_interval": "month",
        "format": "yyyy-MM-dd",
        "min_doc_count": 1,
        "extended_bounds": {
          "min": "2021-01-01",
          "max": "2022-12-31"
        }
      },
      "aggs": {
        "sum_by_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "histogram_by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2021-05-01",
          "key" : 1619827200000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 239800.0
          }
        },
        {
          "key_as_string" : "2021-07-01",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 148800.0
          }
        },
        {
          "key_as_string" : "2021-08-01",
          "key" : 1627776000000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1998000.0
          }
        },
        {
          "key_as_string" : "2021-10-01",
          "key" : 1633046400000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 258000.0
          }
        },
        {
          "key_as_string" : "2021-11-01",
          "key" : 1635724800000,
          "doc_count" : 2,
          "sum_by_price" : {
            "value" : 341000.0
          }
        },
        {
          "key_as_string" : "2022-01-01",
          "key" : 1640995200000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 489000.0
          }
        },
        {
          "key_as_string" : "2022-02-01",
          "key" : 1643673600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1899000.0
          }
        }
      ]
    }
  }
}

2.3_global bucket

在聚合统计数据的时候,有些时候需要对比部分数据和总体数据。
例如:
统计某品牌车辆平均价格和所有车辆平均价格。global是用于定义一个全局bucket,这个bucket会忽略query的条件,检索所有document进行对应的聚合统计。
查询

GET /cars/_search
{
  "size": 0,
  "query": {
    "match": {
      "brand": "大众"
    }
  },
  "aggs": {
    "volkswagen_of_avg_price": {
      "avg": {
        "field": "price"
      }
    },
    "all_avg_price": {
      "global": {},
      "aggs": {
        "all_of_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "all_avg_price" : {
      "doc_count" : 8,
      "all_of_price" : {
        "value" : 671700.0
      }
    },
    "volkswagen_of_avg_price" : {
      "value" : 793000.0
    }
  }
}

2.4 aggs+order(聚合+排序)

对聚合统计数据进行排序。
例如:
统计每个品牌的汽车销量和销售总额,按照销售总额的降序排列。
查询

GET /cars/_search
{
  "aggs": {
    "group_of_brand": {
      "terms": {
        "field": "brand",
        "order": {
          "sum_of_price": "desc"
        }
      },
      "aggs": {
        "sum_of_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_of_brand" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "奥迪",
          "doc_count" : 3,
          "sum_of_price" : {
            "value" : 2606000.0
          }
        },
        {
          "key" : "大众",
          "doc_count" : 3,
          "sum_of_price" : {
            "value" : 2379000.0
          }
        },
        {
          "key" : "标志",
          "doc_count" : 2,
          "sum_of_price" : {
            "value" : 388600.0
          }
        }
      ]
    }
  }
}

如果有多层aggs,执行下钻聚合的时候,也可以根据最内层聚合数据执行排序。(即外层排序的内容可以使用里层的别名进行排序)
例如
统计每个品牌中每种颜色车辆的销售总额,并根据销售总额降序排列。这就像SQL中的分组排序一样,

只能组内数据排序,而不能跨组实现排序。

查询

GET /cars/_search
{
  "aggs": {
    "group_by_brand": {
      "terms": {
        "field": "brand"
      },
      "aggs": {
        "group_by_color": {
          "terms": {
            "field": "color",
            "order": {
              "sum_of_price": "desc"
            }
          },
          "aggs": {
            "sum_of_price": {
              "sum": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_brand" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "大众",
          "doc_count" : 3,
          "group_by_color" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "黑色",
                "doc_count" : 1,
                "sum_of_price" : {
                  "value" : 1998000.0
                }
              },
              {
                "key" : "金色",
                "doc_count" : 2,
                "sum_of_price" : {
                  "value" : 381000.0
                }
              }
            ]
          }
        },
        {
          "key" : "奥迪",
          "doc_count" : 3,
          "group_by_color" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "黑色",
                "doc_count" : 2,
                "sum_of_price" : {
                  "value" : 2388000.0
                }
              },
              {
                "key" : "红色",
                "doc_count" : 1,
                "sum_of_price" : {
                  "value" : 218000.0
                }
              }
            ]
          }
        },
        {
          "key" : "标志",
          "doc_count" : 2,
          "group_by_color" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "白色",
                "doc_count" : 2,
                "sum_of_price" : {
                  "value" : 388600.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

2.5search+aggs (条件查询+聚合)

聚合类似SQL中的group by子句,search类似SQL中的where子句。在ES中是完全可以将search和aggregations整合起来,执行相对更复杂的搜索统计。
例如:
统计某品牌车辆每个季度的销量和销售额。
查询

GET /cars/_search
{
  "query": {
    "match": {
      "brand": "大众"
    }
  },
  "aggs": {
    "histogram_by_date": {
      "date_histogram": {
        "field": "sold_date",
        "calendar_interval": "quarter",
        "min_doc_count": 1
      },
      "aggs": {
        "sum_by_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.9444616,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      }
    ]
  },
  "aggregations" : {
    "histogram_by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2021-07-01T00:00:00.000Z",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
            "value" : 1998000.0
          }
        },
        {
          "key_as_string" : "2021-10-01T00:00:00.000Z",
          "key" : 1633046400000,
          "doc_count" : 2,
          "sum_by_price" : {
            "value" : 381000.0
          }
        }
      ]
    }
  }
}

2.6filter+aggs(过滤+聚合)

filter也可以和aggs组合使用实现过滤聚合分析。
例如:
统计10万–50万之间的车辆的平均价格。

GET /cars/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "price": {
            "gte": 100000,
            "lte": 500000
          }
        }
      }
    }
  },
  "aggs": {
    "avg_by_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      }
    ]
  },
  "aggregations" : {
    "avg_by_price" : {
      "value" : 246100.0
    }
  }
}

2.7聚合中使用filter

filter也可以使用在aggs句法中,filter的范围决定了其过滤的范围。
如:统计某品牌汽车最近一年的销售总额。将filter放在aggs内部,代表这个过滤器只对query搜索得到的结果执行filter过滤。如果filter放在aggs外部,过滤器则会过滤所有的数据。

①12M/M 表示 12 个月。
②1y/y 表示 1年。
③d 表示天

查询

GET /cars/_search
{
  "query": {
    "match": {
      "brand": "大众"
    }
  },
  "aggs": {
    "count_last_year": {
      "filter": {
        "range": {
          "sold_date": {
            "gte": "now-12M"
          }
        }
      },
      "aggs": {
        "sum_of_price_last_year": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.9444616,
    "hits" : [
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      }
    ]
  },
  "aggregations" : {
    "count_last_year" : {
      "meta" : { },
      "doc_count" : 0,
      "sum_of_price_last_year" : {
        "value" : 0.0
      }
    }
  }
}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/684104.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

《教我兄弟学Android逆向13 xpose改机开发01-环境设置》

上一篇 《教我兄弟学Android逆向12 编写xposed模块》 我们了解了xpose的基本原理并一起搭建了xpose的hook环境&#xff0c;你也很好的完成了课后作业&#xff0c;但是通过后面的测试练习你发现xpose在不同系统环境的安装方法是不一样的,在我们之前的低系统手机上面直接安装就能…

LNMP六个实验

目录 访问状态统计配置 基于授权的访问控制 基于客户端的访问控制 基于域名的 Nginx 虚拟主机 基于IP 的 Nginx 虚拟主机 基于端口的 Nginx 虚拟主机 总结 访问状态统计配置 查看已安装的 Nginx 是否包含 HTTP_STUB_STATUS 模块 修改 nginx.conf 配置文件&#xff0c;…

Python基础四

目录 一、Python数据类型--列表 1.列表的下标 2.访问列表中的元素 3.更新列表元素 4.删除列表元素 5.列表脚本操作符 6.列表截取与拼接 截取 拼接 7.嵌套列表 8.列表比较 二、Python内置函数--列表相关 一、Python数据类型--列表 Python中的列表类似于java的数组 列…

Rust语言从入门到入坑——(7)Rust 错误处理

文章目录 0 引入1、可恢复错误2、可恢复错误递归3、不可恢复错误4、kind 方法5、总结 0 引入 Rust 有一套独特的处理异常情况的机制&#xff0c;程序中一般会出现两种错误&#xff1a;可恢复错误和不可恢复错误。 1、可恢复错误的典型案例是文件访问错误&#xff0c;如果访问一…

RVEA算法

RVEA 1 目标函数2 预备知识3 参考向量引导选择4 更新参考向量5 流程6 代码7 运行效果 1 目标函数 min ⁡ X f ( X ) ( f 1 ( X ) , f 2 ( X ) , . . . , f M ( X ) ) \min_{\small{X}} \pmb{f(\small{X})} (f_1(\small{X}), f_2(\small{X}), ..., f_M(\small{X})) Xmin​f(X)…

数据结构——快速排序的介绍

快速排序 快速排序是霍尔(Hoare)于1962年提出的一种二叉树结构的交换排序方法。快速排序是一种常用的排序算法&#xff0c;其基本思想是通过选择一个元素作为"基准值"&#xff0c;将待排序序列分割成两个子序列&#xff0c;其中一个子序列的元素都小于等于基准值&am…

DAY29:回溯算法(四)组合总和+组合总和Ⅱ

文章目录 39.组合总和思路伪代码为什么传入i而不是i1&#xff0c;不会导致无限循环 完整版剪枝优化剪枝修改完整版补充&#xff1a;std::sort升降序的问题&#xff08;默认升序&#xff09; 40.组合总和Ⅱ思路最开始的写法debug测试&#xff1a;逻辑错误修改完整版&#xff1a;…

营销策划报告个人心得

营销策划报告个人心得篇1 20__年3月27日晚&#xff0c;我们12级市营专业的同学们早早的来到了大成楼。根据网络营销与策划课程实训要求&#xff0c;覃希老师邀请了校外企业经理给我们进行产品培训。 我们按实训项目分别安排在C302和C304教室培训&#xff0c;培训中同学们认认真…

基于深度学习的高精度水果检测识别系统(PyTorch+Pyside6+YOLOv5模型)

摘要&#xff1a;基于深度学习的高精度水果&#xff08;苹果、香蕉、葡萄、橘子、菠萝和西瓜&#xff09;检测识别系统可用于日常生活中或野外来检测与定位水果目标&#xff0c;利用深度学习算法可实现图片、视频、摄像头等方式的水果目标检测识别&#xff0c;另外支持结果可视…

OpenGL 模板测试

1.示例效果图 选中模型对象&#xff0c;出现模型轮廓。 2.简介 当片段着色器处理完一个片段之后&#xff0c;模板测试会开始执行&#xff0c;和深度测试一样&#xff0c;它也可能会丢弃片段。接下来&#xff0c;被保留的片段会进入深度测试&#xff0c;它可能会丢弃更多的片…

MongoDB集群管理(三)

MongoDB集群管理 集群介绍 为什么使用集群 随着业务数据和并发量的增加&#xff0c;若只使用一台MongoDB服务器&#xff0c;存在着断电和数据风险的问题&#xff0c;故采用Mongodb复制集的方式&#xff0c;来提高项目的高可用、安全性等性能。 MongoDB复制是将数据同步到多个…

LNMP (Nginx网站服务) nginx 平滑升级

目录 1.1 Nginx的简介 1.2 Apache与Nginx的区别 Nginx对比Apache的优势&#xff1a; 1.3 Nginx的进程 Nginx的两个进程&#xff1a; 同步&#xff0c;异步&#xff0c;阻塞&#xff0c;非阻塞的概念补充 阻塞与非阻塞 同步和异步 2.1 编译安装Nginx 2.1 .1 关闭防火墙…

两点边值问题的有限元方法以及边值条件的处理示例

文章目录 引言题目补全方程刚度矩阵构造基底边值条件非齐次左边值条件非齐次右边值条件非齐次非齐次边值条件有限元方程 求数值解直接求总刚度矩阵先求单元刚度矩阵 引言 本文参考李荣华教授的《偏微分方程数值解法》一书 题目 对于非齐次第二边值问题 { − d d x ( p d u …

陶哲轩甩出调教GPT-4聊天记录,点击领取大佬的研究助理

量子位 | 公众号 QbitAI 鹅妹子嘤&#xff0c;天才数学家陶哲轩搞数学研究&#xff0c;已经离不开普通人手里的“数学菜鸡”GPT了&#xff01; 就在他最新解决的一个数学难题下面&#xff0c;陶哲轩明确指出自己“使用了GPT-4”&#xff0c;后者给他提出了一种可行的解决方法…

【FFmpeg实战】avformat_find_stream_info() 函数源码解析

转载自地址&#xff1a;https://cloud.tencent.com/developer/article/1873836 先来看一下 avformat_find_stream_info() 的头文件里的注释对该函数的介绍&#xff0c;本文我们基于 FFmpeg n4.2 版本的源码分析。 /*** Read packets of a media file to get stream informatio…

Apikit 自学日记:创建 API 文档

Apikit 中一共有5种创建API文档的方式&#xff1a; 新建API文档 导入API文档&#xff0c;详情可查看《导入、导出API文档》 从模板添加API文档&#xff0c;详情可查看《API文档模板》 自动生成API文档&#xff0c;详情可查看《自动生成API文档》 IDEA插件注释同步API文档 …

linux 在线安装 Redis

博主介绍&#xff1a; ✌博主从事应用安全和大数据领域&#xff0c;有8年研发经验&#xff0c;5年面试官经验&#xff0c;Java技术专家✌ Java知识图谱点击链接&#xff1a;体系化学习Java&#xff08;Java面试专题&#xff09; &#x1f495;&#x1f495; 感兴趣的同学可以收…

生成式AI掀起产业智能化新浪潮|爱分析报告

报告摘要 大模型支撑的生成式AI&#xff0c;让人类社会有望步入通用人工智能时代&#xff0c;拥有广阔的应用前景&#xff0c;有望赋能千行百业。当前生成式AI的落地整体处于初级阶段&#xff0c;不同模态的落地时间表差异明显&#xff0c;企业需求主要集中在数字化程度高、容…

地平线旭日x3派部署yolov8

地平线旭日x3派部署yolov8 总体流程1.导出onnx模型导出YOLOV8_onnxruntime.py验证onnxutils.py 2.在开发机转为bin模型2.1准备数据图片2.2转换必备的yaml文件2.3开始转换 3.开发机验证**quantized_model.onnx4.板子运行bin模型 资源链接 总体流程 1.导出onnx模型 导出 使用y…

03 | 事务隔离:为什么你改了我还看不见?

以下出自 《MySQL 实战 45 讲》 03 | 事务隔离&#xff1a;为什么你改了我还看不见&#xff1f; 隔离性与隔离级别 当数据库上有多个事务同时执行的时候&#xff0c;就可能出现脏读&#xff08;dirty read&#xff09;、不可重复读&#xff08;non-repeatable read&#xff0…