4.搜索辅助功能

news2024/11/15 13:41:29

ES 既有基本的搜索功能、又有字段类型的精确搜索、分词匹配、范围搜索、坐标搜索、分页查询等等。

4.1 搜索辅助功能

俗话说“工欲善其事,必先利其器”。在介绍ES提供的各种搜索匹配功能之前,我们先介绍ES提供的各种搜索辅助功能。例如,为优化搜索性能,需要指定搜索结果返回一部分字段内容。为了更好地呈现结果,需要用到结果计数和分页功能;当遇到性能瓶颈时,需要剖析搜索各个环节的耗时;面对不符合预期的搜索结果时,需要分析各个文档的评分细节

4.1.1指定返回的字段

考虑性能问题,需要对搜索进行瘦身。所以需要返回指定的字段。

示例

  1. 创建索引
PUT /hoteld
{ 
  "mappings": { 
    "properties": { 
      "title": {     
        "type": "text" 
      }, 
      "city": {     
        "type": "keyword"  
      }, 
      "price": {   
        "type": "double" 
      }, 
      "create_time": { 
        "type": "date", 
        "format": "yyyy-MM-dd HH:mm:ss" 
      }, 
      "amenities": { 
        "type": "text" 
      }, 
      "full_room": {  
        "type": "boolean" 
      }, 
      "location": {  
        "type": "geo_point" 
      }, 
      "praise": { 
        "type": "integer" 
      } 
    } 
  } 
}  
  1. 插入文档
POST /_bulk 
{"index":{"_index":"hoteld","_id":"001"}}
{"title":"文雅酒店","city":"青岛","price":556,"create_time":"2020-04-18 12:00:00","amenities":"浴池,普通停车场/充电停车场","full_room":false,"location":{"lat":36.083078,"lon":120.37566},"praise":10}
{"index":{"_index":"hoteld","_id":"002"}} 
{"title":"金都嘉怡假日酒店","city":"北京","price":337.00,"create_time":"2021-03-15 20:00:00","amenities":"wifi,充电停车场/可升降停车场","full_room":false,"location":{"lat":39.915153,"lon":116.4030},"praise":60}
{"index":{"_index":"hoteld","_id":"003"}} 
{"title":"金都欣欣酒店","city":"天津","price":200.00,"create_time":"2021-05-09 16:00:00","amenities":"提供假日party,免费早餐,可充电停车场","full_room":true,"location":{"lat":39.186555,"lon":117.162007},"praise":30} 
{"index":{"_index":"hoteld","_id":"004"}}
{"title":"金都酒店","city":"北京","price":500,"create_time":"2021-02-18 08:00:00","amenities":"浴池(假日需预定),室内游泳池,普通停车场","full_room":true,"location":{"lat":39.915343,"lon":116.4239},"praise":20}
{"index":{"_index":"hoteld","_id":"005"}} 
{"title":"文雅精选酒店","city":"北京","price":800.00,"create_time":"2021-01-01 08:00:00","amenities":"浴池(假日需预定),wifi,室内游泳池,普通停车场","full_room":true,"location":{"lat":39.918229,"lon":116.422011},"praise":20}

DSL

GET /hoteld/_search
{
  "_source": ["title","city"],
  "query": {
    "term": {
      "city": {
        "value": "天津"
      }
    }
  }
}
  1. 请求方式:GET
  2. “_source”:数组。元素里面是想要展示的字段

JAVA API分两种形式

1 spring-boot-starter-data-elasticsearch
    /**
     * 获取特定字段的查询
     * @return
     */
    public Hotel1 findByQuerySource(){
        FetchSourceFilter fetchSourceFilter = new FetchSourceFilter(new String[]{"city"}, null);
        NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
                .withSourceFilter(fetchSourceFilter).withQuery(QueryBuilders.termsQuery("city", "天津")).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(nativeSearchQuery, Hotel1.class);
        for (SearchHit<Hotel1> hotel1SearchHit : search) {
            Hotel1 content = hotel1SearchHit.getContent();
            System.out.println(content);
        }
        return null;
    }
2. elasticsearch-rest-high-level-client
/**
     * 特定字段返回
     */
    public void findByQuerySource(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.fetchSource(new String[]{"city"},null);
        searchSourceBuilder.query(QueryBuilders.termQuery("city","天津"));
        searchRequest.source(searchSourceBuilder);

        try {
            SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : search.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String city = (String)sourceAsMap.get("city");
                System.out.println(city);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.1.2 计数

为了提升搜索体验,返回符合筛选条件的总条数。

DSL

GET /${index_name}/_count
{
  "query":{
    ....
  }
}

  1. 请求方式:GET
  2. index_name:索引名称
  3. _count:计数路径
  4. query:可以传过滤条件、

JAVA API分两种形式

1 spring-boot-starter-data-elasticsearch
    /**
     * 获取符合的条数
     * @return
     */
    public Long getDataCount(){
        NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
                .withQuery(QueryBuilders.termQuery("city", "天津"))
                .build();
        long hoteld = elasticsearchRestTemplate.count(nativeSearchQuery, IndexCoordinates.of("hoteld"));
        System.out.println(hoteld);
        return hoteld;
    }
2. elasticsearch-rest-high-level-client
/**
     * 获取符合条件的文档条数
     * @return
     */
    public Long getDataCount(){
        CountRequest countRequest = new CountRequest("hoteld");
        countRequest.query(QueryBuilders.termQuery("city","天津"));
        try {
            CountResponse countResponse = restHighLevelClient.count(countRequest, RequestOptions.DEFAULT);
            long count = countResponse.getCount();
            return count;
        } catch (IOException e) {
            e.printStackTrace();
        }
        return 0L;
    }

4.1.3 结果分页

Es 的分页和关系型数据库分页不太一样。Es分页默认是开始是from默认是0,size默认是10。Es分页和关系型数据分页不一样的原因是。他不是真正的分页。比如查询from=10 size=10这页的数据时。假设有三个分片一个协调节点。每个分片需要查询出100条数据。三个节点就是300条数据。然后再在协调节点给进行排序找到第10页展示的数据并且返回。
也就是说Es并不适合大的分页查询。而且每个分片最大查询条数是10000。如果想要改变这个值可以设置改索引下的max_result_window这个字段参数。
如:
PUT /hotel/_settings
{ “index”: { “max_result_window”: 20000 } }

DSL

GET /${index_name}/_search
{
  "from":0,
  "size":10,
  "query":{
    ...
  }
}

  1. 请求方式:GET
  2. index_name:索引名称
  3. from:第几页以0开始
  4. size:每页条数默认10

java API分两种方式

1. spring-boot-starter-data-elasticsearch

 /**
     * 结果分页查询
     */
    public void getPageQueryData(){
        PageRequest of = PageRequest.of(0, 2);
        NativeSearchQuery build = new NativeSearchQueryBuilder()
                .withQuery(QueryBuilders.matchAllQuery()).withPageable(of)
                .build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(build, Hotel1.class);
        for (SearchHit<Hotel1> hotel1SearchHit : search) {
            Hotel1 content = hotel1SearchHit.getContent();
            System.out.println(content);
        }
    }

2. elasticsearch-rest-high-level-client
    /**
     * 分页查询
     */
    public void getPageQuery(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().from(0).size(2).query(QueryBuilders.matchAllQuery());
        searchRequest.source(searchSourceBuilder);
        try {
            SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : search.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                System.out.println(sourceAsMap);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.1.4 性能分析

在使用es的时候,可能会遇到搜索结果慢的问题。如果执行的DSL脚步比较长。就需要通过profile = true 来查看哪部分比较慢了。

DSL

POST /${index_name}/_search
{
  "profile" = true,
  "query":{
    ...
  }
}

-- 结果
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    ... 命中数据
  },
  "profile" : {
    "shards" : [
      {
        "id" : "[3YHg2n4cRlquBb6iSBtkXQ][hoteld][0]",   -- 分片数据
        "searches" : [
          {
            "query" : [
              {
                "type" : "TermQuery",
                "description" : "city:天津",
                "time_in_nanos" : 439700,
                "breakdown" : {
                  "set_min_competitive_score_count" : 0,
                  "match_count" : 0,
                  "shallow_advance_count" : 0,
                  "set_min_competitive_score" : 0,
                  "next_doc" : 6500,
                  "match" : 0,
                  "next_doc_count" : 1,
                  "score_count" : 1,
                  "compute_max_score_count" : 0,
                  "compute_max_score" : 0,
                  "advance" : 1000,
                  "advance_count" : 1,
                  "score" : 3200,
                  "build_scorer_count" : 2,
                  "create_weight" : 375200,
                  "shallow_advance" : 0,
                  "create_weight_count" : 1,
                  "build_scorer" : 53800
                }
              }
            ],
            "rewrite_time" : 1500,
            "collector" : [
              {
                "name" : "SimpleTopScoreDocCollector",
                "reason" : "search_top_hits",
                "time_in_nanos" : 10900
              }
            ]
          }
        ],
        "aggregations" : [ ]
      }
    ]
  }
}


  1. profile : 新能分析的关键字。

因为新能分析比较消耗性能。所以在线上环境是不推荐使用的。

还可以在Kibana的Dev Tools界面中单击Search Profiler链接
image.png

4.1.5 评分分析

查询某个文档在某次查询条件时的评分。可以方便线上问题查询。

DSL

GET /${index_name}/_explain/${doc_id}
{
  "query":{
    ....
  }
}

  1. _explain:评分分析关键字
{
  "_index" : "hoteld",
  "_type" : "_doc",
  "_id" : "002",
  "matched" : true,
  "explanation" : {                       // 被拆分为两个子查询
    "value" : 0.91718745,
    "description" : "sum of:",
    "details" : [
      {
        "value" : 0.45859373,  // 子查询分值
        "description" : "weight(title:金 in 1) [PerFieldSimilarity], result of:",
        "details" : [
          {
            "value" : 0.45859373, // 子查询分值
            "description" : "score(freq=1.0), computed as boost * idf * tf from:",
            "details" : [
              {
                "value" : 2.2,
                "description" : "boost",
                "details" : [ ]
              },
              {
                "value" : 0.5389965,
                "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                "details" : [
                  {
                    "value" : 3,
                    "description" : "n, number of documents containing term",
                    "details" : [ ]
                  },
                  {
                    "value" : 5,
                    "description" : "N, total number of documents with field",
                    "details" : [ ]
                  }
                ]
              },
              {
                "value" : 0.38674033,
                "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                "details" : [
                  {
                    "value" : 1.0,
                    "description" : "freq, occurrences of term within document",
                    "details" : [ ]
                  },
                  {
                    "value" : 1.2,
                    "description" : "k1, term saturation parameter",
                    "details" : [ ]
                  },
                  {
                    "value" : 0.75,
                    "description" : "b, length normalization parameter",
                    "details" : [ ]
                  },
                  {
                    "value" : 8.0,
                    "description" : "dl, length of field",
                    "details" : [ ]
                  },
                  {
                    "value" : 5.6,
                    "description" : "avgdl, average length of field",
                    "details" : [ ]
                  }
                ]
              }
            ]
          }
        ]
      },
      {
        "value" : 0.45859373,   / 子查询分值
        "description" : "weight(title:都 in 1) [PerFieldSimilarity], result of:",
        "details" : [
          {
            "value" : 0.45859373, // 子查询分值
            "description" : "score(freq=1.0), computed as boost * idf * tf from:",
            "details" : [
              {
                "value" : 2.2,
                "description" : "boost",
                "details" : [ ]
              },
              {
                "value" : 0.5389965,
                "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                "details" : [
                  {
                    "value" : 3,
                    "description" : "n, number of documents containing term",
                    "details" : [ ]
                  },
                  {
                    "value" : 5,
                    "description" : "N, total number of documents with field",
                    "details" : [ ]
                  }
                ]
              },
              {
                "value" : 0.38674033,
                "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                "details" : [
                  {
                    "value" : 1.0,
                    "description" : "freq, occurrences of term within document",
                    "details" : [ ]
                  },
                  {
                    "value" : 1.2,
                    "description" : "k1, term saturation parameter",
                    "details" : [ ]
                  },
                  {
                    "value" : 0.75,
                    "description" : "b, length normalization parameter",
                    "details" : [ ]
                  },
                  {
                    "value" : 8.0,
                    "description" : "dl, length of field",
                    "details" : [ ]
                  },
                  {
                    "value" : 5.6,
                    "description" : "avgdl, average length of field",
                    "details" : [ ]
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
}


4.2 丰富的搜索匹配功能

针对不同的数据类型,ES提供了多种搜索方式。keyword使用的trem text使用的match 数值类型的取值区间range,前缀匹配suggest等等。

4.2.1 查询所有文档

类似关系型数据库中的select * from table。ES中也提供了查询关键字 match_all字段。这个时候就不会给所有文档进行评分了。默认boost为1

DSL

GET /${index_name}/_search
{
  "query": {
    "match_all": {
      "boost": 2
    }
  }
}


{
  "took" : 16,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 2.0,
    "hits" : [
      {
        "_index" : "hoteld",
        "_type" : "_doc",
        "_id" : "001",
        "_score" : 2.0,
        "_source" : {
          "title" : "文雅酒店",
          "city" : "青岛",
          "price" : 556,
          "create_time" : "2020-04-18 12:00:00",
          "amenities" : "浴池,普通停车场/充电停车场",
          "full_room" : false,
          "location" : {
            "lat" : 36.083078,
            "lon" : 120.37566
          },
          "praise" : 10
        }
      }
      ]
  }
}


java API分两种方式

1. spring-boot-starter-data-elasticsearch

    /**
     * 查询全部数据
     */
    public void getMatchlAllList(){
        NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder().withQuery(QueryBuilders.matchAllQuery().boost(2.0f)).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(nativeSearchQuery, Hotel1.class);
        for (SearchHit<Hotel1> hotel1SearchHit : search) {
            Hotel1 content = hotel1SearchHit.getContent();
            System.out.println(content);
        }
    }

2. elasticsearch-rest-high-level-client
 /**
     * 查询全部字段
     */
    public void getMatchAllList(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        SearchSourceBuilder query = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery());
        searchRequest.source(query);
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : searchResponse.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.2 term级别查询

term级别查询就相当于java中的equals。能使用term查询的数据类型有:keyword、数值类型、日期类型、布尔类型、数组类型(是数值类型的)。text类型是有倒排索引进行分词的所以不能使用。

DSL

GET /${index_name}/_search
{
  "query":{
    "term":{
      filed:{
        "key":"value"
      }
    }
  }
}

  1. trem:关键字
  2. filed:字段名称
  3. value:需要查询的value值。
GET /hoteld/_search
{
  "query": {
    "term": {
      "city": {
       "value": "天津"  // keyword
      }
    }
  }
}

{
  "query": {
    "term": {
      "price": {
       "value": "200" // double
      }
    }
  }
}

{
  "query": {
    "term": {
      "create_time": {
       "value": "2021-05-09 16:00:00" // date 日期类型
      }
    }
  }
}

{
  "query": {
    "term": {
      "full_room": {
       "value": true  // boolean
      }
    }
  }
}


java API分两种方式

1. spring-boot-starter-data-elasticsearch
/**
     * term 查询
     */
    public void findTermQuery(){
        NativeSearchQuery query = new NativeSearchQueryBuilder().withQuery(QueryBuilders.termQuery("city", "天津")).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(query, Hotel1.class);
        if (search.hasSearchHits()) {
            for (SearchHit<Hotel1> hotel1SearchHit : search) {
                Hotel1 content = hotel1SearchHit.getContent();
                System.out.println(content);
            }
        }
    }

2. elasticsearch-rest-high-level-client
/**
     * term 查询
     */
    public void findTermQuery(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().query(QueryBuilders.termQuery("city","天津"));
        searchRequest.source(searchSourceBuilder);
        try {
            SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : search.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.3terms级别查询

terms就是term的升级版。term只能查询匹配一个数据就相当于数据型sql中的=号terms相当于in().

DSL

GET /${index_name}/_search
{
  "query":{
    "terms":{
      "filed":[
        "value",
        "value"...
      ]
    }
  }
}

  1. terms:关键字
  2. filed:字段名称
GET /hoteld/_search
{
  "query": {
    "terms": {
      "city":["北京","天津"]
    }
  }
}

java API分两种方式

1. spring-boot-starter-data-elasticsearch
/**
     * term 查询
     */
    public void findTermQuery(){
        NativeSearchQuery query = new NativeSearchQueryBuilder().withQuery(QueryBuilders.termsQuery("city", "天津","北京")).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(query, Hotel1.class);
        if (search.hasSearchHits()) {
            for (SearchHit<Hotel1> hotel1SearchHit : search) {
                Hotel1 content = hotel1SearchHit.getContent();
                System.out.println(content);
            }
        }
    }

2. elasticsearch-rest-high-level-client
/**
     * term 查询
     */
    public void findTermQuery(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().query(QueryBuilders.termsQuery("city","天津","北京"));
        searchRequest.source(searchSourceBuilder);
        try {
            SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : search.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.4 range查询

返回查询 一般是数据类型和日期类型。可以查询范围内符合的数据。

  1. gt 大于
  2. lt 小于
  3. gte 大于等于
  4. lte 小于等于

DSL

GET /${index_name}/_search
{
  "query":{
    "range":{
      "filed":{
        "gt":"",
        "lt":"".....
      }
    }
  }
}

GET /hoteld/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 700,
        "lte": 800
      }
    }
  }
}

// 时间查询 必须按照时间格式进行查询
GET /hoteld/_search
{
  "query": {
    "range": {
      "create_time": {
        "gte": "2021-01-01 08:00:00",
        "lte": "2021-03-01 08:00:00"
      }
    }
  }
}

java API分两种方式

1. spring-boot-starter-data-elasticsearch
    /**
     * range 查询
     */
    public void findRangeQuery(){
        NativeSearchQuery query = new NativeSearchQueryBuilder().withQuery(QueryBuilders.rangeQuery("price").gte(700).lte(800)).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(query, Hotel1.class);
        if (search.hasSearchHits()) {
            for (SearchHit<Hotel1> hotel1SearchHit : search) {
                Hotel1 content = hotel1SearchHit.getContent();
                System.out.println(content);
            }
        }
    }

2. elasticsearch-rest-high-level-client
/**
     * range范围查询
     */
    public void findRangeQuery(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(QueryBuilders.rangeQuery("price").gte(700).lte("800"));
        searchRequest.source(sourceBuilder);
        try {
            SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : search.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.5 exists查询

判断某个字段不为null。不为空的依据为不为null、数组不为空数组、数组不为[null].

DSL

GET /${index_name}/_search
{
  "query":{
    "exists":{
      "field":"key"
    }
  }
}

  1. exists:关键字
  2. key:字段名称

测试:

PUT /hotel_1
{
  "mappings": {
    "properties":{
    "title":{
      "type":"text"
    },
    "tag":{
      "type":"keyword"
    }
    }
  }
}

POST /hotel_1/_doc/001
{
  "title":"环球酒店",
  "tag":null
}

POST /hotel_1/_doc/002
{
  "title":"环球酒店",
  "tag":[]
}

POST /hotel_1/_doc/003
{
  "title":"环球酒店",
  "tag":[null]
}

三个文档tag字段分别为null、[]、[null]。查询tag字段存在值的文档这三个文档不会命中。
GET /hotel_1/_search
{
  "query":{
    "exists":{
      "field":"tag"
    }
  }
}

结果:
{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}


java API分两种方式

1. spring-boot-starter-data-elasticsearch
    /**
     * exists查询
     */
    public void findExistsQuery(){
        NativeSearchQuery query = new NativeSearchQueryBuilder().withQuery(QueryBuilders.existsQuery("tag")).build();
        SearchHits<Map> search = elasticsearchRestTemplate.search(query, Map.class, IndexCoordinates.of("hotel_1"));
        if (search.hasSearchHits()) {
            for (SearchHit<Map> hotel1SearchHit : search) {
                Map content = hotel1SearchHit.getContent();
                System.out.println(content);
            }
        }
    }

2. elasticsearch-rest-high-level-client
/**
     * exists 查询
     */
    public void findExistsQuery(){
        SearchRequest searchRequest = new SearchRequest("hotel_1");
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.query(QueryBuilders.existsQuery("tag"));
        searchRequest.source(searchSourceBuilder);
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : searchResponse.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.6 布尔查询

复合查询就是需要多个条件过滤出符合的结果集。布尔查询是常见的复合查询方式。布尔查询返回的结果会根据每个子查询匹配度来排分。

布尔查询支持四种子查询方式。

子查询名称功能
must必须匹配该查询条件 可以理解为&&
should可以匹配该查询条件 可以理解为 ||
must not必须不匹配该查询条件 !
filter必须匹配过滤条件,不进行打分计算

4.2.6.1 must查询

must查询相当于与查询。并且把子查询的分数添加到文档分数计算中。

DSL
GET /hoteld/_search
{
  "query": {
    "bool": {
      "must": [
       {
         "term": {
           "city": {
             "value": "北京"
           }
         }
       },
       {
         "range": {
           "price": {
             "gte": 350,
             "lte": 500
           }
         }
       }
      ]
    }
  }
}

  1. bool:关键字
  2. must:关键字 数组 可以传多个条件
java API分两种方式
1. spring-boot-starter-data-elasticsearch
/**
     * must查询
     */
    public void findMustQuery(){
        TermsQueryBuilder termsQueryBuilder = QueryBuilders.termsQuery("city", "北京");
        RangeQueryBuilder price = QueryBuilders.rangeQuery("price").gte(350).lte(500);
        BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery().must(termsQueryBuilder).must(price);
        NativeSearchQuery build = new NativeSearchQueryBuilder().withQuery(queryBuilder).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(build, Hotel1.class);
        if (search.hasSearchHits()) {
            for (SearchHit<Hotel1> hotel1SearchHit : search) {
                Hotel1 content = hotel1SearchHit.getContent();
                System.out.println(content);
            }
        }
    }

2. elasticsearch-rest-high-level-client
/**
     * must 查询
     */
    public void findMustQuery(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("city", "北京");
        boolQueryBuilder.must(termQueryBuilder);
        RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("price").gte(350).lte(500);
        boolQueryBuilder.must(rangeQueryBuilder);
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().query(boolQueryBuilder);
        searchRequest.source(searchSourceBuilder);
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : searchResponse.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.6.2 should查询

should查询就相当于||条件。每个条件的匹配分数也会用于计算总分数

DSL
GET /${index_name}/_search
{
  "query":{
    "bool":{
      "should":[
        {
          "match":{}
        },
        {
          "term":{}
        }
      ]
    }
  }
}

  1. bool:关键字。表示布尔查询
  2. should:表示或查询。里面的语句只要有一个符合就为true。

示例:

GET /hoteld/_search
{
  "query": {
    "bool": {
      "should": [
        {"term": {
          "city": {
            "value": "天津"
          }
        }},
        {
          "range": {
            "price": {
              "gte": 350,
              "lte": 500
            }
          }
        }
      ]
    }
  }
}

java API分两种方式
1. spring-boot-starter-data-elasticsearch
/**
     * should查询 又称或查询
     */
    public void findShouldQuery(){
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("city", "天津");
        RangeQueryBuilder price = QueryBuilders.rangeQuery("price").gte(350).lte(500);
        BoolQueryBuilder should = QueryBuilders.boolQuery().should(termQueryBuilder).should(price);
        NativeSearchQuery build = new NativeSearchQueryBuilder().withQuery(should).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(build, Hotel1.class);
        if (search.hasSearchHits()) {
            for (SearchHit<Hotel1> hotel1SearchHit : search) {
                Hotel1 content = hotel1SearchHit.getContent();
                System.out.println(content);
            }
        }
    }

2. elasticsearch-rest-high-level-client
    /**
     * should 查询 又称或查询
     */
    public void findShouldQuery(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("city", "天津");
        RangeQueryBuilder price = QueryBuilders.rangeQuery("price").gte(350).lte(500);
        boolQueryBuilder.should(termQueryBuilder).should(price);
        SearchSourceBuilder query = new SearchSourceBuilder().query(boolQueryBuilder);
        searchRequest.source(query);
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : searchResponse.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.6.3 must_not查询

must_not 查询代表是非查询。命中文档不能匹配当中的一个或多个子查询接口。ES会将改查询与文档匹配度加入到总分里去计算。

DSL
GET /hoteld/_search
{
  "query": {
    "bool": {
      "must_not": [
        {"term": {
          "city": {
            "value": "天津"
          }
        }},
        {
          "range": {
            "price": {
              "gte": 350,
              "lte": 500
            }
          }
        }
      ]
    }
  }
}

java API分两种方式
1. spring-boot-starter-data-elasticsearch
 /**
     * must_not查询
     */
    public void findMustNotQuery(){
        TermsQueryBuilder termsQueryBuilder = QueryBuilders.termsQuery("city", "北京");
        RangeQueryBuilder price = QueryBuilders.rangeQuery("price").gte(350).lte(500);
        BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery().mustNot(termsQueryBuilder).mustNot(price);
        NativeSearchQuery build = new NativeSearchQueryBuilder().withQuery(queryBuilder).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(build, Hotel1.class);
        if (search.hasSearchHits()) {
            for (SearchHit<Hotel1> hotel1SearchHit : search) {
                Hotel1 content = hotel1SearchHit.getContent();
                System.out.println(content);
            }
        }
    }

2. elasticsearch-rest-high-level-client
/**
     * must_not查询 也叫且查询
     */
    public void findMustNotQuery(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("city", "北京");
        boolQueryBuilder.mustNot(termQueryBuilder);
        RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("price").gte(350).lte(500);
        boolQueryBuilder.mustNot(rangeQueryBuilder);
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().query(boolQueryBuilder);
        searchRequest.source(searchSourceBuilder);
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : searchResponse.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.6.4 filter查询

filter查询和其他布尔查询不太一样。其他布尔查询会关注子查询分数情。filter不关注分数。并且还会缓存部分子查询结果。

Filter查询原理
  1. 假设当前有五个文档。这五个文档对应城市的倒排索引为:

  1. 假设当前有五个文档。这五个文档对应满房字段的倒排索引为:

  1. 已查询城市为北京。没有满房的酒店为例
    1. 当ES执行过滤条件时。回显查询city为北京的bitset(位图)数据是否存在。bitset可以用最紧凑的数据来表示给定范围内的连续数据。如果查询中有bitset数据,则直接取出。如果没有则es查询数据后根据查询接口来组装bitset数据,并将其放入缓存中。同时es也会考察满房字段为false是否有bitset数据。如果有则取出,否则就查询出接口并生成bitset数据放到缓存中。
    2. 假设城市值为北京时没有bitset数据。则bitset生成的方式为:
      1. 首先es会先搜索为北京的文档,这里符合条件的文档为doc1,doc5。然后为所有文件构建bitset数组。数组中每个元素的值用来表示对应位置的文档是否和查询条件匹配,0表示未匹配,1表示匹配。在本例中,doc1和doc5匹配“北京”,对应位置的值为1;doc2、doc3、doc4不匹配,对应位置的值为0。最终,本例的bitset数组为[1,0,0,0,1]。之所以用bitset表示文档和query的匹配结果,是因为该结构不仅节省空间而且后续进行操作时也能节省时间。如果满房字段缓存中没有对应的bitset数据,ES构建满房字段为false对应bitset的过程也是类似的。

  4. 接下来es会遍历查询条件的bitset数组。按照命中与否进行文档过滤。当一个请求有多个filter过滤条件时,会先从最稀疏的数组进行遍历,因为稀疏的数组可以过滤掉更多的文档。上述请求城市数组最稀疏,所以就像从城市过滤,然后再从是否满房字段过滤。连个数组都过滤好后就只剩先doc1,doc5了。
  5. 如果缓存中有的话就直接使用缓存中数组进行过滤。也就是说bitset是可重用的。这种重用机制叫做filter cache(过滤器缓存)。
  6. filter cache会跟踪每一个filter查询,ES筛选一部分filter查询的bitset进行缓存。首先,这些过滤条件要在最近256个查询中出现过;其次,这些过滤条件的次数必须超过某个阈值
  7. 另外filter cache还有自动更新的功能。如果某个文档中的城市被修改了。则bitset中的数组也会相对应的修改。
  8. filter查询是不计入分数计算的。这更加减少了开销。
  9. 如果在自己的业务中有不需要分数计算的字段进行过滤的时候可以用filter查询。
DSL
GET /hoteld/_search
{
  "query": {
    "bool": {
      "filter": [
        {"term": {
          "city": {
            "value": "北京"
          }
        }},
        {
          "range": {
            "price": {
              "gte": 350,
              "lte": 500
            }
          }
        }
      ]
    }
  }
}

java API分两种方式
1. spring-boot-starter-data-elasticsearch
/**
     * filter查询
     */
    public void findFilterQuery(){
        TermsQueryBuilder termsQueryBuilder = QueryBuilders.termsQuery("city", "北京");
        RangeQueryBuilder price = QueryBuilders.rangeQuery("price").gte(350).lte(500);
        BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery().filter(termsQueryBuilder).filter(price);
        NativeSearchQuery build = new NativeSearchQueryBuilder().withQuery(queryBuilder).build();
        SearchHits<Hotel1> search = elasticsearchRestTemplate.search(build, Hotel1.class);
        if (search.hasSearchHits()) {
            for (SearchHit<Hotel1> hotel1SearchHit : search) {
                Hotel1 content = hotel1SearchHit.getContent();
                System.out.println(content);
            }
        }
    }

2. elasticsearch-rest-high-level-client
 /**
     * filter查询
     */
    public void findFilterQuery(){
        SearchRequest searchRequest = new SearchRequest("hoteld");
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("city", "北京");
        boolQueryBuilder.filter(termQueryBuilder);
        RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("price").gte(350).lte(500);
        boolQueryBuilder.filter(rangeQueryBuilder);
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().query(boolQueryBuilder);
        searchRequest.source(searchSourceBuilder);
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            for (SearchHit hit : searchResponse.getHits()) {
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                String id = hit.getId();
                System.out.println(sourceAsMap + "id: " + id);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4.2.6.4 Constant Score查询

如果不想让检索词频率对搜索结果排序有影响,只想过滤某个文本字段是否包含有某个词,可以使用Constant score查询。假设需要查询amenities字段是否包含停车场字段。

DSL
GET /hoteld/_search
{
  "_source":["amenities"],
  "query":{
    "constant_score":{
      "filter":{
        "match":{
          "amenities":"停车场"
        }
      },
      "boost":1.2
    }
  }
}

查询结果:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {         //分片
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 1.2,          // 最大分数
    "hits" : [
      {
        "_index" : "hoteld",
        "_type" : "_doc",
        "_id" : "001",
        "_score" : 1.2,         // 分数结果 并没有影响到排名
        "_source" : {
          "amenities" : "浴池,普通停车场/充电停车场"
        }
      },
      {
        "_index" : "hoteld",
        "_type" : "_doc",
        "_id" : "002",
        "_score" : 1.2,
        "_source" : {
          "amenities" : "wifi,充电停车场/可升降停车场"
        }
      },
      {
        "_index" : "hoteld",
        "_type" : "_doc",
        "_id" : "003",
        "_score" : 1.2,
        "_source" : {
          "amenities" : "提供假日party,免费早餐,可充电停车场"
        }
      },
      {
        "_index" : "hoteld",
        "_type" : "_doc",
        "_id" : "004",
        "_score" : 1.2,
        "_source" : {
          "amenities" : "浴池(假日需预定),室内游泳池,普通停车场"
        }
      },
      {
        "_index" : "hoteld",
        "_type" : "_doc",
        "_id" : "005",
        "_score" : 1.2,
        "_source" : {
          "amenities" : "浴池(假日需预定),wifi,室内游泳池,普通停车场"
        }
      }
    ]
  }
}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/420408.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【让你的灵感立刻落地】在线代码运行平台InsCode

文章目录官网地址详解1. 导入项目2. 在线演示、在线修改3. 发布作品参考InsCode 是一个在线代码运行平台&#xff0c;可以在线上运行代码&#xff0c;并且支持多种语言&#xff0c;同时还可以在线修改和提交代码&#xff0c;支持发布和分享项目。InsCode 平台在编写博客、演示项…

二叉树练习题(递归展开图详解哦)

全文目录引言单值二叉树题目描述及思路实现二叉树的最大深度题目描述及思路实现翻转二叉树题目描述及思路实现相同的树题目描述及思路实现总结引言 前面我们介绍了二叉树的相关基础知识&#xff0c;并且了解到二叉树的表示有两种结构&#xff1a;顺序结构与链式结构。即&#…

手把手教您注册/使用Claude

文章目录注册slack注意事项最近几天出现了一个很火的AI聊天项目——Claude&#xff0c;据说可以媲美ChatGPT&#xff0c;最主要的就是可以很好的解决我们国内的使用痛点&#xff0c;可以完全免费无限制的使用&#xff0c;下面就和大家分享一下正确的注册和使用Claude的流程&…

想成为一名【黑客】,你该如何快速的入门?

假设你有一台个人电脑&#xff0c;或者可以访问一台电脑&#xff0c;那么你就可以着手【黑客】技能的学习了。【黑客】文化演化而来的的时候&#xff0c;电脑是很昂贵的&#xff0c;个人不能拥有他们。所以最重要的一个步骤就是新手可以拥有一台属于自己的电脑&#xff0c;新手…

【Cisco Packet Tracer| 一.交换机配置模式与基本参数配置】

文章目录一.交换机的多种模式以及切换1.如何进入到交换机配置的命令行用户界面(Command Line Interface)2.普通模式模式3.特权用户模式4.全局配置模式5.模式切换图二.交换机名称&#xff0c;口令等设置1.全局模式下-交换机改名2.接口模式下-配置端口速度和工作模式2.1配置端口速…

项目8:用户注册和登录的前后端联调

项目8&#xff1a;用户注册和登录的前后端联调 1.前端项目使用 2.前端项目注册模块 3.后端完成项目注册 4.前端项目登录模块 5.后端完成项目登录 6.用户认证&#xff08;校验用户是否登录&#xff09; 项目8&#xff1a;用户注册和登录的前后端联调 1.前端项目使用 直接…

20230413在CV1826平台配置开机自启动程序

20230413在CV1826平台配置开机自启动程序 2023/4/13 10:51 1、项目需求&#xff1a;硬件需要测量摄像头开机之后的电压/时钟信号&#xff0c;但是不想每次开机的时候都通过adb连接cv1826来开启摄像头。 C:\Users\Sun>adb shell / # / # cd /mnt/ /mnt # /mnt # ls -l total …

Go 语言性能优化指南

编写高性能的 Go 程序~ 前言&#xff1a; 继上次课程的高质量编程内容讲解&#xff0c;本次课程主要介绍了在满足正确性、可靠性、健壮性、可读性等质量因素的前提下提高程序效率的性能优化建议&#xff1b;性能优化分析工具&#xff1b;以及性能调优的实战案例&#xff0c;分…

叶酸聚乙二醇羟基FA-PEG-OH;一文带你了解高分子试剂OH-PEG-Folate

FA-PEG-OH&#xff0c;叶酸-聚乙二醇-羟基 中文名称&#xff1a;叶酸聚乙二醇羟基 英文名称&#xff1a;FA-PEG-OH HO-PEG-FA Folate-PEG-OH 性状&#xff1a;黄色液体或固体&#xff0c;取决于分子量 溶剂&#xff1a;溶于水&#xff0c;DMSO、DMF等常规性有机溶剂 活性基…

城市地下综合管廊安全运营与智慧管控的分层架构研究

安科瑞 李亚俊 1、引言 1833年&#xff0c;市政管线综合管廊在巴黎城市地下建成至今&#xff0c;经过百年来的探索、研究、改良和实践&#xff0c;法国、英国、德国、俄罗斯、日本、美国等发达国家的管廊规划建设与安全运维体系已经日臻完善&#xff0c;截止目前&#xff0c;…

《花雕学AI》17:关注提示工程—本世纪最重要的技能可能就是与AI人工智能对话

本文目录与主要结构 引言&#xff1a;介绍提示工程的概念和背景&#xff0c;说明为什么它是本世纪最重要的技能之一。 正文&#xff1a; 一、提示工程的基本原理和方法&#xff1a;介绍什么是提示、如何设计和优化提示、如何使用提示与语言模型进行交互。 二、提示工程的应用和…

Direct3D 12——混合——混合

混合运算 typedef enum D3D12_BLEND_OP {D3D12_BLEND_OP_ADD 1, //添加源 1 和源 2。D3D12_BLEND_OP_SUBTRACT 2,//从源 2 中减去源 1。D3D12_BLEND_OP_REV_SUBTRACT 3,//从源 1 中减去源 2。D3D12_BLEND_OP_MIN 4,//查找源 1 和源 2 的最小值。D3D12_BLEND_OP_MAX 5//查…

【云原生|Docker】13-Docker-compose详解

【云原生Docker】13-Docker-compose详解 文章目录【云原生Docker】13-Docker-compose详解前言docker-compose简介docker-compose安装docker-compose基本示例Docker Compose常用命令说明Docker Compose文件详解versionserviceimagebuildcommandlinksexternal_linksportsexposeen…

Win11的两个实用技巧系列之找不到wifi网络的解决方法、双系统开机选择系统方法

Win11装了VMware后找不到wifi网络的解决方法 有用户在电脑上安装了VMware虚拟机来使用的时候&#xff0c;发现虚拟机中无法进行无线网络的连接了&#xff0c;本文就为大家带来了详细的解决方法&#xff0c;一起看看吧 Win11装了VMware后找不到wifi网络的解决方法教学分享。有用…

助力信创国产化,Solon v2.2.9 发布

Solon 是一个高效的 Java 应用开发框架&#xff1a;更快、更小、更简单。它不是 Spring、没有使用 Servlet、JavaEE 接口&#xff0c;是一个有自己接口标准的开放生态。可以为应用软件国产化提供支持&#xff0c;助力信创建设。 150来个生态插件&#xff0c;覆盖各种不同的应用…

【SSM框架】Spring更简单的存储对象

Spring更简单的存储对象将Bean对象简单存储到Spring容器中使用五大类注解来存储Bean对象使⽤ Controller 存储 bean 对象使⽤ Service 存储 bean 对象使⽤ Repository 存储 bean 对象使⽤ Component 存储 bean 对象使⽤ Configuration为什么要有五大类注解五大类注解的关系五大…

【C++初阶】类与对象(一)

文章目录一、面向过程和面向对象初步认识二、类的引入三、类的定义四、类的访问限定符及封装1 、访问限定符2.封装五、类的作用域六、类的实例化七、类对象模型1.探究存储方式2.结构体内存对齐规则八、this指针1、this指针的引出2.this指针的特性八、C语言和C实现Stack的对比总…

漏洞挖掘小技巧(一)

Swagger UI反射XSS Swagger UI是目前最流行的RestFul接口API文档和测试工具。 首先写一个 json的 XSS 负载 https://gist.githubusercontent.com/ramkrivas/c47c4a49bea5f3ff99a9e6229298a6ba/raw/e2e610ea302541a37604c7df8bcaebdcb109b3ba/xsstest.json回到Swagger UI&…

轮廓特征属性及应用

轮廓特征属性及应用 1)凸包 凸包(Convex Hull)是一个计算机几何图形学中的概念, 简单来说, 给定二维平面点集, 凸包就是能够将最外层的点连接起来构成的凸多边形, 它能够包含点集中所有的点。物体的凸包检测常应用在物体识别、手势识别及边界检测等领域。 寻找凸包—cv2.co…

数字中国背景下,企业加大数据决策投入,零代码+商业智能成为新选型 | 爱分析洞察

自“十四五”规划将“加快数字化发展&#xff0c;建设数字中国”单独成篇&#xff0c;从国家战略层面明确了数字化转型的重要性&#xff0c;国家层面有关支持数字经济发展的顶层设计与配套政策此后相继出台。2023年3月1日&#xff0c;中共中央、国务院印发了《数字中国建设整体…