Elasticsearch通过RestHighLevelClient实现聚合分组及聚合计算查询

😊 @ 作者：一恍过去

💖 @ 主页： https://blog.csdn.net/zhuocailing3390

🎊 @ 社区： Java技术栈交流

🎉 @ 主题： Elasticsearch通过RestHighLevelClient实现聚合分组及聚合计算查询

⏱️ @ 创作时间： 2022年08月22日

1、pom引入

<dependencies>
        <dependency>
            <groupId>org.springframework.bootgroupId>
            <artifactId>spring-boot-starter-webartifactId>
        dependency>

        <dependency>
            <groupId>org.projectlombokgroupId>
            <artifactId>lombokartifactId>
            <optional>trueoptional>
        dependency>
        <dependency>
            <groupId>org.springframework.bootgroupId>
            <artifactId>spring-boot-starter-testartifactId>
            <scope>testscope>
        dependency>
        <dependency>
            <groupId>commons-iogroupId>
            <artifactId>commons-ioartifactId>
            <version>2.7version>
        dependency>	
		<dependency>
            <groupId>org.elasticsearchgroupId>
            <artifactId>elasticsearchartifactId>
            <version>7.8.0version>
        dependency>
        
        <dependency>
            <groupId>org.elasticsearch.clientgroupId>
            <artifactId>elasticsearch-rest-high-level-clientartifactId>
            <version>7.8.0version>
        dependency>
        
        <dependency>
            <groupId>org.apache.logging.log4jgroupId>
            <artifactId>log4j-apiartifactId>
            <version>2.8.2version>
        dependency>
    dependencies>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

2、配置类

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

/**
 * @Author: 
 * @Date: 2022/8/13 10:47
 * @Description:
 **/
@Configuration
public class ElasticsearchConfig {

    @Bean
    public RestHighLevelClient restHighLevelClient() {
        return new RestHighLevelClient(
                // 配置ES连接地址
                RestClient.builder(new HttpHost("192.168.80.121", 9200, "http"))
        );
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

3、数据准备

新增索引：

# 新增索引
PUT http://192.168.80.121:9200/cars

# 请求参数
{
  "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 1
  },
  "mappings": {
      "properties": {
        "color": {
          "type": "keyword"
        },
        "make": {
          "type": "keyword"
        },
        "price": {
          "type": "float"
        },
          "sold": {
          "type": "keyword"
        }
      }
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

批量新增数：

参考：《使用Http请求实现数据的批量导入》

# 批量导入数
POST http://192.168.80.121:9200/cars/_bulk

# 注意：必须换行

{"index": {"_index": "cars", "_type": "_doc", "_id": 1}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2022-10-28" }

{"index": {"_index": "cars", "_type": "_doc", "_id": 2}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2022-11-05" }

{"index": {"_index": "cars", "_type": "_doc", "_id": 3}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2022-05-18" }

{"index": {"_index": "cars", "_type": "_doc", "_id": 4}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2022-07-02" }

{"index": {"_index": "cars", "_type": "_doc", "_id": 5}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2022-08-19" }

{"index": {"_index": "cars", "_type": "_doc", "_id": 6}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2022-11-05" }

{"index": {"_index": "cars", "_type": "_doc", "_id": 7}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2022-01-01" }

{"index": {"_index": "cars", "_type": "_doc", "_id": 8}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2022-02-12" }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

验证数据：

# 验证
GET http://192.168.80.121:9200/cars/_search
1
2

4、基本概念

注意：在ES中，需要进行聚合、排序、过滤的字段其处理方式比较特殊，因此不能被分词，设置文本类型为keyword。

基本数据格式如下

Elasticsearch中的聚合，包含多种类型，最常用的两种，一个叫桶，一个叫指标(度量)：

桶（bucket）

桶的作用，是按照某种方式对数据进行分组(group by)，每一组数据在ES中称为一个桶。

度量（metrics）

分组完成以后，我们一般会对组中的数据进行聚合运算，例如求平均值、最大、最小、求和等，这些在ES中称为度量

比较常用的一些度量聚合方式：

avg ：求平均值
max ：求最大值
min ：求最小值
percentiles ：求百分比
stats ：同时返回avg、max、min、sum、count等
sum ：求和
Top hits ：求前几
Count：求总数

5、聚合为桶(分组查询)

我们按照汽车的颜色color来划分桶

请求：

@Api(tags = "查询操作")
@RestController
@RequestMapping("/query")
@Slf4j
public class QueryController {
    @Resource
    private RestHighLevelClient restHighLevelClient;

	        /**
         * 聚合分组查询
         *
         * @throws IOException
         */
        @ApiOperation(value = "聚合分组查询", notes = "聚合分组查询")
        @GetMapping("/group")
        public void group() throws IOException {
            SearchRequest request = new SearchRequest();
            // 查询索引为nba的数据
            request.indices("cars");
    
            // 对color字段进行分组
            SearchSourceBuilder builder = new SearchSourceBuilder();
            // 如果只关心分组数据，将结果集设置为0，即不展示hits中的数据
            builder.size(0);
            // 设置分组名称为`colorGroup`,并且结果数量进行排序，false：表示desc，true表示asc
            AggregationBuilder aggregationBuilder = AggregationBuilders.terms("colorGroup").field("color").order(BucketOrder.count(false));
            builder.aggregation(aggregationBuilder);
    
    
            // 执行查询
            request.source(builder);
            SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT);
            // 获取数据
            Aggregations aggregations = response.getAggregations();
            ParsedStringTerms colorGroup = aggregations.get("colorGroup");
            List<? extends Terms.Bucket> buckets = colorGroup.getBuckets();
            for (Terms.Bucket bucket : buckets) {
                System.out.println("color：" + bucket.getKey() + "，" + "count：" + bucket.getDocCount());
            }
        }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

响应：

color：red，count：4
color：blue，count：2
color：green，count：2
1
2
3

6、聚合指标(聚合计算)

聚合指标是指直接对所有数据进行聚合，不进行分组查询；聚合方式为：avg、max、min、sum 、stats 、percentiles

请求：

@Api(tags = "查询操作")
@RestController
@RequestMapping("/query")
@Slf4j
public class QueryController {
    @Resource
    private RestHighLevelClient restHighLevelClient;
    
    /**
     * 聚合计算查询
     *
     * @throws IOException
     */
    @ApiOperation(value = "聚合计算查询", notes = "聚合计算查询")
    @GetMapping("/aggs")
    public void aggs() throws IOException {
        SearchRequest request = new SearchRequest();
        // 查询索引为nba的数据
        request.indices("cars");

        // 对price字段求平均值
        SearchSourceBuilder builder = new SearchSourceBuilder();
        // 如果只关心分组数据，将结果集设置为0，即不展示hits中的数据
        builder.size(0);
        AggregationBuilder aggregationBuilder = AggregationBuilders.avg("avgPrice").field("price");
        builder.aggregation(aggregationBuilder);

        // 执行查询
        request.source(builder);
        SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT);
        // 获取数据
        Aggregations aggregations = response.getAggregations();
        Avg avgPrice = aggregations.get("avgPrice");
        double value = avgPrice.getValue();
        System.out.println("平均值为：" + value);
    }

}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

响应：

平均值为：26500.0
1

7、桶内指标(分组聚合计算)

是指对某个字段进行分组后再进行聚合计算；聚合方式为：avg、max、min、sum 、stats 、percentiles

请求：

@Api(tags = "查询操作")
@RestController
@RequestMapping("/query")
@Slf4j
public class QueryController {
    @Resource
    private RestHighLevelClient restHighLevelClient;

    /**
     * 分组聚合计算查询
     *
     * @throws IOException
     */
    @ApiOperation(value = "分组聚合计算查询", notes = "分组聚合计算查询")
    @GetMapping("/aggsGroup")
    public void aggsGroup() throws IOException {
        SearchRequest request = new SearchRequest();
        // 查询索引为nba的数据
        request.indices("cars");

        // 对price字段求平均值
        SearchSourceBuilder builder = new SearchSourceBuilder();
        // 如果只关心分组数据，将结果集设置为0，即不展示hits中的数据
        builder.size(0);
        // 设置分组名称为`colorGroup`,并且结果数量进行排序，false：表示desc，true表示asc
        AggregationBuilder aggregationBuilder = AggregationBuilders.terms("colorGroup").field("color").order(BucketOrder.count(false));
        builder.aggregation(aggregationBuilder);

        // 对分组结果进行聚合计算，求分组后的平均值
        AggregationBuilder avgBuilder = AggregationBuilders.avg("avgPrice").field("price");
        aggregationBuilder.subAggregation(avgBuilder);


        // 执行查询
        request.source(builder);
        SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT);
        // 获取数据
        Aggregations aggregations = response.getAggregations();
        ParsedStringTerms colorGroup = aggregations.get("colorGroup");
        List<? extends Terms.Bucket> buckets = colorGroup.getBuckets();
        for (Terms.Bucket bucket : buckets) {
            // 获取分组后的聚合计算数据
            Aggregations acgAggregations = bucket.getAggregations();
            Avg avgPrice = acgAggregations.get("avgPrice");
            double value = avgPrice.getValue();
            System.out.println("color：" + bucket.getKey() + "，" + "count：" + bucket.getDocCount() + "，" + "avg：" + value);
        }
    }

}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

响应：

color：red，count：4，avg：32500.0
color：blue，count：2，avg：20000.0
color：green，count：2，avg：21000.0
1
2
3

相关阅读:
【JAVA程序设计】基于JavaWeb技术的医疗管理系统-有报告
 【老生谈算法】matlab实现模糊数学模型源码——模糊数学模型
 【ccf-csp题解】第四次csp认证-第四题-网络延时-树的直径
 仿大众点评——秒杀系统部分04——Redis缓存措施
 无线接入回传一体化关键技术及标准化进展
 本月本周github热度霸榜项目——jeecgboot
linux查看进程对应的线程(数)
Matplotlib数据可视化基础
 MFC扩展库BCGControlBar Pro v35.0新版亮点 - 工具栏、菜单全新升级
 CopyOnWriteArrayList源码分析
原文地址：https://blog.csdn.net/zhuocailing3390/article/details/126356968

Elasticsearch通过RestHighLevelClient实现聚合分组及聚合计算查询

目录

1、pom引入

2、配置类

3、数据准备

4、 基本概念

5、 聚合为桶(分组查询)

6、聚合指标(聚合计算)

7、桶内指标(分组聚合计算)

4、基本概念

5、聚合为桶(分组查询)