官方文档
补全建议器提供了根据输入自动补全/搜索的功能。这是一个导航功能,引导用户在输入时找到相关结果,提高搜索精度。
理想情况下,自动补全功能应该和用户输入一样快,以便为用户提供与已输入内容相关的即时反馈。因此,completion建议器针对速度进行了优化。建议器使用的数据结构支持快速查找,但构建成本很高,并且存储在内存中。
源码地址:https://gitcode.com/medcl/elasticsearch-analysis-pinyin/overview
idea通过git(https://gitcode.com/medcl/elasticsearch-analysis-pinyin.git)导入项目
切换到分支7.x
修改pom.xml,elasticsearch.version 版本号为7.15.0

执行maven打包命令
mvn clean package "-Dmaven.test.skip=true"
执行命令以后生成的插件:

将elasticsearch-analysis-pinyin-7.15.0.zip上传到服务器elasticsearch安装目录,修改文件权限为elasticsearch用户权限,笔者的用户是hadoop.
chown hadoop:hadoop elasticsearch-analysis-pinyin-7.15.0.zip
进入elasticsearch安装目录执行命令安装插件
sh ./bin/elasticsearch-plugin install file:elasticsearch-analysis-pinyin-7.15.0.zip

重新启动elasticsearch
# 停止
kill -9 进程id
#启动
sh ./bin/elasticsearch -d
查询插件是否安装成功
GET _cat/plugins

拼音插件参数说明
The plugin includes analyzer: pinyin , tokenizer: pinyin and token-filter: pinyin.
PUT hot_word
{
"settings": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
},
"number_of_shards": "1",
"max_result_window": "10000",
"number_of_replicas": "1",
"analysis": {
"analyzer": {
"pinyin_analyzer": {
"tokenizer": "my_pinyin"
}
},
"tokenizer": {
"my_pinyin": {
"type": "pinyin",
"keep_separate_first_letter": true,
"keep_full_pinyin": true,
"keep_original": true,
"limit_first_letter_length": 16,
"lowercase": true,
"remove_duplicated_term": true
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
},
"suggest": {
"type": "completion",
"analyzer": "pinyin_analyzer",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
}
}
}
}
类型completion 参数:
| 参数 | 说明 |
|---|---|
| analyzer | 要使用的索引分析器默认为simple。 |
| search_analyzer | 要使用的搜索分析器,默认值为analyzer。 |
| preserve_separators | 保留分隔符,默认为true。如果禁用,你可以找到一个以Foo Fighters开头的字段,如果你建议使用foof。 |
| preserve_position_increments | 启用位置增量,默认为true如果禁用并使用停用词分析器,你可以得到一个以The Beatles开头的字段,如果你建议使用b。注意:你也可以通过索引两个输入,Beatles和The Beatles来实现这一点,如果你能够丰富数据,则无需更改简单的分析器。 |
| max_input_length | 限制单个输入的长度,默认为50个UTF-16代码点。这个限制只在索引时用于减少每个输入字符串的字符总数,以防止大量输入使底层数据结构膨胀。大多数用例都不受默认值的影响,因为前缀补全的长度很少超过几个字符。 |
PUT hot_word/_doc/1?refresh
{
"name":"压克力盒",
"suggest":{
"input":["压克力盒"],
"weight":10
}
}
PUT hot_word/_doc/2?refresh
{
"name":"亚克力盒",
"suggest":{
"input":["亚克力盒"],
"weight":10
}
}
PUT hot_word/_doc/3?refresh
{
"name":"刻磨機",
"suggest":{
"input":["刻磨機"],
"weight":10
}
}
PUT hot_word/_doc/4?refresh
{
"name":"刻模机",
"suggest":{
"input":["刻模机"],
"weight":10
}
}
suggest参数说明:
| 参数 | 说明 | 备注 |
|---|---|---|
| input | 要存储的输入,可以是一个字符串数组,也可以只是一个字符串。该字段是必填字段。 | 此值不能包含以下UTF-16控制字符:\u0000 (null),\u001f (information separator one),\u001f (information separator one) |
| weight | 一个正整数或包含一个正整数的字符串,它定义了权重,允许你对建议进行排序。该字段是可选的。 |
GET hot_word/_search?pretty
{
"_source": ["suggest"],
"suggest": {
"song-suggest": {
"prefix": "关键词",
"completion": {
"field": "suggest",
"size": 10,
"skip_duplicates": true
}
}
}
}
参数说明:
测试:
测试中文输入关键词:亚,亚克、亚克力、亚克力盒,正常返回补全结果:
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"song-suggest" : [
{
"text" : "亚克力",
"offset" : 0,
"length" : 3,
"options" : [
{
"text" : "亚克力盒",
"_index" : "hot_word",
"_type" : "_doc",
"_id" : "2",
"_score" : 10.0,
"_source" : {
"suggest" : {
"input" : [
"亚克力盒"
],
"weight" : 10
}
}
},
{
"text" : "压克力盒",
"_index" : "hot_word",
"_type" : "_doc",
"_id" : "1",
"_score" : 10.0,
"_source" : {
"suggest" : {
"input" : [
"压克力盒"
],
"weight" : 10
}
}
}
]
}
]
}
}
测试拼音输入:ya、yake,yakelihe,yakelihe,正常返回补全结果:
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"song-suggest" : [
{
"text" : "yakelihe",
"offset" : 0,
"length" : 8,
"options" : [
{
"text" : "亚克力盒",
"_index" : "hot_word",
"_type" : "_doc",
"_id" : "2",
"_score" : 10.0,
"_source" : {
"suggest" : {
"input" : [
"亚克力盒"
],
"weight" : 10
}
}
},
{
"text" : "压克力盒",
"_index" : "hot_word",
"_type" : "_doc",
"_id" : "1",
"_score" : 10.0,
"_source" : {
"suggest" : {
"input" : [
"压克力盒"
],
"weight" : 10
}
}
}
]
}
]
}
}
测试拼音首字母输入:y、yk、ykl、yklh,正常返回补全结果。支持首字母补全,pinyin需要设置keep_separate_first_letter为true。
测试同音字输入:鸭,鸭课,正常返回补全结果。
测试简繁体输入:刻模机,正常返回简繁体补全结果。
completion建议器还支持模糊查询——这意味着即使你在搜索中输入错误,仍然可以得到结果。
GET hot_word/_search?pretty
{
"_source": ["suggest"],
"suggest": {
"song-suggest": {
"prefix": "关键词",
"completion": {
"field": "suggest",
"size": 10,
"skip_duplicates": true,
"fuzzy": {
"fuzziness": 2,
"min_length":4,
"prefix_length":4,
"transpositions": false,
"unicode_aware":true
}
}
}
}
}
模糊查询fuzzy参数说明:
测试输入“亚可爱”,正常返回补全结果:
{
"took" : 44,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"song-suggest" : [
{
"text" : "亚可爱",
"offset" : 0,
"length" : 3,
"options" : [
{
"text" : "亚克力盒",
"_index" : "hot_word",
"_type" : "_doc",
"_id" : "2",
"_score" : 40.0,
"_source" : {
"suggest" : {
"input" : [
"亚克力盒"
],
"weight" : 10
}
}
},
{
"text" : "压克力盒",
"_index" : "hot_word",
"_type" : "_doc",
"_id" : "1",
"_score" : 40.0,
"_source" : {
"suggest" : {
"input" : [
"压克力盒"
],
"weight" : 10
}
}
}
]
}
]
}
}
easy-es实现:https://www.easy-es.cn/,这个框架建议索引设置为手动,自动索引还不是很稳定
<dependency>
<groupId>org.dromara.easy-esgroupId>
<artifactId>easy-es-boot-starterartifactId>
<version>v2.0.0version>
dependency>
<dependency>
<groupId>org.springframework.bootgroupId>
<artifactId>spring-boot-starter-webartifactId>
<exclusions>
<exclusion>
<groupId>org.elasticsearch.clientgroupId>
<artifactId>elasticsearch-rest-high-level-clientartifactId>
exclusion>
<exclusion>
<groupId>org.elasticsearchgroupId>
<artifactId>elasticsearchartifactId>
exclusion>
exclusions>
dependency>
<dependency>
<groupId>org.elasticsearch.clientgroupId>
<artifactId>elasticsearch-rest-high-level-clientartifactId>
<version>7.14.0version>
dependency>
<dependency>
<groupId>org.elasticsearchgroupId>
<artifactId>elasticsearchartifactId>
<version>7.14.0version>
dependency>
/**
* 自动补齐字段
*/
@IndexField(fieldType = FieldType.NESTED, nestedClass = SearchHotWordV2IDX.HotSuggest.class)
private SearchHotWordV2IDX.HotSuggest suggest;
@Data
public static class HotSuggest {
@IndexField(fieldType = FieldType.INTEGER)
private Integer weight;
@IndexField(fieldType = FieldType.KEYWORD)
private List<String> input;
}
private static final String HOT_KEYWORD_SUGGEST_NAME = "hot-suggest";
public List<String> autoComplete(String keyword, Boolean fuzzy) {
LambdaEsQueryWrapper<SearchHotWordV2IDX> queryWrapper = new LambdaEsQueryWrapper<SearchHotWordV2IDX>();
queryWrapper.select(SearchHotWordV2IDX::getSuggest);
String newKeyword = Optional.ofNullable(keyword)
.map(a -> a.toLowerCase())
.map(b -> StrUtil.sub(b, 0, 10)).orElseThrow(RuntimeException::new);
// 定义建议构造器
SuggestBuilder suggestBuilder = new SuggestBuilder();
// 自动补全补齐构造器
CompletionSuggestionBuilder completionSuggestionBuilder = new CompletionSuggestionBuilder("suggest")
.size(20)
.skipDuplicates(true);
if (fuzzy) {
// 设置模糊查询
FuzzyOptions fuzzyOptions = FuzzyOptions.builder()
.setFuzziness(2)
.setFuzzyMinLength(4)
.setFuzzyPrefixLength(4).build();
completionSuggestionBuilder.prefix(newKeyword, fuzzyOptions);
} else {
completionSuggestionBuilder.prefix(newKeyword);
}
// 自动补全添加到建议构造器
suggestBuilder.addSuggestion(HOT_KEYWORD_SUGGEST_NAME, completionSuggestionBuilder);
SearchSourceBuilder searchSourceBuilder = searchHotWordV2EsMapper.getSearchSourceBuilder(queryWrapper);
// 建议器添加到searchSourceBuilder
searchSourceBuilder.suggest(suggestBuilder);
// queryWrapper设置searchSourceBuilder
queryWrapper.setSearchSourceBuilder(searchSourceBuilder);
// 查询
SearchResponse response = searchHotWordV2EsMapper.search(queryWrapper);
Suggest suggest = response.getSuggest();
// 获取自动补全结果
CompletionSuggestion completionSuggestion = suggest.getSuggestion(HOT_KEYWORD_SUGGEST_NAME);
return Optional.ofNullable(completionSuggestion.getOptions())
.map(a -> a.stream().map(b -> b.getText().string()).collect(Collectors.toList())).orElse(null);
}