elasticsearch操作

1.倒排索引的介绍


1 倒排索引：对文章进行分词，对每个词建立索引，
由于这样建，会出现索引爆炸，索引索引跟标题建关系，标题再跟文章建索引，如下：
分词---文章建立索引                             |
 
| 今天(索引)    | (文章1,<2，10>,2) (文章3,<8>,1)       |
| 星期天（索引） | (文章2,<12,25,100>,3)                 |
| 出去玩（索引） | (文章5,<11,24,89>,3)(文章1,<8，19>,2) |
 
今天出现在哪个文章，出现的位置和出现的次数

2.索引操作（数据库）

2.1 创建索引


PUT ymq
{
  "settings": {
    "index":{
      "number_of_shards":5,
      "number_of_replicas":1
    }
  }
}

2.2 查看索引


# 查看单个
GET ymq/_settings
# 查看所有
GET _all/_settings
# 查看特定
GET ymq,ymq2/_settings
# 查看所有
GET _settings

2.3 修改索引（一般不太用，只能用来修改副本数量）


#修改索引副本数量为2  分片的数量一开始就要定好
# 副本数量可以改（有可能会出错）
PUT ymq/_settings
{
  "number_of_replicas": 2
}
 
PUT  _all/_settings
{
"index": {
  "blocks": {
    "read_only_allow_delete": false
    }
  }
}

2.4 删除索引

DELETE ymq

3. 映射管理(类型)（表）

3.1 介绍

在Elasticsearch 6.0.0或更高版本中创建的索引只包含一个mapping type。

在5.x中使用multiple mapping types创建的索引将继续像以前一样在Elasticsearch 6.x中运行。 Mapping types将在Elasticsearch 7.0.0中完全删除

##索引如果不创建，只有插入文档，会自动创建

3.2 创建映射（类型，表）


PUT books
{
  "mappings": {
    "properties":{
      "title":{
        "type":"text"
      },
      "price":{
        "type":"integer"
      },
      "addr":{
        "type":"keyword"
      },
      "company":{
        "properties":{
          "name":{"type":"text"},
          "company_addr":{"type":"text"},
          "employee_count":{"type":"integer"}
        }
      },
      "publish_date":{"type":"date","format":"yyy-MM-dd"}
      
    }
    
  }
}

3.3 查看映射


GET books/_mapping
GET _all/_mapping

3.4 特殊说明索引映射都不存在，也可以插入文档


PUT ymq2/_doc/1
{
  "title":"白雪公主和十个小矮人",
  "price":"99",
  "addr":"黑暗森里",
  "publish_date":"2018-05-19",
  "name":"ymq"
}

4. 文档基本增删查改（一行一行数据）

4.1 插入文档


PUT books/_doc/1
{
  "title":"大头儿子小偷爸爸",
  "price":100,  
  "addr":"北京天安门",
  "company":{
    "name":"我爱北京天安门",
    "company_addr":"我的家在东北松花江傻姑娘",
    "employee_count":10
  },
  "publish_date":"2019-08-19"
}
 
PUT books/_doc/2
{
  "title":"白雪公主和十个小矮人",
  "price":"99", 
  "addr":"黑暗森里",
  "publish_date":"2018-05-19"
}
 
PUT books/_doc/3
{
  "title":"白雪公主和十个小矮人",
  "price":"99", 
  "addr":"黑暗森里",
  "publish_date":"2018-05-19",
   "name":"lqz"
}

4.2 查看文档


 
# 格式：索引名称/默认类型名称/id
GET books/_doc/1

4.3 修改文档两种方式

4.3.1 第一种(不推荐，全部修改)


PUT lqz/_doc/1
{
  "name":"顾老二",
  "age":30,
  "from": "gu",
  "desc": "皮肤黑、武器长、性格直",
  "tags": ["黑", "长", "直"]
}

4.3.2 局部修改


POST lqz/_doc/1/_update
{
  "doc": {
    "desc": "皮肤很safasdfsda黄，武器很长，性格很直",
    "tags": ["很黄","很长", "很直"]
  }
}

4.4 删除文档

DELETE lqz/_doc/4

5. 文档查询

5.1 term与match的区别

5.1.1 介绍

term：是代表完全匹配，也就是精确查询，搜索前不会再对搜索词进行分词，所以我们的搜索词必须是文档分词集合中的一个

match：查询会先对搜索词进行分词,分词完毕后再逐个对分词结果进行匹配，因此相比于term的精确搜索，match是分词匹配搜索

5.1.2 创建索引+映射(无ik)+插入数据


# 创建索引跟映射
PUT lqz
{
  "settings": {
		"number_of_shards": 5,
		"number_of_replicas": 2
	},
  "mappings": {
    "properties":{
      "title":{
        "type":"text"
      },
      "desc":{
        "type":"text"
      },
      "price":{
        "type":"integer"
      },
      "addr":{
        "type":"keyword"
      },
      "company":{
        "properties":{
          "name":{"type":"text"},
          "company_addr":{"type":"text"},
          "employee_count":{"type":"integer"}
        }
      },
      "publish_date":{"type":"date","format":"yyy-MM-dd"}
      
    }
    
  }
}
 
# 插入数据
 
PUT lqz/_doc/1
{
  "title":"so beautiful zero",
  "price":100,  
  "addr":"北京天安门",
  "desc":"beautiful cat",
  "company":{
    "name":"我爱北京天安门",
    "company_addr":"我的家在东北松花江傻姑娘",
    "employee_count":10
  },
  "publish_date":"2019-08-19"
}
 
PUT lqz/_doc/2
{
  "title":"so beautiful one",
  "price":200,  
  "addr":"北京天安门",
  "desc":"beautiful dog",
  "company":{
    "name":"我爱北京天安门",
    "company_addr":"我的家在东北松花江傻姑娘",
    "employee_count":10
  },
  "publish_date":"2019-08-19"
}
 
 
PUT lqz/_doc/3
{
  "title":"so beautiful tow",
  "price":698,  
  "addr":"北京天安门",
  "desc":"dog",
  "company":{
    "name":"我爱北京天安门",
    "company_addr":"我的家在东北松花江傻姑娘",
    "employee_count":10
  },
  "publish_date":"2019-08-19"
}

5.2 term

5.2.1 term与terms

term:不会分词，按照指定的词查询

terms:可指定多个词查询


# term查的不会分词
GET lqz/_doc/_search
        {
      "query": {
        "term": {
          "desc": "beautiful"
        }
      }
    }
# terms由于部分词，想查多个，terms
GET lqz/_doc/_search
  {
    "query": {
      "terms": {
        "title": ["beautiful", "so"]
      }
    }
  }

5.3 match

5.3.1 match和match_all

match:查询相当于模糊匹配,只包含其中一部分关键词就行

match_all:能够匹配索引中的所有文件。

match_phrase：短语匹配查询,要求必须全部精确匹配，且顺序必须与指定的短语相同


# match查的短语会分词
GET lqz/_doc/_search
    {
      "query": {
        "match_all": {}
      }
    }
  
GET lqz/_doc/_search
    {
      "query": {
        "match": {
          "title": "beautiful tow"
        }
      }
    }

5.4 排序查询

不是所有字段都支持排序，只有数字类型，字符串不支持


# 排序查询
# 1.普通查询
GET lqz/_doc/_search
{
  "query": {
    "match": {
      "addr": "北京天安门"
    }
  }
}
 
# 2.降序
GET lqz/_doc/_search
{
  "query": {
    "match": {
      "addr": "北京天安门"
    }
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}
 
#3.升序
GET lqz/_doc/_search
{
  "query": {
    "match": {
      "addr": "北京天安门"
    }
  },
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    }
  ]
}
 
# 4.match_all+升序
GET lqz/_doc/_search
{
  "query": {
    "match_all": {
    }
  },
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    }
  ]
}

5.5 分页查询

所有的条件都是可插拔的，彼此之间用 , 分割


# 分页
#从第二条开始，取一条
 
GET lqz/_doc/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ], 
  "from": 2,
  "size": 2
}
 
 
 
 
###注意：对于`elasticsearch`来说，所有的条件都是可插拔的，彼此之间 , 分割
GET lqz/_doc/_search
{
  "query": {
    "match_all": {}
  }, 
  "from": 2,
  "size": 2
}

5.6 布尔查询

must：与关系，相当于关系型数据库中的and。
should：或关系，相当于关系型数据库中的or。
must_not：非关系，相当于关系型数据库中的not。
filter：过滤条件。
range：条件筛选范围。
gt：大于，相当于关系型数据库中的>。
gte：大于等于，相当于关系型数据库中的>=。
lt：小于，相当于关系型数据库中的<。
lte：小于等于，相当于关系型数据库中的<=。


##布尔查询之should or条件
GET lqz/_doc/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "addr": "北京天安门"
          }
        },
        {
          "match": {
            "desc": "beautiful"
          }
        }
      ]
    }
  }
}
 
 
 
 
 
### must_not条件   都不是
GET lqz/_doc/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "addr": "北京天安门"
          }
        },
        {
          "match": {
            "desc": "beautiful"
          }
        },
        {
          "match": {
            "price": 698
          }
        }
      ]
    }
  }
}
 
 
 
 
###filter，大于小于的条件   gt lt  gte  lte
GET lqz/_doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "addr": "北京天安门"
          }
        }
      ],
      "filter": {
        "range": {
          "price": {
            "lt": 200
          }
        }
      }
    }
  }
}
 
 
### 范围查询
GET lqz/_doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "addr": "北京天安门"
          }
        }
      ],
      "filter": {
        "range": {
          "price": {
            "gte": 100,
            "lte": 150
          }
        }
      }
    }
  }
}

5.7 查询结果过滤


 
###基本使用
GET lqz/_doc/_search
{
  "query": {
    "match_all": {
      }
  },
  "_source":["name","age"]
}
 
 
####_source和query是平级的
 
GET lqz/_doc/_search
{
  "query": {
    "bool": {
      "must":{
        "match":{"from":"gu"}
      },
      
      "filter": {
        "range": {
          "age": {
            "lte": 25
          }
        }
      }
    }
  },
  "_source":["name","age"]
}

5.8 高亮查询(未能高亮)


GET lqz/_doc/_search
{
  "query": {
    "match": {
      "price": "698"
    }
  },
  "highlight": {
    "pre_tags": "",
    "post_tags": "",
    "fields": {
    "from": {}
    }
  }
}

5.9 聚合函数


 
# sum ,avg, max ,min
 
# select max(age) as my_avg from 表 where from=gu;
GET lqz/_doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "aggs": {
    "my_avg": {
      "avg": {
        "field": "age"
      }
    }
  },
  "_source": ["name", "age"]
}
 
#最大年龄
GET lqz/_doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "aggs": {
    "my_max": {
      "max": {
        "field": "age"
      }
    }
  },
  "_source": ["name", "age"]
}
 
#最小年龄
GET lqz/_doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "aggs": {
    "my_min": {
      "min": {
        "field": "age"
      }
    }
  },
  "_source": ["name", "age"]
}
 
# 总年龄
#最小年龄
GET lqz/_doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "aggs": {
    "my_sum": {
      "sum": {
        "field": "age"
      }
    }
  },
  "_source": ["name", "age"]
}
 
 
 
#分组
 
 
# 现在我想要查询所有人的年龄段，并且按照`15~20，20~25,25~30`分组，并且算出每组的平均年龄。
GET lqz/_doc/_search
{
  "size": 0, 
  "query": {
    "match_all": {}
  },
  "aggs": {
    "age_group": {
      "range": {
        "field": "age",
        "ranges": [
          {
            "from": 15,
            "to": 20
          },
          {
            "from": 20,
            "to": 25
          },
          {
            "from": 25,
            "to": 30
          }
        ]
      }
    }
  }
}

相关阅读:
第一章教育基础（07 心理学基础知识）
详细讲解什么是单例模式
 获取sku详细信息 API 返回值说明
 《MySQL学习笔记》数据库增删查改（进阶）
1、Html编程基础
 内蒙古自治区工程系列建设工程专业技术人才职称评审条件
 JVM【八股文】
视频剪辑中花式抠图的代码实操与案例详述
 C#中的 Attribute 与 Python/TypeScript 中的装饰器是同个东西吗
 Python基础——异常处理
原文地址：https://blog.csdn.net/qq_52385631/article/details/126374769