- 1 倒排索引:对文章进行分词,对每个词建立索引,
- 由于这样建,会出现索引爆炸,索引索引跟标题建关系,标题再跟文章建索引,如下:
- 分词---文章建立索引 |
-
- | 今天(索引) | (文章1,<2,10>,2) (文章3,<8>,1) |
- | 星期天(索引) | (文章2,<12,25,100>,3) |
- | 出去玩(索引) | (文章5,<11,24,89>,3)(文章1,<8,19>,2) |
-
- 今天出现在哪个文章,出现的位置和出现的次数
- PUT ymq
- {
- "settings": {
- "index":{
- "number_of_shards":5,
- "number_of_replicas":1
- }
- }
- }
- # 查看单个
- GET ymq/_settings
- # 查看所有
- GET _all/_settings
- # 查看特定
- GET ymq,ymq2/_settings
- # 查看所有
- GET _settings
- #修改索引副本数量为2 分片的数量一开始就要定好
- # 副本数量可以改(有可能会出错)
- PUT ymq/_settings
- {
- "number_of_replicas": 2
- }
-
- PUT _all/_settings
- {
- "index": {
- "blocks": {
- "read_only_allow_delete": false
- }
- }
- }
DELETE ymq
在Elasticsearch 6.0.0或更高版本中创建的索引只包含一个mapping type。
在5.x中使用multiple mapping types创建的索引将继续像以前一样在Elasticsearch 6.x中运行。 Mapping types将在Elasticsearch 7.0.0中完全删除
##索引如果不创建,只有插入文档,会自动创建
- PUT books
- {
- "mappings": {
- "properties":{
- "title":{
- "type":"text"
- },
- "price":{
- "type":"integer"
- },
- "addr":{
- "type":"keyword"
- },
- "company":{
- "properties":{
- "name":{"type":"text"},
- "company_addr":{"type":"text"},
- "employee_count":{"type":"integer"}
- }
- },
- "publish_date":{"type":"date","format":"yyy-MM-dd"}
-
- }
-
- }
- }
- GET books/_mapping
- GET _all/_mapping
- PUT ymq2/_doc/1
- {
- "title":"白雪公主和十个小矮人",
- "price":"99",
- "addr":"黑暗森里",
- "publish_date":"2018-05-19",
- "name":"ymq"
- }
- PUT books/_doc/1
- {
- "title":"大头儿子小偷爸爸",
- "price":100,
- "addr":"北京天安门",
- "company":{
- "name":"我爱北京天安门",
- "company_addr":"我的家在东北松花江傻姑娘",
- "employee_count":10
- },
- "publish_date":"2019-08-19"
- }
-
- PUT books/_doc/2
- {
- "title":"白雪公主和十个小矮人",
- "price":"99",
- "addr":"黑暗森里",
- "publish_date":"2018-05-19"
- }
-
- PUT books/_doc/3
- {
- "title":"白雪公主和十个小矮人",
- "price":"99",
- "addr":"黑暗森里",
- "publish_date":"2018-05-19",
- "name":"lqz"
- }
-
-
- # 格式:索引名称/默认类型名称/id
- GET books/_doc/1
- PUT lqz/_doc/1
- {
- "name":"顾老二",
- "age":30,
- "from": "gu",
- "desc": "皮肤黑、武器长、性格直",
- "tags": ["黑", "长", "直"]
- }
- POST lqz/_doc/1/_update
- {
- "doc": {
- "desc": "皮肤很safasdfsda黄,武器很长,性格很直",
- "tags": ["很黄","很长", "很直"]
- }
- }
DELETE lqz/_doc/4
term:是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词,所以我们的搜索词必须是文档分词集合中的一个
match:查询会先对搜索词进行分词,分词完毕后再逐个对分词结果进行匹配,因此相比于term的精确搜索,match是分词匹配搜索
- # 创建索引跟映射
- PUT lqz
- {
- "settings": {
- "number_of_shards": 5,
- "number_of_replicas": 2
- },
- "mappings": {
- "properties":{
- "title":{
- "type":"text"
- },
- "desc":{
- "type":"text"
- },
- "price":{
- "type":"integer"
- },
- "addr":{
- "type":"keyword"
- },
- "company":{
- "properties":{
- "name":{"type":"text"},
- "company_addr":{"type":"text"},
- "employee_count":{"type":"integer"}
- }
- },
- "publish_date":{"type":"date","format":"yyy-MM-dd"}
-
- }
-
- }
- }
-
- # 插入数据
-
- PUT lqz/_doc/1
- {
- "title":"so beautiful zero",
- "price":100,
- "addr":"北京天安门",
- "desc":"beautiful cat",
- "company":{
- "name":"我爱北京天安门",
- "company_addr":"我的家在东北松花江傻姑娘",
- "employee_count":10
- },
- "publish_date":"2019-08-19"
- }
-
- PUT lqz/_doc/2
- {
- "title":"so beautiful one",
- "price":200,
- "addr":"北京天安门",
- "desc":"beautiful dog",
- "company":{
- "name":"我爱北京天安门",
- "company_addr":"我的家在东北松花江傻姑娘",
- "employee_count":10
- },
- "publish_date":"2019-08-19"
- }
-
-
- PUT lqz/_doc/3
- {
- "title":"so beautiful tow",
- "price":698,
- "addr":"北京天安门",
- "desc":"dog",
- "company":{
- "name":"我爱北京天安门",
- "company_addr":"我的家在东北松花江傻姑娘",
- "employee_count":10
- },
- "publish_date":"2019-08-19"
- }
term:不会分词,按照指定的词查询
terms:可指定多个词查询
- # term查的不会分词
- GET lqz/_doc/_search
- {
- "query": {
- "term": {
- "desc": "beautiful"
- }
- }
- }
- # terms由于部分词,想查多个,terms
- GET lqz/_doc/_search
- {
- "query": {
- "terms": {
- "title": ["beautiful", "so"]
- }
- }
- }
match:查询相当于模糊匹配,只包含其中一部分关键词就行
match_all:能够匹配索引中的所有文件。
match_phrase:短语匹配查询,要求必须全部精确匹配,且顺序必须与指定的短语相同
- # match查的短语会分词
- GET lqz/_doc/_search
- {
- "query": {
- "match_all": {}
- }
- }
-
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "title": "beautiful tow"
- }
- }
- }
不是所有字段都支持排序,只有数字类型,字符串不支持
- # 排序查询
- # 1.普通查询
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "addr": "北京天安门"
- }
- }
- }
-
- # 2.降序
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "addr": "北京天安门"
- }
- },
- "sort": [
- {
- "price": {
- "order": "desc"
- }
- }
- ]
- }
-
- #3.升序
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "addr": "北京天安门"
- }
- },
- "sort": [
- {
- "price": {
- "order": "asc"
- }
- }
- ]
- }
-
- # 4.match_all+升序
- GET lqz/_doc/_search
- {
- "query": {
- "match_all": {
- }
- },
- "sort": [
- {
- "price": {
- "order": "asc"
- }
- }
- ]
- }
所有的条件都是可插拔的,彼此之间用 , 分割
- # 分页
- #从第二条开始,取一条
-
- GET lqz/_doc/_search
- {
- "query": {
- "match_all": {}
- },
- "sort": [
- {
- "price": {
- "order": "desc"
- }
- }
- ],
- "from": 2,
- "size": 2
- }
-
-
-
-
- ###注意:对于`elasticsearch`来说,所有的条件都是可插拔的,彼此之间 , 分割
- GET lqz/_doc/_search
- {
- "query": {
- "match_all": {}
- },
- "from": 2,
- "size": 2
- }
must:与关系,相当于关系型数据库中的and。
should:或关系,相当于关系型数据库中的or。
must_not:非关系,相当于关系型数据库中的not。
filter:过滤条件。
range:条件筛选范围。
gt:大于,相当于关系型数据库中的>。
gte:大于等于,相当于关系型数据库中的>=。
lt:小于,相当于关系型数据库中的<。
lte:小于等于,相当于关系型数据库中的<=。
- ##布尔查询之should or条件
- GET lqz/_doc/_search
- {
- "query": {
- "bool": {
- "should": [
- {
- "match": {
- "addr": "北京天安门"
- }
- },
- {
- "match": {
- "desc": "beautiful"
- }
- }
- ]
- }
- }
- }
-
-
-
-
-
- ### must_not条件 都不是
- GET lqz/_doc/_search
- {
- "query": {
- "bool": {
- "must_not": [
- {
- "match": {
- "addr": "北京天安门"
- }
- },
- {
- "match": {
- "desc": "beautiful"
- }
- },
- {
- "match": {
- "price": 698
- }
- }
- ]
- }
- }
- }
-
-
-
-
- ###filter,大于小于的条件 gt lt gte lte
- GET lqz/_doc/_search
- {
- "query": {
- "bool": {
- "must": [
- {
- "match": {
- "addr": "北京天安门"
- }
- }
- ],
- "filter": {
- "range": {
- "price": {
- "lt": 200
- }
- }
- }
- }
- }
- }
-
-
- ### 范围查询
- GET lqz/_doc/_search
- {
- "query": {
- "bool": {
- "must": [
- {
- "match": {
- "addr": "北京天安门"
- }
- }
- ],
- "filter": {
- "range": {
- "price": {
- "gte": 100,
- "lte": 150
- }
- }
- }
- }
- }
- }
-
- ###基本使用
- GET lqz/_doc/_search
- {
- "query": {
- "match_all": {
- }
- },
- "_source":["name","age"]
- }
-
-
- ####_source和query是平级的
-
- GET lqz/_doc/_search
- {
- "query": {
- "bool": {
- "must":{
- "match":{"from":"gu"}
- },
-
- "filter": {
- "range": {
- "age": {
- "lte": 25
- }
- }
- }
- }
- },
- "_source":["name","age"]
- }
-
-
-
-
-
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "price": "698"
- }
- },
- "highlight": {
- "pre_tags": "",
- "post_tags": "",
- "fields": {
- "from": {}
- }
- }
- }
-
- # sum ,avg, max ,min
-
- # select max(age) as my_avg from 表 where from=gu;
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "from": "gu"
- }
- },
- "aggs": {
- "my_avg": {
- "avg": {
- "field": "age"
- }
- }
- },
- "_source": ["name", "age"]
- }
-
- #最大年龄
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "from": "gu"
- }
- },
- "aggs": {
- "my_max": {
- "max": {
- "field": "age"
- }
- }
- },
- "_source": ["name", "age"]
- }
-
- #最小年龄
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "from": "gu"
- }
- },
- "aggs": {
- "my_min": {
- "min": {
- "field": "age"
- }
- }
- },
- "_source": ["name", "age"]
- }
-
- # 总年龄
- #最小年龄
- GET lqz/_doc/_search
- {
- "query": {
- "match": {
- "from": "gu"
- }
- },
- "aggs": {
- "my_sum": {
- "sum": {
- "field": "age"
- }
- }
- },
- "_source": ["name", "age"]
- }
-
-
-
- #分组
-
-
- # 现在我想要查询所有人的年龄段,并且按照`15~20,20~25,25~30`分组,并且算出每组的平均年龄。
- GET lqz/_doc/_search
- {
- "size": 0,
- "query": {
- "match_all": {}
- },
- "aggs": {
- "age_group": {
- "range": {
- "field": "age",
- "ranges": [
- {
- "from": 15,
- "to": 20
- },
- {
- "from": 20,
- "to": 25
- },
- {
- "from": 25,
- "to": 30
- }
- ]
- }
- }
- }
- }