• python使用ElasticSearch7.17.6笔记


    数操作系统:windows10

    我开始使用最新的版本,8.4.1但是使用过程中kibana启动不了,就索性使用旧版;

    下载地址:

    es7.17.6 下载地址 

    kibana7.17.6下载地址

    解压到合适的位置,更改elasticsearch.yml

    添加配置如下:

    1. cluster.name: robin-es
    2. node.name: node-1
    3. network.host: 0.0.0.0
    4. http.port: 9200
    5. cluster.initial_master_nodes: ["node-1"]

    更改kibana.yml配置

    i18n.locale: "zh-CN"

    到各自的bin目录下启动两个服务bat文件,

    在浏览器中执行http:://localhost:9200

    可以看到json就对了

    1. {
    2. "name" : "node-1",
    3. "cluster_name" : "robin-es",
    4. "cluster_uuid" : "pAvuRyRESuCHtbTnfdWrvA",
    5. "version" : {
    6. "number" : "7.17.6",
    7. "build_flavor" : "default",
    8. "build_type" : "zip",
    9. "build_hash" : "f65e9d338dc1d07b642e14a27f338990148ee5b6",
    10. "build_date" : "2022-08-23T11:08:48.893373482Z",
    11. "build_snapshot" : false,
    12. "lucene_version" : "8.11.1",
    13. "minimum_wire_compatibility_version" : "6.8.0",
    14. "minimum_index_compatibility_version" : "6.0.0-beta1"
    15. },
    16. "tagline" : "You Know, for Search"
    17. }

    使用python需要添加一下相关的库,我这里使用国内的库,并且使用代理,

    注意:建议使用对应版本的库,否则可能不兼容。

    1. pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ elasticsearch==7.17.6 --proxy="http://127.0.0.1:1081"
    2. pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ elasticsearch[async]==7.17.6 --proxy="http://127.0.0.1:1081"

    连接数据库:

    1. indexName = "student"
    2. client = Elasticsearch(
    3. ['127.0.0.1:9200'],
    4. # 在做任何操作之前,先进行嗅探
    5. # sniff_on_start=True,
    6. # # 节点没有响应时,进行刷新,重新连接
    7. sniff_on_connection_fail=True,
    8. # # 每 60 秒刷新一次
    9. sniffer_timeout=60
    10. )

    写几个增删改查的函数:

    需要注意:7.15版本以上使用了新的函数,旧的方式已经不适用了

    1. # 推荐使用 elasticsearch 需要注意版本问题
    2. from queue import Empty
    3. from elasticsearch import Elasticsearch
    4. from elasticsearch import *
    5. import json
    6. # es 7.17
    7. def checkIndexByName(client, indexName):
    8. try:
    9. res = client.indices.get(index=indexName)
    10. # print(res)
    11. return True
    12. except Exception as ex:
    13. return False
    14. # 创建索引
    15. def createIndex(client, name, doc):
    16. ret = False
    17. try:
    18. # Elasticsearch.options()
    19. resp = client.indices.create(index=name, mappings=doc["mappings"])
    20. # print(resp['result'])
    21. ret = True
    22. except Exception as ex:
    23. print(ex)
    24. return False
    25. return ret
    26. # 删除索引
    27. def dropIndex(client, name):
    28. ret = False
    29. try:
    30. # Elasticsearch.options()
    31. result = client.indices.delete(index=name)
    32. ret = True
    33. except:
    34. return False
    35. return ret
    36. def addDoc(client, index, doc, id):
    37. # 重复添加,数据覆盖
    38. try:
    39. resp = client.index(index=index, document=doc, id=id)
    40. print(resp['result'])
    41. return True
    42. except Exception as e:
    43. print("create index error")
    44. return False
    45. def delDocFromIndex(client, index, id):
    46. try:
    47. res = client.delete(index=index, id=id)
    48. print(res['_shards']['successful'])
    49. return '1'
    50. except Exception as e:
    51. print(e)
    52. return '0'
    53. def findDocById(client, index, id):
    54. try:
    55. res = client.get(index=index, id=id)
    56. return res['_source']
    57. except Exception as e:
    58. print(e)
    59. return 'nil'

    创建索引的过程,可以在外部配置文件中设置相关的参数,

    比如我们创建一个学生的相关索引,我们建立一个配置文件student.json

    1. {
    2. "settings": {
    3. "index": {
    4. "number_of_shards": 1,
    5. "number_of_replicas": 0
    6. }
    7. },
    8. "mappings": {
    9. "dynamic": "strict",
    10. "properties": {
    11. "name": {
    12. "type": "key"
    13. },
    14. "age": {
    15. "type": "long"
    16. },
    17. "birthday": {
    18. "type": "date"
    19. }
    20. }
    21. }
    22. }

    之后,我们创建索引时候这样使用:

    1. def load_json(filePath):
    2. data = open(filePath, 'r').read()
    3. return json.loads(data)
    4. docMapping = load_json("./student.json")
    5. # print(docMapping)
    6. #dropIndex(client, indexName, result)
    7. ret = checkIndexByName(client, indexName)
    8. if not ret:
    9. print("\nindex is exsit = %d" % ret)
    10. createIndex(client, indexName, docMapping)

    如果没有索引,则创建一下;

    在kibana的开发工具中可以看到相关的结果:

    之后, 添加2个记录(文档)试试

    1. doc = {
    2. "name": "灰太狼",
    3. "age": 22,
    4. "birthday": "2000-02-02",
    5. "tags": ["男"]
    6. }
    7. res = addDoc(client, indexName, doc, 13810500001)
    8. # print(res)
    9. doc = {
    10. "name": "美羊羊",
    11. "age": 10,
    12. "birthday": "2010-01-01",
    13. "tags": ["女"]
    14. }
    15. res = addDoc(client, indexName, doc, 13810500002)
    16. # print(res)

    可以在kibana中看到:

     

     目前位置,基本的增删改查,都实现了,但是还需要复杂的查询:

    1. bodyQueryAll = {
    2. "query": {
    3. "match_all": {}
    4. }
    5. }
    6. res = client.search(index=indexName, query=bodyQueryAll["query"])
    7. print("查询到%d 个" % res['hits']['total']['value'])
    8. items = res["hits"]["hits"]
    9. # print(items)
    10. for item in items:
    11. print("index=%s, id=%s doc=%s" %
    12. (item['_index'], item['_id'], item['_source']))
    1. 查询到2
    2. index=student, id=13810501001 doc={'name': '灰太狼', 'age': 22, 'birthday': '2000-02-02', 'tags': ['男']}
    3. index=student, id=13810501002 doc={'name': '美羊羊', 'age': 21, 'birthday': '2000-01-01', 'tags': ['女']}

    在kibana中,是这样的:

     

    知道了查询后返回数据的结构了,就可以提取我们想要的数据了,

    再添加2个查询函数:

    1. def queryAll(client, indexName):
    2. bodyQueryAll = {
    3. "query": {
    4. "match_all": {}
    5. }
    6. }
    7. res = client.search(index=indexName, query=bodyQueryAll["query"])
    8. n = res['hits']['total']['value']
    9. #print("查询到%d 个" % n)
    10. items = res["hits"]["hits"]
    11. # print(items)
    12. # for item in items:
    13. # print("index=%s, id=%s doc=%s" %
    14. # (item['_index'], item['_id'], item['_source']))
    15. return (n, items)
    16. def queryByDoc(client, indexName, query):
    17. res = client.search(index=indexName, query=query)
    18. n = res['hits']['total']['value']
    19. items = res["hits"]["hits"]
    20. return (n, items)

    测试代码如下:

    1. print("查全量:")
    2. res = queryAll(client, indexName)
    3. n = res[0]
    4. items = res[1]
    5. # print(items)
    6. for item in items:
    7. print("index=%s, id=%s doc=%s" %
    8. (item['_index'], item['_id'], item['_source']))
    9. queryNames = {
    10. "bool":
    11. {
    12. "should": [
    13. {"match":
    14. {"name": "美羊羊"}
    15. },
    16. {
    17. "match": {"name": "喜羊羊"}
    18. }
    19. ]
    20. }
    21. }
    22. print("查名字:")
    23. res = queryByDoc(client, indexName, queryNames)
    24. n = res[0]
    25. items = res[1]
    26. # print(items)
    27. for item in items:
    28. print("index=%s, id=%s doc=%s" %
    29. (item['_index'], item['_id'], item['_source']))

    输出:

    1. 查全量:
    2. index=student, id=13810501001 doc={'name': '灰太狼', 'age': 22, 'birthday': '2000-02-02', 'tags': ['男']}
    3. index=student, id=13810501002 doc={'name': '美羊羊', 'age': 21, 'birthday': '2000-01-01', 'tags': ['女']}
    4. 查名字:
    5. index=student, id=13810501002 doc={'name': '美羊羊', 'age': 21, 'birthday': '2000-01-01', 'tags': ['女']}

    kibana中这样的:

     

    参考:python操作Elasticsearch7.x - lshan - 博客园

    Elasticsearch API Reference — Python Elasticsearch client 8.4.1 documentation

    python操作Elasticsearch7.17.0_有勇气的牛排的博客-CSDN博客

    https://github.com/elastic/elasticsearch-py/issues/1698

    elasticsearch——入门 - 走看看

  • 相关阅读:
    17、Mybatis获取参数值的情况3(若mapper接口方法的参数为多个时,可以手动将这些参数放入map中存储)
    【开源】基于Vue.js的智能停车场管理系统的设计和实现
    面试五 -bind 和 function
    Flink之KeyedState
    python使用ElasticSearch7.17.6笔记
    企业真实面试:父子类之间到底是怎么实例化的?
    linux-4.19 内存之页面回收
    1.0、C语言数据结构 ——初识数据结构和算法
    【图像分割】图像检测(分割、特征提取)、各种特征(面积等)的测量和过滤(Matlab代码实现)
    Java的垃圾回收机制详解——从入门到出土,学不会接着来砍我!
  • 原文地址:https://blog.csdn.net/robinfoxnan/article/details/126768161