• .Net使用Elastic.Clients.Elasticsearch在Elasticsearch8中实现向量存储和相似度检索


    一、测试环境

    Elastic.Clients.Elasticsearch版本:8.13.0
    Elasticsearch版本:8.13.0

    二、代码

    1、创建包含DenseVector的索引
    public static bool InitIndex()
    {
        // 定义索引配置
        var faceVectorproperties = new Properties
            {
                { "Id" ,new KeywordProperty()},
                { "FileID" ,new KeywordProperty()},
                { "FileGUID" ,new KeywordProperty()},
                { "ResourceID" ,new KeywordProperty()},
                { "FileName" ,new TextProperty()},
                { "Embedding" ,new DenseVectorProperty{Dims = 3 } }
            };
        // 定义索引配置
        var indexConfig = new IndexState
        {
            Settings = new IndexSettings
            {
                NumberOfShards = 1, // 设置分片数
                NumberOfReplicas = 1 // 设置副本数
            },
            Mappings = new TypeMapping
            {
                Properties = faceVectorproperties
            }
        };
        //判断是否已经存在该索引
        var existFaceVectorIndexResponse = _client.Indices.ExistsAsync("FaceVector").Result;
        if (!existFaceVectorIndexResponse.IsValidResponse)
        {
            // 创建索引请求
            var createIndexRequest = new CreateIndexRequest("FaceVector")
            {
                Settings = indexConfig.Settings,
                Mappings = indexConfig.Mappings
            };
            var createFaceVectorIndexResponse = _client.Indices.CreateAsync(createIndexRequest).Result;
            if (createFaceVectorIndexResponse.Acknowledged)
            {
                    //添加一条测试数据
                    ES_FaceVector temp = new ES_FaceVector
                    {
                        FileID = 0,
                        FileGUID = Guid.NewGuid(),
                        ResourceID = 0,
                        FileName = "测试",
                        Embedding = new float[] {1.2f,1.1f,1.3f }
                    };
                    var addDocResult = AddDoc<ES_FaceVector>(temp, ElasticIndexEnum.FaceVector);
            }
            else
            {
                return false;
            }
        }
        return true;
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    2、索引文档
    //批量索引文档
    public static bool AddDocs<T>(List<T> data, string indexName) where T : class
    {
        var bulkIndexResponse = _client.BulkAsync(b => b
            .Index(indexName)
            .IndexMany(data)
        ).Result;
        return bulkIndexResponse.IsValidResponse;
    }
    //单个索引文档
    public static bool AddDoc<T>(T data, string indexName) where T : class
    {
        var response = _client.IndexAsync(data, indexName).Result;
        return response.IsValidResponse;
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    3、对向量字段进行近似knn检索
    public static void SearchKnn()
    {
        // 构建KNN查询
        var doubleArr = new[] { -0.04604065, 0.054946236, 0.057453074};
        var arrLen = doubleArr.Length;
        var knnQuery = new KnnQuery()
        {
            k = 2,
            NumCandidates = 1000,
            Field = "embedding",
            QueryVector= doubleArr.Select(s=>(float)s).ToArray()
        };
        // 构建Elasticsearch查询
        var searchRequest = new SearchRequest<ES_FaceVector>(ElasticIndexEnum.FaceVector)
        {
            Knn = new KnnQuery[] { knnQuery },
            MinScore = 0.90,
            SourceIncludes = new [] { "fileName", "embedding" }
        };
    
        var searchResponse = _client.Search<ES_FaceVector>(searchRequest);
        if (searchResponse.IsValidResponse)
        {
            foreach (var hit in searchResponse.Hits)
            {
                // 处理每个文档的结果
                var fileNameTemp = hit.Source.FileName;
                var embeddingTemp = hit.Source.Embedding;
                
            }
        }
        else
        {
            Console.WriteLine($"Error: {searchResponse.DebugInformation}");
        }
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37

    三、参考

    .Net使用Elastic.Clients.Elasticsearch连接Elasticsearch8

    https://www.elastic.co/guide/en/elasticsearch/client/net-api/8.13/connecting.html

    https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

    https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html


  • 相关阅读:
    Python面向对象2-继承-
    //按层遍历二叉树,并收集结点
    Leetcode 416、分割等和子集
    基于ssm实验室管理系统
    [Java]JDK8新特性
    springboot基础及上传组件封装
    QtDay4
    Mongodb操作基础 分片
    从函数计算到 Serverless 架构
    小咖批量剪辑助手款视频批量自动剪辑软件
  • 原文地址:https://blog.csdn.net/willingtolove/article/details/138075817