• 基于Microsoft SemanticKernel和GPT4实现一个智能翻译服务


    今年.NET Conf China 2023技术大会,我给大家分享了 .NET应用国际化-AIGC智能翻译+代码生成的议题

    .NET Conf China 2023分享-.NET应用国际化-AIGC智能翻译+代码生成

    今天将详细的代码实现和大家分享一下。

    一、前提准备

    1. 新建一个Console类的Project

    2. 引用SK的Nuget包,SK的最新Nuget包

    dotnet add package Microsoft.SemanticKernel --version 1.4.0
    
        "Microsoft.SemanticKernel" Version="1.4.0" />
        "Newtonsoft.Json" Version="13.0.3" />
      

    3. 在Azure OpenAI Service中创建一个GPT4的服务,这个可能大家没有账号,那就先看代码如何实现吧

    部署好GPT4模型后,可以拿到以下三个重要的值

    Azure OpenAI Deployment Name
    Azure OpenAI Endpoint
    Azure OpenAI Key
    二、编写翻译使用的Prompt
    复制代码
     {{$input}}
    请将上面的输入翻译为英文,不要返回任何解释说明,
    请扮演一个美国电动汽车充电服务运营商(精通中文和英文),用户的输入数据是JSON格式,例如{"1":"充电站", "2":"充电桩"}, 
    如果不是JSON格式,请返回无效的输入。
    请使用以下专业术语进行翻译
     {
        "充电站":"Charging station",
        "电站":"Charging station",
        "场站":"Charging station",
        "充电桩":"Charging point",
        "充电终端":"Charging point",
        "终端":"Charging point",
        "电动汽车":"Electric Vehicle",
        "直流快充":"DC Fast Charger",
        "超级充电站":"Supercharger",
        "智能充电":"Smart Charging",
        "交流慢充":"AC Slow Charging"
    }
    翻译结果请以JSON格式返回,例如 {"1":"Charging station", "2":"Charging point"}
    复制代码

    类似的还有葡萄牙下的翻译Prompt

    复制代码
    {{$input}}
    请将上面的输入翻译为葡萄牙语,不要返回任何解释说明,请扮演一个巴西的电动汽车充电服务运营商(精通葡萄牙语、中文和英文)
    用户的输入数据是JSON格式,例如{"1":"充电站", "2":"充电桩"}, 如果不是JSON格式,请返回无效的输入
    请使用以下专业术语进行翻译
     {
      "充电站": "Estação de carregamento",
      "电站": "Estação de carregamento",
      "场站": "Estação de carregamento",
      "充电桩": "Ponto de carregamento",
      "充电终端": "Ponto de carregamento",
      "终端": "Ponto de carregamento",
      "电动汽车": "Veículo Elétrico",
      "直流快充": "Carregador Rápido DC",
      "超级充电站": "Supercharger",
      "智能充电": "Carregamento Inteligente",
      "交流慢充": "Carregamento AC Lento"
    }
    请以JSON格式返回,例如 {"1":"Estação de carregamento", "2":"Ponto de carregamento"}
    复制代码

    在项目工程下新建Plugins目录和TranslatePlugin子目录,同时新建Translator_en和Translator_pt等多个子目录

     config.json文件下的内容如下:

    复制代码
    {
        "schema": 1,
        "type": "completion",
        "description": "Translate.",
        "completion": {
             "max_tokens": 2000,
             "temperature": 0.5,
             "top_p": 0.0,
             "presence_penalty": 0.0,
             "frequency_penalty": 0.0
        },
        "input": {
             "parameters": [
                  {
                       "name": "input",
                       "description": "The user's input.",
                       "defaultValue": ""
                  }
             ]
        }
    }
    复制代码

    三、Translator翻译类,实现文本多语言翻译

    这个类主要实现将用户输入的文本(系统处理为JSON格式),翻译为指定的语言

    复制代码
    using System.Runtime.InteropServices;
    using Microsoft.SemanticKernel;
    using Newtonsoft.Json;

    namespace LLM_SK;
    public class Translator
    {
        Kernel kernel;
        public Translator(Kernel kernel)
        {
            this.kernel = kernel;
        }

        public IDictionary Translate(IDictionary textList, string language)
        {
            var pluginDirectory = Path.Combine(System.IO.Directory.GetCurrentDirectory(), "Plugins/TranslatePlugin");
            var plugin = kernel.CreatePluginFromPromptDirectory(pluginDirectory, "Translator_" + language + "");        

            var json = JsonConvert.SerializeObject(textList);      

            if (!string.IsNullOrEmpty(json))
            {
                var output = kernel.InvokeAsync(plugin["Translator_" + language + ""], new() { ["input"] = json }).Result.ToString();
                if (!string.IsNullOrWhiteSpace(output))
                {
                    Console.WriteLine(output);
                    return JsonConvert.DeserializeObject>(output);
                }
            }

            return new Dictionary();
        }
    }
    复制代码

    这个类中构造函数中接收传入的Kernel对象,这个Kernel对象是指

    Microsoft.SemanticKernel.Kernel  
    复制代码
    //
    // Summary:
    //     Provides state for use throughout a Semantic Kernel workload.
    //
    // Remarks:
    //     An instance of Microsoft.SemanticKernel.Kernel is passed through to every function
    //     invocation and service call throughout the system, providing to each the ability
    //     to access shared state and services.
    public sealed class Kernel
    复制代码

    暂且理解为调用各类大模型的Kernel核心类,基于这个Kernel实例对象完成大模型的调用和交互

    另外,上述代码中有个Prompt模板文件读取的操作。

            var pluginDirectory = Path.Combine(System.IO.Directory.GetCurrentDirectory(), "Plugins/TranslatePlugin");
            var plugin = kernel.CreatePluginFromPromptDirectory(pluginDirectory, "Translator_" + language + "");    

     从Plugins/TranslatePlugin目录下读取指定的KernelPlugin,例如Translator_en英语翻译插件和Translator_pt 葡萄牙翻译插件

     var output = kernel.InvokeAsync(plugin["Translator_" + language + ""], new() { ["input"] = json }).Result.ToString();

     调用KernelFunction方式实现GPT4大模型调用

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    //
       // Summary:
       //     Invokes the Microsoft.SemanticKernel.KernelFunction.
       //
       // Parameters:
       //   function:
       //     The Microsoft.SemanticKernel.KernelFunction to invoke.
       //
       //   arguments:
       //     The arguments to pass to the function's invocation, including any Microsoft.SemanticKernel.PromptExecutionSettings.
       //
       //
       //   cancellationToken:
       //     The System.Threading.CancellationToken to monitor for cancellation requests.
       //     The default is System.Threading.CancellationToken.None.
       //
       // Returns:
       //     The result of the function's execution.
       //
       // Exceptions:
       //   T:System.ArgumentNullException:
       //     function is null.
       //
       //   T:Microsoft.SemanticKernel.KernelFunctionCanceledException:
       //     The Microsoft.SemanticKernel.KernelFunction's invocation was canceled.
       //
       // Remarks:
       //     This behaves identically to invoking the specified function with this Microsoft.SemanticKernel.Kernel
       //     as its Microsoft.SemanticKernel.Kernel argument.
       public Task InvokeAsync(KernelFunction function, KernelArguments? arguments = null, CancellationToken cancellationToken = default(CancellationToken))
       {
           Verify.NotNull(function, "function");
           return function.InvokeAsync(this, arguments, cancellationToken);
       }

     继续封装GPT4TranslateService,构造Microsoft.SemanticKernel.Kernel 类实例。

    复制代码
    using System.Globalization;
    using Microsoft.SemanticKernel;
    
    namespace LLM_SK;
    public class GPT4TranslateService
    {    
        public IDictionary<int,string> Translate(IDictionary<int, string> texts, CultureInfo cultureInfo)
        {
            var kernel = BuildKernel();
            var translator = new Translator(kernel);
            return translator.Translate(texts, cultureInfo.TwoLetterISOLanguageName );
        }
    
        //私有方法,构造IKernel
        private Kernel BuildKernel()
        {
            var builder = Kernel.CreateBuilder();
            builder.AddAzureOpenAIChatCompletion(
                     "xxxxgpt4",                  // Azure OpenAI Deployment Name
                     "https://****.openai.azure.com/", // Azure OpenAI Endpoint
                     "***************");      // Azure OpenAI Key
    
            return  builder.Build();
       }
    }
    复制代码

    四、测试调用

    这里我们设计了2种语言,英语和葡萄牙的文本翻译

    复制代码
    var culture = new CultureInfo("en-US");
    var translator = new GPT4TranslateService();
    translator.Translate(new Dictionary<int, string>(){{ 1,"电站"}, {2,"终端不可用"},{3,"充电桩不可用"} ,
    {4,"场站"},{5,"充电站暂未运营" }},culture);
    
    culture = new CultureInfo("pt-BR");
    translator.Translate(new Dictionary<int, string>(){{ 1,"电站"}, {2,"终端不可用"},{3,"充电桩不可用"} ,
    {4,"场站"},{5,"充电站暂未运营" }},culture);
    复制代码

    输出的结果

    {"1":"Charging station","2":"Charging point unavailable","3":"Charging station unavailable","4":"Charging station","5":"Charging station not in operation yet"}
    {"1":"Estação de carregamento","2":"Ponto de carregamento não está disponível","3":"Ponto de carregamento não está disponível","4":"Estação de carregamento","5":"A estação de carregamento ainda não está em operação"}

    五、总结

    以上是基于SemanticKernel和GPT4实现一个智能翻译服务的Demo和框架,大家可以基于这个示例继续完善,增加更多动态的数据和API调用,例如将JSON数据写入数据库

    同时还可以记录翻译不稳定的异常,手工处理或者继续完善Prompt。

     

    周国庆

    2024/2/17

  • 相关阅读:
    漏电继电器 LLJ-630F φ100 导轨安装 分体式结构 LLJ-630H(S) AC
    [数据库]阿里云postgres数据库备份恢复
    2023最新SSM计算机毕业设计选题大全(附源码+LW)之java医院疫情管理系统4f9a9
    Day 47 MySQL Navcat、PyMYsql
    FlinkSQL -- joins----flink-1.13.6
    0/1背包问题
    MATLAB算法实战应用案例精讲-奇异值分解(SVD)(附matlab、python、Java、C++、R语言和C语言代码)
    keepalived的通信原理
    【C++】泛型算法(五)泛型算法的使用与设计
    open cv快速入门系列---数字图像基础
  • 原文地址:https://www.cnblogs.com/tianqing/p/18018050