推理是指模型将文本作为输入并执行某种分析的任务。这可以是提取标签、提取名称、理解文本的情感等任务。
例如,如果您想从一段文本中提取积极或消极的情感,在传统的机器学习工作流程中,您需要收集标签数据集、训练模型、确定如何在云中部署模型并进行推断。这需要经历大量的工作流程。而对于每个任务,例如情感、提取名称等,您都需要训练和部署单独的模型。
而在大型语言模型,对于类似的任务,只需要编写一个提示,就可以立即开始生成结果。而且,您可以只使用一个模型、一个API来执行许多不同的任务,而不需要找出如何训练和部署许多不同的模型。
和(①指南)一样需要搭建一个环境
- import openai
- import os
-
- from dotenv import load_dotenv, find_dotenv
- _ = load_dotenv(find_dotenv()) # read local .env file
-
- openai.api_key = os.getenv('OPENAI_API_KEY')
- def get_completion(prompt, model="gpt-3.5-turbo"):
- messages = [{"role": "user", "content": prompt}]
- response = openai.ChatCompletion.create(
- model=model,
- messages=messages,
- temperature=0, # this is the degree of randomness of the model's output
- )
- return response.choices[0].message["content"]
以一个台灯的评价为例子。
- lamp_review = """
- Needed a nice lamp for my bedroom, and this one had \
- additional storage and not too high of a price point. \
- Got it fast. The string to our lamp broke during the \
- transit and the company happily sent over a new one. \
- Came within a few days as well. It was easy to put \
- together. I had a missing part, so I contacted their \
- support and they very quickly got me the missing piece! \
- Lumina seems to me to be a great company that cares \
- about their customers and products!!
- """
让我写一个提示来分类这个评价的情感。我只需要写下“以下产品评价的情感是什么”,然后加上通常的分隔符和评价文本等等。然后运行它。
- prompt = f"""
- What is the sentiment of the following product review,
- which is delimited with triple backticks?
- Review text: '''{lamp_review}'''
- """
- response = get_completion(prompt)
- print(response)
- """
- The sentiment of the product review is positive.
- """
可以限定为只回答一个单词,积极或消极。
- prompt = f"""
- What is the sentiment of the following product review,
- which is delimited with triple backticks?
- Give your answer as a single word, either "positive" \
- or "negative".
- Review text: '''{lamp_review}'''
- """
- response = get_completion(prompt)
- print(response)
- """
- positive
- """
也可以让它识别出评价作者表达的情绪列表,包括不超过五个项目。
- prompt = f"""
- Identify a list of emotions that the writer of the \
- following review is expressing. Include no more than \
- five items in the list. Format your answer as a list of \
- lower-case words separated by commas.
- Review text: '''{lamp_review}'''
- """
- response = get_completion(prompt)
- print(response)
- """
- happy, satisfied, grateful, impressed, content
- """
我们也可以询问是否顾客有某一个特定的情感,比如愤怒?
- prompt = f"""
- Is the writer of the following review expressing anger?\
- The review is delimited with triple backticks. \
- Give your answer as either yes or no.
- Review text: '''{lamp_review}'''
- """
- response = get_completion(prompt)
- print(response)
- """
- No
- """
信息提取是NLP(自然语言处理)的一部分,从文本中提取您想要了解的特定信息。
在以下案例中,我要求它识别以下内容:商品Item和品牌Brand。
- prompt = f"""
- Identify the following items from the review text:
- Item purchased by reviewer
- Company that made the item
- The review is delimited with triple backticks. \
- Format your response as a JSON object with \
- "Item" and "Brand" as the keys.
- If the information isn't present, use "unknown" \
- as the value.
- Make your response as short as possible.
- Review text: '''{lamp_review}'''
- """
- response = get_completion(prompt)
- print(response)
- """
- {
- "Item": "lamp",
- "Brand": "Lumina"
- }
- """
多任务抽取,一次性提取这些不同的字段,下面例子编写提示来识别情绪,确定某人是否生气,然后提取物品和品牌。
- prompt = f"""
- Identify the following items from the review text:
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item
- The review is delimited with triple backticks. \
- Format your response as a JSON object with \
- "Sentiment", "Anger", "Item" and "Brand" as the keys.
- If the information isn't present, use "unknown" \
- as the value.
- Make your response as short as possible.
- Format the Anger value as a boolean.
- Review text: '''{lamp_review}'''
- """
- response = get_completion(prompt)
- print(response)
- """
- {
- "Sentiment": "positive",
- "Anger": false,
- "Item": "lamp with additional storage",
- "Brand": "Lumina"
- }
- """
下面的例子,给定一篇长文,这是一篇虚构的关于政府工人对他们工作的机构感受的报纸文章。
- story = """
- In a recent survey conducted by the government,
- public sector employees were asked to rate their level
- of satisfaction with the department they work at.
- The results revealed that NASA was the most popular
- department with a satisfaction rating of 95%.
- One NASA employee, John Smith, commented on the findings,
- stating, "I'm not surprised that NASA came out on top.
- It's a great place to work with amazing people and
- incredible opportunities. I'm proud to be a part of
- such an innovative organization."
- The results were also welcomed by NASA's management team,
- with Director Tom Johnson stating, "We are thrilled to
- hear that our employees are satisfied with their work at NASA.
- We have a talented and dedicated team who work tirelessly
- to achieve our goals, and it's fantastic to see that their
- hard work is paying off."
-
- The survey also revealed that the
- Social Security Administration had the lowest satisfaction
- rating, with only 45% of employees indicating they were
- satisfied with their job. The government has pledged to
- address the concerns raised by employees in the survey and
- work towards improving job satisfaction across all departments.
- """
确定以下文本中正在讨论的五个主题。让我们把每个项目格式化成一个或两个单词长,将您的响应格式化为逗号分隔的列表,如果我们运行它,可以提取了一个主题列表。
- prompt = f"""
- Determine five topics that are being discussed in the \
- following text, which is delimited by triple backticks.
- Make each item one or two words long.
- Format your response as a list of items separated by commas.
- Text sample: '''{story}'''
- """
- response = get_completion(prompt)
- print(response.split(sep=','))
- """
- ['government survey',
- ' job satisfaction',
- ' NASA',
- ' Social Security Administration',
- ' employee concerns']
- """
如果给一篇文章,可以判断是否有某一些主题,这是我们追踪的主题:NASA,当地政府,工程,员工满意度和联邦政府,为每个主题提供0或1的答案列表。
- topie_list ={
- "nasa", "local govement", "engineering",
- "employee satisfaction", "federal government"
- }
- prompt = f"""
- Determine whether each item in the following list of \
- topics is a topic in the text below, which
- is delimited with triple backticks.
- Give your answer as list with 0 or 1 for each topic.\
- List of topics: {", ".join(topic_list)}
- Text sample: '''{story}'''
- """
- response = get_completion(prompt)
- print(response)
- """
- nasa: 1
- local government: 0
- engineering: 0
- employee satisfaction: 1
- federal government: 1
- """
如果某个主题包括NASA,可以打印出“新NASA故事”
- topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
- if topic_dict['nasa'] == 1:
- print("ALERT: New NASA story!")