
介绍完信息总结,再聊聊信息提取,我认为这个场景是继场景 3 推理以外,第二个值得深挖的场景。这个场景有非常多的有意思的场景,比如:

第二个可能比较难理解,举个 OpenAI 里的例子,它的 prompt 是这样的(为了有足够空间显示内容,我仅节选了 text 里的部分内容,完整内容,可以点击这里查看):

Extract the important entities mentioned in the article below. First extract all company names, then extract all people names, then extract specific topics which fit the content and finally extract general overarching themes
Desired format:
Company names: <comma_separated_list_of_company_names>
People names: -||-
Specific topics: -||-
General themes: -||-

Text: """Powering Next Generation
Applications with OpenAI Codex
Codex is now powering 70 different applications across a variety of use cases through the OpenAI API.

May 24, 2022
4 minute read
OpenAI Codex, a natural language-to-code system based on GPT-3, helps turn simple English instructions into over a dozen popular coding languages. Codex was released last August through our API and is the principal building block of GitHub Copilot.

Warp is a Rust-based terminal, reimagined from the ground up to help both individuals and teams be more productive in the command-line.

Terminal commands are typically difficult to remember, find and construct. Users often have to leave the terminal and search the web for answers and even then the results might not give them the right command to execute. Warp uses Codex to allow users to run a natural language command to search directly from within the terminal and get a result they can immediately use.

“Codex allows Warp to make the terminal more accessible and powerful. Developers search for entire commands using natural language rather than trying to remember them or assemble them piecemeal. Codex-powered command search has become one of our game changing features.”

—Zach Lloyd, Founder, Warp

Machinet helps professional Java developers write quality code by using Codex to generate intelligent unit test templates.

Machinet was able to accelerate their development several-fold by switching from building their own machine learning systems to using Codex. The flexibility of Codex allows for the ability to easily add new features and capabilities saving their users time and helping them be more productive.

“Codex is an amazing tool in our arsenal. Not only does it allow us to generate more meaningful code, but it has also helped us find a new design of product architecture and got us out of a local maximum.”

—Vladislav Yanchenko, Founder, Machinet"""

Prompt 有点长,我解释下,它是让 AI 将文章里的重点内容进行抽离,并将其根据特定格式进行输出。要求将文章里提到的公司、人名、主题都按照 comma separated list(按逗号空格分开)的方式罗列出来。

最后的 output 是这样的:

Company names: OpenAI, Microsoft, GitHub, Pygma, Replit, Warp, Machinet
People names: Emile Paffard-Wray, Amjad Masad, Zach Lloyd, Vladislav Yanchenko
Specific topics: GPT-3, OpenAI API, Azure OpenAI Service, GitHub Copilot, Pygma, Replit, Warp, Machinet
General themes: Natural language-to-code, Productivity, Problem solving, Creativity, Learning



本章会讲一下信息总结。信息总结还是比较简单的,基本上在 prompt 里加入总结 summarize 就可以了。但如果你想要一些特别的效果,不妨组合使用之前介绍的技巧,比如:

使用技巧 2,增加总结示例,让 AI 总结符合你需求的内容
使用技巧 4,增加 role,让 AI 总结的内容具有一定的风格




解释代码:比如你看到一段 Python 的代码,但你看不懂,你可以让 AI 解释下代码的含义。
解释论文:看某篇论文看不懂,或者论文里的某一段看不懂,你也可以让 AI 解释。
以解释代码为例,它的 prompt 是这样的(此案例来自 OpenAI 官方案例):

Explanation of what the code does:

Python 3
def remove_common_prefix(x, prefix, ws_prefix):

x["completion"] = x["completion"].str[len(prefix) :] 
if ws_prefix: 
    # keep the single whitespace as prefix 
    x["completion"] = " " + x["completion"] 

return x

以下是 ChatGPT 的 Output,它阐述了这段代码是拿来干什么的,同时还阐述了各个参数的含义:

This is a Python function called remove_common_prefix that takes in three parameters: x, prefix, and ws_prefix.

The x parameter is assumed to be a Pandas DataFrame or Series object with a column named "completion", which contains strings that may or may not start with the prefix string.

The prefix parameter is a string that represents the common prefix that needs to be removed from the beginning of each string in the "completion" column of the DataFrame.

The ws_prefix parameter is a Boolean value that indicates whether or not to keep a single whitespace character as a prefix after the common prefix has been removed.

The function modifies the "completion" column of the DataFrame by removing the prefix string from the beginning of each string in that column using the Pandas str method and string slicing. The len(prefix) : syntax is used to slice the string starting from the character that follows the prefix.

If ws_prefix is True, then the function adds a single whitespace character as a prefix to each string in the "completion" column after removing the prefix.

Finally, the modified DataFrame is returned by the function.


除了回答问题外,另一个重要的场景是让 AI 根据你的要求完成一些内容生成任务,根据输出的结果差异,我将其概括为以下几个主要场景:

本章,我们先来聊聊「无中生有」场景。顾名思义,就是让 AI 给你生成一些内容。你只需要向 AI 描述你想写的内容,AI 就会按你的要求写出该内容。比如:

像撰写招聘信息等,你只需要明确目标,将目标定得比较明确即可生成能让你较为满意的答案。本章我想聊下写代码。下面是一个让 ChatGPT 写代码的案例。


Create a MySQL query for all students in the Computer Science Department:
Table departments, columns = [DepartmentId, DepartmentName]
Table students, columns = [DepartmentId, StudentId, StudentName]


SELECT students.*
FROM students
INNER JOIN departments
ON students.DepartmentId = departments.DepartmentId
WHERE departments.DepartmentName = 'Computer Science';

技巧 3:使用引导词,引导模型输出特定语内容
在代码生成场景里,有一个小技巧,上面提到的案例,其 prompt 还可以继续优化,在 prompt 最后,增加一个代码的引导,告知 AI 我已经将条件描述完了,你可以写代码了。

在 prompt 的最后增加 SELECT 可以很好地提示 AI 可以写 SQL 代码了。Better prompt:

Create a MySQL query for all students in the Computer Science Department:
Table departments, columns = [DepartmentId, DepartmentName]
Table students, columns = [DepartmentId, StudentId, StudentName]

同样的道理,如果你想让 AI 写 Python 代码,那 import 会是比较好的提示。但需要注意,这个只是告知 AI 可以写代码了,并不能告知 AI 写何种代码,你仍然需要在 prompt 里增加提示,告知 AI 要生成何种语言的代码。

在吴恩达的 ChatGPT Prompt Engineering 课程中,也提到这个技巧,只是在课程中,引导词并不是放在最后,而是在 prompt 里直接说明,让 AI 生成一个 JSON 格式的内容。课程中的例子是这样的(注意这个是 python 代码):

prompt = f"""
Generate a list of three made-up book titles along \
with their authors and genres.
Provide them in JSON format with the following keys:
book_id, title, author, genre.

我简单解释下,其关键是在 prompt 里跟 AI 说明,需要 AI 按照 JSON 格式输出内容。


在问答这个大场景下,还有一个子场景是推理,这个场景非常有意思,而且是个非常值得深挖的场景,prompt 在此场景里发挥的作用非常大。

如果你想用 ChatGPT API 做点什么小应用,我建议可以从这个场景入手,相对来说没有其他场景那么红海。

举个比较简单的例子,我将美国工商管理硕士的入学考试 GMAT 的一道数学题输入进了 ChatGPT:

If a real estate agent received a commission of 6 percent of the selling price of a certain house, what was the selling price of the house?
(1) The selling price minus the real estate agent’s commission was $84,600.
(2) The selling price was 250 percent of the original purchase price of $36,000.

(A) Statement (1) ALONE is sufficient, but statement (2) alone is not sufficient.
(B) Statement (2) ALONE is sufficient, but statement (1) alone is not sufficient.
(C) BOTH statements TOGETHER are sufficient, but NEITHER statement ALONE is sufficient.
(D) EACH statement ALONE is sufficient.
(E) Statements (1) and (2) TOGETHER are NOT sufficient.




它的 Output 是这样的,它会根据题目的要求去推理和运算:

Let's assume the selling price of the house is S.

According to the given information, the real estate agent's commission is 6% of S, which is 0.06S.

So, the amount received by the seller after paying the commission would be (S - 0.06S) = 0.94S.

Statement (1) tells us that 0.94S = $84,600. Solving for S, we get S = $90,000.

Therefore, statement (1) alone is sufficient to determine the selling price of the house.

Statement (2) tells us that the selling price is 250% of the original purchase price, which means S = 2.5*$36,000 = $90,000.

Therefore, statement (2) alone is also sufficient to determine the selling price of the house.

Since both statements individually provide the same answer, we can conclude that together they are also sufficient to determine the selling price of the house.

Hence, the answer is (D) EACH statement ALONE is sufficient.




因此,在支付佣金后卖方收到的金额为(S-0.06S)= 0.94S。

陈述(1)告诉我们,0.94S = 84,600美元。解出S,我们得到S = $90,000。


陈述(2)告诉我们,售价是原始购买价的250%,这意味着S = 2.5 * $36,000 = $90,000。




这个场景还比较简单,我就不在这举更多例子了,你可以将各种题目输入进 ChatGPT 里试试。