AI Is Dangerous, but Not for the Reasons You Think | Sasha Luccioni | TED

1,211,342 views ・ 2023-11-06

TED


请双击下面的英文字幕来播放视频。

翻译人员: Yip Yan Yeung 校对人员: suya f.
00:04
So I've been an AI researcher for over a decade.
0
4292
3504
我从事人工智能(AI)研究 已有十多年了。
00:07
And a couple of months ago, I got the weirdest email of my career.
1
7796
3503
几个月前,我收到了我职业生涯中 最奇怪的电子邮件。
00:11
A random stranger wrote to me
2
11925
1668
一个陌生人写信给我,
00:13
saying that my work in AI is going to end humanity.
3
13635
3420
说我在 AI 领域的工作将终结人类。
00:18
Now I get it, AI, it's so hot right now.
4
18598
3754
我懂了,AI 现在太火了。
00:22
(Laughter)
5
22352
1627
(笑声)
00:24
It's in the headlines pretty much every day,
6
24020
2086
每天的头条新闻里几乎都有它,
00:26
sometimes because of really cool things
7
26106
1918
有时候是因为一些非常酷的事情,
00:28
like discovering new molecules for medicine
8
28066
2169
比如发现了新的药物分子,
00:30
or that dope Pope in the white puffer coat.
9
30235
2252
或者教皇穿着白色羽绒服。
00:33
But other times the headlines have been really dark,
10
33446
2461
其他的头条新闻却非常黑暗,
00:35
like that chatbot telling that guy that he should divorce his wife
11
35907
3671
比如聊天机器人 让一个人与妻子离婚,
00:39
or that AI meal planner app proposing a crowd pleasing recipe
12
39619
4088
或者一个 AI 食谱计划应用 提出了一份受人喜爱的食谱,
00:43
featuring chlorine gas.
13
43707
2002
却以氯气为主。
00:46
And in the background,
14
46376
1418
我们听到了 关于世界末日的情景、
00:47
we've heard a lot of talk about doomsday scenarios,
15
47836
2419
00:50
existential risk and the singularity,
16
50255
1918
生存风险和奇点理论的议论纷纷,
00:52
with letters being written and events being organized
17
52215
2503
人们写信、组织活动 确保这种情况不会发生。
00:54
to make sure that doesn't happen.
18
54718
2002
00:57
Now I'm a researcher who studies AI's impacts on society,
19
57637
4630
我是一名研究AI对社会 影响的研究人员,
01:02
and I don't know what's going to happen in 10 or 20 years,
20
62267
2836
我不知道 10 年 或 20 年后会发生什么,
01:05
and nobody really does.
21
65145
2461
没有人知道。
01:07
But what I do know is that there's some pretty nasty things going on right now,
22
67981
4546
但我知道现在 有一些非常讨厌的事情,
01:12
because AI doesn't exist in a vacuum.
23
72527
2878
因为 AI 不是凭空存在的。
01:15
It is part of society, and it has impacts on people and the planet.
24
75447
3920
它是社会的一部分, 会对人类和地球造成影响。
01:20
AI models can contribute to climate change.
25
80160
2502
AI 模型会影响气候变化。
01:22
Their training data uses art and books created by artists
26
82704
3462
它们的训练数据 在未经许可的情况下
使用着艺术家和作者们 创作的艺术和书籍。
01:26
and authors without their consent.
27
86207
1710
01:27
And its deployment can discriminate against entire communities.
28
87959
3837
它的部署可能会歧视整个群体。
01:32
But we need to start tracking its impacts.
29
92797
2127
我们得开始追踪它的影响。
01:34
We need to start being transparent and disclosing them and creating tools
30
94966
3587
我们得保持透明, 公开影响,创建工具,
01:38
so that people understand AI better,
31
98595
2419
让人们更好地了解 AI,
01:41
so that hopefully future generations of AI models
32
101056
2335
希望 AI 模型的更新换代
01:43
are going to be more trustworthy, sustainable,
33
103433
2836
更值得信赖、更可持续,
01:46
maybe less likely to kill us, if that's what you're into.
34
106269
2836
如果你关心的是这一点, 也许也更不会把我们杀了。
01:50
But let's start with sustainability,
35
110148
1752
让我们从可持续发展说起,
01:51
because that cloud that AI models live on is actually made out of metal, plastic,
36
111900
5756
因为 AI 模型所在的“云” 其实来自金属、塑料,
01:57
and powered by vast amounts of energy.
37
117656
2460
需要大量的能量。
02:00
And each time you query an AI model, it comes with a cost to the planet.
38
120116
4463
你每向 AI 模型查询一次, 都会让地球付出成本。
02:05
Last year, I was part of the BigScience initiative,
39
125789
3044
去年,我参与了 BigScience 计划,
02:08
which brought together a thousand researchers
40
128833
2127
该计划汇集了来自世界各地的
02:10
from all over the world to create Bloom,
41
130960
2503
一千名研究人员, 共同创建了 Bloom 模型,
02:13
the first open large language model, like ChatGPT,
42
133505
4337
这是第一个开放的大语言模型,
类似 ChatGPT,
02:17
but with an emphasis on ethics, transparency and consent.
43
137842
3546
但强调道德、透明度和准许。
02:21
And the study I led that looked at Bloom's environmental impacts
44
141721
3253
我带领的研究调查了 Bloom 对环境的影响,
02:25
found that just training it used as much energy
45
145016
3253
发现仅仅是训练它,消耗的能量
02:28
as 30 homes in a whole year
46
148311
2211
就相当于 30 个家庭的消耗量,
02:30
and emitted 25 tons of carbon dioxide,
47
150563
2419
排放了 25 吨的二氧化碳,
02:33
which is like driving your car five times around the planet
48
153024
3253
相当于驾车环绕地球五周,
02:36
just so somebody can use this model to tell a knock-knock joke.
49
156319
3170
只为了让人用这个模型 讲一个简单的笑话。
02:39
And this might not seem like a lot,
50
159489
2169
看起来可能并不多,
02:41
but other similar large language models,
51
161700
2460
但其他类似的大语言模型,
02:44
like GPT-3,
52
164202
1126
如 GPT-3,
02:45
emit 20 times more carbon.
53
165370
2544
排放的碳量要多 20 倍。
02:47
But the thing is, tech companies aren't measuring this stuff.
54
167956
2878
但问题是,科技公司 并没有计算这些东西。
02:50
They're not disclosing it.
55
170875
1252
它们不会透露。
02:52
And so this is probably only the tip of the iceberg,
56
172168
2461
这可能只是冰山一角,
02:54
even if it is a melting one.
57
174629
1418
即使冰山正在融化。
02:56
And in recent years we've seen AI models balloon in size
58
176798
3629
近年来,我们已经看到 AI 模型的规模激增,
03:00
because the current trend in AI is "bigger is better."
59
180468
3462
因为 AI 当前的趋势是 “越大越好”。
03:04
But please don't get me started on why that's the case.
60
184305
2795
但是请不要让我 开始解释为什么会这样。
03:07
In any case, we've seen large language models in particular
61
187100
3003
无论如何,在过去的五年中,
03:10
grow 2,000 times in size over the last five years.
62
190103
3211
我们已经看到大语言模型 的规模增长了 2000 倍。
03:13
And of course, their environmental costs are rising as well.
63
193314
3045
当然,它们的环境成本也在上升。
03:16
The most recent work I led, found that switching out a smaller,
64
196401
3795
我最近带领的研究发现,将更小、
03:20
more efficient model for a larger language model
65
200238
3337
更高效的模型 换成更大型的语言模型
03:23
emits 14 times more carbon for the same task.
66
203616
3754
所产生的碳排放量 是同一任务的 14 倍。
03:27
Like telling that knock-knock joke.
67
207412
1877
比如讲个简单的笑话。
03:29
And as we're putting in these models into cell phones and search engines
68
209289
3462
当我们将这些模型 放进手机、搜索引擎
03:32
and smart fridges and speakers,
69
212792
2836
智能冰箱、扬声器时,
03:35
the environmental costs are really piling up quickly.
70
215628
2628
环境成本确实在迅速增加。
03:38
So instead of focusing on some future existential risks,
71
218840
3754
与其关注未来的某些生存风险,
03:42
let's talk about current tangible impacts
72
222635
2753
不如谈谈当前的切实影响
03:45
and tools we can create to measure and mitigate these impacts.
73
225388
3629
以及我们可以创建什么工具 衡量、减轻这些影响。
03:49
I helped create CodeCarbon,
74
229893
1668
我帮助创建了 CodeCarbon,
03:51
a tool that runs in parallel to AI training code
75
231603
2961
一款与 AI 训练代码 并行运行的工具,
03:54
that estimates the amount of energy it consumes
76
234564
2211
可以估算它消耗的能量
03:56
and the amount of carbon it emits.
77
236775
1668
和排放的碳量。
03:58
And using a tool like this can help us make informed choices,
78
238485
2877
使用这样的工具可以帮助我们 做出明智的选择,
04:01
like choosing one model over the other because it's more sustainable,
79
241404
3253
如选择一种而不是另一种模型, 因为它更具可持续性,
04:04
or deploying AI models on renewable energy,
80
244657
2920
或者在可再生能源领域 部署人工智能模型,
04:07
which can drastically reduce their emissions.
81
247619
2544
大大减少碳排放。
04:10
But let's talk about other things
82
250163
2085
但我们来谈谈其他事,
04:12
because there's other impacts of AI apart from sustainability.
83
252290
2961
因为除了可持续性之外, AI 还有其他影响。
04:15
For example, it's been really hard for artists and authors
84
255627
3128
例如,艺术家和作家很难
04:18
to prove that their life's work has been used for training AI models
85
258797
4212
证明他们的毕生之作在未经允许的 情况下被用于训练 AI 模型。
04:23
without their consent.
86
263051
1209
04:24
And if you want to sue someone, you tend to need proof, right?
87
264302
3170
而且,如果你想起诉某人, 你往往需要证据,对吧?
04:27
So Spawning.ai, an organization that was founded by artists,
88
267806
3920
因此,由艺术家创立的组织 Spawning.ai
04:31
created this really cool tool called “Have I Been Trained?”
89
271726
3337
创建了这个非常酷的工具, 名为 “我被拿去训练了吗?”。
04:35
And it lets you search these massive data sets
90
275104
2461
它可以让你搜索海量数据集,
04:37
to see what they have on you.
91
277607
2085
看看它们对你做了些什么。
04:39
Now, I admit it, I was curious.
92
279734
1668
我承认我很好奇。
04:41
I searched LAION-5B,
93
281444
1627
我搜索了 LAION-5B,
04:43
which is this huge data set of images and text,
94
283112
2461
这是一个由图像和文本 组成的庞大数据集,
04:45
to see if any images of me were in there.
95
285615
2711
想看看里面有没有我的图片。
04:49
Now those two first images,
96
289285
1585
先是这两张照片,
04:50
that's me from events I've spoken at.
97
290870
2169
是我在演讲活动中的照片。
04:53
But the rest of the images, none of those are me.
98
293081
2753
但是其余图片中的都不是我。
04:55
They're probably of other women named Sasha
99
295875
2002
她们可能是其他名叫萨沙的女性,
04:57
who put photographs of themselves up on the internet.
100
297919
2628
在互联网上发布了自己的照片。
05:01
And this can probably explain why,
101
301047
1627
这也许可以解释为什么
05:02
when I query an image generation model
102
302715
1836
当我查询图像生成模型
05:04
to generate a photograph of a woman named Sasha,
103
304551
2294
生成一个名叫萨沙的女性的照片时,
05:06
more often than not I get images of bikini models.
104
306886
2753
05:09
Sometimes they have two arms,
105
309681
1626
有时她们有两只手臂,
05:11
sometimes they have three arms,
106
311349
2294
有时她们有三只手臂,
05:13
but they rarely have any clothes on.
107
313685
2043
但她们很少穿着衣服。
05:16
And while it can be interesting for people like you and me
108
316563
2794
虽然像你我这样的人 搜索这些数据集可能很有趣,
05:19
to search these data sets,
109
319357
2127
05:21
for artists like Karla Ortiz,
110
321526
2044
但对于卡拉·奥尔蒂兹(Karla Ortiz) 这样的艺术家来说,
05:23
this provides crucial evidence that her life's work, her artwork,
111
323570
3753
这提供了重要的证据, 证明她一生的作品,她的作品,
05:27
was used for training AI models without her consent,
112
327365
2961
在未经她同意的情况下 被用于训练 AI 模型,
05:30
and she and two artists used this as evidence
113
330326
2336
她和两位艺术家以此作为证据,
05:32
to file a class action lawsuit against AI companies
114
332704
2794
以侵犯版权为由 对 AI 公司提起集体诉讼。
05:35
for copyright infringement.
115
335540
1960
05:37
And most recently --
116
337542
1168
最近——
05:38
(Applause)
117
338710
3378
(掌声)
05:42
And most recently Spawning.ai partnered up with Hugging Face,
118
342130
3044
最近,Spawning.ai 与 我所在的公司 Hugging Face 合作,
05:45
the company where I work at,
119
345216
1585
05:46
to create opt-in and opt-out mechanisms for creating these data sets.
120
346801
4922
在创建这些数据集的过程中 加入了选择加入和退出的机制。
05:52
Because artwork created by humans shouldn’t be an all-you-can-eat buffet
121
352098
3587
因为人类创作的艺术品不应该成为
训练 AI 语言模型的畅吃自助餐。
05:55
for training AI language models.
122
355727
1793
05:58
(Applause)
123
358313
4254
(掌声)
06:02
The very last thing I want to talk about is bias.
124
362567
2336
我想谈的最后一点是偏见。
06:04
You probably hear about this a lot.
125
364944
1919
你可能经常听说这点。
06:07
Formally speaking, it's when AI models encode patterns and beliefs
126
367196
3713
严格来说,它出现在 AI 模型将代表刻板印象、
06:10
that can represent stereotypes or racism and sexism.
127
370950
3128
种族歧视、性别歧视的 模式或观点纳入其中的时候。
06:14
One of my heroes, Dr. Joy Buolamwini, experienced this firsthand
128
374412
3212
我的偶像之一,乔伊·布拉姆维尼 (Joy Buolamwini)博士
亲身经历了这一点,
06:17
when she realized that AI systems wouldn't even detect her face
129
377665
3045
她发现 AI 系统 甚至无法检测到她的脸,
06:20
unless she was wearing a white-colored mask.
130
380752
2169
除非她戴着白色面具。
06:22
Digging deeper, she found that common facial recognition systems
131
382962
3754
深入研究后,她发现, 常见的面部识别系统
06:26
were vastly worse for women of color compared to white men.
132
386758
3253
识别有色人种女性 要比白人男性差得多。
06:30
And when biased models like this are deployed in law enforcement settings,
133
390428
5297
在执法场景中使用这样 带有偏见的模型时,
06:35
this can result in false accusations, even wrongful imprisonment,
134
395767
4296
可能会导致虚假指控,甚至是冤狱,
06:40
which we've seen happen to multiple people in recent months.
135
400063
3920
近几个月来,我们已经在多人身上 看到了这种指控的发生。
06:44
For example, Porcha Woodruff was wrongfully accused of carjacking
136
404025
3086
例如,波恰·伍德拉夫 (Porcha Woodruff)
在怀孕八个月时被错误指控劫车,
06:47
at eight months pregnant
137
407111
1252
06:48
because an AI system wrongfully identified her.
138
408363
2961
因为 AI 系统错误地识别了她。
06:52
But sadly, these systems are black boxes,
139
412325
2002
但遗憾的是, 这些系统是个黑箱,
06:54
and even their creators can't say exactly why they work the way they do.
140
414369
5964
即使是它们的创造者也无法 确切地说出它们为什么会如此运行。
07:00
And for example, for image generation systems,
141
420917
3462
例如,就图像生成系统而言,
07:04
if they're used in contexts like generating a forensic sketch
142
424379
4129
如果它们被用于 根据对嫌疑人的描述
07:08
based on a description of a perpetrator,
143
428549
2711
生成法医素描这样的场景下,
07:11
they take all those biases and they spit them back out
144
431260
3587
它们会吸取所有偏见, 再原样返回,
07:14
for terms like dangerous criminal, terrorists or gang member,
145
434889
3462
比如“危险罪犯”、“恐怖分子”、 “黑社会成员”这样的词语,
07:18
which of course is super dangerous
146
438393
2168
这种工具用在社会中 当然是非常危险的。
07:20
when these tools are deployed in society.
147
440603
4421
07:25
And so in order to understand these tools better,
148
445566
2294
因此,为了更好地理解这些工具,
07:27
I created this tool called the Stable Bias Explorer,
149
447902
3212
我创建了这个名为 Stable Bias Explorer 的工具,
07:31
which lets you explore the bias of image generation models
150
451155
3379
它可以让你从专业的角度 探索图像生成模型中的偏见。
07:34
through the lens of professions.
151
454575
1669
07:37
So try to picture a scientist in your mind.
152
457370
3045
试试在你的脑海中 想象一位科学家。
07:40
Don't look at me.
153
460456
1168
别看着我。
07:41
What do you see?
154
461666
1335
你看到了什么?
07:43
A lot of the same thing, right?
155
463835
1501
大家都差不多,对吧?
07:45
Men in glasses and lab coats.
156
465378
2377
戴眼镜、穿着实验室外套的男人。
07:47
And none of them look like me.
157
467797
1710
都不是我这个样子。
07:50
And the thing is,
158
470174
1460
问题是我们研究了 各种图像生成模型,
07:51
is that we looked at all these different image generation models
159
471676
3253
07:54
and found a lot of the same thing:
160
474929
1627
发现了很多相同的东西:
07:56
significant representation of whiteness and masculinity
161
476597
2586
在我们研究的 150 个职业中,
07:59
across all 150 professions that we looked at,
162
479225
2127
都有明显的白人和男性气质,
08:01
even if compared to the real world,
163
481352
1794
即使与现实世界相比,
08:03
the US Labor Bureau of Statistics.
164
483187
1836
美国劳工统计局也是如此。
08:05
These models show lawyers as men,
165
485023
3044
这些模型将律师显示为男性,
08:08
and CEOs as men, almost 100 percent of the time,
166
488109
3462
将 CEO 显示为男性, 几乎每次都是,
08:11
even though we all know not all of them are white and male.
167
491571
3170
虽然我们都知道不是所有 这些岗位都是白人男性。
08:14
And sadly, my tool hasn't been used to write legislation yet.
168
494782
4380
遗憾的是,我的工具 还没有被用来起草立法。
08:19
But I recently presented it at a UN event about gender bias
169
499203
3963
但我最近在联合国的 一次关于性别偏见的活动上
08:23
as an example of how we can make tools for people from all walks of life,
170
503166
3879
介绍了这个工具,借此 说明我们如何为各行各业的人们,
08:27
even those who don't know how to code,
171
507086
2252
即使是那些不会编程的人制作工具,
08:29
to engage with and better understand AI because we use professions,
172
509380
3253
让他们参与、更好地理解 AI, 因为我们运用的是专业知识,
08:32
but you can use any terms that are of interest to you.
173
512633
3087
但你可以使用任何你感兴趣的术语。
08:36
And as these models are being deployed,
174
516596
2752
随着这些模型的部署,
08:39
are being woven into the very fabric of our societies,
175
519390
3128
它们正在与我们社会的千丝万缕交织,
08:42
our cell phones, our social media feeds,
176
522518
2044
我们的手机、我们的社交媒体发文,
08:44
even our justice systems and our economies have AI in them.
177
524604
3211
甚至我们的司法系统 和经济都包含了 AI。
08:47
And it's really important that AI stays accessible
178
527815
3879
保证 AI 的可访问性很重要,
08:51
so that we know both how it works and when it doesn't work.
179
531736
4713
这样我们才能知道它是如何运作的, 什么情况下是用不了的。
08:56
And there's no single solution for really complex things like bias
180
536908
4296
对于偏见、版权或气候变化等 非常复杂的问题,
09:01
or copyright or climate change.
181
541245
2419
没有单一的解决方案。
09:03
But by creating tools to measure AI's impact,
182
543664
2711
但是,通过创建 衡量 AI 影响的工具,
09:06
we can start getting an idea of how bad they are
183
546375
3337
我们可以开始了解 AI 有多恶劣,
09:09
and start addressing them as we go.
184
549754
2502
随之着手处理这些问题。
09:12
Start creating guardrails to protect society and the planet.
185
552256
3337
开始采取防护措施, 保护社会和地球。
09:16
And once we have this information,
186
556177
2336
一旦我们有了这些信息,
09:18
companies can use it in order to say,
187
558513
1835
公司就可以借此表示,
09:20
OK, we're going to choose this model because it's more sustainable,
188
560389
3170
好吧,我们之所以选择这个模型, 是因为它更具可持续性,
09:23
this model because it respects copyright.
189
563601
2044
选择这个模型,是因为它尊重版权。
09:25
Legislators who really need information to write laws,
190
565686
3087
真正需要信息来制定法律的立法者
09:28
can use these tools to develop new regulation mechanisms
191
568773
3462
可以在 AI 部署至社会之中时 利用这些工具制定
09:32
or governance for AI as it gets deployed into society.
192
572276
3796
新的管理或治理机制。
09:36
And users like you and me can use this information
193
576114
2377
像你我这样的用户 可以利用这些信息
09:38
to choose AI models that we can trust,
194
578491
3337
选择我们可以信任的 AI 模型,
09:41
not to misrepresent us and not to misuse our data.
195
581869
2920
而不是歪曲我们的形象, 也不会滥用我们的数据。
09:45
But what did I reply to that email
196
585790
1918
但是我是怎么回复那封 说我的作品将摧毁人类的邮件的?
09:47
that said that my work is going to destroy humanity?
197
587750
2961
09:50
I said that focusing on AI's future existential risks
198
590711
4046
我说,关注 AI 未来的生存风险
09:54
is a distraction from its current,
199
594799
2044
会分散人们关注眼下切实的影响
09:56
very tangible impacts
200
596843
1835
09:58
and the work we should be doing right now, or even yesterday,
201
598719
4004
以及我们现在,甚至昨天
10:02
for reducing these impacts.
202
602723
1919
为减少这些影响而应该做的工作。
10:04
Because yes, AI is moving quickly, but it's not a done deal.
203
604684
4045
因为没错,AI 发展迅速, 但还没有尘埃落定。
10:08
We're building the road as we walk it,
204
608771
2503
我们边走边铺脚下的路,
10:11
and we can collectively decide what direction we want to go in together.
205
611274
3795
我们可以共同决定 我们想共同前进的方向。
10:15
Thank you.
206
615069
1210
谢谢。
10:16
(Applause)
207
616279
2002
(掌声)
关于本网站

这个网站将向你介绍对学习英语有用的YouTube视频。你将看到来自世界各地的一流教师教授的英语课程。双击每个视频页面上显示的英文字幕,即可从那里播放视频。字幕会随着视频的播放而同步滚动。如果你有任何意见或要求,请使用此联系表与我们联系。

https://forms.gle/WvT1wiN1qDtmnspy7