How AI Is Decoding Ancient Scrolls | Julian Schilliger and Youssef Nader | TED
90,017 views ・ 2025-01-30
请双击下面的英文字幕来播放视频。
翻译人员: Lening Xu
校对人员: Bruce Wang
00:04
We always think about the potential
of AI changing the future.
0
4020
3880
我们一直思考AI改变未来的潜力。
00:08
But what about the potential
of AI changing the past?
1
8900
2760
但AI改变过去的潜力呢?
00:12
My name is Youssef Nader.
2
12860
1560
我是优素福·纳德。
00:14
I'm an Egyptian AI researcher
3
14460
1960
是一名埃及人工智能研究员,
00:16
and a PhD student
at the Free University in Berlin,
4
16460
3120
也是柏林自由大学的博士生,
00:19
and last year, I led the Vesuvius
Grand Prize winning team
5
19620
4960
去年,我带领维苏威火山奖团队
00:24
on exploring this very question.
6
24620
2240
探索了这个问题。
00:27
You see, the story starts
almost 2,000 years ago.
7
27900
3320
你看,故事开始于将近两千年前。
00:32
A Greek philosopher that we believe
was Philodemus of Gadara
8
32180
4040
一位希腊哲学家,
我们认为是加达拉的菲洛德默斯
00:36
sat in one of the many rooms
of the Villa dei Papiri.
9
36260
2880
坐在纸莎草别墅众多房间的一间里。
00:39
He talked about music,
he talked about pleasure,
10
39980
3280
谈论音乐,谈论快乐,
00:43
he talked about what makes
things enjoyable,
11
43260
2560
他谈到什么让事情变得愉快,
00:45
questions that still
plague us until today.
12
45860
2280
直到今天这些问题仍然困扰着我们。
00:49
One of his scribes wrote down his thoughts
on sheets of papyrus.
13
49300
4800
他的一位抄写员在纸莎草纸上
写下他的想法。
00:54
The sheets were rolled and stowed away
for later generations.
14
54660
4320
这些纸被卷起来存放供后代使用。
01:00
Fast-forward 150 years, ...
15
60180
3120
快进 150 年,
01:04
Mount Vesuvius erupts,
16
64380
2160
维苏威火山爆发,
01:06
burying Herculaneum,
the villa and the words of the philosopher
17
66540
4800
将赫库兰尼姆、别墅和哲学家的话
01:11
under a sea of hot mud and ashes.
18
71380
2840
埋在充满热泥和灰烬的海洋下。
01:15
Now fast-forward again,
to the 17th century.
19
75460
2400
快进到 17 世纪。
01:18
People are excavating around the area.
20
78420
2160
人们正在该地区四处挖掘。
01:21
They found beautiful statues,
breathtaking frescoes
21
81180
4360
他们发现了漂亮的雕像、
令人叹为观止的壁画
01:25
and some weird-looking pieces of charcoal,
22
85580
3640
和一些看上去很奇怪的木炭,
01:29
like you see in this picture.
23
89260
1720
就像你们在这张照片中看到的那样。
01:31
This is when the first scrolls
were discovered,
24
91700
2880
这是发现第一卷轴的时候,
01:34
and people were racing
to excavate more of these.
25
94620
3000
人们竞相挖掘更多的卷轴。
01:38
What knowledge is included
that is not known to us now?
26
98300
4320
卷轴中有哪些
我们现在不知道的知识?
01:42
What things should we know
about these scrolls?
27
102660
2920
关于这些卷轴,
我们应该知道什么?
01:48
My name is Julian,
and I am a digital archaeologist.
28
108860
5040
我叫朱利安,
是一名数字考古学者。
01:55
When the pyroclastic flow hit the scrolls,
it had a destructive effect.
29
115660
6920
当火山灰撞击卷轴时,
产生了破坏性影响。
02:03
It tore into them, shredded off pieces,
and it charred them badly.
30
123460
6200
它们被撕成碎片,
然后被烧焦。
02:09
Even the deformation that you can see
happened at that point.
31
129700
5120
你看到的变形也是那个时候发生的。
02:14
People, 250-something years ago,
32
134860
3920
大约在 250 年前,
02:18
were curious what's lying
inside those scrolls,
33
138780
3240
人们好奇那些卷轴里面隐藏的,
不可见的到底是什么?
02:22
hidden and not accessible anymore.
34
142060
3200
02:26
Because of a lack of technology,
35
146020
1920
由于缺乏技术,
02:27
they had to resort
to physically unrolling
36
147980
3920
他们不得不采用物理方式展开,
02:31
and thereby destroying
most of the scrolls.
37
151900
3680
从而摧毁大部分卷轴。
02:36
To this day,
38
156340
1320
时至今日,
02:37
only the most damaged and deformed scrolls
39
157700
3000
只有损坏和变形最严重的卷轴
02:40
remain in their initial,
rolled-up configuration.
40
160740
4200
保留着初始的卷起状态。
02:45
Fast-forwarding a little bit,
the computer age arrives.
41
165780
4320
稍微快一点,
计算机时代就到来了。
02:50
Youssef and I are born.
42
170140
2960
我和优素福出生了。
02:53
We are going on
and getting our education --
43
173100
2840
我们正在接受教育——
02:55
(Laughter)
44
175940
1760
(笑声)
02:57
and at the same time, Brent Seales,
a researcher and professor,
45
177700
6080
同时,研究员兼教授布伦特·西尔斯
03:03
had the idea to use CT scan technology
to actually digitize the scrolls,
46
183820
6560
有了使用CT扫描技术对卷轴
进行数字化的想法,
03:10
with the hope of, one day,
digitally unrolling them.
47
190420
5640
希望有一天能以数字方式展开它们。
03:16
Behind me, you can see
a video of such a CT scan,
48
196100
3520
在我身后,你可以看到这样
一个CT扫描的视频,
03:19
and it goes through the CT scan 3D volume,
49
199660
3400
它逐层通过CT扫描3D卷轴。
03:23
layer by layer.
50
203100
1200
03:25
The papyrus is visible as a spiral,
51
205020
4080
纸莎草纸呈螺旋状,
03:29
and you can see it's tightly wound-up,
52
209100
2080
你可以看到它紧紧缠绕,
03:31
sometimes touching
each other, flaying off.
53
211220
2720
有时会互相粘连,然后脱落。
03:33
It's a difficult question,
how to unroll this digitally.
54
213980
3640
这是一个很难回答的问题,
如何以数字方式开启这个卷轴。
03:38
Nat Friedman, a Silicon Valley investor,
55
218740
3440
硅谷投资者纳特·弗里德曼
03:42
also saw this research,
and he wanted to help.
56
222220
3640
也看到了这项研究,
他想提供帮助。
03:46
That was in 2022.
57
226420
2360
那是 2022 年。
03:48
He reached out, and together
with Brent Seales,
58
228820
3080
他出手与布伦特·西尔斯
03:51
they created the Vesuvius Challenge,
59
231940
2760
一起创建了维苏威挑战赛,
03:54
with the goal to motivate
nerds all over the world
60
234740
3720
目标是激励全世界的书呆子
03:58
to solve this problem.
61
238460
1560
来解决这个问题。
04:00
(Laughter)
62
240020
1920
(笑声)
04:01
They created a grand prize,
63
241980
1800
他们设立了大奖,
04:03
promising eternal glory
and monetary incentives
64
243820
3560
承诺向任何有能力的人
提供永恒荣耀和金钱激励。
04:07
to anyone who could do that.
65
247420
1560
04:09
(Laughter)
66
249020
1800
(笑声)
04:10
I myself saw that on the internet
67
250860
2280
我本人在网上看到这个消息时,
04:13
while writing my master's thesis
at ETH Zurich, in robotics,
68
253180
4040
正在苏黎世联邦理工学院
撰写机器人专业研究生论文,
04:17
and I was instantly happy to solve it --
69
257220
3280
我立刻对解决这个问题产生兴趣——
04:20
or at least try, why not, you know?
70
260540
2320
或者至少尝试一下,为什么不呢?
04:22
And I went on,
joined the Discord community
71
262900
3880
我加入了Discord社区,
04:26
where all the people
that were also contestants
72
266820
2720
04:29
and playing with the scroll data
73
269540
1800
就卷轴数据交流想法,
04:31
were exchanging ideas,
74
271340
2080
04:33
and I joined there
and started working on it.
75
273420
2840
我加入并开始研究这个问题。
04:37
Also there, on Discord,
I met Youssef and Luke [Farritor],
76
277820
3240
同样在那里,在 Discord 上,
我遇到了优素福和卢克,
04:41
who would become my teammates,
77
281100
1440
他们将成为我的队友,
04:42
and with whom I would actually win
the grand prize.
78
282580
3360
我和他们一起赢得大奖。
04:45
Surprisingly, it went on,
and made global headline news.
79
285980
3800
令人惊讶的是,
它将成为全球头条新闻。
04:50
It even got into the British tabloids.
80
290460
2400
甚至登上了英国的小报。
04:52
(Laughter)
81
292900
3040
(笑声)
04:55
So when we started,
82
295940
1560
我们刚开始的时候,
04:57
there were two main problems
still remaining.
83
297500
3400
面临两个主要问题。
05:00
One, you had to unroll the scroll.
84
300940
2840
第一,你必须展开卷轴。
05:03
And two, you then had to make
the ink visible.
85
303780
3560
第二,然后你必须让墨水可见。
05:07
Youssef will tell you
more about that part.
86
307380
2280
优素福会告诉你
更多关于这部分的信息。
05:10
For me, the most exciting thing
87
310660
2120
对我来说,最令人兴奋的是
05:12
was the computer-vision problem
of unrolling those scrolls virtually.
88
312780
4280
虚拟地展开这些卷轴的计算机视觉问题。
05:17
I decided to iterate on a tool
89
317060
2240
我决定对肯塔基大学研究员
创建的工具进行迭代,
05:19
that was created
by the Kentucky researchers,
90
319340
2800
05:22
and make it faster, less prone to errors
91
322180
3640
让它变得更快,更不容易出错,
05:25
and just iterate on it and make it better.
92
325820
2800
然后对其进行迭代使其变得更好。
05:29
The Vesuvius Challenge team saw that
93
329380
1840
维苏威挑战赛团队看到了这一点,
05:31
and also implemented a team
of 10 people that would use my tool.
94
331220
4680
还组建了一个由 10 人组成的团队
来使用我的工具。
05:35
They would annotate scroll data,
like you see in this video,
95
335940
3840
他们会为卷轴数据添加注释,
就像你在这段视频中看到的那样,
05:39
where they created a red line
where the surface would lie.
96
339820
3440
他们在表面上画了一条红线。
05:43
The algorithm then would
take it into 3D space,
97
343300
3120
算法会将其带入三维空间,
05:46
creating a three-dimensional
representation of the surface.
98
346460
4040
创建表面的三维展示。
05:50
Computer algorithms
99
350860
1240
计算机算法会将其扁平化
并创建分区。
05:52
would then flatten it
and create a segment.
100
352140
2360
05:55
This all would be called “segmentation”
101
355220
2680
在卷轴展开讨论社区中,
05:57
in the space of the scrolling
and unrolling community.
102
357940
6240
这被称为 “分割”。
06:04
(Laughter)
103
364940
5200
(笑声)
06:10
So I created open-source
commits to this tool
104
370140
5760
我为这个工具创建了开源代码,
06:15
and implemented new algorithms
from my studies, like Optical Flow,
105
375900
3840
并根据我的研究实现了新的算法,
比如Optical Flow,
06:19
to better track the sheets
through the volume,
106
379780
3560
以更好地通过模型跟踪卷轴,
06:23
and we end up with something
like what you see behind me.
107
383380
3840
最后我们得到的结果
就像你在我身后看到的那样。
06:27
First off, those were
really small segments,
108
387260
2920
首先,这些都是很小的片段,
06:30
and I added improvement,
109
390180
1480
我进行了改进,
06:31
made the code faster
110
391700
1440
使代码更快,
06:33
and had lots of feedback
from the community.
111
393180
2880
并收到了来自社区的大量反馈。
06:36
They were really happy, and I was
happy getting lots of feedback.
112
396100
3400
他们真的很高兴,
我也很高兴收到很多反馈。
06:39
It was a really positive environment.
113
399540
2120
这是一个非常积极的环境。
06:43
So in the end,
114
403300
1800
因此,最终,
06:45
I could track the performance
of the algorithms,
115
405140
2720
我可以跟踪算法的性能,
06:47
how the segmentation team performed,
116
407900
2160
分割部分的运行,
06:50
and I could see that my improvements,
from start to finish,
117
410100
3560
我从始至终可以看到改进,
06:53
would be around a 10,000-fold
improvement over the initial version.
118
413660
5560
最终版比初始版本提高约 10,000 倍。
07:00
This algorithm was then also used
to unroll all the area
119
420580
4320
该算法还被用来展开所有区域,
你可以在我们提交的内容中看到。
07:04
that you can see in our submission.
120
424940
2040
07:07
All the sheets were generated
with these methods.
121
427020
3240
所有表单都是用这些方法生成的。
07:11
In December, I was looking for teammates.
122
431620
3280
十二月时,我一直在寻找队友。
07:14
I made a blog post,
123
434940
1280
我写了一篇博客,
07:16
and I showcased my newest algorithms,
124
436260
3280
展示了我最新的算法,
07:19
reaching out to anyone
that was willing to team up.
125
439580
3240
接触任何愿意合作的人。
07:23
Youssef and Luke got into contact with me.
126
443780
3160
优素福和卢克联系了我。
07:27
They were happy to team up,
and I was happy as well.
127
447740
3280
他们很高兴组队,
我也很高兴。
07:31
(Laughter)
128
451740
1400
(笑声)
07:33
So after the virtual unwrapping,
the words still are not visible.
129
453820
4720
虚拟展开后,
这些文字仍然不可见。
07:39
The main problem is that the ink
that was used at the time
130
459100
2920
主要问题是,
当时使用的墨水是碳基墨水,
07:42
was a carbon-based ink,
131
462020
1840
07:43
and carbon-based ink
on carbon-based papyrus in a CT scan
132
463900
3440
而在CT扫描中,
碳基纸莎草纸上的碳基墨水
07:47
isn’t visible, or at least
[not] to the naked eye.
133
467380
2800
是不可见的,
或者至少肉眼看不见。
07:50
So the same team
at the University of Kentucky
134
470860
2200
因此,肯塔基大学的同一个研究小组
07:53
decided to test whether the ink
was present at all
135
473100
2600
决定测试CT扫描中是否存在墨水。
07:55
in the CT scans.
136
475700
1360
07:57
For this, they took some of the pieces
that people broke off the scrolls,
137
477100
5160
为此,他们采集了
人们从卷轴上拆下的一些碎片,
08:02
and they fed them into the same pipeline
of the X-ray CT scanning,
138
482260
5040
然后将它们送入同一个X射线
CT扫描通道中,
08:07
and this gives us the 3D data
that we were working with.
139
487300
3040
这为我们提供了
正在处理的三维数据。
08:10
Because you can see the ink
and it’s an exposed surface,
140
490380
3840
因为你可以看到墨水,
而且它是裸露的表面,
08:14
you can even improve it
with infrared imaging.
141
494220
2600
所以你甚至可以通过红外成像
对其进行改善。
08:16
And this gives you a ground truth
142
496820
1640
这为想找到的字母,
提供了基本的数据。
08:18
of what letters
you're actually trying to find.
143
498460
2480
08:21
And then from there,
144
501620
1280
然后,
08:22
you can train a machine-learning model
to try to find these letters.
145
502940
3640
你可以训练一个机器学习模型
来尝试找到这些字母。
08:27
The way this works
146
507260
1440
其工作原理是,
08:28
is that the model looks
at very small cubes at a single time
147
508740
3440
模型一次只能查看非常小的立方体,
08:32
and tries to decide whether there is ink
present in this area or not.
148
512220
4080
并尝试确定该区域是否存在墨水。
08:36
And then, when you keep
moving this cube all around,
149
516300
3360
然后,当你继续移动这个立方体时,
08:39
the model gets to see
different data samples
150
519700
3040
模型会看到不同的数据样本,
08:42
and then tries to understand
what ink actually is.
151
522780
3560
然后尝试了解墨水到底是什么。
08:47
So this is how it looks
while the model is training.
152
527020
2520
所以这就是模型训练时的样子。
08:49
It's not perfect, but you can see that,
especially around the middle,
153
529580
4640
它并不完美,但你可以看到,
尤其是在中间,
08:54
the model is starting
to see the letters perfectly.
154
534260
2880
模型已经开始完美地看到字母了。
08:57
So the data is there.
155
537140
2400
所以数据在那里。
08:59
The ink is there.
156
539580
1240
墨水也在那里。
09:00
But it’s just very hard to find and see.
157
540820
1960
但是很难找到和看见。
09:04
Looking at the CT scan
raw data on the left here,
158
544020
4960
看看左边的CT扫描的原始数据,
09:09
you can see the fibers,
159
549020
2280
你可以看到纤维,
09:11
you can see the structure of the papyrus,
160
551340
2960
你可以看到纸莎草纸的结构,
09:14
but the letters are very, very faint.
161
554340
2640
但是字母非常非常微弱。
09:17
The letters from the right image
are very, very faint in the CT scans.
162
557020
4400
右侧图像中的字母
在CT扫描中非常非常浅显。
09:21
And they're actually,
in this special case,
163
561460
3080
实际上,在这种特殊情况下,
09:24
characterized by a difference of contrast
164
564580
2480
它们的特征是对比度不同,
09:27
and some speckles, freckles,
features that are very hard to see.
165
567100
4520
还有一些斑点、雀斑和很难看清的特征。
09:32
So what happens if we try to take a look
166
572340
3360
那么,如果我们试着
09:35
at the segment that Julian
was just showing?
167
575740
2720
看看朱利安刚才放映的片段
会发生什么?
09:40
So this is the data
that we were working with.
168
580500
2720
这是我们正在使用的数据。
09:43
And I'm going to give you 10 seconds
to try to find the letters yourself.
169
583260
4040
我会给你 10 秒钟来尝试
自己找到字母。
09:47
(Laughter)
170
587340
1120
(笑声)
09:48
And as a hint,
171
588500
1640
作为提示,
09:50
I'll tell you that there are
three letters in this image.
172
590180
2720
我要告诉你,
这张照片中有三个字母。
09:52
Believe me.
173
592900
2080
相信我。
09:55
Try to find some pattern,
some crackle patterns,
174
595980
5160
试着找到一些图案,
一些裂纹图案,
10:01
some cracks in there.
175
601140
1520
里面有一些裂缝。
10:03
If you were able to identify
this pattern of these three letters --
176
603420
4160
如果你能识别出这三个字母的图案--
10:07
(Laughter)
177
607620
2720
(笑声)
10:10
then congratulations.
178
610380
1200
恭喜你。
10:11
One year ago, you may have won
40,000 dollars.
179
611620
2360
一年前,你可能赢了 40,000 美元。
10:14
(Laughter)
180
614020
2000
(笑声)
10:16
However, if you're like me,
and you couldn't make sense of this,
181
616060
3720
但是,如果你像我一样,
你无法理解这一点,
10:19
there's a different way
that you can find this ink --
182
619780
2680
那么你可以通过另一种方式
找到这种墨水——
10:22
one that actually scales very, very well.
183
622460
2440
一种可以很好地缩放的墨水。
10:26
So this is where my journey begins
with the Vesuvius Challenge.
184
626460
3680
因此,我的旅程从
维苏威火山挑战赛开始。
10:30
There is this neat idea
in computer-vision literature
185
630180
2880
计算机视觉文献中有这样一个好主意,
10:33
where if you don't actually have labels,
186
633060
2120
10:35
if you don't have the goal
that you want your AI model to reach,
187
635220
4080
如果没有你想要的人工智能模型
实现的目标,
10:39
you can pick an intermediary
goal along the way.
188
639300
2480
你可以选择一个中间目标。
10:42
So, looking at these two pairs of images,
189
642500
4880
因此,看着这两对图像,
10:47
our eyes can identify
that these are the same images,
190
647420
3040
我们的眼睛可以识别出
它们是相同的图像,
10:50
just flipped.
191
650500
1600
只需翻转即可。
10:52
And we can do that because we understand
192
652100
2680
我们之所以能做到这一点,
10:54
the structures
that are present in the images.
193
654820
2240
是因为我们了解图像中存在的结构。
10:57
We can see this little triangle,
and it's flipped,
194
657060
3040
我们可以看到这个小三角形,
它被翻转了,
11:00
so we know this is the same triangle.
195
660140
2240
所以我们知道这是同一个三角形。
11:02
Our eyes already have this feature,
196
662420
2720
我们的眼睛已经有了这个特征,
11:05
but neural networks don't.
197
665180
2720
但是神经网络没有。
11:07
When they see these images,
198
667940
1320
当他们看到这些图像时,
11:09
they can't [tell] that these are
the same image.
199
669260
2320
他们不能分辨出
它们是同一张图片。
11:11
So one idea, just to let it know
about the structures
200
671580
3160
因此,一个想法是让它知道结构
11:14
and familiarize it with the data,
201
674780
1880
并熟悉数据,
11:16
is to show it different views
of the same image
202
676700
3560
就是向它展示同一张图像的不同视图,
11:20
and tell it that these are
the same images.
203
680260
2000
并告诉它这些是相同的图像。
11:22
And after that, you take this model
204
682900
2800
之后,你采用这个模型
11:25
and you train it like the previous models
that the University of Kentucky did.
205
685700
4440
然后像肯塔基大学以前的模型一样
对其进行训练。
11:30
And while the approach doesn't fully work,
it also doesn't fully not work.
206
690180
4360
虽然这种方法并不完全有效,
它也不是完全不起作用。
11:34
And this was the first image
that was produced by the model.
207
694580
3840
这是该模型生成的第一张图像。
11:38
And there was some
very faint signal in there.
208
698460
3360
里面有一些非常微弱的信号。
11:41
It seemed like the model
was catching on something,
209
701860
2440
看来模型正在捕捉一些东西,
11:44
but it wasn't clear, exactly,
what the model was catching on.
210
704340
3240
但尚不清楚该模型到底在捕捉什么。
11:48
So I decided to take these predictions
and create a new ground truth,
211
708180
6440
因此,我决定根据这些预测
来创造一个新的数据,
11:54
asking the model, "Hey, I think
these might be letters.
212
714660
2760
问模型:“嘿,我想这些可能是字母。
11:57
I think there's something in there.
Try to find more of this."
213
717420
3360
我想里面有东西。
试着找到更多这样的东西。”
12:00
And my ground truth, actually,
214
720820
2000
实际上,我的真实情况
12:02
has four correct letters
and four other delusions.
215
722820
3200
有四个正确的字母和另外四个错觉。
12:06
But that was OK.
216
726060
1680
但那没关系。
12:07
So training a new model with this data,
217
727780
2320
因此,使用这些数据训练一个新模型,
12:10
the model started to find more ink,
find more letters,
218
730140
2760
模型开始寻找更多的墨水,
找到更多的字母,
12:12
and the lines even looked complete.
219
732940
2520
甚至线条看起来很完整。
12:16
So I thought, "What are the chances
that if I do this again,
220
736140
3920
于是我想,“如果我再这样做,
12:20
the models keep improving?"
221
740100
1480
模型不断改进的可能性有多大?”
12:22
And this was the core
behind our grand prize-winning solution.
222
742220
4400
这是我们获奖方案
背后的核心。
12:27
Repeating this process over and over,
the models kept improving.
223
747380
3480
一遍又一遍地重复这个过程,
模型不断改进。
12:31
The main trick was you needed to prevent
the models from memorizing
224
751740
4440
主要的技巧是
你需要防止模型记住
12:36
what the previous models have learned.
225
756220
2000
以前的模型学到的东西。
12:38
You're essentially asking
the model to learn
226
758260
2080
你本质上是在要求模型学习
另一个模型学到了什么。
12:40
what the other model has learned.
227
760340
1880
12:42
So overfitting was a serious problem
that required a lot of experiments.
228
762260
4760
因此,过度拟合是一个严重的问题,
需要进行大量实验。
12:47
But in the end, getting the recipe right,
229
767020
3440
但最终,有了正确的配方,
12:50
we were able to predict
all of these letters
230
770460
3320
我们得以在模型看不到的情况下
预测所有这些字母。
12:53
without the models ever seeing them.
231
773780
2080
12:55
These were the first 10 letters.
232
775860
1640
这是前 10 个字母。
12:57
There are, like, 20 in there,
233
777540
1600
这里面大概有 20 个,
12:59
but this was the first coherent word
read from an unopened papyrus sheet.
234
779180
5640
但这是从一张未展开的纸莎草纸上
读出来的第一个连贯的单词。
13:05
From there, scaling the process,
235
785860
1800
从那以后,
13:07
within weeks, we had,
now, columns of text,
236
787700
3440
在几周之内,我们就有了数列文本,
13:11
even special characters
237
791180
1320
模型甚至能够找到,
纸莎学家认为非常有趣的特殊字符,
13:12
that papyrologists found very interesting
that the model was able to find.
238
792540
3840
13:17
The approach was open-sourced,
239
797660
1720
这种方法是开源的,
13:19
and the data and the code were out there,
240
799420
2520
数据和代码都在那里,
13:21
and the race for the grand prize was on.
241
801980
2120
大奖的争夺正在进行中。
13:24
Recovering four paragraphs
at an 85-percent clarity.
242
804900
4160
恢复四个段落,
清晰度为 85% 。
13:29
And the key to our success
was perfecting the data and the model
243
809100
3760
而我们成功的关键
是通过如此多的迭代
13:32
with so many iterations
and so many experiments.
244
812900
2440
和大量的实验来完善数据和模型。
13:36
In the end, we were able to recover
245
816020
2080
最终,我们得以恢复
13:38
more than 14 columns of text,
and 2,000 letters.
246
818140
4000
超过 14 列文本和 2,000 个字母。
13:44
(Applause)
247
824260
7000
(掌声)
13:51
2,000 characters safely stored away
two millennia ago.
248
831820
5240
2,000 个字符安全地存放了两千年。
13:57
In just nine months,
we discovered them again.
249
837940
4080
在短短九个月内,
我们再次发现了它们。
14:02
AI helped us, in large portions,
250
842460
2600
人工智能在很大程度上帮助我们
14:05
writing better code
and even being part in our algorithms.
251
845060
4280
编写了更好的代码,
甚至参与了我们的算法。
14:10
It opened a window into the past.
252
850060
2320
它打开了一扇通往过去的窗口。
14:12
What's next?
253
852980
1520
下一步是什么?
14:14
Let's open this window more.
254
854500
2000
让我们更多地打开这个窗口。
14:16
AI will help us access information
that was so far safely locked away.
255
856540
5960
人工智能将帮助我们访问迄今
为止被安全封锁的信息。
14:24
In the words of the author,
256
864180
1800
用作者的话说:
14:27
"We do not refrain from questioning
nor understanding,
257
867660
5960
“我们不回避质疑或理解,
14:35
and may it be evident
to say true things as they appear."
258
875380
6680
愿事情真相大白。”
14:43
(Applause)
259
883220
1960
(掌声)
New videos
Original video on YouTube.com
关于本网站
这个网站将向你介绍对学习英语有用的YouTube视频。你将看到来自世界各地的一流教师教授的英语课程。双击每个视频页面上显示的英文字幕,即可从那里播放视频。字幕会随着视频的播放而同步滚动。如果你有任何意见或要求,请使用此联系表与我们联系。