How we teach computers to understand pictures | Fei Fei Li

1,159,394 views ・ 2015-03-23

TED


請雙擊下方英文字幕播放視頻。

譯者: Sailin Lu 審譯者: angie chen
00:14
Let me show you something.
0
14366
3738
容我為各位呈現一些照片
00:18
(Video) Girl: Okay, that's a cat sitting in a bed.
1
18104
4156
(影片)女孩:嗯,這是一隻貓,坐在床上。
00:22
The boy is petting the elephant.
2
22260
4040
這男孩在拍撫一隻象。
00:26
Those are people that are going on an airplane.
3
26300
4354
這些人要去搭飛機。
00:30
That's a big airplane.
4
30654
2810
好大的飛機。
主講人:這是由一位三歲的小孩
00:33
Fei-Fei Li: This is a three-year-old child
5
33464
2206
00:35
describing what she sees in a series of photos.
6
35670
3679
所描述她看到的一系列照片
00:39
She might still have a lot to learn about this world,
7
39349
2845
雖然對於這世界她還有更多要學習的地方,
00:42
but she's already an expert at one very important task:
8
42194
4549
但是她已經是其中一項重要技能的專家--
00:46
to make sense of what she sees.
9
46743
2846
為所見之聞賦予意義。
科技在我們的社會已進展到前所未有的程度:
00:50
Our society is more technologically advanced than ever.
10
50229
4226
00:54
We send people to the moon, we make phones that talk to us
11
54455
3629
我們把人送上月球、發明可以與人交談的電話,
00:58
or customize radio stations that can play only music we like.
12
58084
4946
或是客製一個電台,只播放個人喜歡的音樂。
01:03
Yet, our most advanced machines and computers
13
63030
4055
然而這台無比聰明的機器和電腦
01:07
still struggle at this task.
14
67085
2903
仍然無法發展這項技能,
01:09
So I'm here today to give you a progress report
15
69988
3459
因此今天我來到這裡向各位報告
01:13
on the latest advances in our research in computer vision,
16
73447
4047
我們在電腦視覺的最新研究進展,
01:17
one of the most frontier and potentially revolutionary
17
77494
4161
這是現階段在資訊業領域中,
最先進、最具潛力的革命性技術。
01:21
technologies in computer science.
18
81655
3206
01:24
Yes, we have prototyped cars that can drive by themselves,
19
84861
4551
是的,目前我們已經有自動駕駛的原型車,
01:29
but without smart vision, they cannot really tell the difference
20
89412
3853
但若不具備視覺辨識技術, 它將無法分辨同樣出現在馬路中,
01:33
between a crumpled paper bag on the road, which can be run over,
21
93265
3970
一團它其實輾過也無妨的破紙袋,
01:37
and a rock that size, which should be avoided.
22
97235
3340
以及一個大到它必須閃避的石塊, 兩者有何不同。
我們製造出畫素極高的相機,
01:41
We have made fabulous megapixel cameras,
23
101415
3390
01:44
but we have not delivered sight to the blind.
24
104805
3135
但我們卻無法賦予盲人視覺;
無人機可以翻山越嶺,
01:48
Drones can fly over massive land,
25
108420
3305
01:51
but don't have enough vision technology
26
111725
2134
卻沒有足夠的視覺技術可以
01:53
to help us to track the changes of the rainforests.
27
113859
3461
讓我們追蹤雨林的變化;
01:57
Security cameras are everywhere,
28
117320
2950
監視器滿佈在各個角落,
02:00
but they do not alert us when a child is drowning in a swimming pool.
29
120270
5067
卻無法在看到一個孩子將溺斃在泳池之際, 對我們發出警訊。
靜態及動態影像已逐漸與全世界的生活密不可分,
02:06
Photos and videos are becoming an integral part of global life.
30
126167
5595
02:11
They're being generated at a pace that's far beyond what any human,
31
131762
4087
它們發展的步伐已經遠遠超越人類
02:15
or teams of humans, could hope to view,
32
135849
2783
及其群體所相信的,
02:18
and you and I are contributing to that at this TED.
33
138632
3921
在座各位以及我自己 都是TED這個活動裡頭的推手。
02:22
Yet our most advanced software is still struggling at understanding
34
142553
5232
然而,目前最先進的軟體卻仍在其中苦苦掙扎,
無法理解與應用這龐大的資料體。
02:27
and managing this enormous content.
35
147785
3876
02:31
So in other words, collectively as a society,
36
151661
5272
換而言之,在這整個社會裡,
02:36
we're very much blind,
37
156933
1746
大家都有如盲人在運作,
02:38
because our smartest machines are still blind.
38
158679
3387
因為連我們最聰明的機器都還看不見。
02:43
"Why is this so hard?" you may ask.
39
163526
2926
或許有人會問:這到底有什麼困難?
02:46
Cameras can take pictures like this one
40
166452
2693
任何相機都可以產生像這樣的照片,
02:49
by converting lights into a two-dimensional array of numbers
41
169145
3994
它是藉由將有色光轉換成2D的數字陣列,
02:53
known as pixels,
42
173139
1650
也就是大家熟知的像素。
02:54
but these are just lifeless numbers.
43
174789
2251
但這些數字是死的,
02:57
They do not carry meaning in themselves.
44
177040
3111
並沒有被賦予意義。
03:00
Just like to hear is not the same as to listen,
45
180151
4343
就好像有「聽」,不代表有「到」。
03:04
to take pictures is not the same as to see,
46
184494
4040
同樣地,攝取到影像不等於看見,
03:08
and by seeing, we really mean understanding.
47
188534
3829
我們所認知的看到,應包含著了解其中的意義。
03:13
In fact, it took Mother Nature 540 million years of hard work
48
193293
6177
事實上,這樣的成果, 是大自然花了五億四千萬年的光陰
03:19
to do this task,
49
199470
1973
才得到的。
03:21
and much of that effort
50
201443
1881
這其中的努力,
03:23
went into developing the visual processing apparatus of our brains,
51
203324
5271
泰半是耗費在發展腦部的視覺處理這個區塊,
03:28
not the eyes themselves.
52
208595
2647
而不是眼睛的部分。
03:31
So vision begins with the eyes,
53
211242
2747
也就是說,視覺始於眼睛,
03:33
but it truly takes place in the brain.
54
213989
3518
但真正使它有用的,卻是大腦。
03:38
So for 15 years now, starting from my Ph.D. at Caltech
55
218287
5060
十五年來,從在加州理工學院攻讀博士開始,
03:43
and then leading Stanford's Vision Lab,
56
223347
2926
到領導史丹佛的視覺實驗室,
03:46
I've been working with my mentors, collaborators and students
57
226273
4396
我和指導教授、同事及學生們,
03:50
to teach computers to see.
58
230669
2889
試圖讓電腦擁有智能之眼,
03:54
Our research field is called computer vision and machine learning.
59
234658
3294
我們研究的領域稱之為電腦視覺與機器學習,
03:57
It's part of the general field of artificial intelligence.
60
237952
3878
這是人工智慧其中一環。
04:03
So ultimately, we want to teach the machines to see just like we do:
61
243000
5493
我們的終極目標就是教導機器能夠像人一樣理解所見之物,
04:08
naming objects, identifying people, inferring 3D geometry of things,
62
248493
5387
像是識別物品、辨認人臉、 推論物體的幾何形態,
04:13
understanding relations, emotions, actions and intentions.
63
253880
5688
進而理解其中的關聯、情緒、動作及意圖。
04:19
You and I weave together entire stories of people, places and things
64
259568
6153
在座每一位和我,都可以在匆匆一瞥的瞬間,
理解到人事、地、物所交織而成的網絡,
04:25
the moment we lay our gaze on them.
65
265721
2164
04:28
The first step towards this goal is to teach a computer to see objects,
66
268955
5583
要電腦達成這個目標的第一步,就是教導它辨別物品,
04:34
the building block of the visual world.
67
274538
3368
這是視覺的基石。
04:37
In its simplest terms, imagine this teaching process
68
277906
4434
簡單來說,我們教導的方法就是
04:42
as showing the computers some training images
69
282340
2995
給電腦看一些特定物體的影像,
04:45
of a particular object, let's say cats,
70
285335
3321
例如貓咪。
04:48
and designing a model that learns from these training images.
71
288656
4737
我們設計了一個程式讓電腦利用這些影像來學習
04:53
How hard can this be?
72
293393
2044
這有啥困難?
04:55
After all, a cat is just a collection of shapes and colors,
73
295437
4052
貓咪不就是由一些幾何圖形和顏色所組成的嘛,
04:59
and this is what we did in the early days of object modeling.
74
299489
4086
這就是我們初期所做的物體模型。
05:03
We'd tell the computer algorithm in a mathematical language
75
303575
3622
我們用數學語言來告知電腦演繹方法,
05:07
that a cat has a round face, a chubby body,
76
307197
3343
貓就是有圓圓的臉、胖胖的身體,
05:10
two pointy ears, and a long tail,
77
310540
2299
兩個尖尖的耳朵和一條長尾巴。
05:12
and that looked all fine.
78
312839
1410
看起來很好啊,
05:14
But what about this cat?
79
314859
2113
但如果貓咪長這樣呢?
05:16
(Laughter)
80
316972
1091
(觀眾笑)
05:18
It's all curled up.
81
318063
1626
全身都捲起來了。
05:19
Now you have to add another shape and viewpoint to the object model.
82
319689
4719
這下子我們又得在原來的模型 加上新的形狀和不同的視野角度。
05:24
But what if cats are hidden?
83
324408
1715
又,如果貓咪是躲著的呢?
05:27
What about these silly cats?
84
327143
2219
像這群傻貓?
05:31
Now you get my point.
85
331112
2417
這樣各位了解我的意思嗎?
05:33
Even something as simple as a household pet
86
333529
3367
即使簡單如貓這樣的家庭寵物,
05:36
can present an infinite number of variations to the object model,
87
336896
4504
也會有相對於原型以外,無數的其他形態表徵,
05:41
and that's just one object.
88
341400
2233
而這只是其中一樣。
05:44
So about eight years ago,
89
344573
2492
因此八年前,
05:47
a very simple and profound observation changed my thinking.
90
347065
5030
一項極其簡單和深刻的觀察,改變了我的想法,
05:53
No one tells a child how to see,
91
353425
2685
沒有人教導孩子如何去「看」,
05:56
especially in the early years.
92
356110
2261
特別是在早期發育階段,
05:58
They learn this through real-world experiences and examples.
93
358371
5000
他們是從真實世界的經驗中學習。
06:03
If you consider a child's eyes
94
363371
2740
如果你把孩童的眼睛
06:06
as a pair of biological cameras,
95
366111
2554
當成生物相機的概念,
06:08
they take one picture about every 200 milliseconds,
96
368665
4180
就如同每200毫秒就拍一張照片一樣,
06:12
the average time an eye movement is made.
97
372845
3134
這是眼球移動的平均時間。
06:15
So by age three, a child would have seen hundreds of millions of pictures
98
375979
5550
年紀到了三歲時, 孩子們已經看過了真實世界中
數以百萬計的照片,
06:21
of the real world.
99
381529
1834
06:23
That's a lot of training examples.
100
383363
2280
這樣的訓練範例是很大量的。
06:26
So instead of focusing solely on better and better algorithms,
101
386383
5989
因此,我的直覺告訴我 應該以孩童的學習經驗法則,
06:32
my insight was to give the algorithms the kind of training data
102
392372
5272
並兼以質與量,
提供訓練的資料給電腦,
06:37
that a child was given through experiences
103
397644
3319
06:40
in both quantity and quality.
104
400963
3878
而非一昧追求更好的程式演算。
06:44
Once we know this,
105
404841
1858
有了上述的洞見,
06:46
we knew we needed to collect a data set
106
406699
2971
我們接下來必須要收集
06:49
that has far more images than we have ever had before,
107
409670
4459
前所未有的大量資料群,
06:54
perhaps thousands of times more,
108
414129
2577
甚至於是千倍以上的。
06:56
and together with Professor Kai Li at Princeton University,
109
416706
4111
於是我與普林斯頓大學的李凱教授
07:00
we launched the ImageNet project in 2007.
110
420817
4752
共同於2007年開始了 我們稱之為 ImageNet 的專案。
07:05
Luckily, we didn't have to mount a camera on our head
111
425569
3838
很幸運地,我們不必在頭上綁一個相機,
07:09
and wait for many years.
112
429407
1764
然後花費數年收集影像,
07:11
We went to the Internet,
113
431171
1463
而是轉而由網際網路,
07:12
the biggest treasure trove of pictures that humans have ever created.
114
432634
4436
這個由人類所創造出來 龐大的影像寶窟,
07:17
We downloaded nearly a billion images
115
437070
3041
我們下載了數以百萬計的影像,
07:20
and used crowdsourcing technology like the Amazon Mechanical Turk platform
116
440111
5880
並且使用如Amazon Mechanical Turk 這樣的群眾外包平台,
07:25
to help us to label these images.
117
445991
2339
來協助我們處理及分類這些照片。
07:28
At its peak, ImageNet was one of the biggest employers
118
448330
4900
在高峰期,ImageNet 甚至是整個亞馬遜平台
07:33
of the Amazon Mechanical Turk workers:
119
453230
2996
最大的雇主之一,
07:36
together, almost 50,000 workers
120
456226
3854
我們一共聘請了來自167個國家,
07:40
from 167 countries around the world
121
460080
4040
約5萬個工作者,
07:44
helped us to clean, sort and label
122
464120
3947
來協助我們分類處理並標示
07:48
nearly a billion candidate images.
123
468067
3575
將近10億幅影像,
07:52
That was how much effort it took
124
472612
2653
花費了這麼多的資源,
07:55
to capture even a fraction of the imagery
125
475265
3900
就是為了捕捉那一絲絲
07:59
a child's mind takes in in the early developmental years.
126
479165
4171
孩童在早期心智發展的浮光掠影。
08:04
In hindsight, this idea of using big data
127
484148
3902
用現在眼光看來,使用大量的資料
08:08
to train computer algorithms may seem obvious now,
128
488050
4550
來訓練電腦演算是明顯合理的,
08:12
but back in 2007, it was not so obvious.
129
492600
4110
然而在2007年的世界卻非如此。
08:16
We were fairly alone on this journey for quite a while.
130
496710
3878
有好長一段時間, 我們在這個旅途中孤獨地踽踽而行,
08:20
Some very friendly colleagues advised me to do something more useful for my tenure,
131
500588
5003
有些同事好心地建議我, 與其苦苦掙扎於研究經費的募集,
08:25
and we were constantly struggling for research funding.
132
505591
4342
還不如轉而先做些比較好拿到終身聘的研究,
08:29
Once, I even joked to my graduate students
133
509933
2485
我還曾跟我的研究生開玩笑說
08:32
that I would just reopen my dry cleaner's shop to fund ImageNet.
134
512418
4063
我乾脆再開一間乾洗店來資助ImageNet 好了,
08:36
After all, that's how I funded my college years.
135
516481
4761
畢竟那就是我用以支付大學學費的方法。
08:41
So we carried on.
136
521242
1856
就這樣我們還是繼續往前走,
08:43
In 2009, the ImageNet project delivered
137
523098
3715
2009年起,ImageNet 已經是個擁有
08:46
a database of 15 million images
138
526813
4042
涵蓋了兩萬兩千種不同類別,
08:50
across 22,000 classes of objects and things
139
530855
4805
多達150億幅圖像的資料庫,
08:55
organized by everyday English words.
140
535660
3320
並組織以英語日常生活用字為主,
08:58
In both quantity and quality,
141
538980
2926
這樣的規模,不論是「質」或「量」
09:01
this was an unprecedented scale.
142
541906
2972
都是史無前例的。
09:04
As an example, in the case of cats,
143
544878
3461
用貓來舉個例子說明,
09:08
we have more than 62,000 cats
144
548339
2809
我們有超過六萬兩千種
09:11
of all kinds of looks and poses
145
551148
4110
不同外觀和姿勢的貓咪,
09:15
and across all species of domestic and wild cats.
146
555258
5223
橫跨不同的種類,有家貓,也有野貓。
09:20
We were thrilled to have put together ImageNet,
147
560481
3344
ImageNet 的成果讓我們非常激動,
09:23
and we wanted the whole research world to benefit from it,
148
563825
3738
我們希望它有助於全世界的研究,
09:27
so in the TED fashion, we opened up the entire data set
149
567563
4041
就如同 TED 的貢獻,我們免費提供整個資料庫
09:31
to the worldwide research community for free.
150
571604
3592
給全世界的研究單位。
(觀眾鼓掌)
09:36
(Applause)
151
576636
4000
09:41
Now that we have the data to nourish our computer brain,
152
581416
4538
有了這些資料,我們可以教育我們的電腦,
09:45
we're ready to come back to the algorithms themselves.
153
585954
3737
下一步就是回到程式演算的部分了。
09:49
As it turned out, the wealth of information provided by ImageNet
154
589691
5178
結果我們發現,ImageNet 所提供的豐富資訊
09:54
was a perfect match to a particular class of machine learning algorithms
155
594869
4806
恰巧與機器學習演算的其中一門特定領域 不謀而合,
09:59
called convolutional neural network,
156
599675
2415
我們稱它為「卷積神經網絡」,
10:02
pioneered by Kunihiko Fukushima, Geoff Hinton, and Yann LeCun
157
602090
5248
在七零及八零年代,福島邦彥、Geoff Hinton
10:07
back in the 1970s and '80s.
158
607338
3645
和 Yann LeCun 等學者為該領域的先驅。
10:10
Just like the brain consists of billions of highly connected neurons,
159
610983
5619
正如同大腦是由無數個緊密連結的神經元所組成,
10:16
a basic operating unit in a neural network
160
616602
3854
神經網絡的基本運作單位
10:20
is a neuron-like node.
161
620456
2415
也是一個類神經元的節點。
10:22
It takes input from other nodes
162
622871
2554
它的運作方式是從別的節點得到資料,
10:25
and sends output to others.
163
625425
2718
然後再傳給其他的節點。
10:28
Moreover, these hundreds of thousands or even millions of nodes
164
628143
4713
而且這些數不清的節點
10:32
are organized in hierarchical layers,
165
632856
3227
擁有層層的組織架構,
10:36
also similar to the brain.
166
636083
2554
就好像我們的大腦一樣。
10:38
In a typical neural network we use to train our object recognition model,
167
638637
4783
在一般的神經網絡中, 我們用作訓練的物品辨識模型
10:43
it has 24 million nodes,
168
643420
3181
就有兩千四百萬個節點、
10:46
140 million parameters,
169
646601
3297
一億四千萬個參數,
10:49
and 15 billion connections.
170
649898
2763
以及一百五十億個連結。
10:52
That's an enormous model.
171
652661
2415
這是一個大的不得了的模型。
10:55
Powered by the massive data from ImageNet
172
655076
3901
由ImageNet 提供巨大的資料群、
10:58
and the modern CPUs and GPUs to train such a humongous model,
173
658977
5433
並使用先進的核心處理器及圖型處理器來訓練 這個龐然大物,
11:04
the convolutional neural network
174
664410
2369
卷積神經網絡就在眾人的意料外
11:06
blossomed in a way that no one expected.
175
666779
3436
開花結果了。
11:10
It became the winning architecture
176
670215
2508
在物品辨識領域中,這樣的架構
11:12
to generate exciting new results in object recognition.
177
672723
5340
以令人興奮的嶄新成果,傲視群雄。
11:18
This is a computer telling us
178
678063
2810
電腦告訴我們
11:20
this picture contains a cat
179
680873
2300
這張圖中有隻貓,
11:23
and where the cat is.
180
683173
1903
還告訴我們貓在哪裡。
11:25
Of course there are more things than cats,
181
685076
2112
當然,這世界不會只有貓,
11:27
so here's a computer algorithm telling us
182
687188
2438
電腦的演算告訴我們
11:29
the picture contains a boy and a teddy bear;
183
689626
3274
這張圖中有一個男孩和一隻泰迪熊;
11:32
a dog, a person, and a small kite in the background;
184
692900
4366
有狗,一個人,以及背景中的一支小風箏;
11:37
or a picture of very busy things
185
697266
3135
或這一張令人眼花撩亂的圖,
11:40
like a man, a skateboard, railings, a lampost, and so on.
186
700401
4644
有人、滑板、欄杆、路燈,等等。
11:45
Sometimes, when the computer is not so confident about what it sees,
187
705045
5293
有時候,如果電腦不確定自己所見到的東西時,
11:51
we have taught it to be smart enough
188
711498
2276
我們已經將它教到可以聰明地
11:53
to give us a safe answer instead of committing too much,
189
713774
3878
給一個安全的答案,而非莽撞地回答,
11:57
just like we would do,
190
717652
2811
就像一般人會做的。
12:00
but other times our computer algorithm is remarkable at telling us
191
720463
4666
更有些時候,電腦的運算竟能夠
12:05
what exactly the objects are,
192
725129
2253
精準地辨別物體品項
12:07
like the make, model, year of the cars.
193
727382
3436
例如製造商、型號、車子的年份。
12:10
We applied this algorithm to millions of Google Street View images
194
730818
5386
Google 將這個演算程式廣泛地運用在
12:16
across hundreds of American cities,
195
736204
3135
數百個美國城市的街景裡,
12:19
and we have learned something really interesting:
196
739339
2926
也因此我們從中得到了一些有趣的概念。
12:22
first, it confirmed our common wisdom
197
742265
3320
首先,它證實了一項廣為人知的說法,
12:25
that car prices correlate very well
198
745585
3290
也就是汽車價格和家庭收入
12:28
with household incomes.
199
748875
2345
是息息相關的。
12:31
But surprisingly, car prices also correlate well
200
751220
4527
然而令人驚訝的是,汽車價格也和
12:35
with crime rates in cities,
201
755747
2300
城市中的犯罪率
12:39
or voting patterns by zip codes.
202
759007
3963
以及區域選舉模式,有相當的關係。
12:44
So wait a minute. Is that it?
203
764060
2206
等等,難道說我今天
12:46
Has the computer already matched or even surpassed human capabilities?
204
766266
5153
就是來告訴各位電腦已經趕上 甚至超越人類了嗎?
12:51
Not so fast.
205
771419
2138
還早得很呢。
12:53
So far, we have just taught the computer to see objects.
206
773557
4923
到目前為止,我們只是教導電腦識別物品,
12:58
This is like a small child learning to utter a few nouns.
207
778480
4644
就像小孩子牙牙學語一樣,
13:03
It's an incredible accomplishment,
208
783124
2670
雖然這是個傲人的進展,
13:05
but it's only the first step.
209
785794
2460
但它不過是第一步而已,
13:08
Soon, another developmental milestone will be hit,
210
788254
3762
很快地,下一波具指標性的後浪就會打上來了,
13:12
and children begin to communicate in sentences.
211
792016
3461
小孩子開始進展到用句子來溝通。
13:15
So instead of saying this is a cat in the picture,
212
795477
4224
因此,他已經不會用「這是貓」 來描述圖片,
13:19
you already heard the little girl telling us this is a cat lying on a bed.
213
799701
5202
而是會聽到這個小女孩說「這是躺在床上的貓」。
13:24
So to teach a computer to see a picture and generate sentences,
214
804903
5595
因此,要教導電腦看到圖並說出句子,
13:30
the marriage between big data and machine learning algorithm
215
810498
3948
必須進一步地仰賴龐大資料群
13:34
has to take another step.
216
814446
2275
以及機器的學習演算。
13:36
Now, the computer has to learn from both pictures
217
816721
4156
現在,電腦不僅要學習圖片識別,
13:40
as well as natural language sentences
218
820877
2856
還要學習人類自然的
13:43
generated by humans.
219
823733
3322
說話方式。
13:47
Just like the brain integrates vision and language,
220
827055
3853
就如同大腦要結合視覺和語言一樣,
13:50
we developed a model that connects parts of visual things
221
830908
5201
我們做出了一個模型, 它可以連結不同的可視物體,
13:56
like visual snippets
222
836109
1904
就像視覺片段一樣,
13:58
with words and phrases in sentences.
223
838013
4203
並附上句子用的字詞和片語。
14:02
About four months ago,
224
842216
2763
約四個月前,
14:04
we finally tied all this together
225
844979
2647
我們終於把所有的元素全部兜起來了,
14:07
and produced one of the first computer vision models
226
847626
3784
做出了第一個電腦版的模型,
14:11
that is capable of generating a human-like sentence
227
851410
3994
它有辦法在初次看到照片時
14:15
when it sees a picture for the first time.
228
855404
3506
說出像人類般自然的句子,
14:18
Now, I'm ready to show you what the computer says
229
858910
4644
好,現在我要給各位看看電腦
14:23
when it sees the picture
230
863554
1975
對於演講一開頭
14:25
that the little girl saw at the beginning of this talk.
231
865529
3830
那位小女孩所看到的影像, 它又是如何理解的。
14:31
(Video) Computer: A man is standing next to an elephant.
232
871519
3344
(電腦) 有個人站在大象旁邊。
14:36
A large airplane sitting on top of an airport runway.
233
876393
3634
一架大飛機停在機場跑道上。
14:41
FFL: Of course, we're still working hard to improve our algorithms,
234
881057
4212
(主講人) 當然,我們仍戮力於改善這電腦程式,
14:45
and it still has a lot to learn.
235
885269
2596
它還有很多要學。
14:47
(Applause)
236
887865
2291
(觀眾鼓掌)
14:51
And the computer still makes mistakes.
237
891556
3321
電腦還是會犯錯。
14:54
(Video) Computer: A cat lying on a bed in a blanket.
238
894877
3391
(電腦) 一隻貓包著毯子躺在床上。
14:58
FFL: So of course, when it sees too many cats,
239
898268
2553
(主講人) 因為它看了太多貓了,
15:00
it thinks everything might look like a cat.
240
900821
2926
以至於它見到了什麼都像貓咪。
15:05
(Video) Computer: A young boy is holding a baseball bat.
241
905317
2864
(電腦) 一位小男孩握著一支球棒。
15:08
(Laughter)
242
908181
1765
(觀眾笑)
15:09
FFL: Or, if it hasn't seen a toothbrush, it confuses it with a baseball bat.
243
909946
4583
(主講人) 或者,如果電腦是第一次看到牙刷, 會把它與球棒混淆。
15:15
(Video) Computer: A man riding a horse down a street next to a building.
244
915309
3434
(電腦) 一個人在建築物旁的街道上騎馬。
15:18
(Laughter)
245
918743
2023
(觀眾笑)
15:20
FFL: We haven't taught Art 101 to the computers.
246
920766
3552
(主講人) 我們還沒讓電腦上基礎美術課。
15:25
(Video) Computer: A zebra standing in a field of grass.
247
925768
2884
(電腦) 一匹斑馬站在原野中。
15:28
FFL: And it hasn't learned to appreciate the stunning beauty of nature
248
928652
3367
(主講人) 電腦還沒辦法像人類一樣,
15:32
like you and I do.
249
932019
2438
學會欣賞大自然的美景。
15:34
So it has been a long journey.
250
934457
2832
這是條漫漫長路,
15:37
To get from age zero to three was hard.
251
937289
4226
要從零歲發展到三歲是很難的,
15:41
The real challenge is to go from three to 13 and far beyond.
252
941515
5596
更艱深的挑戰在於從三歲發展到十三歲, 甚至到更遠的階段。
15:47
Let me remind you with this picture of the boy and the cake again.
253
947111
4365
讓我用這張男孩與蛋糕的圖片來進一步說明,
15:51
So far, we have taught the computer to see objects
254
951476
4064
直到今日,我們已經教會了電腦識別物品,
15:55
or even tell us a simple story when seeing a picture.
255
955540
4458
甚至於在看到一張圖後,可以簡單地敘述。
15:59
(Video) Computer: A person sitting at a table with a cake.
256
959998
3576
(電腦) 一個人和蛋糕坐在桌旁。
16:03
FFL: But there's so much more to this picture
257
963574
2630
(主講人) 這張照片其實蘊涵著更多的東西,
16:06
than just a person and a cake.
258
966204
2270
不僅只有人和蛋糕。
16:08
What the computer doesn't see is that this is a special Italian cake
259
968474
4467
電腦看不出這是種特別的義式蛋糕,
16:12
that's only served during Easter time.
260
972941
3217
人們只有在復活節時才會做。
16:16
The boy is wearing his favorite t-shirt
261
976158
3205
這個男孩穿著他最心愛的T恤,
16:19
given to him as a gift by his father after a trip to Sydney,
262
979363
3970
是去雪梨玩的時候,他的父親送的,
16:23
and you and I can all tell how happy he is
263
983333
3808
各位和我都可以看得出他有多快樂,
16:27
and what's exactly on his mind at that moment.
264
987141
3203
以及當時他的心裡在想什麼。
16:31
This is my son Leo.
265
991214
3125
這是我兒子,李奧。
16:34
On my quest for visual intelligence,
266
994339
2624
在探索智能視覺的旅途上,
16:36
I think of Leo constantly
267
996963
2391
我不斷地想到他,
16:39
and the future world he will live in.
268
999354
2903
以及他在將來生活的世界,
16:42
When machines can see,
269
1002257
2021
當未來,機器有了視覺,
16:44
doctors and nurses will have extra pairs of tireless eyes
270
1004278
4712
醫生和護士就多了雙永不倦怠的眼睛,
16:48
to help them to diagnose and take care of patients.
271
1008990
4092
幫助他們診斷及照顧病人;
16:53
Cars will run smarter and safer on the road.
272
1013082
4383
行駛在路上的車子可以更聰明、更安全;
16:57
Robots, not just humans,
273
1017465
2694
人類與機器人能一起
17:00
will help us to brave the disaster zones to save the trapped and wounded.
274
1020159
4849
共同投入災區的救援工作,拯救受困人員及傷者;
17:05
We will discover new species, better materials,
275
1025798
3796
我們還可以發現新品種 與更好的材料,
17:09
and explore unseen frontiers with the help of the machines.
276
1029594
4509
探索未知的疆界, 這一切都可仰賴機器的協助。
17:15
Little by little, we're giving sight to the machines.
277
1035113
4167
一步一步地,我們賦予機器視覺,
17:19
First, we teach them to see.
278
1039280
2798
先教他們識別物品,
17:22
Then, they help us to see better.
279
1042078
2763
然後它們也讓我們看得更清楚,
17:24
For the first time, human eyes won't be the only ones
280
1044841
4165
這是第一次人類的眼睛不是唯一
17:29
pondering and exploring our world.
281
1049006
2934
可以用來思考和探索世界的工具,
17:31
We will not only use the machines for their intelligence,
282
1051940
3460
我們不僅可以利用機器的智能,
17:35
we will also collaborate with them in ways that we cannot even imagine.
283
1055400
6179
更可以運用更多你想像不到的方式攜手合作。
17:41
This is my quest:
284
1061579
2161
這是我想追求的目標:
17:43
to give computers visual intelligence
285
1063740
2712
給予機器智慧之眼,
17:46
and to create a better future for Leo and for the world.
286
1066452
5131
為李奧和整個世界創造更美好的未來。
17:51
Thank you.
287
1071583
1811
謝謝各位。
17:53
(Applause)
288
1073394
3785
(觀眾鼓掌)
關於本網站

本網站將向您介紹對學習英語有用的 YouTube 視頻。 您將看到來自世界各地的一流教師教授的英語課程。 雙擊每個視頻頁面上顯示的英文字幕,從那裡播放視頻。 字幕與視頻播放同步滾動。 如果您有任何意見或要求,請使用此聯繫表與我們聯繫。

https://forms.gle/WvT1wiN1qDtmnspy7