The wonderful and terrifying implications of computers that can learn | Jeremy Howard
597,885 views ・ 2014-12-16
请双击下面的英文字幕来播放视频。
翻译人员: Yuchen Shen
校对人员: Li Li
00:12
It used to be that if you wanted
to get a computer to do something new,
0
12880
4013
在过去,如果你想让计算机做一件事
00:16
you would have to program it.
1
16893
1554
你需要设计电脑程序
00:18
Now, programming, for those of you here
that haven't done it yourself,
2
18447
3411
你们可能从没做过这件事
00:21
requires laying out in excruciating detail
3
21858
3502
编程需要排列出你想让电脑做的
每一个细枝末节的小步骤来达到你的目的
00:25
every single step that you want
the computer to do
4
25360
3367
00:28
in order to achieve your goal.
5
28727
2362
00:31
Now, if you want to do something
that you don't know how to do yourself,
6
31089
3496
假如你自己都不清楚完成这某件事的话
00:34
then this is going
to be a great challenge.
7
34585
2063
要编写处电脑程序来完成那件事就会显得
比登天还要困难
00:36
So this was the challenge faced
by this man, Arthur Samuel.
8
36648
3483
这也是这个人,亚瑟 塞缪尔,所面临的挑战
00:40
In 1956, he wanted to get this computer
9
40131
4077
在1956年,他想让这台电脑和他下国际象棋
00:44
to be able to beat him at checkers.
10
44208
2340
00:46
How can you write a program,
11
46548
2040
你怎样才能罗列出所有的细枝末节,
并且让电脑下象棋比你厉害?
00:48
lay out in excruciating detail,
how to be better than you at checkers?
12
48588
3806
00:52
So he came up with an idea:
13
52394
1722
他想出一个办法
00:54
he had the computer play
against itself thousands of times
14
54116
3724
它让电脑和自己对战几千次
00:57
and learn how to play checkers.
15
57840
2524
学习如何下象棋
01:00
And indeed it worked,
and in fact, by 1962,
16
60364
3180
事实证明他做到了。1962年
01:03
this computer had beaten
the Connecticut state champion.
17
63544
4017
这台电脑打败了美国康涅狄克州象棋冠军
01:07
So Arthur Samuel was
the father of machine learning,
18
67561
2973
亚瑟 塞缪尔是机器学习之父
01:10
and I have a great debt to him,
19
70534
1717
我非常敬畏他
01:12
because I am a machine
learning practitioner.
20
72251
2763
因为我是机器学习的实践者
01:15
I was the president of Kaggle,
21
75014
1465
我曾是Kaggle的主席
01:16
a community of over 200,000
machine learning practictioners.
22
76479
3388
Kaggle是一个拥有200,000机器学习实践者地社区
01:19
Kaggle puts up competitions
23
79867
2058
Kaggle会组织竞赛
01:21
to try and get them to solve
previously unsolved problems,
24
81925
3708
让人们尝试解决过去未解决的问题
01:25
and it's been successful
hundreds of times.
25
85633
3837
已成功解决问题几百次
01:29
So from this vantage point,
I was able to find out
26
89470
2470
在这个有利环境中,我发现了
01:31
a lot about what machine learning
can do in the past, can do today,
27
91940
3950
机器学习在过去,现在,和将来可以做些什么
01:35
and what it could do in the future.
28
95890
2362
01:38
Perhaps the first big success of
machine learning commercially was Google.
29
98252
4423
第一个机器学习的商业成功案例应该是谷歌
01:42
Google showed that it is
possible to find information
30
102675
3109
谷歌用计算机算法寻找信息
01:45
by using a computer algorithm,
31
105784
1752
01:47
and this algorithm is based
on machine learning.
32
107536
2901
而且这个算法以计算机学习为基础
01:50
Since that time, there have been many
commercial successes of machine learning.
33
110437
3886
从那以后,机器学习得到了很多的商业成功
01:54
Companies like Amazon and Netflix
34
114323
1837
像亚马逊、网飞这类公司
01:56
use machine learning to suggest
products that you might like to buy,
35
116160
3716
通过机器学习向你推荐你可能想买的东西
01:59
movies that you might like to watch.
36
119876
2020
你可能想看的电影
02:01
Sometimes, it's almost creepy.
37
121896
1807
有时候你会被吓一跳
02:03
Companies like LinkedIn and Facebook
38
123703
1954
像领英、脸谱这类的公司
02:05
sometimes will tell you about
who your friends might be
39
125657
2594
有时会告诉你谁会是你的朋友
02:08
and you have no idea how it did it,
40
128251
1977
你根本不知道他们是如何做到的
02:10
and this is because it's using
the power of machine learning.
41
130228
2967
其实他们正是运用了机器学习的力量
02:13
These are algorithms that have
learned how to do this from data
42
133195
2957
这种运算方法使用数据
02:16
rather than being programmed by hand.
43
136152
3247
而非手动编写程序
02:19
This is also how IBM was successful
44
139399
2478
这也是IBM的Watson超级计算机
在《危险边缘》里打败两届世界冠军的秘诀
02:21
in getting Watson to beat
the two world champions at "Jeopardy,"
45
141877
3862
02:25
answering incredibly subtle
and complex questions like this one.
46
145739
3225
成功回答了这样一个极其模糊且复杂的问题
02:28
["The ancient 'Lion of Nimrud' went missing
from this city's national museum in 2003
(along with a lot of other stuff)"]
47
148964
2835
[“古代‘尼姆鲁德狮像’于2003年在这个城市的国家博物馆消失(连同其它很多物品)”]
02:31
This is also why we are now able
to see the first self-driving cars.
48
151799
3235
这也是为什么我们现在有了第一台自驾车
02:35
If you want to be able to tell
the difference between, say,
49
155034
2822
如果你想区分一棵树和一个行人
02:37
a tree and a pedestrian,
well, that's pretty important.
50
157856
2632
显然这很重要
02:40
We don't know how to write
those programs by hand,
51
160488
2587
但是我们不知道如何写这样一个程序
02:43
but with machine learning,
this is now possible.
52
163075
2997
有了机器学习,这就成为了可能
02:46
And in fact, this car has driven
over a million miles
53
166072
2608
这台自驾车已经行驶了十万英里
02:48
without any accidents on regular roads.
54
168680
3506
在正常路面上零事故
02:52
So we now know that computers can learn,
55
172196
3914
我们知道电脑能够学习
02:56
and computers can learn to do things
56
176110
1900
学习做一件有时我们自己都不知道怎么做的事情
02:58
that we actually sometimes
don't know how to do ourselves,
57
178010
2838
03:00
or maybe can do them better than us.
58
180848
2885
有时甚至比我们做得更好
03:03
One of the most amazing examples
I've seen of machine learning
59
183733
4195
我见过机器学习最惊人的例子
是我在Kaggle做的一个项目
03:07
happened on a project that I ran at Kaggle
60
187928
2392
03:10
where a team run by a guy
called Geoffrey Hinton
61
190320
3591
一个叫杰弗里 辛顿的人毕业于多伦多大学,
带领一个团队
03:13
from the University of Toronto
62
193911
1552
03:15
won a competition for
automatic drug discovery.
63
195463
2677
赢得了一个自动查毒的竞赛
03:18
Now, what was extraordinary here
is not just that they beat
64
198140
2847
然而真正精彩的不是他们打败了所有默克公司
或者国际学术团体设计的运算
03:20
all of the algorithms developed by Merck
or the international academic community,
65
200987
4013
03:25
but nobody on the team had any background
in chemistry or biology or life sciences,
66
205000
5061
而是他们团队里没有一个人有化学、生物
或者生命科学的背景
03:30
and they did it in two weeks.
67
210061
2169
却在两个星期内赢得了比赛
03:32
How did they do this?
68
212230
1381
他们是如何做到的?
03:34
They used an extraordinary algorithm
called deep learning.
69
214421
2921
他们应用了一种超凡的算法叫做深度学习
03:37
So important was this that in fact
the success was covered
70
217342
2949
几个星期后纽约时报在其首页
报道了此次的重要成功
03:40
in The New York Times in a front page
article a few weeks later.
71
220291
3121
03:43
This is Geoffrey Hinton
here on the left-hand side.
72
223412
2735
在左手边就是杰弗里 辛顿
03:46
Deep learning is an algorithm
inspired by how the human brain works,
73
226147
4341
深度学习是受到人类大脑的启发
03:50
and as a result it's an algorithm
74
230488
1812
也因此这种算法的能力不受任何理论限制
03:52
which has no theoretical limitations
on what it can do.
75
232300
3841
03:56
The more data you give it and the more
computation time you give it,
76
236141
2823
你给它越多的数据和运算时间
03:58
the better it gets.
77
238964
1312
它会工作的越好
04:00
The New York Times also
showed in this article
78
240276
2339
纽约时报在其文章中
还说明了深度学习的另一非凡之处
04:02
another extraordinary
result of deep learning
79
242615
2242
04:04
which I'm going to show you now.
80
244857
2712
现在我要展示给你们看
04:07
It shows that computers
can listen and understand.
81
247569
4941
它表明电脑能够听懂信息
04:12
(Video) Richard Rashid: Now, the last step
82
252510
2711
(视频)理查德 拉希德:现在,
我要做的最后一步是
04:15
that I want to be able
to take in this process
83
255221
3025
04:18
is to actually speak to you in Chinese.
84
258246
4715
用汉语和大家说话
04:22
Now the key thing there is,
85
262961
2635
在这之前,我们已经通过很多说汉语的人
收集了大量信息
04:25
we've been able to take a large amount
of information from many Chinese speakers
86
265596
5002
04:30
and produce a text-to-speech system
87
270598
2530
然后形成一个语音合成系统
04:33
that takes Chinese text
and converts it into Chinese language,
88
273128
4673
把汉字转换成汉语言
04:37
and then we've taken
an hour or so of my own voice
89
277801
4128
之后我们收录了一个小时我的声音
04:41
and we've used that to modulate
90
281929
1891
使声音合成系统的声音听起来像我
04:43
the standard text-to-speech system
so that it would sound like me.
91
283820
4544
04:48
Again, the result's not perfect.
92
288364
2540
再次,结果并不完美
04:50
There are in fact quite a few errors.
93
290904
2648
他们会有不少错误
04:53
(In Chinese)
94
293552
2484
(中文)
04:56
(Applause)
95
296036
3367
(掌声)
05:01
There's much work to be done in this area.
96
301446
3576
在这个领域还有很多工作要做
05:05
(In Chinese)
97
305022
3645
(中文)
05:08
(Applause)
98
308667
3433
(掌声)
05:13
Jeremy Howard: Well, that was at
a machine learning conference in China.
99
313345
3399
杰里米 霍华德:这是在一个中国的机器学习会议上
05:16
It's not often, actually,
at academic conferences
100
316744
2370
事实上,一般来说,你不会在学术会议上
听到如此热烈的掌声
05:19
that you do hear spontaneous applause,
101
319114
1897
05:21
although of course sometimes
at TEDx conferences, feel free.
102
321011
3676
当然除了TEDx演讲可以随意鼓掌
05:24
Everything you saw there
was happening with deep learning.
103
324687
2795
你所看到的一切都伴随着深入学习
05:27
(Applause) Thank you.
104
327482
1525
(掌声)谢谢
05:29
The transcription in English
was deep learning.
105
329007
2282
对英文的转录是深入学习
05:31
The translation to Chinese and the text
in the top right, deep learning,
106
331289
3412
翻译成汉语以及屏幕右上方的文字是深入学习
05:34
and the construction of the voice
was deep learning as well.
107
334701
3307
声音的合成也是深入学习
05:38
So deep learning is
this extraordinary thing.
108
338008
3234
深入学习就是这样神奇的事情
05:41
It's a single algorithm that
can seem to do almost anything,
109
341242
3099
这个单一的算法似乎可以做任何事情
05:44
and I discovered that a year earlier,
it had also learned to see.
110
344341
3111
而且一年前我发现他甚至有视觉
05:47
In this obscure competition from Germany
111
347452
2176
这个名不见经传的德国竞赛
05:49
called the German Traffic Sign
Recognition Benchmark,
112
349628
2597
叫做德国交通标志识别基准
05:52
deep learning had learned
to recognize traffic signs like this one.
113
352225
3393
深度学习已学得识别这些交通标识
05:55
Not only could it
recognize the traffic signs
114
355618
2094
它不仅能够做的比其它算法好
05:57
better than any other algorithm,
115
357712
1758
05:59
the leaderboard actually showed
it was better than people,
116
359470
2719
排行榜显示它比人更厉害
06:02
about twice as good as people.
117
362189
1852
是人的准确率的两倍
06:04
So by 2011, we had the first example
118
364041
1996
到2011年,我们有了第一台视力高于人类的电脑
06:06
of computers that can see
better than people.
119
366037
3405
06:09
Since that time, a lot has happened.
120
369442
2049
从此更多的电脑也可以做到
06:11
In 2012, Google announced that
they had a deep learning algorithm
121
371491
3514
在2012年,谷歌宣布让一个深度学习的算法看YouTube视频
06:15
watch YouTube videos
122
375005
1415
06:16
and crunched the data
on 16,000 computers for a month,
123
376420
3437
收集16,000台电脑上的数据,为期一个月
06:19
and the computer independently learned
about concepts such as people and cats
124
379857
4361
之后电脑便能仅通过看视频独立识别人和猫
06:24
just by watching the videos.
125
384218
1809
06:26
This is much like the way
that humans learn.
126
386027
2352
这近似于人类学习的过程
06:28
Humans don't learn
by being told what they see,
127
388379
2740
人类不需要被告诉他们看到了什么
06:31
but by learning for themselves
what these things are.
128
391119
3331
而是在自己认知事物的过程中学习
06:34
Also in 2012, Geoffrey Hinton,
who we saw earlier,
129
394450
3369
同样在2012年,杰弗里 辛顿,我们之前看到的人
06:37
won the very popular ImageNet competition,
130
397819
2858
赢了很火的ImageNet比赛
06:40
looking to try to figure out
from one and a half million images
131
400677
4141
分辨出150万张图片的内容
06:44
what they're pictures of.
132
404818
1438
06:46
As of 2014, we're now down
to a six percent error rate
133
406256
3533
到2014年,我们已经将图像识别的误差
降低到百分之六
06:49
in image recognition.
134
409789
1453
06:51
This is better than people, again.
135
411242
2026
低于人类误差率
06:53
So machines really are doing
an extraordinarily good job of this,
136
413268
3769
这项非凡的工作现在已经用于工业
06:57
and it is now being used in industry.
137
417037
2269
06:59
For example, Google announced last year
138
419306
3042
比如说,去年谷歌声明
07:02
that they had mapped every single
location in France in two hours,
139
422348
4585
他们在两小时内把法国的每一个地点汇成地图
07:06
and the way they did it was
that they fed street view images
140
426933
3447
他们是将街景填入深度学习算法以辨认街道号
07:10
into a deep learning algorithm
to recognize and read street numbers.
141
430380
4319
07:14
Imagine how long
it would have taken before:
142
434699
2220
可以想象从前这件事要花费多少时间和精力
07:16
dozens of people, many years.
143
436919
3355
07:20
This is also happening in China.
144
440274
1911
同样的事情也发生在中国
07:22
Baidu is kind of
the Chinese Google, I guess,
145
442185
4036
百度大概类似于中国的谷歌
07:26
and what you see here in the top left
146
446221
2283
我们看到左上角
07:28
is an example of a picture that I uploaded
to Baidu's deep learning system,
147
448504
3974
是一张我上传到百度的深度学习系统的图片
07:32
and underneath you can see that the system
has understood what that picture is
148
452478
3769
下面你可以看到系统理解了这张照片
07:36
and found similar images.
149
456247
2236
并且找到了类似的图片
07:38
The similar images actually
have similar backgrounds,
150
458483
2736
同样的背景
07:41
similar directions of the faces,
151
461219
1658
同样的角度
07:42
even some with their tongue out.
152
462877
1788
有的甚至也有伸出来的舌头
07:44
This is not clearly looking
at the text of a web page.
153
464665
3030
网页上没有准确的文字
07:47
All I uploaded was an image.
154
467695
1412
我只是上传了图片
07:49
So we now have computers which
really understand what they see
155
469107
4021
所以说电脑能够真正理解它所看到的事物
07:53
and can therefore search databases
156
473128
1624
进而在数据库的几百万张图片中进行实时搜索
07:54
of hundreds of millions
of images in real time.
157
474752
3554
07:58
So what does it mean
now that computers can see?
158
478306
3230
就现在而言,电脑的视力意味着什么呢?
08:01
Well, it's not just
that computers can see.
159
481536
2017
事实上不仅仅是电脑能够看见
08:03
In fact, deep learning
has done more than that.
160
483553
2069
深度学习其实可以做得更多
08:05
Complex, nuanced sentences like this one
161
485622
2948
像这样一个细小复杂的语句
08:08
are now understandable
with deep learning algorithms.
162
488570
2824
对深度学习来说是相对易于理解的
08:11
As you can see here,
163
491394
1303
你可以看到
08:12
this Stanford-based system
showing the red dot at the top
164
492697
2768
斯坦福基础系统显示上面的红点指出
这个语句表达的是否定语气
08:15
has figured out that this sentence
is expressing negative sentiment.
165
495465
3919
08:19
Deep learning now in fact
is near human performance
166
499384
3406
深度学习在理解语句内容方面已经接近人类水平
08:22
at understanding what sentences are about
and what it is saying about those things.
167
502802
5121
08:27
Also, deep learning has
been used to read Chinese,
168
507923
2728
同样,深度学习在用于阅读汉语上已经相当于中国本土人水平
08:30
again at about native
Chinese speaker level.
169
510651
3156
08:33
This algorithm developed
out of Switzerland
170
513807
2168
这个算法开发于瑞士
08:35
by people, none of whom speak
or understand any Chinese.
171
515975
3356
没有一个人懂汉语
08:39
As I say, using deep learning
172
519331
2051
要我说,深度学习是比较于人类
做这件事最好的系统
08:41
is about the best system
in the world for this,
173
521382
2219
08:43
even compared to native
human understanding.
174
523601
5117
08:48
This is a system that we
put together at my company
175
528718
2964
这个系统是在我们公司建立的
08:51
which shows putting
all this stuff together.
176
531682
2046
它要把这些东西集合起来
08:53
These are pictures which
have no text attached,
177
533728
2461
这些图片没有文字描述
08:56
and as I'm typing in here sentences,
178
536189
2352
随着我在这输入文字
08:58
in real time it's understanding
these pictures
179
538541
2969
同时它会了解这些图片
09:01
and figuring out what they're about
180
541510
1679
理解它们是关于什么的
09:03
and finding pictures that are similar
to the text that I'm writing.
181
543189
3163
然后找出和这些相似的图片
09:06
So you can see, it's actually
understanding my sentences
182
546352
2756
所以你看,他真正在理解我的文字
09:09
and actually understanding these pictures.
183
549108
2224
理解这些图片
09:11
I know that you've seen
something like this on Google,
184
551332
2559
我知道你在谷歌上看到过类似的
09:13
where you can type in things
and it will show you pictures,
185
553891
2775
你可以输入文字,它会提供给你图片
09:16
but actually what it's doing is it's
searching the webpage for the text.
186
556666
3424
但实际上它是在网页上搜索文字
09:20
This is very different from actually
understanding the images.
187
560090
3001
这和理解图片是有很大不同的
09:23
This is something that computers
have only been able to do
188
563091
2752
理解图片是电脑在过去几个月里才刚刚会做的事情
09:25
for the first time in the last few months.
189
565843
3248
09:29
So we can see now that computers
can not only see but they can also read,
190
569091
4091
电脑不仅有视力,而且能够阅读
09:33
and, of course, we've shown that they
can understand what they hear.
191
573182
3765
而且当然,电脑也能理解所听到的
09:36
Perhaps not surprising now that
I'm going to tell you they can write.
192
576947
3442
也许并不意外,我现在要告诉你们,电脑也可以写
09:40
Here is some text that I generated
using a deep learning algorithm yesterday.
193
580389
4783
这是我昨天用深度学习算法写的文字
09:45
And here is some text that an algorithm
out of Stanford generated.
194
585172
3924
这些是斯坦福的算法做的
09:49
Each of these sentences was generated
195
589096
1764
每一句话都是深度学习算法对图片进行的描述
09:50
by a deep learning algorithm
to describe each of those pictures.
196
590860
4249
09:55
This algorithm before has never seen
a man in a black shirt playing a guitar.
197
595109
4472
算法没见过一个穿黑衣服的男人弹吉他
09:59
It's seen a man before,
it's seen black before,
198
599581
2220
它见过男人,见过黑色
10:01
it's seen a guitar before,
199
601801
1599
见过吉他
10:03
but it has independently generated
this novel description of this picture.
200
603400
4294
它便自己对这个图片作出了这样的描述
10:07
We're still not quite at human
performance here, but we're close.
201
607694
3502
我们还做不到完全和人类同等水平,
但我们已经很接近了
10:11
In tests, humans prefer
the computer-generated caption
202
611196
4068
统计表明,四分之一的人更喜欢电脑做的图片说明
10:15
one out of four times.
203
615264
1527
10:16
Now this system is now only two weeks old,
204
616791
2064
目前这个系统刚被开发两周之久
10:18
so probably within the next year,
205
618855
1846
所以按这个速度,估计明年
10:20
the computer algorithm will be
well past human performance
206
620701
2801
电脑算法会超过人类水平
10:23
at the rate things are going.
207
623502
1862
10:25
So computers can also write.
208
625364
3049
电脑会写
10:28
So we put all this together and it leads
to very exciting opportunities.
209
628413
3475
我们把这些都放在一起,会发现一个令人兴奋的机遇
10:31
For example, in medicine,
210
631888
1492
比如说,在医药业
10:33
a team in Boston announced
that they had discovered
211
633380
2525
一个波士顿团队宣布
10:35
dozens of new clinically relevant features
212
635905
2949
他们发现了肿瘤的几十种临床表现
10:38
of tumors which help doctors
make a prognosis of a cancer.
213
638854
4266
帮助医生预测癌症
10:44
Very similarly, in Stanford,
214
644220
2296
同样的,在斯坦福
10:46
a group there announced that,
looking at tissues under magnification,
215
646516
3663
一个团队宣布通过用放大镜观察组织
10:50
they've developed
a machine learning-based system
216
650179
2381
开发了一个基于机器学习的系统
10:52
which in fact is better
than human pathologists
217
652560
2582
可以比病理学家更有效地预测癌症患者的幸存率
10:55
at predicting survival rates
for cancer sufferers.
218
655142
4377
10:59
In both of these cases, not only
were the predictions more accurate,
219
659519
3245
在这两个例子中,不仅预测更加准确
11:02
but they generated new insightful science.
220
662764
2502
而且他们创造了新的科学视角
11:05
In the radiology case,
221
665276
1505
在放射学中
11:06
they were new clinical indicators
that humans can understand.
222
666781
3095
新视角是人类可以明白的新临床表现
11:09
In this pathology case,
223
669876
1792
在病理学中
11:11
the computer system actually discovered
that the cells around the cancer
224
671668
4500
电脑发现癌细胞周围的细胞
11:16
are as important as
the cancer cells themselves
225
676168
3340
在诊断中同癌细胞一样重要
11:19
in making a diagnosis.
226
679508
1752
11:21
This is the opposite of what pathologists
had been taught for decades.
227
681260
5361
这和病理学家几十年来的教学是相反的
11:26
In each of those two cases,
they were systems developed
228
686621
3292
这两个案例中的系统都是由
11:29
by a combination of medical experts
and machine learning experts,
229
689913
3621
医学专家和机器学习专家共同开发的
11:33
but as of last year,
we're now beyond that too.
230
693534
2741
去年我们就已经超过了这个水平
11:36
This is an example of
identifying cancerous areas
231
696275
3549
这个是用显微镜识别组织癌变区的例子
11:39
of human tissue under a microscope.
232
699824
2530
11:42
The system being shown here
can identify those areas more accurately,
233
702354
4613
所显示的这个系统能够与病理学专家同样准确地识别癌变区
11:46
or about as accurately,
as human pathologists,
234
706967
2775
甚至比病理专家更准确
11:49
but was built entirely with deep learning
using no medical expertise
235
709742
3392
但是建立系统的都是深度学习的专家
11:53
by people who have
no background in the field.
236
713134
2526
没有一个医学专家
11:56
Similarly, here, this neuron segmentation.
237
716730
2555
类似的,这是神经细胞分裂
11:59
We can now segment neurons
about as accurately as humans can,
238
719285
3668
我们已经可以和人类一样准确地分裂细胞
12:02
but this system was developed
with deep learning
239
722953
2717
但这是个深度学习系统
12:05
using people with no previous
background in medicine.
240
725670
3251
没有一个开发者拥有医学背景
12:08
So myself, as somebody with
no previous background in medicine,
241
728921
3227
对于我这个完全没有医学背景的人来说
12:12
I seem to be entirely well qualified
to start a new medical company,
242
732148
3727
看起来我也完全可以开一个医药公司
12:15
which I did.
243
735875
2146
我确实这么做了
12:18
I was kind of terrified of doing it,
244
738021
1740
我开始有点不知所措
12:19
but the theory seemed to suggest
that it ought to be possible
245
739761
2889
但理论上说这件事是可行的
12:22
to do very useful medicine
using just these data analytic techniques.
246
742650
5492
用这些数据分析技术制作医药
12:28
And thankfully, the feedback
has been fantastic,
247
748142
2480
所幸的是,反响非常好
12:30
not just from the media
but from the medical community,
248
750622
2356
不仅是媒体的,包括医药行业
12:32
who have been very supportive.
249
752978
2344
都很支持
12:35
The theory is that we can take
the middle part of the medical process
250
755322
4149
理论表明我们可以将制药的中间过程
12:39
and turn that into data analysis
as much as possible,
251
759471
2893
充分转换成数据分析
12:42
leaving doctors to do
what they're best at.
252
762364
3065
让医生去做他们最擅长的
12:45
I want to give you an example.
253
765429
1602
我有一个例子
12:47
It now takes us about 15 minutes
to generate a new medical diagnostic test
254
767031
4944
制作一个医学诊断测试需要十五分钟
12:51
and I'll show you that in real time now,
255
771975
1954
我会给你们实际展示
12:53
but I've compressed it down to
three minutes by cutting some pieces out.
256
773929
3487
但是我去掉了一部分,把它压缩到了三分钟
12:57
Rather than showing you
creating a medical diagnostic test,
257
777416
3061
不要医学诊断试验
13:00
I'm going to show you
a diagnostic test of car images,
258
780477
3369
我要给你们展示制作一个汽车图片的诊断测试
13:03
because that's something
we can all understand.
259
783846
2222
因为这个我们都能懂
13:06
So here we're starting with
about 1.5 million car images,
260
786068
3201
现在我们有150万张汽车图片
13:09
and I want to create something
that can split them into the angle
261
789269
3206
我想要根据拍照的角度对他们进行分类
13:12
of the photo that's being taken.
262
792475
2223
13:14
So these images are entirely unlabeled,
so I have to start from scratch.
263
794698
3888
这些图片完全没有标签,所以我要先对他们进行简单描述
13:18
With our deep learning algorithm,
264
798586
1865
有深度学习算法
13:20
it can automatically identify
areas of structure in these images.
265
800451
3707
它可以自动识别图片的结构要素
13:24
So the nice thing is that the human
and the computer can now work together.
266
804158
3620
令人高兴的是人和电脑可以合作
13:27
So the human, as you can see here,
267
807778
2178
你可以看到,这个人
13:29
is telling the computer
about areas of interest
268
809956
2675
正在告诉电脑什么是感兴趣的要素
13:32
which it wants the computer then
to try and use to improve its algorithm.
269
812631
4650
为之后电脑用来完善算法
13:37
Now, these deep learning systems actually
are in 16,000-dimensional space,
270
817281
4296
现在,这些深度学习算法处在16,000维空间中
13:41
so you can see here the computer
rotating this through that space,
271
821577
3432
所以你看到电脑让他们在这个空间中旋转
13:45
trying to find new areas of structure.
272
825009
1992
尝试找到新的结构要素
13:47
And when it does so successfully,
273
827001
1781
当他成功时
13:48
the human who is driving it can then
point out the areas that are interesting.
274
828782
4004
开车的人就可以指出感兴趣的要素
13:52
So here, the computer has
successfully found areas,
275
832786
2422
现在电脑成功找出这些要素
13:55
for example, angles.
276
835208
2562
比如,角度
13:57
So as we go through this process,
277
837770
1606
我们在这个过程中
13:59
we're gradually telling
the computer more and more
278
839376
2340
逐渐的告诉电脑更多
14:01
about the kinds of structures
we're looking for.
279
841716
2428
我们想寻找的结构
14:04
You can imagine in a diagnostic test
280
844144
1772
你可以想象一个诊断测试
14:05
this would be a pathologist identifying
areas of pathosis, for example,
281
845916
3350
这就像是病理学家识别病态区域
14:09
or a radiologist indicating
potentially troublesome nodules.
282
849266
5026
或者放射学专家找出潜在的问题囊肿
14:14
And sometimes it can be
difficult for the algorithm.
283
854292
2559
有时候这对算法来说有些难度
14:16
In this case, it got kind of confused.
284
856851
1964
我们的例子就比较麻烦
14:18
The fronts and the backs
of the cars are all mixed up.
285
858815
2550
车的正面和背面全部混淆了
14:21
So here we have to be a bit more careful,
286
861365
2072
所以我们要仔细一些
14:23
manually selecting these fronts
as opposed to the backs,
287
863437
3232
人工地选出正面和背面
14:26
then telling the computer
that this is a type of group
288
866669
5506
人后告诉电脑这是我们所感兴趣的一类
14:32
that we're interested in.
289
872175
1348
14:33
So we do that for a while,
we skip over a little bit,
290
873523
2677
做这件事花了一些时间,所以我们跳过
14:36
and then we train the
machine learning algorithm
291
876200
2246
之后我们用这几百个东西训练机器学习算法
14:38
based on these couple of hundred things,
292
878446
1974
14:40
and we hope that it's gotten a lot better.
293
880420
2025
希望他会有很大进步
14:42
You can see, it's now started to fade
some of these pictures out,
294
882445
3073
你能看到,它正在消退一些图片
14:45
showing us that it already is recognizing
how to understand some of these itself.
295
885518
4708
说明他已经开始可以自己理解这些图片了
14:50
We can then use this concept
of similar images,
296
890226
2902
我们可以用相似图片的概念
14:53
and using similar images, you can now see,
297
893128
2094
用相似的图片,你可以看到
14:55
the computer at this point is able to
entirely find just the fronts of cars.
298
895222
4019
电脑现在能够只找出正面的车
14:59
So at this point, the human
can tell the computer,
299
899241
2948
在这个时候,人可以告诉电脑
15:02
okay, yes, you've done
a good job of that.
300
902189
2293
对的,没错,你做的很好
15:05
Sometimes, of course, even at this point
301
905652
2185
当然,有时,即使在这个阶段
15:07
it's still difficult
to separate out groups.
302
907837
3674
分组仍然是很困难的
15:11
In this case, even after we let the
computer try to rotate this for a while,
303
911511
3884
像我们这里,让电脑在这里旋转了一段时间了
15:15
we still find that the left sides
and the right sides pictures
304
915399
3345
我们还是看到左面的和右面的图片有混淆
15:18
are all mixed up together.
305
918744
1478
15:20
So we can again give
the computer some hints,
306
920222
2140
所以我们可以再一次给电脑一些提示
15:22
and we say, okay, try and find
a projection that separates out
307
922362
2976
我们让它通过深度学习算法尽可能分离出左面和右面的图片
15:25
the left sides and the right sides
as much as possible
308
925338
2607
15:27
using this deep learning algorithm.
309
927945
2122
15:30
And giving it that hint --
ah, okay, it's been successful.
310
930067
2942
有了这个指示——好的,它已经完成了
15:33
It's managed to find a way
of thinking about these objects
311
933009
2882
它要想办法分开这一部分
15:35
that's separated out these together.
312
935891
2380
15:38
So you get the idea here.
313
938271
2438
你现在知道了
15:40
This is a case not where the human
is being replaced by a computer,
314
940709
8197
这不是电脑取代人类
15:48
but where they're working together.
315
948906
2640
而是一起合作
15:51
What we're doing here is we're replacing
something that used to take a team
316
951546
3550
我们在做的是将过去需要五六人的团队
用七年时间做的事情
15:55
of five or six people about seven years
317
955096
2002
15:57
and replacing it with something
that takes 15 minutes
318
957098
2605
变成只需一个人花十五分钟就能完成
15:59
for one person acting alone.
319
959703
2505
16:02
So this process takes about
four or five iterations.
320
962208
3950
这个过程需要四到五次反复
16:06
You can see we now have 62 percent
321
966158
1859
你可以看到我们已经将150万张图片的62%正确分类
16:08
of our 1.5 million images
classified correctly.
322
968017
2959
16:10
And at this point, we
can start to quite quickly
323
970976
2472
现在我们就可以快速地检查整个分组
16:13
grab whole big sections,
324
973448
1297
16:14
check through them to make sure
that there's no mistakes.
325
974745
2919
确保没有错误
16:17
Where there are mistakes, we can
let the computer know about them.
326
977664
3952
如果哪里有错误,我们可以告诉电脑
16:21
And using this kind of process
for each of the different groups,
327
981616
3045
每个分组我们都这样做
16:24
we are now up to
an 80 percent success rate
328
984661
2487
现在这150万张图片已经达到80%的成功率
16:27
in classifying the 1.5 million images.
329
987148
2415
16:29
And at this point, it's just a case
330
989563
2078
现在这个阶段
16:31
of finding the small number
that aren't classified correctly,
331
991641
3579
只需要找出几个不正确的分类
16:35
and trying to understand why.
332
995220
2888
并让电脑明白为什么
16:38
And using that approach,
333
998108
1743
到了这个步骤
16:39
by 15 minutes we get
to 97 percent classification rates.
334
999851
4121
十五分钟后我们达到了97%的正确率
16:43
So this kind of technique
could allow us to fix a major problem,
335
1003972
4600
这种技术能帮助我们解决一个问题
16:48
which is that there's a lack
of medical expertise in the world.
336
1008578
3036
医疗专家不足的问题
16:51
The World Economic Forum says
that there's between a 10x and a 20x
337
1011614
3489
世界经济论坛表明,在发展中国家,
内科医生有十倍到二十倍的短缺
16:55
shortage of physicians
in the developing world,
338
1015103
2624
16:57
and it would take about 300 years
339
1017727
2113
而弥补这一短缺需要300年的时间
16:59
to train enough people
to fix that problem.
340
1019840
2894
17:02
So imagine if we can help
enhance their efficiency
341
1022734
2885
所以想象一下,是否我们能够用深度学习的方法
帮助他们提高效率?
17:05
using these deep learning approaches?
342
1025619
2839
17:08
So I'm very excited
about the opportunities.
343
1028458
2232
我对这个机会表示很激动
17:10
I'm also concerned about the problems.
344
1030690
2589
我同样的担心一些问题
17:13
The problem here is that
every area in blue on this map
345
1033279
3124
问题是在这张地图上的蓝色区域内
17:16
is somewhere where services
are over 80 percent of employment.
346
1036403
3769
服务占就业的80%以上
17:20
What are services?
347
1040172
1787
什么是服务?
17:21
These are services.
348
1041959
1514
这些是服务
17:23
These are also the exact things that
computers have just learned how to do.
349
1043473
4154
这些也是电脑才刚刚开始学习的事情
17:27
So 80 percent of the world's employment
in the developed world
350
1047627
3804
也就是说世界上发达国家的80%的就业
17:31
is stuff that computers
have just learned how to do.
351
1051431
2532
是电脑刚开始学习的
17:33
What does that mean?
352
1053963
1440
这是什么意思?
17:35
Well, it'll be fine.
They'll be replaced by other jobs.
353
1055403
2583
其实也没什么大不了的,他们会被其他职业替代
17:37
For example, there will be
more jobs for data scientists.
354
1057986
2707
比如说会有更多的数据学家
17:40
Well, not really.
355
1060693
817
也不尽然
17:41
It doesn't take data scientists
very long to build these things.
356
1061510
3118
数据学家不需要太久的时间做这些事
17:44
For example, these four algorithms
were all built by the same guy.
357
1064628
3252
比如这四个算法都是同时一个人开发的
17:47
So if you think, oh,
it's all happened before,
358
1067880
2438
如果你认为这些曾经都发生过
17:50
we've seen the results in the past
of when new things come along
359
1070318
3808
我们看到过新的事物出现
17:54
and they get replaced by new jobs,
360
1074126
2252
然后被新的职业所取代
17:56
what are these new jobs going to be?
361
1076378
2116
那这些新的职业又会是什么?
17:58
It's very hard for us to estimate this,
362
1078494
1871
很难做出估计
18:00
because human performance
grows at this gradual rate,
363
1080365
2739
因为人的能力以这个均匀的速度增长
18:03
but we now have a system, deep learning,
364
1083104
2562
但是现在我们有了深度学习系统
18:05
that we know actually grows
in capability exponentially.
365
1085666
3227
它的能力以指数方式增长
18:08
And we're here.
366
1088893
1605
我们现在在这
18:10
So currently, we see the things around us
367
1090498
2061
目前,我们看周围的事物
18:12
and we say, "Oh, computers
are still pretty dumb." Right?
368
1092559
2676
会说:“电脑还是很笨。”对吧?
18:15
But in five years' time,
computers will be off this chart.
369
1095235
3429
但是在五年内,电脑会超出这张图
18:18
So we need to be starting to think
about this capability right now.
370
1098664
3865
所以我们现在要开始考虑这样的能力了
18:22
We have seen this once before, of course.
371
1102529
2050
当然,我们曾经见过这个
18:24
In the Industrial Revolution,
372
1104579
1387
在工业革命时期
18:25
we saw a step change
in capability thanks to engines.
373
1105966
2851
发动机让生产力迈进一大步
18:29
The thing is, though,
that after a while, things flattened out.
374
1109667
3138
然而问题是,一段时间之后,形势转平了
18:32
There was social disruption,
375
1112805
1702
是由于社会的破坏
18:34
but once engines were used
to generate power in all the situations,
376
1114507
3439
但当发动机被普遍应用时
18:37
things really settled down.
377
1117946
2354
一切都稳定下来了
18:40
The Machine Learning Revolution
378
1120300
1473
机器学习革命
18:41
is going to be very different
from the Industrial Revolution,
379
1121773
2909
将和工业革命有很大不同
18:44
because the Machine Learning Revolution,
it never settles down.
380
1124682
2950
因为机器学习革命不会停止
18:47
The better computers get
at intellectual activities,
381
1127632
2982
电脑越擅长智能活动
18:50
the more they can build better computers
to be better at intellectual capabilities,
382
1130614
4248
它们越能制造出更加擅长智能活动的电脑
18:54
so this is going to be a kind of change
383
1134862
1908
这将会是世界从未经历过的改变
18:56
that the world has actually
never experienced before,
384
1136770
2478
18:59
so your previous understanding
of what's possible is different.
385
1139248
3306
所以你之前理解的可能性是不一样的
19:02
This is already impacting us.
386
1142974
1780
这正在影响我们的生活
19:04
In the last 25 years,
as capital productivity has increased,
387
1144754
3630
在过去的25年里,随着资本生产力的增加
19:08
labor productivity has been flat,
in fact even a little bit down.
388
1148400
4188
劳动生产力在变缓,甚至下降
19:13
So I want us to start
having this discussion now.
389
1153408
2741
所以我希望可以发起大家的讨论
19:16
I know that when I often tell people
about this situation,
390
1156149
3027
我知道当我和人们讲述这样的处境时
19:19
people can be quite dismissive.
391
1159176
1490
人们往往表现出不以为然
19:20
Well, computers can't really think,
392
1160666
1673
电脑不会思考
19:22
they don't emote,
they don't understand poetry,
393
1162339
3028
它们没有情感,也不懂诗
19:25
we don't really understand how they work.
394
1165367
2521
它们甚至都不知道自己是如何运作的
19:27
So what?
395
1167888
1486
那又怎样?
19:29
Computers right now can do the things
396
1169374
1804
电脑现在可以做
19:31
that humans spend most
of their time being paid to do,
397
1171178
2719
人类用大部分有偿的劳动时间做的事情
19:33
so now's the time to start thinking
398
1173897
1731
所以现在该到我们思考
19:35
about how we're going to adjust our
social structures and economic structures
399
1175628
4387
我们将如何调整我们的社会结构和经济结构
19:40
to be aware of this new reality.
400
1180015
1840
来应对新形势
19:41
Thank you.
401
1181855
1533
谢谢
19:43
(Applause)
402
1183388
802
(鼓掌)
New videos
Original video on YouTube.com
关于本网站
这个网站将向你介绍对学习英语有用的YouTube视频。你将看到来自世界各地的一流教师教授的英语课程。双击每个视频页面上显示的英文字幕,即可从那里播放视频。字幕会随着视频的播放而同步滚动。如果你有任何意见或要求,请使用此联系表与我们联系。