Kenneth Cukier: Big data is better data

541,522 views ・ 2014-09-23

TED


请双击下面的英文字幕来播放视频。

翻译人员: Simon Cai 校对人员: Amy Yang
00:12
America's favorite pie is?
0
12787
3845
美国人最爱的馅饼是什么?
00:16
Audience: Apple. Kenneth Cukier: Apple. Of course it is.
1
16632
3506
观众:苹果派 Kenneth Cukier:苹果派 毋庸置疑
00:20
How do we know it?
2
20138
1231
我们是怎么知道的?
00:21
Because of data.
3
21369
2753
因为数据
00:24
You look at supermarket sales.
4
24122
2066
当你观察超市的销售数据
00:26
You look at supermarket sales of 30-centimeter pies
5
26188
2866
会发现超市销售的30厘米冷冻馅饼中
00:29
that are frozen, and apple wins, no contest.
6
29054
4075
苹果派胜出, 毫无悬念
00:33
The majority of the sales are apple.
7
33129
5180
绝大多数的销售份额就是来自苹果派
00:38
But then supermarkets started selling
8
38309
2964
但是之后超市开始销售
00:41
smaller, 11-centimeter pies,
9
41273
2583
比较小的11厘米的馅饼
00:43
and suddenly, apple fell to fourth or fifth place.
10
43856
4174
突然间苹果派的销量下降到了第4或第5名
00:48
Why? What happened?
11
48030
2875
为什么?怎么了?
00:50
Okay, think about it.
12
50905
2818
好, 想象一下
00:53
When you buy a 30-centimeter pie,
13
53723
3848
当你准备买一个30厘米的馅饼时
00:57
the whole family has to agree,
14
57571
2261
全家都不得不同意(选择苹果派馅饼)
00:59
and apple is everyone's second favorite.
15
59832
3791
虽然苹果派只是每个人的次选项
01:03
(Laughter)
16
63623
1935
(笑声)
01:05
But when you buy an individual 11-centimeter pie,
17
65558
3615
但当你给自己选一个11厘米馅饼时
01:09
you can buy the one that you want.
18
69173
3745
你可以买你最爱吃的口味
01:12
You can get your first choice.
19
72918
4015
你会选你的首选项
01:16
You have more data.
20
76933
1641
你有了更多数据
01:18
You can see something
21
78574
1554
你可以知道些事情
01:20
that you couldn't see
22
80128
1132
这些事情在你只有少量数据时
01:21
when you only had smaller amounts of it.
23
81260
3953
你是无法知道的
01:25
Now, the point here is that more data
24
85213
2475
这里, 关键的是更多的数据
01:27
doesn't just let us see more,
25
87688
2283
不单单让我们知道更多
01:29
more of the same thing we were looking at.
26
89971
1854
知道更多我们正在关注的同样事物
01:31
More data allows us to see new.
27
91825
3613
更多的数据使我们能了解新的事情
01:35
It allows us to see better.
28
95438
3094
让我们更好地了解
01:38
It allows us to see different.
29
98532
3656
让我们有不同的视角
01:42
In this case, it allows us to see
30
102188
3173
在这个例子里 更多的数据让我们知道
01:45
what America's favorite pie is:
31
105361
2913
美国人最喜欢的馅饼
01:48
not apple.
32
108274
2542
不是苹果派
01:50
Now, you probably all have heard the term big data.
33
110816
3614
你或许听说过大数据这个词
01:54
In fact, you're probably sick of hearing the term
34
114430
2057
事实上, 你可能对这个词
01:56
big data.
35
116487
1630
已经心生厌恶
01:58
It is true that there is a lot of hype around the term,
36
118117
3330
确实, 大数据受到了空前的宣传炒作
02:01
and that is very unfortunate,
37
121447
2332
这很不应该
02:03
because big data is an extremely important tool
38
123779
3046
因为大数据是一个非常重要的工具
02:06
by which society is going to advance.
39
126825
3734
社会将由此而不断进步
02:10
In the past, we used to look at small data
40
130559
3561
过去我们习惯于处理小数据
02:14
and think about what it would mean
41
134120
1704
思考这些小数据的意义
02:15
to try to understand the world,
42
135824
1496
并以此来了解世界
02:17
and now we have a lot more of it,
43
137320
1991
现在我们有很多很多的数据
02:19
more than we ever could before.
44
139311
2722
数据量前所未有的巨大
02:22
What we find is that when we have
45
142033
1877
当我们掌握海量数据时
02:23
a large body of data, we can fundamentally do things
46
143910
2724
我们可以做一些事
02:26
that we couldn't do when we only had smaller amounts.
47
146634
3276
一些在只有较少数据时不可能办到的事
02:29
Big data is important, and big data is new,
48
149910
2641
大数据很重要, 它也是一个新兴事物
02:32
and when you think about it,
49
152551
1777
想象一下
02:34
the only way this planet is going to deal
50
154328
2216
能够帮助我们应对
02:36
with its global challenges —
51
156544
1789
世界性难题
02:38
to feed people, supply them with medical care,
52
158333
3537
像食物短缺 医疗短缺
02:41
supply them with energy, electricity,
53
161870
2810
能源短缺 电力短缺
02:44
and to make sure they're not burnt to a crisp
54
164680
1789
还有确保人类家园
02:46
because of global warming —
55
166469
1238
不会因为全球变暖而生灵涂炭
02:47
is because of the effective use of data.
56
167707
4195
的唯一办法是有效利用大数据
02:51
So what is new about big data? What is the big deal?
57
171902
3870
那么大数据新在何处, 重在何处呢?
02:55
Well, to answer that question, let's think about
58
175772
2517
为了回答这个问题, 让我们看一下
02:58
what information looked like,
59
178289
1896
信息看上去是什么样的
03:00
physically looked like in the past.
60
180185
3034
信息在以前是什么样的
03:03
In 1908, on the island of Crete,
61
183219
3611
1908年在克里特岛上 (注:位于地中海 为希腊第一大岛)
03:06
archaeologists discovered a clay disc.
62
186830
4735
考古学家发现了一个粘土做的盘子
03:11
They dated it from 2000 B.C., so it's 4,000 years old.
63
191565
4059
这是个公元前2000年的盘子 距今约有4000年的历史
03:15
Now, there's inscriptions on this disc,
64
195624
2004
盘子上有铭文
03:17
but we actually don't know what it means.
65
197628
1327
但是我们不知道它们是什么意思
03:18
It's a complete mystery, but the point is that
66
198955
2098
这完全是个谜团
03:21
this is what information used to look like
67
201053
1928
但这就是4000年前
03:22
4,000 years ago.
68
202981
2089
信息的样子
03:25
This is how society stored
69
205070
2548
这就是当时社会
03:27
and transmitted information.
70
207618
3524
存储和传递信息的方式
03:31
Now, society hasn't advanced all that much.
71
211142
4160
现代社会也没有什么很大的进步
03:35
We still store information on discs,
72
215302
3474
我们还是把数据存储在盘中 (注:指磁盘)
03:38
but now we can store a lot more information,
73
218776
3184
但我们可以存储更多的信息
03:41
more than ever before.
74
221960
1260
远远超过以前的信息容量
03:43
Searching it is easier. Copying it easier.
75
223220
3093
这些信息搜索和复制起来更简单
03:46
Sharing it is easier. Processing it is easier.
76
226313
3500
分享和处理起来也更便捷
03:49
And what we can do is we can reuse this information
77
229813
2766
我们也可以重新利用这些数据
03:52
for uses that we never even imagined
78
232579
1834
一些我们当初收集的时候
03:54
when we first collected the data.
79
234413
3195
从来没有料想过的用途
03:57
In this respect, the data has gone
80
237608
2252
从这个方面来说
数据已经从储存状态到了流动状态
03:59
from a stock to a flow,
81
239860
3532
04:03
from something that is stationary and static
82
243392
3938
从静态的统计性的数据
04:07
to something that is fluid and dynamic.
83
247330
3609
变成动态的数据流
04:10
There is, if you will, a liquidity to information.
84
250939
4023
这就是信息的流动性
04:14
The disc that was discovered off of Crete
85
254962
3474
克里特岛发现的粘土盘
04:18
that's 4,000 years old, is heavy,
86
258436
3764
有4000年的历史, 非常笨重
04:22
it doesn't store a lot of information,
87
262200
1962
但它不能记录太多的信息
04:24
and that information is unchangeable.
88
264162
3116
并且它所记录的信息是不能更改的
04:27
By contrast, all of the files
89
267278
4011
与此相反
爱德华·斯诺登从美国国家安全局
04:31
that Edward Snowden took
90
271289
1861
04:33
from the National Security Agency in the United States
91
273150
2621
所获得的文件
04:35
fits on a memory stick
92
275771
2419
可以放在一个
04:38
the size of a fingernail,
93
278190
3010
仅有指甲大小的存储盘里
04:41
and it can be shared at the speed of light.
94
281200
4745
并且可以以光速进行数据共享
04:45
More data. More.
95
285945
5255
更多数据 更多
04:51
Now, one reason why we have so much data in the world today
96
291200
1974
今天我们有这么多数据的一个原因是
04:53
is we are collecting things
97
293174
1432
我们一直在收集信息
04:54
that we've always collected information on,
98
294606
3280
就像我们一直在做的一样
04:57
but another reason why is we're taking things
99
297886
2656
另一个原因是我们记录了
05:00
that have always been informational
100
300542
2812
许多蕴含丰富信息的事物
05:03
but have never been rendered into a data format
101
303354
2486
但是从没把信息转换成数据形式
05:05
and we are putting it into data.
102
305840
2419
现在我们正在把信息转变成数据
05:08
Think, for example, the question of location.
103
308259
3308
举个例子, 定位问题
05:11
Take, for example, Martin Luther.
104
311567
2249
比如说马丁·路德
05:13
If we wanted to know in the 1500s
105
313816
1597
在16世纪 如果我们想知道
05:15
where Martin Luther was,
106
315413
2667
马丁·路德在哪里
05:18
we would have to follow him at all times,
107
318080
2092
我们必须一直跟着他
05:20
maybe with a feathery quill and an inkwell,
108
320172
2137
或许用羽毛笔和墨水
05:22
and record it,
109
322309
1676
把这些情况记录下来
05:23
but now think about what it looks like today.
110
323985
2183
那现今是什么样的情形呢?
05:26
You know that somewhere,
111
326168
2122
在某些地方
05:28
probably in a telecommunications carrier's database,
112
328290
2446
可能在电信运营商的数据库里
05:30
there is a spreadsheet or at least a database entry
113
330736
3036
有个电子数据表或者至少一个数据目录
05:33
that records your information
114
333772
2088
记录着所有关于你
05:35
of where you've been at all times.
115
335860
2063
任何时候在什么地点的信息
05:37
If you have a cell phone,
116
337923
1360
如果你有个手机
05:39
and that cell phone has GPS, but even if it doesn't have GPS,
117
339283
2847
这个手机有GPS, 或者即使没有GPS
05:42
it can record your information.
118
342130
2385
它还是可以记录你的信息
05:44
In this respect, location has been datafied.
119
344515
4084
从这方面来说, 位置信息被数据化了
05:48
Now think, for example, of the issue of posture,
120
348599
4601
再举个例子, 关于姿势
05:53
the way that you are all sitting right now,
121
353200
1285
你们现在坐着的姿势
05:54
the way that you sit,
122
354485
2030
你坐着的姿势
05:56
the way that you sit, the way that you sit.
123
356515
2771
你坐着的姿势 你坐着的姿势
05:59
It's all different, and it's a function of your leg length
124
359286
2077
这些都不一样 这是一个关于腿长
06:01
and your back and the contours of your back,
125
361363
2093
你的背部和背部轮廓的函数
06:03
and if I were to put sensors, maybe 100 sensors
126
363456
2531
如果我现在放一些传感器 或许100个
06:05
into all of your chairs right now,
127
365987
1766
在你的椅子里
06:07
I could create an index that's fairly unique to you,
128
367753
3600
我可以算出你的独一无二的参数
06:11
sort of like a fingerprint, but it's not your finger.
129
371353
4409
就像你的指纹 但不是针对你的手指
06:15
So what could we do with this?
130
375762
2969
那我们能用它来干什么呢?
06:18
Researchers in Tokyo are using it
131
378731
2397
东京的研究者把它
06:21
as a potential anti-theft device in cars.
132
381128
4388
运用在一个汽车防盗设施的雏形上
06:25
The idea is that the carjacker sits behind the wheel,
133
385516
2924
它的设想是盗贼坐在驾驶座上
06:28
tries to stream off, but the car recognizes
134
388440
2104
企图把车开走 但是汽车识别出
06:30
that a non-approved driver is behind the wheel,
135
390544
2362
驾驶座上的是个未授权驾驶人
06:32
and maybe the engine just stops, unless you
136
392906
2164
那汽车可能就会熄火
06:35
type in a password into the dashboard
137
395070
3177
除非你在仪表盘上输入密码
06:38
to say, "Hey, I have authorization to drive." Great.
138
398247
4658
来表明“我已获得授权”
06:42
What if every single car in Europe
139
402905
2553
如果欧洲的每辆汽车
06:45
had this technology in it?
140
405458
1457
都装备了这项技术会是怎样的情形?
06:46
What could we do then?
141
406915
3165
我们还能做些什么呢?
06:50
Maybe, if we aggregated the data,
142
410080
2240
或许如果我们整合数据
06:52
maybe we could identify telltale signs
143
412320
3814
我们可以识别示警信号
06:56
that best predict that a car accident
144
416134
2709
对于在下一个五秒钟内
06:58
is going to take place in the next five seconds.
145
418843
5893
可能发生的意外做出最佳预判
07:04
And then what we will have datafied
146
424736
2557
我们也可以进行数据化的是
07:07
is driver fatigue,
147
427293
1783
司机的疲劳度
07:09
and the service would be when the car senses
148
429076
2334
当汽车侦测到司机的坐姿
07:11
that the person slumps into that position,
149
431410
3437
倒成某一特定姿势时
07:14
automatically knows, hey, set an internal alarm
150
434847
3994
这个设备感知到并发出车内警告
07:18
that would vibrate the steering wheel, honk inside
151
438841
2025
可能是震动方向盘或语音提示
07:20
to say, "Hey, wake up,
152
440866
1721
“嗨,醒醒
07:22
pay more attention to the road."
153
442587
1904
集中精神在路况上”
07:24
These are the sorts of things we can do
154
444491
1853
这就是生活的更多方面数据化后
07:26
when we datafy more aspects of our lives.
155
446344
2821
我们能做的事情
07:29
So what is the value of big data?
156
449165
3675
那么大数据的价值在哪里?
07:32
Well, think about it.
157
452840
2190
好 思考一下
07:35
You have more information.
158
455030
2412
你有了更多地信息
07:37
You can do things that you couldn't do before.
159
457442
3341
你可以做你以前不能做的事
07:40
One of the most impressive areas
160
460783
1676
在运用这个概念的领域里
07:42
where this concept is taking place
161
462459
1729
让人印象最为最深刻的
07:44
is in the area of machine learning.
162
464188
3307
是机器学习
07:47
Machine learning is a branch of artificial intelligence,
163
467495
3077
机器学习是人工智能的一个分支
07:50
which itself is a branch of computer science.
164
470572
3378
人工智能又是计算机科学的一个分支
07:53
The general idea is that instead of
165
473950
1543
它的基本理念是
07:55
instructing a computer what do do,
166
475493
2117
把关于某个问题的一堆数据扔给电脑
07:57
we are going to simply throw data at the problem
167
477610
2620
让电脑自己找出解决方案
08:00
and tell the computer to figure it out for itself.
168
480230
3206
而不是教电脑应该做什么
08:03
And it will help you understand it
169
483436
1777
通过机器学习的原型
08:05
by seeing its origins.
170
485213
3552
可以帮助你来理解这个理念
08:08
In the 1950s, a computer scientist
171
488765
2388
20世纪50年代IBM的计算机科学家
08:11
at IBM named Arthur Samuel liked to play checkers,
172
491153
3592
亚瑟·塞缪尔想玩跳棋
08:14
so he wrote a computer program
173
494745
1402
所以他写了个程序
08:16
so he could play against the computer.
174
496147
2813
这样他就可以和电脑来玩
08:18
He played. He won.
175
498960
2711
开始他下一盘 赢一盘
08:21
He played. He won.
176
501671
2103
下一盘 赢一盘
08:23
He played. He won,
177
503774
3015
下一盘 赢一盘
08:26
because the computer only knew
178
506789
1778
因为电脑只知道
08:28
what a legal move was.
179
508567
2227
规则允许怎样走
08:30
Arthur Samuel knew something else.
180
510794
2087
亚瑟·塞缪尔还知道其他东西
08:32
Arthur Samuel knew strategy.
181
512881
4629
他知道下棋的策略
08:37
So he wrote a small sub-program alongside it
182
517510
2396
所以他又写了一个附加程序
08:39
operating in the background, and all it did
183
519906
1974
这个程序在后台运行
08:41
was score the probability
184
521880
1817
它的功能只是计算概率
08:43
that a given board configuration would likely lead
185
523697
2563
在一个给定的棋局里
08:46
to a winning board versus a losing board
186
526260
2910
每走一步后
08:49
after every move.
187
529170
2508
会获胜或者失败的概率
08:51
He plays the computer. He wins.
188
531678
3150
再和电脑下棋 还是下一盘 赢一盘
08:54
He plays the computer. He wins.
189
534828
2508
下一盘 赢一盘
08:57
He plays the computer. He wins.
190
537336
3731
下一盘 赢一盘
09:01
And then Arthur Samuel leaves the computer
191
541067
2277
后来亚瑟让电脑
09:03
to play itself.
192
543344
2227
自己和自己下棋
09:05
It plays itself. It collects more data.
193
545571
3509
电脑自己玩的时候收集了更多的数据
09:09
It collects more data. It increases the accuracy of its prediction.
194
549080
4309
收集的数据越多, 预测的准确率就越高
09:13
And then Arthur Samuel goes back to the computer
195
553389
2104
然后亚瑟又继续和电脑下棋
09:15
and he plays it, and he loses,
196
555493
2318
这次他下一盘 输一盘
09:17
and he plays it, and he loses,
197
557811
2069
下一盘 输一盘
09:19
and he plays it, and he loses,
198
559880
2047
下一盘 输一盘
09:21
and Arthur Samuel has created a machine
199
561927
2599
亚瑟创造了一个机器
09:24
that surpasses his ability in a task that he taught it.
200
564526
6288
它的能力超越了亚瑟开始时所教给它的
09:30
And this idea of machine learning
201
570814
2498
机器学习的理念
09:33
is going everywhere.
202
573312
3927
现在已经随处可见
09:37
How do you think we have self-driving cars?
203
577239
3149
你们觉得无人驾驶汽车(关键的技术)是什么?
09:40
Are we any better off as a society
204
580388
2137
是不是把所有交通规则输入软件
09:42
enshrining all the rules of the road into software?
205
582525
3285
就万事大吉了?不是
09:45
No. Memory is cheaper. No.
206
585810
2598
内存很便宜?不是
09:48
Algorithms are faster. No. Processors are better. No.
207
588408
3994
算法更快了 不是 处理器更强大了 不是
09:52
All of those things matter, but that's not why.
208
592402
2772
这些都有影响, 但不是真正的原因
09:55
It's because we changed the nature of the problem.
209
595174
3141
真正的原因是我们改变了问题的本质
09:58
We changed the nature of the problem from one
210
598315
1530
我们把问题的本质从
09:59
in which we tried to overtly and explicitly
211
599845
2245
试图明确无误地
10:02
explain to the computer how to drive
212
602090
2581
教会电脑怎样驾驶
10:04
to one in which we say,
213
604671
1316
变成我们对电脑说
10:05
"Here's a lot of data around the vehicle.
214
605987
1876
“这里有许多关于汽车的数据
10:07
You figure it out.
215
607863
1533
你自己搞定它
10:09
You figure it out that that is a traffic light,
216
609396
1867
你知道那是交通信号灯
10:11
that that traffic light is red and not green,
217
611263
2081
那是红灯不是绿灯
10:13
that that means that you need to stop
218
613344
2014
遇到红灯你必须停下来
10:15
and not go forward."
219
615358
3083
不能往前走”
10:18
Machine learning is at the basis
220
618441
1518
机器学习是许多
10:19
of many of the things that we do online:
221
619959
1991
网上在线应用的基础
10:21
search engines,
222
621950
1857
搜索引擎
10:23
Amazon's personalization algorithm,
223
623807
3801
亚马逊的个性化算法
10:27
computer translation,
224
627608
2212
电脑智能翻译
10:29
voice recognition systems.
225
629820
4290
语音识别系统
10:34
Researchers recently have looked at
226
634110
2835
研究者最近在研究
10:36
the question of biopsies,
227
636945
3195
关于活组织检查的问题
10:40
cancerous biopsies,
228
640140
2767
关于肿瘤活组织检查
10:42
and they've asked the computer to identify
229
642907
2315
他们让电脑
10:45
by looking at the data and survival rates
230
645222
2471
通过 (历史) 数据和存活率
10:47
to determine whether cells are actually
231
647693
4667
来判断这些细胞
10:52
cancerous or not,
232
652360
2544
是否是癌症细胞
10:54
and sure enough, when you throw the data at it,
233
654904
1778
果不其然 当你把数据交给电脑
10:56
through a machine-learning algorithm,
234
656682
2047
电脑通过自主学习
10:58
the machine was able to identify
235
658729
1877
可以寻找出
11:00
the 12 telltale signs that best predict
236
660606
2262
12个最佳的鉴别特征用来预测
11:02
that this biopsy of the breast cancer cells
237
662868
3299
乳腺癌细胞的活检切片
11:06
are indeed cancerous.
238
666167
3218
确实是癌症细胞
11:09
The problem: The medical literature
239
669385
2498
问题是医学文献
11:11
only knew nine of them.
240
671883
2789
只知道其中的九个鉴别特征
11:14
Three of the traits were ones
241
674672
1800
其他三个
11:16
that people didn't need to look for,
242
676472
2975
人们不会去寻找
11:19
but that the machine spotted.
243
679447
5531
但是电脑把它们找了出来
11:24
Now, there are dark sides to big data as well.
244
684978
5925
大数据也有黑暗的一面
11:30
It will improve our lives, but there are problems
245
690903
2074
它可以改善我们的生活
11:32
that we need to be conscious of,
246
692977
2640
但也会带来一些我们需要注意的问题
11:35
and the first one is the idea
247
695617
2623
首先就是
11:38
that we may be punished for predictions,
248
698240
2686
我们可能因为预测的结果而受到惩罚
11:40
that the police may use big data for their purposes,
249
700926
3870
警察可能会用大数据来实现目标
11:44
a little bit like "Minority Report."
250
704796
2351
有点像“少数派报告”
11:47
Now, it's a term called predictive policing,
251
707147
2441
现在有个词叫做预见性监管
11:49
or algorithmic criminology,
252
709588
2363
或者叫算法犯罪学
11:51
and the idea is that if we take a lot of data,
253
711951
2036
这个想法是如果我们掌握了大量数据
11:53
for example where past crimes have been,
254
713987
2159
比如以往犯罪发生的地点
11:56
we know where to send the patrols.
255
716146
2543
我们可以就知道把警力派到哪里
11:58
That makes sense, but the problem, of course,
256
718689
2115
这很合理 但问题是
12:00
is that it's not simply going to stop on location data,
257
720804
4544
数据分析不会仅限于地点数据
12:05
it's going to go down to the level of the individual.
258
725348
2959
它会进一步深入到个人层面
12:08
Why don't we use data about the person's
259
728307
2250
为什么我们不去分析
12:10
high school transcript?
260
730557
2228
某人的中学成绩单
12:12
Maybe we should use the fact that
261
732785
1561
或者我们可以了解
12:14
they're unemployed or not, their credit score,
262
734346
2028
他们的就职情况、信用记录
12:16
their web-surfing behavior,
263
736374
1552
他们的上网行为
12:17
whether they're up late at night.
264
737926
1878
他们是否熬夜
12:19
Their Fitbit, when it's able to identify biochemistries,
265
739804
3161
当可以通过健康腕带读取生化数据时
12:22
will show that they have aggressive thoughts.
266
742965
4236
就可以知道他们是否有激进的想法
12:27
We may have algorithms that are likely to predict
267
747201
2221
我们可以用算法来预测
12:29
what we are about to do,
268
749422
1633
我们将要做什么
12:31
and we may be held accountable
269
751055
1244
可能有些事情还没做
12:32
before we've actually acted.
270
752299
2590
我们就要承担责任
12:34
Privacy was the central challenge
271
754889
1732
个人隐私在小数据时代
12:36
in a small data era.
272
756621
2880
是主要挑战
12:39
In the big data age,
273
759501
2149
在大数据时代
12:41
the challenge will be safeguarding free will,
274
761650
4523
这个挑战将会成为保卫自由意愿
12:46
moral choice, human volition,
275
766173
3779
道德选择 、人类意志
12:49
human agency.
276
769952
3068
人类的能动性
12:54
There is another problem:
277
774540
2225
还有另一个问题
12:56
Big data is going to steal our jobs.
278
776765
3556
大数据会偷走我们的工作
13:00
Big data and algorithms are going to challenge
279
780321
3512
在21世纪
13:03
white collar, professional knowledge work
280
783833
3061
大数据和算法会威胁到
13:06
in the 21st century
281
786894
1653
白领和需要专业知识的工作
13:08
in the same way that factory automation
282
788547
2434
就像在20世纪工厂自动化
13:10
and the assembly line
283
790981
2189
和装配生产线的应用
13:13
challenged blue collar labor in the 20th century.
284
793170
3026
威胁到了蓝领们的工作岗位
13:16
Think about a lab technician
285
796196
2092
想象一下一个研究室技术员
13:18
who is looking through a microscope
286
798288
1409
他的工作就是通过一个显微镜
13:19
at a cancer biopsy
287
799697
1624
观察一个癌症活检组织
13:21
and determining whether it's cancerous or not.
288
801321
2637
来判定它是不是癌症的
13:23
The person went to university.
289
803958
1972
这个人上大学
13:25
The person buys property.
290
805930
1430
买房子
13:27
He or she votes.
291
807360
1741
他/她投票选举
13:29
He or she is a stakeholder in society.
292
809101
3666
他/她是这个社会的一份子
13:32
And that person's job,
293
812767
1394
然后这个人的工作
13:34
as well as an entire fleet
294
814161
1609
还有其他
13:35
of professionals like that person,
295
815770
1969
像他一样的专业人员
13:37
is going to find that their jobs are radically changed
296
817739
3150
将会发现他们的工作被彻底改变了
13:40
or actually completely eliminated.
297
820889
2357
或者彻底废除了
13:43
Now, we like to think
298
823246
1284
我们一直以为
13:44
that technology creates jobs over a period of time
299
824530
3187
在短时或者暂时的就业调整期后
13:47
after a short, temporary period of dislocation,
300
827717
3465
一段时间内科技会创造就业机会
13:51
and that is true for the frame of reference
301
831182
1941
这对于我们所处的参考系
13:53
with which we all live, the Industrial Revolution,
302
833123
2142
工业革命来说就是这样
13:55
because that's precisely what happened.
303
835265
2328
因为在工业革命时期事情就是这样的
13:57
But we forget something in that analysis:
304
837593
2333
但是我们忘记了一件事情
13:59
There are some categories of jobs
305
839926
1830
有些类型的职业
14:01
that simply get eliminated and never come back.
306
841756
3420
已经彻底消失了并且再也不会回来
14:05
The Industrial Revolution wasn't very good
307
845176
2004
如果你是一匹马
14:07
if you were a horse.
308
847180
4002
工业革命不是一件好事
14:11
So we're going to need to be careful
309
851182
2055
所以我们必须非常小心
14:13
and take big data and adjust it for our needs,
310
853237
3514
根据我们的需求和整个人类的需求
14:16
our very human needs.
311
856751
3185
来利用和适应大数据
14:19
We have to be the master of this technology,
312
859936
1954
我们必须是技术的主人
14:21
not its servant.
313
861890
1656
而不是技术的仆人
14:23
We are just at the outset of the big data era,
314
863546
2958
我们正在步入大数据时代
14:26
and honestly, we are not very good
315
866504
3150
老实说, 我们并不能很好地
14:29
at handling all the data that we can now collect.
316
869654
4207
处理所有我们现在能够收集到的数据
14:33
It's not just a problem for the National Security Agency.
317
873861
3330
这不仅仅是国家安全局的问题
14:37
Businesses collect lots of data, and they misuse it too,
318
877191
3038
许多企业也搜集并不恰当地使用数据
14:40
and we need to get better at this, and this will take time.
319
880229
3667
我们需要时间来纠正这个问题
14:43
It's a little bit like the challenge that was faced
320
883896
1822
这有点像原始人类面对火时
14:45
by primitive man and fire.
321
885718
2407
所面临的挑战
14:48
This is a tool, but this is a tool that,
322
888125
1885
火是一种工具
14:50
unless we're careful, will burn us.
323
890010
3559
但是如果使用不当就会引火烧身
14:56
Big data is going to transform how we live,
324
896008
3120
大数据即将改变我们的生活方式
14:59
how we work and how we think.
325
899128
2801
我们的工作方式和思考方式
15:01
It is going to help us manage our careers
326
901929
1889
它可以帮助我们管理事业
15:03
and lead lives of satisfaction and hope
327
903818
3634
帮助我们过想要的满足、充满希望
15:07
and happiness and health,
328
907452
2992
幸福和健康的生活
15:10
but in the past, we've often looked at information technology
329
910444
3306
但是在过去, 对于信息技术(IT)
15:13
and our eyes have only seen the T,
330
913750
2208
我们经常只看到了T
15:15
the technology, the hardware,
331
915958
1686
就是技术、硬件
15:17
because that's what was physical.
332
917644
2262
因为这是切实可见的东西
15:19
We now need to recast our gaze at the I,
333
919906
2924
现在我们需要把目光放在 I 上
15:22
the information,
334
922830
1380
信息
15:24
which is less apparent,
335
924210
1373
它不是那么切实可见
15:25
but in some ways a lot more important.
336
925583
4109
但某种程度上却更加重要
15:29
Humanity can finally learn from the information
337
929692
3465
在人类永无止境的探索过程中
15:33
that it can collect,
338
933157
2418
我们可以从我们能收集的信息中
15:35
as part of our timeless quest
339
935575
2115
来了解这个世界
15:37
to understand the world and our place in it,
340
937690
3159
以及人类在这个世界中所处的地位
15:40
and that's why big data is a big deal.
341
940849
5631
这就是为什么大数据非常重要
15:46
(Applause)
342
946480
3568
(掌声)
关于本网站

这个网站将向你介绍对学习英语有用的YouTube视频。你将看到来自世界各地的一流教师教授的英语课程。双击每个视频页面上显示的英文字幕,即可从那里播放视频。字幕会随着视频的播放而同步滚动。如果你有任何意见或要求,请使用此联系表与我们联系。

https://forms.gle/WvT1wiN1qDtmnspy7


This website was created in October 2020 and last updated on June 12, 2025.

It is now archived and preserved as an English learning resource.

Some information may be out of date.

隐私政策

eng.lish.video

Developer's Blog