Kenneth Cukier: Big data is better data

541,522 views ・ 2014-09-23

TED


請雙擊下方英文字幕播放視頻。

譯者: Yesbydefault 倪文娟 審譯者: Rocky LIANG
00:12
America's favorite pie is?
0
12787
3845
美國人最喜歡哪一種派?
00:16
Audience: Apple. Kenneth Cukier: Apple. Of course it is.
1
16632
3506
觀眾:蘋果派。
講者:蘋果派,當然啦!
00:20
How do we know it?
2
20138
1231
我們怎麼知道?
00:21
Because of data.
3
21369
2753
因為有數據。
00:24
You look at supermarket sales.
4
24122
2066
我們分析超市銷售數據,
00:26
You look at supermarket sales of 30-centimeter pies
5
26188
2866
分析直徑 30 公分冷凍蘋果派的 超市銷售數據,
00:29
that are frozen, and apple wins, no contest.
6
29054
4075
蘋果派最夯,銷量一面倒。
顧客幾乎都是買蘋果派。
00:33
The majority of the sales are apple.
7
33129
5180
00:38
But then supermarkets started selling
8
38309
2964
但是後來,超市開始賣小派,
00:41
smaller, 11-centimeter pies,
9
41273
2583
直徑 11 公分的派,
00:43
and suddenly, apple fell to fourth or fifth place.
10
43856
4174
突然,蘋果派銷量掉到第四、五名,
00:48
Why? What happened?
11
48030
2875
為什麼?發生了什麼事?
00:50
Okay, think about it.
12
50905
2818
好,你想想:
00:53
When you buy a 30-centimeter pie,
13
53723
3848
如果是買 30 公分的大派,
00:57
the whole family has to agree,
14
57571
2261
全家人都得同意,
00:59
and apple is everyone's second favorite.
15
59832
3791
而蘋果是全家每個人的第二選擇,
01:03
(Laughter)
16
63623
1935
(觀眾笑聲)
01:05
But when you buy an individual 11-centimeter pie,
17
65558
3615
但是當你分開買 11 公分的小派,
01:09
you can buy the one that you want.
18
69173
3745
就可以買你自己想吃的,
01:12
You can get your first choice.
19
72918
4015
每個人都可以選自己最愛的口味。
01:16
You have more data.
20
76933
1641
這就會產生更多的數據。
01:18
You can see something
21
78574
1554
你會有新發現,
01:20
that you couldn't see
22
80128
1132
看出數據少的時候, 無法發現的現象。
01:21
when you only had smaller amounts of it.
23
81260
3953
01:25
Now, the point here is that more data
24
85213
2475
現在,這個例子的重點是,
數據增加,不只是讓我們看見更「多」,
01:27
doesn't just let us see more,
25
87688
2283
01:29
more of the same thing we were looking at.
26
89971
1854
更多我們本來就已經知道的;
01:31
More data allows us to see new.
27
91825
3613
數據增加,讓我們看見「新」資訊,
01:35
It allows us to see better.
28
95438
3094
看得更「準確」,
01:38
It allows us to see different.
29
98532
3656
看見「不同」。
01:42
In this case, it allows us to see
30
102188
3173
在這個例子,它使我們看到
01:45
what America's favorite pie is:
31
105361
2913
美國人真正最喜歡的派是什麼:
01:48
not apple.
32
108274
2542
不是蘋果派。
01:50
Now, you probably all have heard the term big data.
33
110816
3614
你們可能都聽過「大數據」這個詞,
01:54
In fact, you're probably sick of hearing the term
34
114430
2057
其實,你們可能已經聽膩了。
01:56
big data.
35
116487
1630
01:58
It is true that there is a lot of hype around the term,
36
118117
3330
的確有很多大肆宣傳,
02:01
and that is very unfortunate,
37
121447
2332
非常遺憾。
02:03
because big data is an extremely important tool
38
123779
3046
因為大數據是極為重要的工具,
02:06
by which society is going to advance.
39
126825
3734
將會推動社會進步。
02:10
In the past, we used to look at small data
40
130559
3561
過去,我們依賴少量數據,
02:14
and think about what it would mean
41
134120
1704
研究其含義,
02:15
to try to understand the world,
42
135824
1496
試圖了解我們的世界。
02:17
and now we have a lot more of it,
43
137320
1991
現在我們有了更多數據,
02:19
more than we ever could before.
44
139311
2722
遠超過以往能力所及。
02:22
What we find is that when we have
45
142033
1877
我們發現,
02:23
a large body of data, we can fundamentally do things
46
143910
2724
當我們擁有龐大的數據,
就可以做過去數據較少時做不到的事。
02:26
that we couldn't do when we only had smaller amounts.
47
146634
3276
02:29
Big data is important, and big data is new,
48
149910
2641
大數據很重要,
大數據也很新。
02:32
and when you think about it,
49
152551
1777
你想一想,
02:34
the only way this planet is going to deal
50
154328
2216
唯一能幫助地球因應全球的挑戰:
02:36
with its global challenges —
51
156544
1789
02:38
to feed people, supply them with medical care,
52
158333
3537
解決饑荒、
提供醫療、
02:41
supply them with energy, electricity,
53
161870
2810
提供能源和電力、
02:44
and to make sure they're not burnt to a crisp
54
164680
1789
確保我們不被全球暖化烤焦,
02:46
because of global warming —
55
166469
1238
02:47
is because of the effective use of data.
56
167707
4195
唯一的方法,就是靠善用數據。
02:51
So what is new about big data? What is the big deal?
57
171902
3870
所以大數據有什麼稀奇?
有什麼好「大」驚小怪?
02:55
Well, to answer that question, let's think about
58
175772
2517
要回答這個問題,
讓我們先來看資訊以前長什麼樣子。
02:58
what information looked like,
59
178289
1896
03:00
physically looked like in the past.
60
180185
3034
03:03
In 1908, on the island of Crete,
61
183219
3611
好,
1908 年,在克里特島,
03:06
archaeologists discovered a clay disc.
62
186830
4735
考古學家發現一個泥土圓盤,
03:11
They dated it from 2000 B.C., so it's 4,000 years old.
63
191565
4059
鑑定大約是公元前 2 千年製成的,
所以已經有 4 千年之久。
03:15
Now, there's inscriptions on this disc,
64
195624
2004
圓盤上刻有古文字,
03:17
but we actually don't know what it means.
65
197628
1327
但無法解讀,
03:18
It's a complete mystery, but the point is that
66
198955
2098
是個謎團。
但重點是,4 千年前資訊是這個樣貌,
03:21
this is what information used to look like
67
201053
1928
03:22
4,000 years ago.
68
202981
2089
古人是用這種方式儲存、傳遞資訊。
03:25
This is how society stored
69
205070
2548
03:27
and transmitted information.
70
207618
3524
03:31
Now, society hasn't advanced all that much.
71
211142
4160
到現在,社會並沒有進步那麼多,
03:35
We still store information on discs,
72
215302
3474
我們還是把資訊存在碟片上,
03:38
but now we can store a lot more information,
73
218776
3184
只是現在可以儲存更多資訊,
03:41
more than ever before.
74
221960
1260
空前的多。
03:43
Searching it is easier. Copying it easier.
75
223220
3093
搜尋更容易,複製更容易,
03:46
Sharing it is easier. Processing it is easier.
76
226313
3500
分享更容易,處理更容易。
03:49
And what we can do is we can reuse this information
77
229813
2766
我們可以重複使用這些資訊,
03:52
for uses that we never even imagined
78
232579
1834
用途之廣,超乎想像,
03:54
when we first collected the data.
79
234413
3195
超乎我們蒐集資訊時的預期。
03:57
In this respect, the data has gone
80
237608
2252
這樣看來,資訊已經
03:59
from a stock to a flow,
81
239860
3532
從「存料」 變成「流動」;
04:03
from something that is stationary and static
82
243392
3938
從靜止、靜態的,
04:07
to something that is fluid and dynamic.
83
247330
3609
變成流體、動態的。
04:10
There is, if you will, a liquidity to information.
84
250939
4023
資訊可說是,有流動性。
04:14
The disc that was discovered off of Crete
85
254962
3474
那個 4 千年之久的克里特圓盤,
04:18
that's 4,000 years old, is heavy,
86
258436
3764
它很重,
04:22
it doesn't store a lot of information,
87
262200
1962
儲存的資訊量不多,
04:24
and that information is unchangeable.
88
264162
3116
內容也不能更改。
04:27
By contrast, all of the files
89
267278
4011
相較之下,
愛德華.史諾登盜走的所有檔案,
04:31
that Edward Snowden took
90
271289
1861
04:33
from the National Security Agency in the United States
91
273150
2621
就是他從美國國安局竊走的資料,
04:35
fits on a memory stick
92
275771
2419
可以全部存在一個記憶卡,
04:38
the size of a fingernail,
93
278190
3010
體積只有指甲般的大小。
04:41
and it can be shared at the speed of light.
94
281200
4745
並且可以用光速來傳輸分享。
04:45
More data. More.
95
285945
5255
更多的數據!
更多。
今天之所以有這麼多的數據,
04:51
Now, one reason why we have so much data in the world today
96
291200
1974
04:53
is we are collecting things
97
293174
1432
原因之一是 我們正在蒐集過去
04:54
that we've always collected information on,
98
294606
3280
儲存資訊的物體;
04:57
but another reason why is we're taking things
99
297886
2656
原因之二是,
我們把一些經常很資訊性的東西——
05:00
that have always been informational
100
300542
2812
05:03
but have never been rendered into a data format
101
303354
2486
從未數據化的資訊,
05:05
and we are putting it into data.
102
305840
2419
把它們變成數據,
05:08
Think, for example, the question of location.
103
308259
3308
例如,地理位置。
05:11
Take, for example, Martin Luther.
104
311567
2249
舉馬丁.路德為例,
05:13
If we wanted to know in the 1500s
105
313816
1597
如果我們想知道十六世紀時,
05:15
where Martin Luther was,
106
315413
2667
馬丁.路德去過哪些地方,
05:18
we would have to follow him at all times,
107
318080
2092
我們必須隨時跟著他到處跑,
05:20
maybe with a feathery quill and an inkwell,
108
320172
2137
可能還要帶著羽毛筆和墨水瓶,
05:22
and record it,
109
322309
1676
隨時記錄。
05:23
but now think about what it looks like today.
110
323985
2183
但是看看現在的做法,
05:26
You know that somewhere,
111
326168
2122
你知道世界上某處,
05:28
probably in a telecommunications carrier's database,
112
328290
2446
可能是電信商的資料庫裡面,
05:30
there is a spreadsheet or at least a database entry
113
330736
3036
有一個試算表 或至少有一筆記錄,
05:33
that records your information
114
333772
2088
存著關於你的資訊,
05:35
of where you've been at all times.
115
335860
2063
記錄你去過的所有地方。
05:37
If you have a cell phone,
116
337923
1360
如果你有一支手機,
05:39
and that cell phone has GPS, but even if it doesn't have GPS,
117
339283
2847
手機有 GPS,但就算沒有 GPS,
05:42
it can record your information.
118
342130
2385
還是可以記錄你的資訊。
05:44
In this respect, location has been datafied.
119
344515
4084
就這個角度來說,位置已經被數據化。
05:48
Now think, for example, of the issue of posture,
120
348599
4601
現在再想想這個例子:姿勢,
05:53
the way that you are all sitting right now,
121
353200
1285
就是你們現在的坐姿,
05:54
the way that you sit,
122
354485
2030
你的坐姿、
05:56
the way that you sit, the way that you sit.
123
356515
2771
你的坐姿,和你的坐姿,
05:59
It's all different, and it's a function of your leg length
124
359286
2077
都不一樣,取決於你的腿長、
06:01
and your back and the contours of your back,
125
361363
2093
你的背和背部輪廓。
06:03
and if I were to put sensors, maybe 100 sensors
126
363456
2531
要是我現在裝 1 百個感應器,
06:05
into all of your chairs right now,
127
365987
1766
到你們每個人的椅子上,
06:07
I could create an index that's fairly unique to you,
128
367753
3600
我可以建出你個人獨特的索引資料,
06:11
sort of like a fingerprint, but it's not your finger.
129
371353
4409
有點像指紋,但不是你的手指。
06:15
So what could we do with this?
130
375762
2969
這有什麼用?
06:18
Researchers in Tokyo are using it
131
378731
2397
東京的研究員用這種數據
06:21
as a potential anti-theft device in cars.
132
381128
4388
來研發汽車防盜裝置。
06:25
The idea is that the carjacker sits behind the wheel,
133
385516
2924
概念是,偷車賊坐在駕駛座,
06:28
tries to stream off, but the car recognizes
134
388440
2104
急著開車逃逸,
但是車子辨識出開車的人未經授權,
06:30
that a non-approved driver is behind the wheel,
135
390544
2362
06:32
and maybe the engine just stops, unless you
136
392906
2164
引擎就自動熄火,
除非你輸入密碼到儀表板,
06:35
type in a password into the dashboard
137
395070
3177
06:38
to say, "Hey, I have authorization to drive." Great.
138
398247
4658
告訴系統:「嘿,我可是有經授權喔!」
很好。
06:42
What if every single car in Europe
139
402905
2553
若歐洲每輛汽車都有這個裝置呢?
06:45
had this technology in it?
140
405458
1457
06:46
What could we do then?
141
406915
3165
那又能做什麼?
06:50
Maybe, if we aggregated the data,
142
410080
2240
或許,我們可以聚集所有的數據,
06:52
maybe we could identify telltale signs
143
412320
3814
或許能提早偵測到警訊,
06:56
that best predict that a car accident
144
416134
2709
預測車禍
06:58
is going to take place in the next five seconds.
145
418843
5893
即將在 5 秒鐘內發生。
07:04
And then what we will have datafied
146
424736
2557
然後我們還可以數據化
07:07
is driver fatigue,
147
427293
1783
駕駛員的疲勞狀態,
07:09
and the service would be when the car senses
148
429076
2334
汽車系統可以偵測到
07:11
that the person slumps into that position,
149
431410
3437
駕駛癱坐成某個姿勢,
07:14
automatically knows, hey, set an internal alarm
150
434847
3994
自動感知,發出指令啟動響鈴,
07:18
that would vibrate the steering wheel, honk inside
151
438841
2025
導致方向盤震動,
07:20
to say, "Hey, wake up,
152
440866
1721
車內喇叭作響,大喊:「嘿,快醒來!
07:22
pay more attention to the road."
153
442587
1904
注意路況!」
07:24
These are the sorts of things we can do
154
444491
1853
這一類的事都可以做到,
07:26
when we datafy more aspects of our lives.
155
446344
2821
當我們把更多的生活層面數據化。
07:29
So what is the value of big data?
156
449165
3675
那麼,大數據究竟有什麼價值?
07:32
Well, think about it.
157
452840
2190
想想看,
07:35
You have more information.
158
455030
2412
現在有更多資訊,
07:37
You can do things that you couldn't do before.
159
457442
3341
可以做過去不能做的事。
07:40
One of the most impressive areas
160
460783
1676
這概念的應用當中,最驚人的領域之一,
07:42
where this concept is taking place
161
462459
1729
07:44
is in the area of machine learning.
162
464188
3307
就是「機器學習」。
07:47
Machine learning is a branch of artificial intelligence,
163
467495
3077
機器學習是人工智慧的一個分支,
07:50
which itself is a branch of computer science.
164
470572
3378
人工智慧又是電腦科學的分支。
07:53
The general idea is that instead of
165
473950
1543
基本概念是:
07:55
instructing a computer what do do,
166
475493
2117
不必告訴電腦要做什麼,
07:57
we are going to simply throw data at the problem
167
477610
2620
只要把數據輸入到問題裡,
08:00
and tell the computer to figure it out for itself.
168
480230
3206
然後叫電腦自己想辦法。
08:03
And it will help you understand it
169
483436
1777
我們回顧一下源頭, 就會比較容易了解。
08:05
by seeing its origins.
170
485213
3552
08:08
In the 1950s, a computer scientist
171
488765
2388
1950 年代,IBM 有位電腦科學家
08:11
at IBM named Arthur Samuel liked to play checkers,
172
491153
3592
名叫亞瑟.山姆爾,很愛下跳棋,
08:14
so he wrote a computer program
173
494745
1402
所以他寫了一個電腦程式,
08:16
so he could play against the computer.
174
496147
2813
叫電腦跟他對打。
08:18
He played. He won.
175
498960
2711
他開始下棋,結果他贏了。
08:21
He played. He won.
176
501671
2103
他再開始下棋,結果他又贏了。
08:23
He played. He won,
177
503774
3015
他再下,還是他贏。
08:26
because the computer only knew
178
506789
1778
因為電腦只會
08:28
what a legal move was.
179
508567
2227
棋步的規則。
08:30
Arthur Samuel knew something else.
180
510794
2087
而亞瑟.山姆爾會得更多,
08:32
Arthur Samuel knew strategy.
181
512881
4629
他懂得策略。
08:37
So he wrote a small sub-program alongside it
182
517510
2396
所以他又寫了一個副程式,
08:39
operating in the background, and all it did
183
519906
1974
在背景執行,只做一件事:
08:41
was score the probability
184
521880
1817
就是計算機率,
08:43
that a given board configuration would likely lead
185
523697
2563
評估目前的棋局,
08:46
to a winning board versus a losing board
186
526260
2910
比較贏棋和輸棋的機率,
08:49
after every move.
187
529170
2508
每下一步棋,就重算一次。
08:51
He plays the computer. He wins.
188
531678
3150
然後他又跟電腦對打,結果他贏。
08:54
He plays the computer. He wins.
189
534828
2508
再對打,還是他贏。
08:57
He plays the computer. He wins.
190
537336
3731
再對打,還是他贏。
09:01
And then Arthur Samuel leaves the computer
191
541067
2277
然後亞瑟.山姆爾讓電腦自己對打。
09:03
to play itself.
192
543344
2227
09:05
It plays itself. It collects more data.
193
545571
3509
它就自己下棋,一邊收集數據。
09:09
It collects more data. It increases the accuracy of its prediction.
194
549080
4309
越收集越多,它的預測準確度就提高。
09:13
And then Arthur Samuel goes back to the computer
195
553389
2104
然後亞瑟.山姆爾再回來跟電腦對打。
09:15
and he plays it, and he loses,
196
555493
2318
他開始下棋,結果他輸了。
09:17
and he plays it, and he loses,
197
557811
2069
他又下,又輸了。
09:19
and he plays it, and he loses,
198
559880
2047
再下,還是輸。
09:21
and Arthur Samuel has created a machine
199
561927
2599
亞瑟.山姆爾創造了一台機器,
09:24
that surpasses his ability in a task that he taught it.
200
564526
6288
它的能力青出於藍,更甚於藍。
09:30
And this idea of machine learning
201
570814
2498
而這種機器學習的概念,
09:33
is going everywhere.
202
573312
3927
現在到處可見。
09:37
How do you think we have self-driving cars?
203
577239
3149
你想我們怎麼會有自動駕駛汽車?
09:40
Are we any better off as a society
204
580388
2137
把全部交通規則都輸入到軟體, 可以改善社會嗎?
09:42
enshrining all the rules of the road into software?
205
582525
3285
09:45
No. Memory is cheaper. No.
206
585810
2598
不是。
因為記憶體更便宜嗎?不是。
09:48
Algorithms are faster. No. Processors are better. No.
207
588408
3994
演算法變快了?不。
有更好的處理器?不。
09:52
All of those things matter, but that's not why.
208
592402
2772
這些都很重要,但不是真正的原因。
09:55
It's because we changed the nature of the problem.
209
595174
3141
真正的原因是 我們改變了問題的本質。
09:58
We changed the nature of the problem from one
210
598315
1530
我們把問題從
09:59
in which we tried to overtly and explicitly
211
599845
2245
明確指示電腦如何開車,
10:02
explain to the computer how to drive
212
602090
2581
10:04
to one in which we say,
213
604671
1316
改成對電腦說:
10:05
"Here's a lot of data around the vehicle.
214
605987
1876
「我給你大量的開車數據,
10:07
You figure it out.
215
607863
1533
你自個兒看著辦吧!」
10:09
You figure it out that that is a traffic light,
216
609396
1867
你自己判斷出那是紅綠燈,
10:11
that that traffic light is red and not green,
217
611263
2081
而且現在亮紅燈,不是綠燈,
10:13
that that means that you need to stop
218
613344
2014
表示你要停車,
10:15
and not go forward."
219
615358
3083
不能繼續開。」
10:18
Machine learning is at the basis
220
618441
1518
機器學習也是
10:19
of many of the things that we do online:
221
619959
1991
我們許多網路活動的基礎:
10:21
search engines,
222
621950
1857
搜尋引擎、
10:23
Amazon's personalization algorithm,
223
623807
3801
亞馬遜的個人化演算法、
10:27
computer translation,
224
627608
2212
電腦翻譯、
10:29
voice recognition systems.
225
629820
4290
語音辨識系統。
10:34
Researchers recently have looked at
226
634110
2835
研究專家近來研究
10:36
the question of biopsies,
227
636945
3195
活組織切片檢查,
10:40
cancerous biopsies,
228
640140
2767
癌組織切片,
10:42
and they've asked the computer to identify
229
642907
2315
他們叫電腦自己判別,
10:45
by looking at the data and survival rates
230
645222
2471
電腦分析數據和存活率,
10:47
to determine whether cells are actually
231
647693
4667
判斷是否為癌症細胞。
10:52
cancerous or not,
232
652360
2544
10:54
and sure enough, when you throw the data at it,
233
654904
1778
果然,當你把數據丟給電腦,
10:56
through a machine-learning algorithm,
234
656682
2047
透過一個機器學習的演算法,
10:58
the machine was able to identify
235
658729
1877
電腦真的能找出
11:00
the 12 telltale signs that best predict
236
660606
2262
12 大危險徵兆,
11:02
that this biopsy of the breast cancer cells
237
662868
3299
預測這個乳房癌細胞的切片
11:06
are indeed cancerous.
238
666167
3218
真的就是癌腫瘤。
11:09
The problem: The medical literature
239
669385
2498
問題來了:醫學文獻只知道
11:11
only knew nine of them.
240
671883
2789
其中 9 項。
11:14
Three of the traits were ones
241
674672
1800
另外 3 項特性
11:16
that people didn't need to look for,
242
676472
2975
是我們以前不需檢查的,
11:19
but that the machine spotted.
243
679447
5531
卻被電腦找出來了。
11:24
Now, there are dark sides to big data as well.
244
684978
5925
好。
不過,大數據也有不好的一面。
11:30
It will improve our lives, but there are problems
245
690903
2074
它會改善我們的生活,
11:32
that we need to be conscious of,
246
692977
2640
但是也有我們必須注意的問題。
11:35
and the first one is the idea
247
695617
2623
第一,
我們可能因為預測而受罰,
11:38
that we may be punished for predictions,
248
698240
2686
11:40
that the police may use big data for their purposes,
249
700926
3870
警察可能會利用大數據來辦案,
11:44
a little bit like "Minority Report."
250
704796
2351
有點像電影《關鍵報告》。
11:47
Now, it's a term called predictive policing,
251
707147
2441
這叫做「預測性警務」,
11:49
or algorithmic criminology,
252
709588
2363
或「演算犯罪學」。
11:51
and the idea is that if we take a lot of data,
253
711951
2036
原理是,我們蒐集大量數據,
11:53
for example where past crimes have been,
254
713987
2159
例如,分析過去犯罪發生地點的大數據,
11:56
we know where to send the patrols.
255
716146
2543
我們就知道要往哪裡派送警力。
11:58
That makes sense, but the problem, of course,
256
718689
2115
這很合邏輯。但問題是,當然,
12:00
is that it's not simply going to stop on location data,
257
720804
4544
這種策略不會 只限犯罪地點的數據,
12:05
it's going to go down to the level of the individual.
258
725348
2959
而會一直延伸到個人資料。
12:08
Why don't we use data about the person's
259
728307
2250
何不利用人們的
12:10
high school transcript?
260
730557
2228
高中成績單?
12:12
Maybe we should use the fact that
261
732785
1561
或許我們可以看看
12:14
they're unemployed or not, their credit score,
262
734346
2028
他們是否失業、信用評等、
12:16
their web-surfing behavior,
263
736374
1552
上網瀏覽行為、
12:17
whether they're up late at night.
264
737926
1878
是否熬夜、
12:19
Their Fitbit, when it's able to identify biochemistries,
265
739804
3161
Fitbit 智慧健康手環, 當它能識別個人生化數據,
12:22
will show that they have aggressive thoughts.
266
742965
4236
可看出主人是否有攻擊性的想法。
12:27
We may have algorithms that are likely to predict
267
747201
2221
可能有演算法 會預測我們將要做什麼事,
12:29
what we are about to do,
268
749422
1633
12:31
and we may be held accountable
269
751055
1244
可能還沒有付諸行動,就得負責。
12:32
before we've actually acted.
270
752299
2590
12:34
Privacy was the central challenge
271
754889
1732
在小數據時代, 最重要的挑戰是隱私。
12:36
in a small data era.
272
756621
2880
12:39
In the big data age,
273
759501
2149
在大數據時代,
12:41
the challenge will be safeguarding free will,
274
761650
4523
挑戰則變成保衛自由意志、
12:46
moral choice, human volition,
275
766173
3779
道德選擇、人的意志、
12:49
human agency.
276
769952
3068
人的「能動性」(human agency)。
12:54
There is another problem:
277
774540
2225
還有一個問題:
12:56
Big data is going to steal our jobs.
278
776765
3556
大數據會搶走我們的工作。
13:00
Big data and algorithms are going to challenge
279
780321
3512
大數據和演算法將會挑戰
13:03
white collar, professional knowledge work
280
783833
3061
21 世紀的白領、專業知識工作,
13:06
in the 21st century
281
786894
1653
13:08
in the same way that factory automation
282
788547
2434
就像工廠自動化和生產線
13:10
and the assembly line
283
790981
2189
13:13
challenged blue collar labor in the 20th century.
284
793170
3026
在 20 世紀挑戰藍領工作者一樣。
13:16
Think about a lab technician
285
796196
2092
試想一位實驗室技術員,
13:18
who is looking through a microscope
286
798288
1409
他正在用顯微鏡看腫瘤切片,
13:19
at a cancer biopsy
287
799697
1624
13:21
and determining whether it's cancerous or not.
288
801321
2637
要判斷是否為癌細胞。
13:23
The person went to university.
289
803958
1972
他唸過大學,
13:25
The person buys property.
290
805930
1430
買了房子,
13:27
He or she votes.
291
807360
1741
會投票,
13:29
He or she is a stakeholder in society.
292
809101
3666
他與社會利害相關。
13:32
And that person's job,
293
812767
1394
他的工作,及許多像他一樣的專業人士,
13:34
as well as an entire fleet
294
814161
1609
13:35
of professionals like that person,
295
815770
1969
13:37
is going to find that their jobs are radically changed
296
817739
3150
將發現他們的工作起了劇變,
13:40
or actually completely eliminated.
297
820889
2357
甚至完全被淘汰。
13:43
Now, we like to think
298
823246
1284
我們喜歡相信
13:44
that technology creates jobs over a period of time
299
824530
3187
長遠來說,科技創造工作機會,
13:47
after a short, temporary period of dislocation,
300
827717
3465
即使剛開始會先經歷 短暫的錯亂與重組,
13:51
and that is true for the frame of reference
301
831182
1941
這對我們所處的工業革命時代來說, 並沒有錯,
13:53
with which we all live, the Industrial Revolution,
302
833123
2142
13:55
because that's precisely what happened.
303
835265
2328
因為事實的確如此。
13:57
But we forget something in that analysis:
304
837593
2333
但是這個分析遺漏了一點:
13:59
There are some categories of jobs
305
839926
1830
有些工作類別其實已經消失,
14:01
that simply get eliminated and never come back.
306
841756
3420
且從未起死回生。
14:05
The Industrial Revolution wasn't very good
307
845176
2004
如果你是一匹馬, 那麼工業革命對你並不利。
14:07
if you were a horse.
308
847180
4002
14:11
So we're going to need to be careful
309
851182
2055
所以我們必須非常謹慎,
14:13
and take big data and adjust it for our needs,
310
853237
3514
正確駕馭大數據, 調整它以適應我們所需,
14:16
our very human needs.
311
856751
3185
滿足我們的人性需求。
14:19
We have to be the master of this technology,
312
859936
1954
我們必須成為這項科技的主人,
14:21
not its servant.
313
861890
1656
而不是淪為它的奴隸。
14:23
We are just at the outset of the big data era,
314
863546
2958
大數據時代才正開始,
14:26
and honestly, we are not very good
315
866504
3150
老實說,我們並不是很擅長
14:29
at handling all the data that we can now collect.
316
869654
4207
處理我們能蒐集的龐大數據資料。
14:33
It's not just a problem for the National Security Agency.
317
873861
3330
這不只是國安局的問題,
14:37
Businesses collect lots of data, and they misuse it too,
318
877191
3038
企業也蒐集大量資料, 同樣也誤用、濫用。
14:40
and we need to get better at this, and this will take time.
319
880229
3667
我們都必須學習怎麼正確運用, 而這需要時間。
14:43
It's a little bit like the challenge that was faced
320
883896
1822
有點像原始人用火 所面臨的挑戰。
14:45
by primitive man and fire.
321
885718
2407
14:48
This is a tool, but this is a tool that,
322
888125
1885
大數據是個工具,
14:50
unless we're careful, will burn us.
323
890010
3559
如果運用失當,就會燒傷我們。
14:56
Big data is going to transform how we live,
324
896008
3120
大數據將改變我們如何生活、
14:59
how we work and how we think.
325
899128
2801
工作,和思考。
15:01
It is going to help us manage our careers
326
901929
1889
它可以幫助我們管理職涯,
15:03
and lead lives of satisfaction and hope
327
903818
3634
讓我們過滿意、夢想的生活,
15:07
and happiness and health,
328
907452
2992
帶來快樂與健康。
15:10
but in the past, we've often looked at information technology
329
910444
3306
以往,我們常在看待「資訊科技」時,
15:13
and our eyes have only seen the T,
330
913750
2208
只專注在「科技」,
15:15
the technology, the hardware,
331
915958
1686
只重視硬體,
15:17
because that's what was physical.
332
917644
2262
因為它具體可見。
15:19
We now need to recast our gaze at the I,
333
919906
2924
現在我們必須重新對焦,
15:22
the information,
334
922830
1380
轉向「資訊」,
15:24
which is less apparent,
335
924210
1373
它比較不明顯,
15:25
but in some ways a lot more important.
336
925583
4109
但是就某些方面來說,卻重要得多。
15:29
Humanity can finally learn from the information
337
929692
3465
人性總算可以向我們蒐集來的資訊學習,
15:33
that it can collect,
338
933157
2418
15:35
as part of our timeless quest
339
935575
2115
成為我們永恆追尋的一部份,
15:37
to understand the world and our place in it,
340
937690
3159
藉此了解我們的世界,和人類的角色,
15:40
and that's why big data is a big deal.
341
940849
5631
這是為什麼大數據將「大」有可為。
15:46
(Applause)
342
946480
3568
(觀眾掌聲)
關於本網站

本網站將向您介紹對學習英語有用的 YouTube 視頻。 您將看到來自世界各地的一流教師教授的英語課程。 雙擊每個視頻頁面上顯示的英文字幕,從那裡播放視頻。 字幕與視頻播放同步滾動。 如果您有任何意見或要求,請使用此聯繫表與我們聯繫。

https://forms.gle/WvT1wiN1qDtmnspy7


This website was created in October 2020 and last updated on June 12, 2025.

It is now archived and preserved as an English learning resource.

Some information may be out of date.

隱私政策

eng.lish.video

Developer's Blog