Kenneth Cukier: Big data is better data

532,959 views ใƒป 2014-09-23

TED


ืื ื ืœื—ืฅ ืคืขืžื™ื™ื ืขืœ ื”ื›ืชื•ื‘ื™ื•ืช ื‘ืื ื’ืœื™ืช ืœืžื˜ื” ื›ื“ื™ ืœื”ืคืขื™ืœ ืืช ื”ืกืจื˜ื•ืŸ.

ืžืชืจื’ื: Yubal Masalker ืžื‘ืงืจ: Sigal Tifferet
00:12
America's favorite pie is?
0
12787
3845
ื”ืขื•ื’ื” ื”ื—ื‘ื™ื‘ื” ืขืœ ื”ืืžืจื™ืงืื™ื ื”ื™ื...?
00:16
Audience: Apple. Kenneth Cukier: Apple. Of course it is.
1
16632
3506
ืงื”ืœ: ืชืคื•ื—ื™ื. ืงื ืช ืงื•ืงื™ืืจ: ืชืคื•ื—ื™ื, ื›ืžื•ื‘ืŸ.
00:20
How do we know it?
2
20138
1231
ืื™ืš ืื ื• ื™ื•ื“ืขื™ื?
00:21
Because of data.
3
21369
2753
ื‘ื’ืœืœ ื”ื ืชื•ื ื™ื.
00:24
You look at supermarket sales.
4
24122
2066
ืจื•ืื™ื ืืช ื”ืžื›ื™ืจื•ืช ื‘ืกื•ืคืจืžืจืงื˜ื™ื.
00:26
You look at supermarket sales of 30-centimeter pies
5
26188
2866
ื‘ื•ื“ืงื™ื ืืช ื”ืžื›ื™ืจื•ืช ืฉืœ ื›ืœ ืขื•ื’ื•ืช
00:29
that are frozen, and apple wins, no contest.
6
29054
4075
30 ื”ืก"ืž ื”ืงืคื•ืื•ืช, ื•ืขื•ื’ื•ืช ื”ืชืคื•ื—ื™ื ืžื ืฆื—ื•ืช ื‘ื’ื“ื•ืœ.
00:33
The majority of the sales are apple.
7
33129
5180
ืจื•ื‘ ื”ืžื›ื™ืจื•ืช ื”ืŸ ืฉืœ ืขื•ื’ื•ืช ืชืคื•ื—ื™ื.
00:38
But then supermarkets started selling
8
38309
2964
ืื‘ืœ ืื– ื”ืกื•ืคืจืžืจืงื˜ื™ื ื”ื—ืœื• ืœืžื›ื•ืจ
00:41
smaller, 11-centimeter pies,
9
41273
2583
ืขื•ื’ื•ืช ื™ื•ืชืจ ืงื˜ื ื•ืช, ื‘ืงื•ื˜ืจ 11 ืก"ืž,
00:43
and suddenly, apple fell to fourth or fifth place.
10
43856
4174
ื•ืคืชืื•ื ืขื•ื’ื•ืช ื”ืชืคื•ื—ื™ื ื™ืจื“ื• ืœืžืงื•ื ื”ืจื‘ื™ืขื™ ืื• ื”ื—ืžื™ืฉื™.
00:48
Why? What happened?
11
48030
2875
ืžื“ื•ืข? ืžื” ืงืจื”?
00:50
Okay, think about it.
12
50905
2818
ื—ื™ืฉื‘ื• ืขืœ ื–ื”.
00:53
When you buy a 30-centimeter pie,
13
53723
3848
ื›ืืฉืจ ืื ื• ืงื•ื ื™ื ืขื•ื’ืช 30 ืก"ืž,
00:57
the whole family has to agree,
14
57571
2261
ื›ืœ ื”ืžืฉืคื—ื” ืฆืจื™ื›ื” ืœื”ืกื›ื™ื,
00:59
and apple is everyone's second favorite.
15
59832
3791
ื•ืขื•ื’ืช ืชืคื•ื—ื™ื ื”ื™ื ื‘ืขื“ื™ืคื•ืช ืฉื ื™ื” ืืฆืœ ื›ืœ ืื—ื“.
01:03
(Laughter)
16
63623
1935
(ืฆื—ื•ืง)
01:05
But when you buy an individual 11-centimeter pie,
17
65558
3615
ืื‘ืœ ื›ืฉืงื•ื ื™ื ืขื•ื’ืช 11 ืก"ืž ืื™ืฉื™ืช,
01:09
you can buy the one that you want.
18
69173
3745
ืืคืฉืจ ืœืงื ื•ืช ืืช ื–ื• ืฉื›ืœ ืื—ื“ ืื•ื”ื‘ ืื™ืฉื™ืช.
01:12
You can get your first choice.
19
72918
4015
ื›ืœ ืื—ื“ ืžืงื‘ืœ ืืช ืžื” ืฉื‘ืขื“ื™ืคื•ืช ืจืืฉื•ื ื” ืืฆืœื•.
01:16
You have more data.
20
76933
1641
ื™ืฉ ื™ื•ืชืจ ื ืชื•ื ื™ื.
01:18
You can see something
21
78574
1554
ื ื™ืชืŸ ืœืจืื•ืช ื“ื‘ืจื™ื
01:20
that you couldn't see
22
80128
1132
ืฉืœื ื ื™ืชืŸ ื”ื™ื” ืœืจืื•ืช
01:21
when you only had smaller amounts of it.
23
81260
3953
ื›ืืฉืจ ื”ื™ื• ืคื—ื•ืช ื ืชื•ื ื™ื.
01:25
Now, the point here is that more data
24
85213
2475
ื”ืขื ื™ื™ืŸ ื”ื•ื ืฉื™ื•ืชืจ ื ืชื•ื ื™ื
01:27
doesn't just let us see more,
25
87688
2283
ืื™ื ื ืžืืคืฉืจื™ื ืจืง ืœืจืื•ืช ื™ื•ืชืจ --
01:29
more of the same thing we were looking at.
26
89971
1854
ื™ื•ืชืจ ืžืื•ืชื ื”ื“ื‘ืจื™ื ืฉืจืื™ื ื• ืงื•ื“ื.
01:31
More data allows us to see new.
27
91825
3613
ื™ื•ืชืจ ื ืชื•ื ื™ื ืžืืคืฉืจื™ื ืœืจืื•ืช ื“ื‘ืจื™ื ื—ื“ืฉื™ื.
01:35
It allows us to see better.
28
95438
3094
ื”ื ืžืืคืฉืจื™ื ืœื ื• ืœืจืื•ืช ื™ื•ืชืจ ื˜ื•ื‘.
01:38
It allows us to see different.
29
98532
3656
ื”ื ืžืืคืฉืจื™ื ืœื”ืกืชื›ืœ ืื—ืจืช.
01:42
In this case, it allows us to see
30
102188
3173
ื‘ืžืงืจื” ื–ื”, ื”ื ืžืืคืฉืจื™ื ืœืจืื•ืช
01:45
what America's favorite pie is:
31
105361
2913
ืžื”ื™ ื”ืขื•ื’ื” ื”ืื”ื•ื‘ื” ื‘ืืžืจื™ืงื”:
01:48
not apple.
32
108274
2542
ืœื ืขื•ื’ืช ืชืคื•ื—ื™ื.
01:50
Now, you probably all have heard the term big data.
33
110816
3614
ื›ื•ืœื›ื ื‘ื•ื•ื“ืื™ ืฉืžืขืชื ืืช ื”ืžื•ืฉื’ 'ื‘ื™ื’ ื“ืื˜ื”'.
01:54
In fact, you're probably sick of hearing the term
34
114430
2057
ื‘ื˜ื— ื ืžืืก ืœื›ื ืœืฉืžื•ืข ืืช ื”ืžื•ืฉื’
01:56
big data.
35
116487
1630
'ื‘ื™ื’ ื“ืื˜ื”'.
01:58
It is true that there is a lot of hype around the term,
36
118117
3330
ืื›ืŸ, ื™ืฉ ื”ืจื‘ื” ื”ืคืจื–ื” ืกื‘ื™ื‘ ื”ืžื•ืฉื’,
02:01
and that is very unfortunate,
37
121447
2332
ื•ื–ื” ื—ื‘ืœ ืžืื•ื“,
02:03
because big data is an extremely important tool
38
123779
3046
ื›ื™ ื‘ื™ื’ ื“ืื˜ื” ื”ื•ื ื›ืœื™ ื—ืฉื•ื‘ ื‘ื™ื•ืชืจ
02:06
by which society is going to advance.
39
126825
3734
ืฉื‘ืืžืฆืขื•ืชื• ื”ื—ื‘ืจื” ืขืชื™ื“ื” ืœื”ืชืงื“ื.
02:10
In the past, we used to look at small data
40
130559
3561
ื‘ืขื‘ืจ, ื ื”ื’ื ื• ืœื”ืกืชื›ืœ ืขืœ ื ืชื•ื ื™ื ืžืฆื•ืžืฆืžื™ื
02:14
and think about what it would mean
41
134120
1704
ื•ืœื—ืฉื•ื‘ ืžื” ืžืฉืžืขื•ืชื
02:15
to try to understand the world,
42
135824
1496
ื‘ื ื™ืกื™ื•ืŸ ืœื”ื‘ื™ืŸ ืืช ื”ืขื•ืœื,
02:17
and now we have a lot more of it,
43
137320
1991
ื•ื›ืขืช ื™ืฉ ืœื ื• ื”ืจื‘ื” ื™ื•ืชืจ ื›ืืœื”,
02:19
more than we ever could before.
44
139311
2722
ื™ื•ืชืจ ืžืžื” ืฉื™ื›ืœื• ืœื”ื™ื•ืช ืื™-ืคืขื ื‘ืขื‘ืจ.
02:22
What we find is that when we have
45
142033
1877
ืžื” ืฉืื ื• ืžื’ืœื™ื ื”ื•ื ืฉื›ืืฉืจ ื™ืฉ ืœื ื•
02:23
a large body of data, we can fundamentally do things
46
143910
2724
ืžืกื“ ื ืชื•ื ื™ื ื’ื“ื•ืœ, ื ื™ืชืŸ ืœืขืฉื•ืช ื“ื‘ืจื™ื
02:26
that we couldn't do when we only had smaller amounts.
47
146634
3276
ืฉืœื ื™ื›ื•ืœื ื• ืœืขืฉื•ืช ื›ืืฉืจ ื”ื™ื• ืœื ื• ืคื—ื•ืช ื ืชื•ื ื™ื.
02:29
Big data is important, and big data is new,
48
149910
2641
ื‘ื™ื’ ื“ืื˜ื” ื”ื•ื ื—ืฉื•ื‘ ื•ื”ื•ื ื—ื“ืฉ,
02:32
and when you think about it,
49
152551
1777
ื•ื›ืืฉืจ ื—ื•ืฉื‘ื™ื ืขืœ ื›ืš,
02:34
the only way this planet is going to deal
50
154328
2216
ื”ื“ืจืš ื”ื™ื—ื™ื“ื” ื‘ื” ืขื•ืœืžื ื• ื™ื•ื›ืœ
02:36
with its global challenges โ€”
51
156544
1789
ืœื”ืชืžื•ื“ื“ ืขื ื”ืืชื’ืจื™ื ื”ื’ืœื•ื‘ืœื™ื™ื --
02:38
to feed people, supply them with medical care,
52
158333
3537
ืœืกืคืง ืœื›ื•ืœื ืื•ื›ืœ, ื˜ื™ืคื•ืœ ืจืคื•ืื™,
02:41
supply them with energy, electricity,
53
161870
2810
ืื ืจื’ื™ื”, ื—ืฉืžืœ,
02:44
and to make sure they're not burnt to a crisp
54
164680
1789
ื•ื’ื ืœื•ื•ื“ื ืฉืœื ื™ื™ืฆืœื• ื‘ื’ืœืœ
02:46
because of global warming โ€”
55
166469
1238
ื”ื”ืชื—ืžืžื•ืช ื”ื’ืœื•ื‘ืœื™ืช --
02:47
is because of the effective use of data.
56
167707
4195
ืชื”ื™ื” ื‘ืืžืฆืขื•ืช ืฉื™ืžื•ืฉ ื™ืขื™ืœ ื‘ื ืชื•ื ื™ื.
02:51
So what is new about big data? What is the big deal?
57
171902
3870
ืื– ืžื” ื›ืœ-ื›ืš ืฉื•ื ื” ื‘ื‘ื™ื’ ื“ืื˜ื”? ืขืœ ืžื” ื›ืœ ื”ืจืขืฉ?
02:55
Well, to answer that question, let's think about
58
175772
2517
ื›ื“ื™ ืœืขื ื•ืช ืขืœ ื”ืฉืืœื”, ื”ื‘ื” ื ื—ืฉื•ื‘
02:58
what information looked like,
59
178289
1896
ื›ื™ืฆื“ ื”ืžื™ื“ืข ื ืจืื” ืคืขื,
03:00
physically looked like in the past.
60
180185
3034
ื›ื™ืฆื“ ื”ื•ื ื ืจืื” ื‘ืคื•ืขืœ ื‘ืขื‘ืจ.
03:03
In 1908, on the island of Crete,
61
183219
3611
ื‘-1908, ื‘ืื™ ื›ืจืชื™ื,
03:06
archaeologists discovered a clay disc.
62
186830
4735
ืืจื›ื™ืื•ืœื•ื’ื™ื ื’ื™ืœื• ื“ื™ืกืงื” ืžื—ื™ืžืจ.
03:11
They dated it from 2000 B.C., so it's 4,000 years old.
63
191565
4059
ื”ื ืชื™ืืจื›ื• ืื•ืชื” ืœ-2000 ืœืคื ื”"ืก, ื›ืœื•ืžืจ, ืœืคื ื™ 4,000 ืฉื ื”.
03:15
Now, there's inscriptions on this disc,
64
195624
2004
ื™ืฉ ื›ื™ืชื•ื‘ ืขืœ ื”ื“ื™ืกืงื”
03:17
but we actually don't know what it means.
65
197628
1327
ืฉืื™ืŸ ืื ื• ื™ื•ื“ืขื™ื ืืช ืคื™ืจื•ืฉื•.
03:18
It's a complete mystery, but the point is that
66
198955
2098
ื–ื•ื”ื™ ืชืขืœื•ืžื”. ืื‘ืœ ืžื” ืฉื—ืฉื•ื‘ ื›ืืŸ
03:21
this is what information used to look like
67
201053
1928
ื”ื•ื ืฉื›ืš ื ืจืื” ืžื™ื“ืข
03:22
4,000 years ago.
68
202981
2089
ืœืคื ื™ 4,000 ืฉื ื”.
03:25
This is how society stored
69
205070
2548
ื–ื• ื”ื“ืจืš ื‘ื” ื”ื—ื‘ืจื”
ืื™ื—ืกื ื” ื•ื”ืขื‘ื™ืจื” ืžื™ื“ืข.
03:27
and transmitted information.
70
207618
3524
03:31
Now, society hasn't advanced all that much.
71
211142
4160
ื”ื—ื‘ืจื” ืœื ื”ืชืงื“ืžื” ืžืื– ื›ืœ-ื›ืš.
03:35
We still store information on discs,
72
215302
3474
ืื ื• ืขื“ื™ื™ืŸ ืžืื—ืกื ื™ื ืžื™ื“ืข ืขืœ ื“ื™ืกืงื•ืช,
03:38
but now we can store a lot more information,
73
218776
3184
ืื‘ืœ ื”ื™ื•ื ืื ื• ื™ื›ื•ืœื™ื ืœืื—ืกืŸ ื”ืจื‘ื” ื™ื•ืชืจ ืžื™ื“ืข,
03:41
more than ever before.
74
221960
1260
ื”ืจื‘ื” ื™ื•ืชืจ ืžืื™-ืคืขื.
03:43
Searching it is easier. Copying it easier.
75
223220
3093
ื™ื•ืชืจ ืงืœ ืœื—ืคืฉ ืื•ืชื•. ื™ื•ืชืจ ืงืœ ืœื”ืขืชื™ืงื•.
03:46
Sharing it is easier. Processing it is easier.
76
226313
3500
ื™ื•ืชืจ ืงืœ ืœืฉืชืคื•. ื™ื•ืชืจ ืงืœ ืœืขื‘ื“ื•.
03:49
And what we can do is we can reuse this information
77
229813
2766
ื ื™ืชืŸ ื’ื ืœื”ืฉืชืžืฉ ื‘ื• ืœืžื˜ืจื•ืช
03:52
for uses that we never even imagined
78
232579
1834
ืฉืืฃ ืคืขื ืœื ื—ืฉื‘ื ื• ืขืœื™ื”ืŸ
03:54
when we first collected the data.
79
234413
3195
ื›ืืฉืจ ืืกืคื ื• ืืช ื”ืžื™ื“ืข.
03:57
In this respect, the data has gone
80
237608
2252
ื‘ื”ืงืฉืจ ื–ื”, ื”ืžื™ื“ืข ื”ืคืš
03:59
from a stock to a flow,
81
239860
3532
ืžืžืฆื‘ื•ืจ ืœื–ืจื,
04:03
from something that is stationary and static
82
243392
3938
ืžืžืฉื”ื• ืฉื”ื•ื ื ื™ื™ื— ื•ืกื˜ื˜ื™
04:07
to something that is fluid and dynamic.
83
247330
3609
ืœืžืฉื”ื• ืฉื”ื•ื ื–ื•ืจื ื•ื“ื™ื ืžื™.
04:10
There is, if you will, a liquidity to information.
84
250939
4023
ืžืชืงื™ื™ืžืช, ืื ืชืจืฆื•, ื ื•ื–ืœื™ื•ืช ืฉืœ ืžื™ื“ืข.
04:14
The disc that was discovered off of Crete
85
254962
3474
ื”ื“ื™ืกืงื” ืžืœืคื ื™ 4,000 ืฉื ื”
04:18
that's 4,000 years old, is heavy,
86
258436
3764
ืฉื ืชื’ืœืชื” ื‘ื›ืจืชื™ื ื”ื™ื ื›ื‘ื“ื”.
ื”ื™ื ืื™ื ื” ืžื—ื–ื™ืงื” ื”ืจื‘ื” ืžื™ื“ืข,
04:22
it doesn't store a lot of information,
87
262200
1962
04:24
and that information is unchangeable.
88
264162
3116
ื•ื”ืžื™ื“ืข ื”ื–ื” ืื™ื ื• ื ื™ืชืŸ ืœืฉื™ื ื•ื™.
04:27
By contrast, all of the files
89
267278
4011
ืœืขื•ืžืชื•, ื›ืœ ื”ืงื‘ืฆื™ื
04:31
that Edward Snowden took
90
271289
1861
ืฉืื“ื•ืืจื“ ืกื ื•ื“ืŸ ืœืงื—
04:33
from the National Security Agency in the United States
91
273150
2621
ืžื”ืกื•ื›ื ื•ืช ืœื‘ื™ื˜ื—ื•ืŸ ืœืื•ืžื™ ืฉืœ ืืจื”"ื‘
04:35
fits on a memory stick
92
275771
2419
ื ื›ื ืกื™ื ื‘ื–ื›ืจื•ืŸ ื ื™ื™ื“
04:38
the size of a fingernail,
93
278190
3010
ื‘ื’ื•ื“ืœ ืฉืœ ืฆื™ืคื•ืจืŸ,
04:41
and it can be shared at the speed of light.
94
281200
4745
ื•ื ื™ืชืŸ ืœืฉืชืคื ื‘ืžื”ื™ืจื•ืช ื”ืื•ืจ.
04:45
More data. More.
95
285945
5255
ื™ื•ืชืจ ื ืชื•ื ื™ื. ื™ื•ืชืจ.
04:51
Now, one reason why we have so much data in the world today
96
291200
1974
ืื—ืช ื”ืกื™ื‘ื•ืช ืฉื™ืฉ ืœื ื• ื›ืœ-ื›ืš ื”ืจื‘ื” ืžื™ื“ืข ื”ื™ื•ื
04:53
is we are collecting things
97
293174
1432
ื”ื™ื ืฉืื ื• ืื•ืกืคื™ื ื“ื‘ืจื™ื
04:54
that we've always collected information on,
98
294606
3280
ืฉืชืžื™ื“ ืืกืคื ื• ืขืœื™ื”ื ืžื™ื“ืข,
04:57
but another reason why is we're taking things
99
297886
2656
ืื‘ืœ ืกื™ื‘ื” ื ื•ืกืคืช ื”ื™ื ืฉืื ื• ืื•ืกืคื™ื ื“ื‘ืจื™ื
05:00
that have always been informational
100
300542
2812
ืฉืชืžื™ื“ ื”ื™ื• ืงืฉื•ืจื™ื ื‘ืžื™ื“ืข
05:03
but have never been rendered into a data format
101
303354
2486
ืื‘ืœ ืืฃ ืคืขื ืœื ื”ื™ื• ื‘ืชื‘ื ื™ืช ืฉืœ ื ืชื•ื ื™ื
05:05
and we are putting it into data.
102
305840
2419
ื•ื›ืขืช ืื ื• ื”ื•ืคื›ื™ื ืื•ืชื ืœื ืชื•ื ื™ื.
05:08
Think, for example, the question of location.
103
308259
3308
ื—ื™ืฉื‘ื• ืœืžืฉืœ ืขืœ ืฉืืœืช ื”ืžื™ืงื•ื.
05:11
Take, for example, Martin Luther.
104
311567
2249
ืœื“ื•ื’ืžื, ืžืจื˜ื™ืŸ ืœื•ืชืจ.
05:13
If we wanted to know in the 1500s
105
313816
1597
ืื ื”ื™ื™ื ื• ืจื•ืฆื™ื ืœื“ืขืช ื‘-1500
05:15
where Martin Luther was,
106
315413
2667
ืื™ืคื” ื ืžืฆื ืžืจื˜ื™ืŸ ืœื•ืชืจ,
05:18
we would have to follow him at all times,
107
318080
2092
ื”ื™ื™ื ื• ืฆืจื™ื›ื™ื ืœืขืงื•ื‘ ืื—ืจื™ื• ื›ืœ ื”ื–ืžืŸ,
05:20
maybe with a feathery quill and an inkwell,
108
320172
2137
ืื•ืœื™ ืขื ืงื•ืœืžื•ืก-ื ื•ืฆื” ื•ืงืกืช-ื“ื™ื•,
05:22
and record it,
109
322309
1676
ื•ืœืจืฉื•ื ืืช ื”ืžื™ืงื•ืžื™ื.
05:23
but now think about what it looks like today.
110
323985
2183
ืื‘ืœ ื—ื™ืฉื‘ื• ื›ื™ืฆื“ ื–ื” ื”ื™ื” ื ืจืื” ื”ื™ื•ื.
05:26
You know that somewhere,
111
326168
2122
ืื ื• ื™ื•ื“ืขื™ื ืฉื”ื™ื›ืŸ ืฉื”ื•ื,
05:28
probably in a telecommunications carrier's database,
112
328290
2446
ื›ื›ืœ ื”ื ืจืื” ื‘ื‘ืกื™ืก ื ืชื•ื ื™ื ืฉืœ ื—ื‘ืจืช ืชืงืฉื•ืจืช,
05:30
there is a spreadsheet or at least a database entry
113
330736
3036
ื™ืฉื ื• ื’ื™ืœื™ื•ืŸ ืืœืงื˜ืจื•ื ื™ ืื• ืœืคื—ื•ืช ืจืฉื•ืžื”
05:33
that records your information
114
333772
2088
ื”ืจื•ืฉืžืช ืืช ื”ืžื™ื“ืข ืขืœ ื›ืœ ืื—ื“
05:35
of where you've been at all times.
115
335860
2063
ื•ืื™ืคื” ื”ื•ื ื”ื™ื” ื‘ื›ืœ ืขืช.
05:37
If you have a cell phone,
116
337923
1360
ืื ื™ืฉ ืœืš ื˜ืœืคื•ืŸ ื ื™ื™ื“,
05:39
and that cell phone has GPS, but even if it doesn't have GPS,
117
339283
2847
ื•ื‘ื• ื™ืฉ ืื™ื›ื•ืŸ ืœื•ื•ื™ื™ื ื™, ืื‘ืœ ื’ื ืื ืื™ืŸ,
05:42
it can record your information.
118
342130
2385
ื”ื•ื ื™ื›ื•ืœ ืœืชืขื“ ืืช ื”ืžื™ื“ืข ืขืœื™ืš.
05:44
In this respect, location has been datafied.
119
344515
4084
ืžื‘ื—ื™ื ื” ื–ื•, ื”ืžื™ืงื•ื ื”ืคืš ืœื ืชื•ืŸ.
05:48
Now think, for example, of the issue of posture,
120
348599
4601
ื—ื™ืฉื‘ื• ืœืžืฉืœ ืขืœ ืชื ื•ื—ื”,
05:53
the way that you are all sitting right now,
121
353200
1285
ื”ืื•ืคืŸ ื‘ื• ืืชื ื™ืฉื•ื‘ื™ื ื›ืจื’ืข,
05:54
the way that you sit,
122
354485
2030
ื”ืื•ืคืŸ ื‘ื• ืืชื” ื™ื•ืฉื‘,
05:56
the way that you sit, the way that you sit.
123
356515
2771
ื”ืื•ืคืŸ ื‘ื• ืืชื” ื™ื•ืฉื‘, ื”ืื•ืคืŸ ื‘ื• ืืช ื™ื•ืฉื‘ืช.
05:59
It's all different, and it's a function of your leg length
124
359286
2077
ืืฆืœ ื›ืœ ืื—ื“ ื–ื” ืฉื•ื ื” ื•ื–ื• ืคื•ื ืงืฆื™ื”
06:01
and your back and the contours of your back,
125
361363
2093
ืฉืœ ืื•ืจืš ื”ืจื’ืœ, ื”ื’ื‘ ื•ื”ืงื™ืžื•ืจ ืฉืœื•,
06:03
and if I were to put sensors, maybe 100 sensors
126
363456
2531
ื•ืื ื”ื™ื™ืชื™ ืฉื ื—ื™ื™ืฉื ื™ื, ืื•ืœื™ 100 ื—ื™ื™ืฉื ื™ื
06:05
into all of your chairs right now,
127
365987
1766
ื‘ื›ืœ ื”ื›ื™ืกืื•ืช ืฉืœื›ื,
06:07
I could create an index that's fairly unique to you,
128
367753
3600
ื”ื™ื™ืชื™ ื™ื›ื•ืœ ืœื™ืฆื•ืจ ืชื‘ื ื™ืช ื“ื™ ื™ื™ื—ื•ื“ื™ืช ืœื›ืœ ืื—ื“,
06:11
sort of like a fingerprint, but it's not your finger.
129
371353
4409
ืžื™ืŸ ื˜ื‘ื™ืขืช ืืฆื‘ืข, ืื‘ืœ ืœื ืžืืฆื‘ืข.
06:15
So what could we do with this?
130
375762
2969
ืื– ืžื” ื”ื™ื™ื ื• ืขื•ืฉื™ื ืขื ื–ื”?
06:18
Researchers in Tokyo are using it
131
378731
2397
ื—ื•ืงืจื™ื ื‘ื˜ื•ืงื™ื• ืžืฉืชืžืฉื™ื ื‘ื–ื”
06:21
as a potential anti-theft device in cars.
132
381128
4388
ื‘ืชื•ืจ ืืžืฆืขื™ ืืคืฉืจื™ ื ื’ื“ ื’ื ื™ื‘ืช ืžื›ื•ื ื™ื•ืช.
06:25
The idea is that the carjacker sits behind the wheel,
133
385516
2924
ื”ืจืขื™ื•ืŸ ื”ื•ื ืฉื›ืืฉืจ ื”ืคื•ืจืฅ ื™ื™ืฉื‘ ืžืื—ื•ืจื™ ื”ื”ื’ื”
06:28
tries to stream off, but the car recognizes
134
388440
2104
ื•ื™ื ืกื” ืœื”ืชื ื™ืข, ื”ืžื›ื•ื ื™ืช ืชื–ื”ื”
06:30
that a non-approved driver is behind the wheel,
135
390544
2362
ืฉื ื”ื’ ืœื ืžื•ืจืฉื” ื™ื•ืฉื‘ ืžืื—ื•ืจื™ ื”ื”ื’ื”,
06:32
and maybe the engine just stops, unless you
136
392906
2164
ื•ืื•ืœื™ ื”ืžื ื•ืข ื™ื™ื›ื‘ื”, ืืœื ืื
ืชื•ืงืœื“ ืกื™ืกืžื” ื‘ืœื•ื— ื”ืžื—ื•ื•ื ื™ื
06:35
type in a password into the dashboard
137
395070
3177
06:38
to say, "Hey, I have authorization to drive." Great.
138
398247
4658
ื›ื“ื™ ืœื•ืžืจ, "ื™ืฉ ืœื™ ื”ืจืฉืื” ืœื ื”ื•ื’." ืžืฆื•ื™ืŸ.
06:42
What if every single car in Europe
139
402905
2553
ืžื” ืื ื‘ื›ืœ ืžื›ื•ื ื™ืช ื‘ืื™ืจื•ืคื”
06:45
had this technology in it?
140
405458
1457
ืชื”ื™ื” ื˜ื›ื ื•ืœื•ื’ื™ื” ื–ื•?
06:46
What could we do then?
141
406915
3165
ืžื” ื”ื™ื™ื ื• ื™ื›ื•ืœื™ื ืœืขืฉื•ืช ืื–?
06:50
Maybe, if we aggregated the data,
142
410080
2240
ืื ื”ื™ื™ื ื• ืฆื•ื‘ืจื™ื ืืช ื”ื ืชื•ื ื™ื,
06:52
maybe we could identify telltale signs
143
412320
3814
ืื•ืœื™ ื”ื™ื™ื ื• ื™ื›ื•ืœื™ื ืœื–ื”ื•ืช ืกื™ืžื ื™ื ืžืงื“ื™ืžื™ื
06:56
that best predict that a car accident
144
416134
2709
ืœืชืื•ื ืช ื“ืจื›ื™ื
06:58
is going to take place in the next five seconds.
145
418843
5893
ื”ืขื•ืžื“ืช ืœื”ืชืจื—ืฉ ืชื•ืš 5 ื”ืฉื ื™ื•ืช ื”ืงืจื•ื‘ื•ืช.
07:04
And then what we will have datafied
146
424736
2557
ื•ืžื” ืฉื ืื’ื•ืจ ืื– ื›ื ืชื•ื ื™ื
07:07
is driver fatigue,
147
427293
1783
ื–ื• ืขื™ื™ืคื•ืช ื”ื ื”ื’,
07:09
and the service would be when the car senses
148
429076
2334
ื•ื”ืžืขื ื” ื™ื”ื™ื” ื›ืืฉืจ ื”ืžื›ื•ื ื™ืช
07:11
that the person slumps into that position,
149
431410
3437
ืชื—ื•ืฉ ืฉื”ืื“ื ืฆื•ื ื— ืœืื•ืชื• ืžืฆื‘,
07:14
automatically knows, hey, set an internal alarm
150
434847
3994
ื”ื™ื ืชื“ืข ื–ืืช ืื•ื˜ื•ืžื˜ื™ืช, ืชืคืขื™ืœ ืืชืจืื” ืคื ื™ืžื™ืช
07:18
that would vibrate the steering wheel, honk inside
151
438841
2025
ืฉืชืจืขื™ื“ ืืช ื”ื”ื’ื”, ืชืฆืคื•ืจ,
07:20
to say, "Hey, wake up,
152
440866
1721
ื›ื“ื™ ืœื•ืžืจ, "ืื“ื•ื ื™, ืชืชืขื•ืจืจ,
07:22
pay more attention to the road."
153
442587
1904
ืฉื™ื ืœื‘ ืœื›ื‘ื™ืฉ."
07:24
These are the sorts of things we can do
154
444491
1853
ื“ื‘ืจื™ื ื›ืืœื” ื ื•ื›ืœ ืœื‘ืฆืข ื›ืืฉืจ
07:26
when we datafy more aspects of our lives.
155
446344
2821
ื ื”ืคื•ืš ืœื ืชื•ื ื™ื ื™ื•ืชืจ ื•ื™ื•ืชืจ ื”ื™ื‘ื˜ื™ื ืžื—ื™ื™ื ื•.
07:29
So what is the value of big data?
156
449165
3675
ืžื” ื”ืขืจืš ืฉืœ ื‘ื™ื’ ื“ืื˜ื”?
07:32
Well, think about it.
157
452840
2190
ื—ื™ืฉื‘ื• ืขืœ ื–ื”.
07:35
You have more information.
158
455030
2412
ื™ืฉ ืœื ื• ื™ื•ืชืจ ืžื™ื“ืข.
07:37
You can do things that you couldn't do before.
159
457442
3341
ื ื™ืชืŸ ืœืขืฉื•ืช ื“ื‘ืจื™ื ืฉืœื ื ื™ืชืŸ ื”ื™ื” ืงื•ื“ื.
07:40
One of the most impressive areas
160
460783
1676
ืื—ื“ ื”ืชื—ื•ืžื™ื ื”ื›ื™ ืžืจืฉื™ืžื™ื
07:42
where this concept is taking place
161
462459
1729
ืฉื‘ื• ื–ื” ืงื•ืจื”
07:44
is in the area of machine learning.
162
464188
3307
ื”ื•ื ื”ืชื—ื•ื ืฉืœ ืžื›ื•ื ื•ืช ืœื•ืžื“ื•ืช.
07:47
Machine learning is a branch of artificial intelligence,
163
467495
3077
ืžื›ื•ื ื•ืช ืœื•ืžื“ื•ืช ื”ื•ื ืขื ืฃ ืฉืœ ืื™ื ื˜ืœื™ื’ื ืฆื™ื” ืžืœืื›ื•ืชื™ืช,
07:50
which itself is a branch of computer science.
164
470572
3378
ืฉื”ื™ื ื‘ืขืฆืžื” ืขื ืฃ ืฉืœ ืžื“ืขื™ ื”ืžื—ืฉื‘.
07:53
The general idea is that instead of
165
473950
1543
ื”ืจืขื™ื•ืŸ ื‘ืื•ืคืŸ ื›ืœืœื™ ื”ื•ื
07:55
instructing a computer what do do,
166
475493
2117
ืฉื‘ืžืงื•ื ืœื”ื•ืจื•ืช ืœืžื—ืฉื‘ ืžื” ืœืขืฉื•ืช,
07:57
we are going to simply throw data at the problem
167
477610
2620
ืคืฉื•ื˜ ื ื–ืจื•ืง ื ืชื•ื ื™ื ื‘ื ื•ื’ืข ืœื‘ืขื™ื”
08:00
and tell the computer to figure it out for itself.
168
480230
3206
ื•ื ืืžืจ ืœืžื—ืฉื‘ ืฉื™ืžืฆื ืคื™ืชืจื•ืŸ ืœื‘ื“.
08:03
And it will help you understand it
169
483436
1777
ื›ื“ื™ ืœื”ื‘ื™ืŸ ืืช ื”ืจืขื™ื•ืŸ
08:05
by seeing its origins.
170
485213
3552
ื ืกืชื›ืœ ืขืœ ื”ืžืงื•ืจ ืฉืœื•.
08:08
In the 1950s, a computer scientist
171
488765
2388
ื‘ืฉื ื•ืช ื”-50, ืื™ืฉ ืžื“ืขื™ ื”ืžื—ืฉื‘
08:11
at IBM named Arthur Samuel liked to play checkers,
172
491153
3592
ื‘ืื™ื™-ื‘ื™-ืื, ื‘ืฉื ืืจืชื•ืจ ืกืžื•ืืœ, ืื”ื‘ ืœืฉื—ืง ื“ืžืงื”,
08:14
so he wrote a computer program
173
494745
1402
ืื– ื”ื•ื ื›ืชื‘ ืชื•ื›ื ืช ืžื—ืฉื‘
08:16
so he could play against the computer.
174
496147
2813
ื›ื“ื™ ืฉื™ื•ื›ืœ ืœืฉื—ืง ื ื’ื“ ื”ืžื—ืฉื‘.
08:18
He played. He won.
175
498960
2711
ื”ื•ื ืฉื™ื—ืง ื•ื ื™ืฆื—.
08:21
He played. He won.
176
501671
2103
ื”ื•ื ืฉื™ื—ืง ื•ื ื™ืฆื—.
08:23
He played. He won,
177
503774
3015
ื”ื•ื ืฉื™ื—ืง ื•ื ื™ืฆื—,
08:26
because the computer only knew
178
506789
1778
ื›ื™ ื”ืžื—ืฉื‘ ื™ื“ืข
08:28
what a legal move was.
179
508567
2227
ืจืง ืžื”ืœื›ื™ื ื—ื•ืงื™ื™ื.
08:30
Arthur Samuel knew something else.
180
510794
2087
ืืจืชื•ืจ ืกืžื•ืืœ ื™ื“ืข ืžืฉื”ื• ืื—ืจ.
08:32
Arthur Samuel knew strategy.
181
512881
4629
ืืจืชื•ืจ ืกืžื•ืืœ ื™ื“ืข ืืกื˜ืจื˜ื’ื™ื”.
08:37
So he wrote a small sub-program alongside it
182
517510
2396
ืœื›ืŸ ื”ื•ื ื›ืชื‘ ืชื•ื›ื ืช-ืžืฉื ื”
08:39
operating in the background, and all it did
183
519906
1974
ืฉืคืขืœื” ื‘ืจืงืข, ื•ื›ืœ ืžื” ืฉื”ื™ื ืขืฉืชื”
08:41
was score the probability
184
521880
1817
ื”ื™ื” ืœืืžื•ื“ ืืช ื”ืกื‘ื™ืจื•ืช
08:43
that a given board configuration would likely lead
185
523697
2563
ืฉืกื™ื“ื•ืจ ื ืชื•ืŸ ืขืœ ื”ืœื•ื— ื™ื•ื‘ื™ืœ
08:46
to a winning board versus a losing board
186
526260
2910
ืœืขืžื“ืช ื ื™ืฆื—ื•ืŸ ืœืขื•ืžืช ืขืžื“ืช ื”ืคืกื“
08:49
after every move.
187
529170
2508
ืœืื—ืจ ื›ืœ ืžื”ืœืš.
08:51
He plays the computer. He wins.
188
531678
3150
ื”ื•ื ืฉื™ื—ืง ื ื’ื“ ื”ืžื—ืฉื‘ ื•ื ื™ืฆื—.
08:54
He plays the computer. He wins.
189
534828
2508
ื”ื•ื ืฉื™ื—ืง ื ื’ื“ ื”ืžื—ืฉื‘ ื•ื ื™ืฆื—
08:57
He plays the computer. He wins.
190
537336
3731
ื”ื•ื ืฉื™ื—ืง ื ื’ื“ ื”ืžื—ืฉื‘ ื•ื ื™ืฆื—.
09:01
And then Arthur Samuel leaves the computer
191
541067
2277
ื•ืื– ืืจืชื•ืจ ืกืžื•ืืœ ืขื–ื‘ ืืช ื”ืžื—ืฉื‘
09:03
to play itself.
192
543344
2227
ื›ื“ื™ ืฉื™ืฉื—ืง ืขื ืขืฆืžื•.
09:05
It plays itself. It collects more data.
193
545571
3509
ื”ื•ื ืฉื™ื—ืง ืขื ืขืฆืžื• ื•ืืกืฃ ื™ื•ืชืจ ืžื™ื“ืข.
09:09
It collects more data. It increases the accuracy of its prediction.
194
549080
4309
ื”ื•ื ืืกืฃ ื™ื•ืชืจ ืžื™ื“ืข ื•ื–ื” ื”ื’ื“ื™ืœ ืืช ื”ื“ื™ื•ืง ืฉืœ ื”ื—ื™ื–ื•ื™ ืฉืœื•.
09:13
And then Arthur Samuel goes back to the computer
195
553389
2104
ื•ืื– ืืจืชื•ืจ ืกืžื•ืืœ ื—ื–ืจ ืœืžื—ืฉื‘,
09:15
and he plays it, and he loses,
196
555493
2318
ืฉื™ื—ืง ื ื’ื“ื• ื•ื”ืคืกื™ื“,
09:17
and he plays it, and he loses,
197
557811
2069
ื•ืฉื™ื—ืง ื ื’ื“ื• ื•ื”ืคืกื™ื“,
09:19
and he plays it, and he loses,
198
559880
2047
ื•ืฉื™ื—ืง ื ื’ื“ื• ื•ื”ืคืกื™ื“,
09:21
and Arthur Samuel has created a machine
199
561927
2599
ื•ื›ืš ืืจืชื•ืจ ืกืžื•ืืœ ื™ืฆืจ ืžื›ื•ื ื”
09:24
that surpasses his ability in a task that he taught it.
200
564526
6288
ืฉื”ืชืขืœืชื” ืขืœ ื™ื›ื•ืœืชื•, ื‘ืžืฉื™ืžื” ืฉื”ื•ื ืขืฆืžื• ืœื™ืžื“ ืื•ืชื”.
09:30
And this idea of machine learning
201
570814
2498
ื•ื”ืจืขื™ื•ืŸ ื”ื–ื” ืฉืœ ืœืžื™ื“ืช ืžื›ื•ื ื”
09:33
is going everywhere.
202
573312
3927
ืžื’ื™ืข ืœื›ืœ ืžืงื•ื.
09:37
How do you think we have self-driving cars?
203
577239
3149
ืื™ืš ืœื“ืขืชื›ื ื™ืฉ ืœื ื• ืžื›ื•ื ื™ื•ืช ืœืœื ื ื”ื’?
09:40
Are we any better off as a society
204
580388
2137
ื”ืื ื ื”ื™ื” ื‘ืžืฆื‘ ื™ื•ืชืจ ื˜ื•ื‘ ื›ื—ื‘ืจื”
09:42
enshrining all the rules of the road into software?
205
582525
3285
ืื ื ื›ื ื™ืก ืืช ื›ืœ ื—ื•ืงื™ ื”ืชื ื•ืขื” ืœืชื•ืš ืชื•ื›ื ื”?
09:45
No. Memory is cheaper. No.
206
585810
2598
ืœื. ื”ืื ื”ื–ื™ื›ืจื•ืŸ ื–ื•ืœ ื™ื•ืชืจ? ืœื.
09:48
Algorithms are faster. No. Processors are better. No.
207
588408
3994
ื”ืืœื’ื•ืจื™ืชืžื™ื ืžื”ื™ืจื™ื ื™ื•ืชืจ? ืœื. ื”ืžืขื‘ื“ื™ื ื˜ื•ื‘ื™ื ื™ื•ืชืจ? ืœื.
09:52
All of those things matter, but that's not why.
208
592402
2772
ื›ืœ ื”ื“ื‘ืจื™ื ื”ืœืœื• ื—ืฉื•ื‘ื™ื, ืื‘ืœ ื”ื ืœื ื”ืกื™ื‘ื”.
09:55
It's because we changed the nature of the problem.
209
595174
3141
ื”ืกื™ื‘ื” ื”ื™ื ืฉืฉื™ื ื™ื ื• ืืช ืื•ืคื™ ื”ื‘ืขื™ื”.
09:58
We changed the nature of the problem from one
210
598315
1530
ืฉื™ื ื™ื ื• ืืช ืื•ืคื™ ื”ื‘ืขื™ื”
09:59
in which we tried to overtly and explicitly
211
599845
2245
ืžืžืฆื‘ ืฉื‘ื• ื ื™ืกื™ื ื• ื‘ืื•ืคืŸ ืžื•ื’ื–ื ื•ื‘ื’ืœื•ื™
10:02
explain to the computer how to drive
212
602090
2581
ืœื”ืกื‘ื™ืจ ืœืžื—ืฉื‘ ื›ื™ืฆื“ ืœื ื”ื•ื’
10:04
to one in which we say,
213
604671
1316
ืœืžืฆื‘ ื‘ื• ืื ื• ืื•ืžืจื™ื,
10:05
"Here's a lot of data around the vehicle.
214
605987
1876
"ื”ื ื”, ืงื— ืœืš ื”ืžื•ืŸ ื ืชื•ื ื™ื ืขืœ ื”ืจื›ื‘.
10:07
You figure it out.
215
607863
1533
ื•ืืชื” ืชืžืฆื ืืช ื”ืคื™ืชืจื•ืŸ.
10:09
You figure it out that that is a traffic light,
216
609396
1867
ืืชื” ืชืžืฆื ืฉื–ื” ืจืžื–ื•ืจ,
10:11
that that traffic light is red and not green,
217
611263
2081
ืฉื”ืจืžื–ื•ืจ ืื“ื•ื ื•ืœื ื™ืจื•ืง,
10:13
that that means that you need to stop
218
613344
2014
ืคื™ืจื•ืฉื• ืฉืฆืจื™ืš ืœืขืฆื•ืจ
10:15
and not go forward."
219
615358
3083
ื•ืœื ืœื”ืชืงื“ื."
10:18
Machine learning is at the basis
220
618441
1518
ืœืžื™ื“ืช ืžื›ื•ื ื•ืช ื ืžืฆืืช
10:19
of many of the things that we do online:
221
619959
1991
ื‘ื™ืกื•ื“ ื”ืจื‘ื” ื“ื‘ืจื™ื ืฉืื ื• ืžื‘ืฆืขื™ื ื‘ืจืฉืช:
10:21
search engines,
222
621950
1857
ืžื ื•ืขื™ ื—ื™ืคื•ืฉ,
10:23
Amazon's personalization algorithm,
223
623807
3801
ืืœื’ื•ืจื™ืชืžื™ื ืฉืœ ืืžื–ื•ืŸ ืœื”ืชืืžื” ืื™ืฉื™ืช,
10:27
computer translation,
224
627608
2212
ืชืจื’ื•ื ืžืžื•ื—ืฉื‘,
10:29
voice recognition systems.
225
629820
4290
ืžืขืจื›ื•ืช ืœื–ื™ื”ื•ื™ ืงื•ืœ.
10:34
Researchers recently have looked at
226
634110
2835
ืœืื—ืจื•ื ื”, ื—ื•ืงืจื™ื ื”ืชืขืžืงื•
10:36
the question of biopsies,
227
636945
3195
ื‘ืกื•ื’ื™ื™ืช ื”ื‘ื™ื•ืคืกื™ื”,
10:40
cancerous biopsies,
228
640140
2767
ืฉืœ ื“ื’ื™ืžืช ืจืงืžื•ืช ืกืจื˜ื ื™ื•ืช,
10:42
and they've asked the computer to identify
229
642907
2315
ื•ื”ื ื‘ื™ืงืฉื• ืžืžื—ืฉื‘ ืœืงื‘ื•ืข,
10:45
by looking at the data and survival rates
230
645222
2471
ื‘ืืžืฆืขื•ืช ื‘ื—ื™ื ืช ื”ื ืชื•ื ื™ื
10:47
to determine whether cells are actually
231
647693
4667
ื•ืฉื™ืขื•ืจื™ ื”ืชืžื•ืชื”, ืื ื”ืชืื™ื
ื”ื ื‘ืืžืช ืกืจื˜ื ื™ื™ื,
10:52
cancerous or not,
232
652360
2544
10:54
and sure enough, when you throw the data at it,
233
654904
1778
ื•ื›ืฉื”ื ืชื•ื ื™ื ื”ื•ื–ื ื• ืœืžื—ืฉื‘,
10:56
through a machine-learning algorithm,
234
656682
2047
ื‘ืืžืฆืขื•ืช ืืœื’ื•ืจื™ืชื ืœืžื™ื“ืช ืžื›ื•ื ื”,
10:58
the machine was able to identify
235
658729
1877
ื”ืžื›ื•ื ื” ื”ื™ืชื” ืžืกื•ื’ืœืช ืœื–ื”ื•ืช
11:00
the 12 telltale signs that best predict
236
660606
2262
ืืช 12 ื”ืกื™ืžื ื™ื ื”ืžื—ืฉื™ื“ื™ื ืฉืžื ื‘ืื™ื ื”ื›ื™ ื˜ื•ื‘
11:02
that this biopsy of the breast cancer cells
237
662868
3299
ืฉืจื™ืงืžื” ื–ื• ืžืชืื™ ืกืจื˜ืŸ-ืฉื“
11:06
are indeed cancerous.
238
666167
3218
ื”ื™ื ืื›ืŸ ืžืžืื™ืจื”.
11:09
The problem: The medical literature
239
669385
2498
ื”ื‘ืขื™ื”: ื”ืกืคืจื•ืช ื”ืจืคื•ืื™ืช
11:11
only knew nine of them.
240
671883
2789
ื”ื›ื™ืจื” ืจืง 9 ืžื”ื.
11:14
Three of the traits were ones
241
674672
1800
ืฉืœื•ืฉื” ืžื”ืกื™ืžื ื™ื ื”ื™ื• ื›ืืœื”
11:16
that people didn't need to look for,
242
676472
2975
ืฉืœื ื”ื™ื” ืฆื•ืจืš ืฉืื ืฉื™ื ื™ื—ืคืฉื•,
11:19
but that the machine spotted.
243
679447
5531
ืื‘ืœ ื”ืžื›ื•ื ื” ืื™ืชืจื” ืื•ืชื.
11:24
Now, there are dark sides to big data as well.
244
684978
5925
ืื‘ืœ, ื™ืฉื ื ื’ื ืฆื“ื“ื™ื ืืคืœื™ื ืœื‘ื™ื’ ื“ืื˜ื”.
11:30
It will improve our lives, but there are problems
245
690903
2074
ื‘ื™ื’ ื“ืื˜ื” ื™ืฉืคืจ ืืช ื—ื™ื™ื ื•,
11:32
that we need to be conscious of,
246
692977
2640
ืื‘ืœ ื™ืฉ ื’ื ื‘ืขื™ื•ืช ืฉืฆืจื™ืš ืœื”ื™ื•ืช ืžื•ื“ืขื™ื ืืœื™ื”ืŸ.
11:35
and the first one is the idea
247
695617
2623
ื”ืจืืฉื•ื ื” ื”ื™ื ื”ืืคืฉืจื•ืช
11:38
that we may be punished for predictions,
248
698240
2686
ืฉืื ื• ืขืฉื•ื™ื™ื ืœืกื‘ื•ืœ ืžื‘ื™ืฆื•ืข ื ื™ื‘ื•ื™ื™ื,
11:40
that the police may use big data for their purposes,
249
700926
3870
ื›ื™ ื”ืžืฉื˜ืจื” ืขืœื•ืœื” ืœื”ืฉืชืžืฉ ื‘ื‘ื™ื’ ื“ืื˜ื” ืœืžื˜ืจื•ืชื™ื”,
11:44
a little bit like "Minority Report."
250
704796
2351
ืžืฉื”ื• ื›ืžื• ื‘ืกืจื˜ "ื“ื•"ื— ืžื™ื•ื—ื“".
11:47
Now, it's a term called predictive policing,
251
707147
2441
ื–ื” ื ืงืจื "ืฉื™ื˜ื•ืจ ืžื ื‘ื",
11:49
or algorithmic criminology,
252
709588
2363
ืื• "ื—ืงืจ ืคืฉื™ืขื” ืืœื’ื•ืจื™ืชืžื™".
11:51
and the idea is that if we take a lot of data,
253
711951
2036
ื”ืจืขื™ื•ืŸ ื”ื•ื ืฉืื ืœื•ืงื—ื™ื ื”ืžื•ืŸ ื ืชื•ื ื™ื,
11:53
for example where past crimes have been,
254
713987
2159
ืœื“ื•ื’ืžื, ื”ื™ื›ืŸ ื”ืชืจื—ืฉื• ืคืฉืขื™ื ื‘ืขื‘ืจ,
11:56
we know where to send the patrols.
255
716146
2543
ื ื“ืข ืœืืŸ ืœืฉื’ืจ ืืช ืกื™ื•ืจื™ ื”ืžืฉื˜ืจื”.
11:58
That makes sense, but the problem, of course,
256
718689
2115
ื–ื” ื ืจืื” ื”ื’ื™ื•ื ื™, ืื‘ืœ ื”ื‘ืขื™ื”
12:00
is that it's not simply going to stop on location data,
257
720804
4544
ื”ื™ื ืฉื–ื” ืœื ื™ื™ืขืฆืจ ืจืง ื‘ื ืชื•ื ื™ ื”ืžื™ืงื•ื,
12:05
it's going to go down to the level of the individual.
258
725348
2959
ืืœื ื–ื” ื™ื™ืจื“ ืœืจืžืช ื”ืคืจื˜.
12:08
Why don't we use data about the person's
259
728307
2250
ืœืžื” ืฉืœื ื ืฉืชืžืฉ ื‘ื ืชื•ื ื™ื
ืžืชืขื•ื“ืช ื”ืชื™ื›ื•ืŸ ืฉืœ ื”ืื“ื?
12:10
high school transcript?
260
730557
2228
12:12
Maybe we should use the fact that
261
732785
1561
ืื•ืœื™ ืขืœื™ื ื• ืœื”ืฉืชืžืฉ ื‘ื ืชื•ื ื™ื ื›ืžื•,
12:14
they're unemployed or not, their credit score,
262
734346
2028
ืื ื”ืื“ื ืขื•ื‘ื“ ืื• ืžื•ื‘ื˜ืœ, ืจืžืช ื”ืืฉืจืื™ ืฉืœื•,
12:16
their web-surfing behavior,
263
736374
1552
ื”ืจื’ืœื™ ื’ืœื™ืฉืชื• ื‘ืื™ื ื˜ืจื ื˜,
12:17
whether they're up late at night.
264
737926
1878
ืื ื”ื•ื ืขืจ ืขื“ ืžืื•ื—ืจ ื‘ืœื™ืœื”.
12:19
Their Fitbit, when it's able to identify biochemistries,
265
739804
3161
ืžื›ืฉื™ืจ ื”ื ื™ื˜ื•ืจ ื”ื’ื•ืคื ื™, ืื ื”ื•ื ืžืกื•ื’ืœ ืœื–ื”ื•ืช
12:22
will show that they have aggressive thoughts.
266
742965
4236
ืชื’ื•ื‘ื•ืช ื›ื™ืžื™ื•ืช, ื™ืจืื” ืฉื™ืฉ ืœื• ืžื—ืฉื‘ื•ืช ืชื•ืงืคื ื™ื•ืช.
12:27
We may have algorithms that are likely to predict
267
747201
2221
ืขืฉื•ื™ื™ื ืœื”ื™ื•ืช ืืœื’ื•ืจื™ืชืžื™ื ืฉื™ื›ื•ืœื™ื ืœื ื‘ื
12:29
what we are about to do,
268
749422
1633
ืžื” ืื ื• ืขื•ืžื“ื™ื ืœืขืฉื•ืช,
12:31
and we may be held accountable
269
751055
1244
ื•ืื ื• ืขืœื•ืœื™ื ืœื”ื™ื—ืฉื‘
12:32
before we've actually acted.
270
752299
2590
ืœืื—ืจืื™ื ืœื“ื‘ืจื™ื ืฉื˜ืจื ื‘ื™ืฆืขื ื• ื‘ืคื•ืขืœ.
12:34
Privacy was the central challenge
271
754889
1732
ืคืจื˜ื™ื•ืช ื”ื™ื•ื•ืชื” ืืชื’ืจ ืžืจื›ื–ื™
12:36
in a small data era.
272
756621
2880
ื‘ืขื™ื“ืŸ ืฉืœ ื ืชื•ื ื™ื ืžื•ืขื˜ื™ื.
12:39
In the big data age,
273
759501
2149
ื‘ืขื™ื“ืŸ ื‘ื™ื’ ื“ืื˜ื”,
12:41
the challenge will be safeguarding free will,
274
761650
4523
ื”ืืชื’ืจ ื™ื”ื™ื” ืœืฉืžื•ืจ ืžื›ืœ ืžืฉืžืจ ืขืœ ื”ืจืฆื•ืŸ ื”ื—ื•ืคืฉื™,
12:46
moral choice, human volition,
275
766173
3779
ืขืœ ื”ื‘ื—ื™ืจื” ื”ืžื•ืกืจื™ืช, ืขืœ ืจืฆื•ืŸ ื”ืื“ื,
12:49
human agency.
276
769952
3068
ืขืœ ื”ืขืฆืžืื•ืช ื”ืื ื•ืฉื™ืช.
12:54
There is another problem:
277
774540
2225
ื™ืฉื ื” ื‘ืขื™ื” ื ื•ืกืคืช:
12:56
Big data is going to steal our jobs.
278
776765
3556
ื‘ื™ื’ ื“ืื˜ื” ื™ื’ื–ื•ืœ ืžืื™ืชื ื• ืืช ืขื‘ื•ื“ื•ืชื™ื ื•.
13:00
Big data and algorithms are going to challenge
279
780321
3512
ื‘ื™ื’ ื“ืื˜ื” ื•ื”ืืœื’ื•ืจื™ืชืžื™ื ืขื•ืžื“ื™ื ืœืงืจื•ื ืชื™ื’ืจ
13:03
white collar, professional knowledge work
280
783833
3061
ืขืœ ืขื‘ื•ื“ื•ืช ื”ืฆื•ื•ืืจื•ืŸ ื”ืœื‘ืŸ ื”ืžืงืฆื•ืขื™ื•ืช
13:06
in the 21st century
281
786894
1653
ืฉืœ ื”ืžืื” ื”-21
13:08
in the same way that factory automation
282
788547
2434
ื‘ืื•ืชื• ืื•ืคืŸ ืฉื”ืžื™ื›ื•ืŸ ื”ืชืขืฉื™ื™ืชื™
13:10
and the assembly line
283
790981
2189
ื•ืงื• ื”ื™ื™ืฆื•ืจ ืงืจืื• ืชื™ื’ืจ
13:13
challenged blue collar labor in the 20th century.
284
793170
3026
ืขืœ ืขื‘ื•ื“ื•ืช ื”ืฆื•ื•ืืจื•ืŸ ื”ื›ื—ื•ืœ ื‘ืžืื” ื”-20.
13:16
Think about a lab technician
285
796196
2092
ืชื—ืฉื‘ื• ืขืœ ื˜ื›ื ืื™ ืžืขื‘ื“ื”
13:18
who is looking through a microscope
286
798288
1409
ืฉืžืกืชื›ืœ ื“ืจืš ืžื™ืงืจื•ืกืงื•ืค
13:19
at a cancer biopsy
287
799697
1624
ืขืœ ืจื™ืงืžื” ืกืจื˜ื ื™ืช
13:21
and determining whether it's cancerous or not.
288
801321
2637
ื•ืžื—ืœื™ื˜ ืื ื”ื™ื ืžืžืื™ืจื” ืื• ืœื.
13:23
The person went to university.
289
803958
1972
ื”ืื“ื ืœืžื“ ื‘ืื•ื ื™ื‘ืจืกื™ื˜ื”.
13:25
The person buys property.
290
805930
1430
ื”ื•ื ืงื ื” ืจื›ื•ืฉ.
13:27
He or she votes.
291
807360
1741
ื”ื•ื ืื• ื”ื™ื ืžืฆื‘ื™ืขื™ื ื‘ื‘ื—ื™ืจื•ืช.
13:29
He or she is a stakeholder in society.
292
809101
3666
ื”ื•ื ืื• ื”ื™ื ื‘ืขืœื™ ืขื ื™ื™ืŸ ื‘ื—ื‘ืจื”.
13:32
And that person's job,
293
812767
1394
ื•ื”ืขื‘ื•ื“ื” ืฉืœ ืื•ืชื• ืื“ื,
13:34
as well as an entire fleet
294
814161
1609
ื›ืžื• ื’ื ืฆื™ ืฉืœื
13:35
of professionals like that person,
295
815770
1969
ืฉืœ ืื ืฉื™ ืžืงืฆื•ืข ื›ืžื• ืื•ืชื• ืื“ื,
13:37
is going to find that their jobs are radically changed
296
817739
3150
ื™ืžืฆืื• ืฉื”ืขื‘ื•ื“ื•ืช ืฉืœื”ื ื”ืฉืชื ื• ื‘ืื•ืคืŸ ื ื™ื›ืจ
13:40
or actually completely eliminated.
297
820889
2357
ืื• ืœืžืขืฉื” ื”ืชื—ืกืœื• ื›ืœื™ืœ.
13:43
Now, we like to think
298
823246
1284
ืื ื• ืื•ื”ื‘ื™ื ืœื—ืฉื•ื‘
13:44
that technology creates jobs over a period of time
299
824530
3187
ืฉื”ื˜ื›ื ื•ืœื•ื’ื™ื” ื™ื•ืฆืจืช ืขื ื”ื–ืžืŸ ืขื‘ื•ื“ื•ืช,
13:47
after a short, temporary period of dislocation,
300
827717
3465
ืœืื—ืจ ืชืงื•ืคืช-ืžืขื‘ืจ ืงืฆืจื” ื•ื–ืžื ื™ืช,
13:51
and that is true for the frame of reference
301
831182
1941
ื•ื–ื” ื ื›ื•ืŸ ื‘ื™ื—ืก ืœืžืฆื™ืื•ืช ืฉืื ื•
13:53
with which we all live, the Industrial Revolution,
302
833123
2142
ื—ื™ื™ื ื‘ื”, ืฉื”ื™ื ื”ืžื”ืคื›ื” ื”ืชืขืฉื™ื™ืชื™ืช,
13:55
because that's precisely what happened.
303
835265
2328
ื›ื™ ื–ื” ื‘ื“ื™ื•ืง ืžื” ืฉืงืจื”.
13:57
But we forget something in that analysis:
304
837593
2333
ืื‘ืœ ืื ื• ืฉื•ื›ื—ื™ื ืžืฉื”ื• ื‘ื ื™ืชื•ื— ื–ื”:
13:59
There are some categories of jobs
305
839926
1830
ื™ืฉื ื ื›ืžื” ืชื—ื•ืžื™ ืขื‘ื•ื“ื”
14:01
that simply get eliminated and never come back.
306
841756
3420
ืฉืคืฉื•ื˜ ื ืžื—ืงื™ื ื•ืœืขื•ืœื ืœื ืฉื‘ื™ื.
14:05
The Industrial Revolution wasn't very good
307
845176
2004
ื”ืžื”ืคื›ื” ื”ืชืขืฉื™ื™ืชื™ืช
ืœื ื”ื™ืชื” ื˜ื•ื‘ื” ื‘ืžื™ื•ื—ื“ ืขื‘ื•ืจ ืกื•ืกื™ื.
14:07
if you were a horse.
308
847180
4002
14:11
So we're going to need to be careful
309
851182
2055
ืœื›ืŸ ื™ื”ื™ื” ืขืœื™ื ื• ืœื”ื™ื–ื”ืจ
14:13
and take big data and adjust it for our needs,
310
853237
3514
ื•ืœื”ืชืื™ื ืืช ื‘ื™ื’ ื“ืื˜ื” ืœืฆืจื›ื™ื ื•,
14:16
our very human needs.
311
856751
3185
ืฆืจื›ื™ื ื• ื”ืื ื•ืฉื™ื™ื ื‘ื™ื•ืชืจ.
14:19
We have to be the master of this technology,
312
859936
1954
ื™ื”ื™ื” ืขืœื™ื ื• ืœื”ื™ื•ืช ืื“ื•ื ื™ื” ืฉืœ ื˜ื›ื ื•ืœื•ื’ื™ื” ื–ื•,
14:21
not its servant.
313
861890
1656
ืœื ืžืฉืจืชื™ื”.
14:23
We are just at the outset of the big data era,
314
863546
2958
ืื ื• ื ืžืฆืื™ื ืจืง ื‘ืชื—ื™ืœืชื• ืฉืœ ืขื™ื“ืŸ ื‘ื™ื’ ื“ืื˜ื”,
14:26
and honestly, we are not very good
315
866504
3150
ื•ื”ืืžืช ื”ื™ื ืฉืื ื• ืœื ืžืฆื˜ื™ื™ื ื™ื
14:29
at handling all the data that we can now collect.
316
869654
4207
ื‘ื˜ื™ืคื•ืœ ื‘ื›ืœ ื”ื ืชื•ื ื™ื ืฉืื ื• ืžืกื•ื’ืœื™ื ืœืืกื•ืฃ ื›ื™ื•ื.
14:33
It's not just a problem for the National Security Agency.
317
873861
3330
ื–ื• ืœื ืจืง ื‘ืขื™ื” ื”ื ื•ื’ืขืช ืœืกื•ื›ื ื•ืช ืœื‘ื™ื˜ื—ื•ืŸ ืœืื•ืžื™.
14:37
Businesses collect lots of data, and they misuse it too,
318
877191
3038
ื—ื‘ืจื•ืช ืื•ืกืคื•ืช ื”ืžื•ืŸ ื ืชื•ื ื™ื ื•ื”ืŸ ื’ื ืžืฉืชืžืฉื•ืช ื‘ื• ืœืจืขื”,
14:40
and we need to get better at this, and this will take time.
319
880229
3667
ื•ืขืœื™ื ื• ืœื”ืฉืชืคืจ ื‘ืชื—ื•ื ื–ื”, ื•ื–ื” ื™ืงื— ื–ืžืŸ.
14:43
It's a little bit like the challenge that was faced
320
883896
1822
ื–ื” ืงืฆืช ื›ืžื• ื”ืืชื’ืจ ืฉื ื™ืฆื‘
14:45
by primitive man and fire.
321
885718
2407
ื‘ืคื ื™ ื”ืื“ื ื”ืงื“ืžื•ืŸ ืขื ื”ืืฉ.
14:48
This is a tool, but this is a tool that,
322
888125
1885
ื–ื”ื• ื›ืœื™, ืื‘ืœ ื›ืœื™ ืฉืื
14:50
unless we're careful, will burn us.
323
890010
3559
ืœื ื ื”ื™ื” ื–ื”ื™ืจื™ื ืื™ืชื•, ื”ื•ื ื™ืฉืจื•ืฃ ืื•ืชื ื•.
14:56
Big data is going to transform how we live,
324
896008
3120
ื‘ื™ื’ ื“ืื˜ื” ืขื•ืžื“ ืœืฉื ื•ืช ืืช ื“ืจืš ื—ื™ื™ื ื•,
14:59
how we work and how we think.
325
899128
2801
ืืช ื“ืจืš ืขื‘ื•ื“ืชื ื• ื•ื—ืฉื™ื‘ืชื ื•.
15:01
It is going to help us manage our careers
326
901929
1889
ื”ื•ื ื™ืกื™ื™ืข ืœื ื• ืœื ื”ืœ ืืช ื”ืงืจื™ื™ืจื•ืช ืฉืœื ื•
15:03
and lead lives of satisfaction and hope
327
903818
3634
ื•ืœื ื”ืœ ื—ื™ื™ื ืฉืœ ืกื™ืคื•ืง, ืชืงื•ื•ื”
15:07
and happiness and health,
328
907452
2992
ืื•ืฉืจ ื•ื‘ืจื™ืื•ืช.
15:10
but in the past, we've often looked at information technology
329
910444
3306
ืื‘ืœ ื‘ืขื‘ืจ, ื”ืกืชื›ืœื ื• ืขืœ "ื˜ื›ื ื•ืœื•ื’ื™ื™ืช ืžื™ื“ืข"
15:13
and our eyes have only seen the T,
330
913750
2208
ื•ืจืื™ื ื• ืจืง ืืช ื”-"ื˜",
15:15
the technology, the hardware,
331
915958
1686
ืืช ื”ื˜ื›ื ื•ืœื•ื’ื™ื”, ื”ื—ื•ืžืจื”,
15:17
because that's what was physical.
332
917644
2262
ื›ื™ ื”ื ื”ื™ื• ื”ื“ื‘ืจื™ื ื”ืคื™ื–ื™ืงืœื™ื™ื.
15:19
We now need to recast our gaze at the I,
333
919906
2924
ื›ืขืช ืื ื• ืฆืจื™ื›ื™ื ืœื”ืกืชื›ืœ ืขืœ ื”-"ืž",
15:22
the information,
334
922830
1380
ื”ืžื™ื“ืข,
15:24
which is less apparent,
335
924210
1373
ื”ื‘ื•ืœื˜ ืคื—ื•ืช ืœืขื™ืŸ,
15:25
but in some ways a lot more important.
336
925583
4109
ืื‘ืœ ื‘ืžื•ื‘ื ื™ื ืžืกื•ื™ื™ืžื™ื ื”ื•ื ื”ืจื‘ื” ื™ื•ืชืจ ื—ืฉื•ื‘.
15:29
Humanity can finally learn from the information
337
929692
3465
ื”ืื ื•ืฉื•ืช ื™ื›ื•ืœื” ืกื•ืฃ-ืกื•ืฃ ืœืœืžื•ื“ ืžื”ืžื™ื“ืข
15:33
that it can collect,
338
933157
2418
ืฉื”ื™ื ืžืกื•ื’ืœืช ืœืืกื•ืฃ,
15:35
as part of our timeless quest
339
935575
2115
ื›ื—ืœืง ืžืžืกืขื™ื ื• ื”ื ืฆื—ื™
15:37
to understand the world and our place in it,
340
937690
3159
ืœื”ื‘ื ืช ื”ืขื•ืœื ื•ืžืงื•ืžื ื• ื‘ืชื•ื›ื•,
15:40
and that's why big data is a big deal.
341
940849
5631
ื•ื–ื• ื”ืกื™ื‘ื” ืžื“ื•ืข ื‘ื™ื’ ื“ืื˜ื” ื”ื•ื ืขื ื™ื™ืŸ ื›ื” ื—ืฉื•ื‘.
15:46
(Applause)
342
946480
3568
(ืžื—ื™ืื•ืช ื›ืคื™ื™ื)
ืขืœ ืืชืจ ื–ื”

ืืชืจ ื–ื” ื™ืฆื™ื’ ื‘ืคื ื™ื›ื ืกืจื˜ื•ื ื™ YouTube ื”ืžื•ืขื™ืœื™ื ืœืœื™ืžื•ื“ ืื ื’ืœื™ืช. ืชื•ื›ืœื• ืœืจืื•ืช ืฉื™ืขื•ืจื™ ืื ื’ืœื™ืช ื”ืžื•ืขื‘ืจื™ื ืขืœ ื™ื“ื™ ืžื•ืจื™ื ืžื”ืฉื•ืจื” ื”ืจืืฉื•ื ื” ืžืจื—ื‘ื™ ื”ืขื•ืœื. ืœื—ืฅ ืคืขืžื™ื™ื ืขืœ ื”ื›ืชื•ื‘ื™ื•ืช ื‘ืื ื’ืœื™ืช ื”ืžื•ืฆื’ื•ืช ื‘ื›ืœ ื“ืฃ ื•ื™ื“ืื• ื›ื“ื™ ืœื”ืคืขื™ืœ ืืช ื”ืกืจื˜ื•ืŸ ืžืฉื. ื”ื›ืชื•ื‘ื™ื•ืช ื’ื•ืœืœื•ืช ื‘ืกื ื›ืจื•ืŸ ืขื ื”ืคืขืœืช ื”ื•ื•ื™ื“ืื•. ืื ื™ืฉ ืœืš ื”ืขืจื•ืช ืื• ื‘ืงืฉื•ืช, ืื ื ืฆื•ืจ ืื™ืชื ื• ืงืฉืจ ื‘ืืžืฆืขื•ืช ื˜ื•ืคืก ื™ืฆื™ืจืช ืงืฉืจ ื–ื”.

https://forms.gle/WvT1wiN1qDtmnspy7