Big Data - Tim Smith

577,235 views ・ 2013-05-03

TED-Ed


Please double-click on the English subtitles below to play the video.

00:00
Translator: Andrea McDonough Reviewer: Jessica Ruby
0
0
7000
Translator: Andrea McDonough Reviewer: Daban Q. Jaff
00:31
Big data is an elusive concept.
1
31085
2762
00:35
It represents an amount of digital information,
2
35987
2688
00:38
which is uncomfortable to store,
3
38675
2170
00:40
transport,
4
40845
1128
00:41
or analyze.
5
41973
1878
00:43
Big data is so voluminous
6
43851
1915
00:45
that it overwhelms the technologies of the day
7
45766
2708
00:48
and challenges us to create the next generation
8
48474
2425
00:50
of data storage tools and techniques.
9
50899
3105
00:59
So, big data isn't new.
10
59557
1779
01:01
In fact, physicists at CERN have been rangling
11
61336
2358
01:03
with the challenge of their ever-expanding big data for decades.
12
63694
4399
01:09
Fifty years ago, CERN's data could be stored
13
69431
2323
01:11
in a single computer.
14
71754
1752
01:13
OK, so it wasn't your usual computer,
15
73506
2154
01:15
this was a mainframe computer
16
75660
1417
01:17
that filled an entire building.
17
77077
2310
01:21
To analyze the data,
18
81494
1169
01:22
physicists from around the world traveled to CERN
19
82663
2948
01:25
to connect to the enormous machine.
20
85611
3026
01:31
In the 1970's, our ever-growing big data
21
91075
2853
01:33
was distributed across different sets of computers,
22
93928
2750
01:36
which mushroomed at CERN.
23
96678
2030
01:38
Each set was joined together
24
98708
1442
01:40
in dedicated, homegrown networks.
25
100150
2528
01:42
But physicists collaborated without regard
26
102678
1786
01:44
for the boundaries between sets,
27
104464
1949
01:46
hence needed to access data on all of these.
28
106413
2889
01:49
So, we bridged the independent networks together
29
109302
1985
01:51
in our own CERNET.
30
111287
3092
01:54
In the 1980's, islands of similar networks
31
114379
2848
01:57
speaking different dialects
32
117227
1544
01:58
sprung up all over Europe and the States,
33
118771
2540
02:01
making remote access possible but torturous.
34
121311
3091
02:04
To make it easy for our physicists across the world
35
124402
2144
02:06
to access the ever-expanding big data
36
126546
2405
02:08
stored at CERN without traveling,
37
128951
1793
02:10
the networks needed to be talking
38
130744
1299
02:12
with the same language.
39
132043
1370
02:13
We adopted the fledgling internet working standard from the States,
40
133413
3795
02:17
followed by the rest of Europe,
41
137208
1376
02:18
and we established the principal link at CERN
42
138584
2168
02:20
between Europe and the States in 1989,
43
140752
2503
02:23
and the truly global internet took off!
44
143255
2786
02:28
Physicists could easily then access
45
148580
1791
02:30
the terabytes of big data
46
150371
1812
02:32
remotely from around the world,
47
152183
1663
02:33
generate results,
48
153846
1379
02:35
and write papers in their home institutes.
49
155225
2295
02:37
Then, they wanted to share their findings
50
157520
1501
02:39
with all their colleagues.
51
159021
1792
02:40
To make this information sharing easy,
52
160813
1603
02:42
we created the web in the early 1990's.
53
162416
2942
02:45
Physicists no longer needed to know
54
165358
1838
02:47
where the information was stored
55
167196
1637
02:48
in order to find it and access it on the web,
56
168833
2569
02:51
an idea which caught on across the world
57
171402
2134
02:53
and has transformed the way we communicate
58
173536
2376
02:55
in our daily lives.
59
175912
1668
03:00
During the early 2000's,
60
180226
1407
03:01
the continued growth of our big data
61
181633
1990
03:03
outstripped our capability to analyze it at CERN,
62
183623
3291
03:06
despite having buildings full of computers.
63
186914
3585
03:10
We had to start distributing the petabytes of data
64
190499
2306
03:12
to our collaborating partners
65
192805
1582
03:14
in order to employ local computing and storage
66
194387
2752
03:17
at hundreds of different institutes.
67
197139
2835
03:19
In order to orchestrate these interconnected resources
68
199974
2295
03:22
with their diverse technologies,
69
202269
2044
03:24
we developed a computing grid,
70
204313
1751
03:26
enabling the seamless sharing
71
206064
1576
03:27
of computing resources around the globe.
72
207640
2428
03:30
This relies on trust relationships and mutual exchange.
73
210068
4391
03:34
But this grid model could not be transferred
74
214459
2293
03:36
out of our community so easily,
75
216752
2284
03:39
where not everyone has resources to share
76
219036
2294
03:41
nor could companies be expected
77
221330
1876
03:43
to have the same level of trust.
78
223206
2753
03:45
Instead, an alternative, more business-like approach
79
225959
2295
03:48
for accessing on-demand resources
80
228254
1836
03:50
has been flourishing recently,
81
230090
1708
03:51
called cloud computing,
82
231798
1668
03:53
which other communities are now exploiting
83
233466
1876
03:55
to analyzing their big data.
84
235342
2000
03:57
It might seem paradoxical for a place like CERN,
85
237342
2987
04:00
a lab focused on the study
86
240329
1571
04:01
of the unimaginably small building blocks of matter,
87
241900
3171
04:05
to be the source of something as big as big data.
88
245071
3377
04:08
But the way we study the fundamental particles,
89
248448
2082
04:10
as well as the forces by which they interact,
90
250530
2613
04:13
involves creating them fleetingly,
91
253143
2103
04:15
colliding protons in our accelerators
92
255246
2368
04:17
and capturing a trace of them
93
257614
1427
04:19
as they zoom off near light speed.
94
259041
2273
04:21
To see those traces,
95
261314
994
04:22
our detector, with 150 million sensors,
96
262308
3448
04:25
acts like a really massive 3-D camera,
97
265756
2475
04:28
taking a picture of each collision event -
98
268231
2110
04:30
that's up to 14 millions times per second.
99
270341
2550
04:32
That makes a lot of data.
100
272891
2533
04:37
But if big data has been around for so long,
101
277194
2159
04:39
why do we suddenly keep hearing about it now?
102
279353
2627
04:41
Well, as the old metaphor explains,
103
281980
1711
04:43
the whole is greater than the sum of its parts,
104
283691
2788
04:46
and this is no longer just science that is exploiting this.
105
286479
3777
04:50
The fact that we can derive more knowledge
106
290256
1604
04:51
by joining related information together
107
291860
2330
04:54
and spotting correlations
108
294190
1551
04:55
can inform and enrich numerous aspects of everyday life,
109
295741
3391
04:59
either in real time,
110
299132
1028
05:00
such as traffic or financial conditions,
111
300160
2291
05:02
in short-term evolutions,
112
302451
1755
05:04
such as medical or meteorological,
113
304206
2127
05:06
or in predictive situations,
114
306333
1725
05:08
such as business, crime, or disease trends.
115
308058
3020
05:13
Virtually every field is turning to gathering big data,
116
313369
3063
05:16
with mobile sensor networks spanning the globe,
117
316432
2337
05:18
cameras on the ground and in the air,
118
318769
2287
05:21
archives storing information published on the web,
119
321056
3011
05:24
and loggers capturing the activities
120
324067
2129
05:26
of Internet citizens the world over.
121
326196
2699
05:28
The challenge is on to invent new tools and techniques
122
328895
2591
05:31
to mine these vast stores,
123
331486
1953
05:33
to inform decision making,
124
333439
1801
05:35
to improve medical diagnosis,
125
335240
2256
05:37
and otherwise to answer needs and desires
126
337496
2210
05:39
of tomorrow's society in ways that are unimagined today.
127
339706
3957

Original video on YouTube.com
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7