Tim Berners-Lee: The next Web of open, linked data

435,103 views ・ 2009-03-13

TED


Please double-click on the English subtitles below to play the video.

00:18
Time flies.
0
18330
2000
00:20
It's actually almost 20 years ago
1
20330
2000
00:22
when I wanted to reframe the way we use information,
2
22330
4000
00:26
the way we work together: I invented the World Wide Web.
3
26330
3000
00:29
Now, 20 years on, at TED,
4
29330
3000
00:32
I want to ask your help in a new reframing.
5
32330
4000
00:37
So going back to 1989,
6
37330
4000
00:41
I wrote a memo suggesting the global hypertext system.
7
41330
3000
00:44
Nobody really did anything with it, pretty much.
8
44330
3000
00:47
But 18 months later -- this is how innovation happens --
9
47330
4000
00:51
18 months later, my boss said I could do it on the side,
10
51330
4000
00:55
as a sort of a play project,
11
55330
2000
00:57
kick the tires of a new computer we'd got.
12
57330
2000
00:59
And so he gave me the time to code it up.
13
59330
3000
01:02
So I basically roughed out what HTML should look like:
14
62330
5000
01:07
hypertext protocol, HTTP;
15
67330
3000
01:10
the idea of URLs, these names for things
16
70330
3000
01:13
which started with HTTP.
17
73330
2000
01:15
I wrote the code and put it out there.
18
75330
2000
01:17
Why did I do it?
19
77330
2000
01:19
Well, it was basically frustration.
20
79330
2000
01:21
I was frustrated -- I was working as a software engineer
21
81330
4000
01:25
in this huge, very exciting lab,
22
85330
2000
01:27
lots of people coming from all over the world.
23
87330
2000
01:29
They brought all sorts of different computers with them.
24
89330
3000
01:32
They had all sorts of different data formats,
25
92330
3000
01:35
all sorts, all kinds of documentation systems.
26
95330
2000
01:37
So that, in all that diversity,
27
97330
3000
01:40
if I wanted to figure out how to build something
28
100330
2000
01:42
out of a bit of this and a bit of this,
29
102330
2000
01:44
everything I looked into, I had to connect to some new machine,
30
104330
4000
01:48
I had to learn to run some new program,
31
108330
2000
01:50
I would find the information I wanted in some new data format.
32
110330
5000
01:55
And these were all incompatible.
33
115330
2000
01:57
It was just very frustrating.
34
117330
2000
01:59
The frustration was all this unlocked potential.
35
119330
2000
02:01
In fact, on all these discs there were documents.
36
121330
3000
02:04
So if you just imagined them all
37
124330
3000
02:07
being part of some big, virtual documentation system in the sky,
38
127330
5000
02:12
say on the Internet,
39
132330
2000
02:14
then life would be so much easier.
40
134330
2000
02:16
Well, once you've had an idea like that it kind of gets under your skin
41
136330
4000
02:20
and even if people don't read your memo --
42
140330
2000
02:22
actually he did, it was found after he died, his copy.
43
142330
3000
02:25
He had written, "Vague, but exciting," in pencil, in the corner.
44
145330
3000
02:28
(Laughter)
45
148330
2000
02:30
But in general it was difficult -- it was really difficult to explain
46
150330
4000
02:34
what the web was like.
47
154330
2000
02:36
It's difficult to explain to people now that it was difficult then.
48
156330
2000
02:38
But then -- OK, when TED started, there was no web
49
158330
3000
02:41
so things like "click" didn't have the same meaning.
50
161330
3000
02:44
I can show somebody a piece of hypertext,
51
164330
2000
02:46
a page which has got links,
52
166330
2000
02:48
and we click on the link and bing -- there'll be another hypertext page.
53
168330
4000
02:52
Not impressive.
54
172330
2000
02:54
You know, we've seen that -- we've got things on hypertext on CD-ROMs.
55
174330
3000
02:57
What was difficult was to get them to imagine:
56
177330
3000
03:00
so, imagine that that link could have gone
57
180330
4000
03:04
to virtually any document you could imagine.
58
184330
2000
03:07
Alright, that is the leap that was very difficult for people to make.
59
187330
4000
03:11
Well, some people did.
60
191330
2000
03:13
So yeah, it was difficult to explain, but there was a grassroots movement.
61
193330
3000
03:17
And that is what has made it most fun.
62
197330
4000
03:21
That has been the most exciting thing,
63
201330
2000
03:23
not the technology, not the things people have done with it,
64
203330
2000
03:25
but actually the community, the spirit of all these people
65
205330
2000
03:27
getting together, sending the emails.
66
207330
2000
03:29
That's what it was like then.
67
209330
2000
03:31
Do you know what? It's funny, but right now it's kind of like that again.
68
211330
3000
03:34
I asked everybody, more or less, to put their documents --
69
214330
2000
03:36
I said, "Could you put your documents on this web thing?"
70
216330
3000
03:39
And you did.
71
219330
3000
03:42
Thanks.
72
222330
1000
03:43
It's been a blast, hasn't it?
73
223330
2000
03:45
I mean, it has been quite interesting
74
225330
2000
03:47
because we've found out that the things that happen with the web
75
227330
2000
03:49
really sort of blow us away.
76
229330
2000
03:51
They're much more than we'd originally imagined
77
231330
2000
03:53
when we put together the little, initial website
78
233330
2000
03:55
that we started off with.
79
235330
2000
03:57
Now, I want you to put your data on the web.
80
237330
3000
04:00
Turns out that there is still huge unlocked potential.
81
240330
4000
04:04
There is still a huge frustration
82
244330
2000
04:06
that people have because we haven't got data on the web as data.
83
246330
4000
04:10
What do you mean, "data"? What's the difference -- documents, data?
84
250330
2000
04:12
Well, documents you read, OK?
85
252330
3000
04:15
More or less, you read them, you can follow links from them, and that's it.
86
255330
3000
04:18
Data -- you can do all kinds of stuff with a computer.
87
258330
2000
04:20
Who was here or has otherwise seen Hans Rosling's talk?
88
260330
6000
04:26
One of the great -- yes a lot of people have seen it --
89
266330
4000
04:30
one of the great TED Talks.
90
270330
2000
04:32
Hans put up this presentation
91
272330
2000
04:34
in which he showed, for various different countries, in various different colors --
92
274330
5000
04:39
he showed income levels on one axis
93
279330
3000
04:42
and he showed infant mortality, and he shot this thing animated through time.
94
282330
3000
04:45
So, he'd taken this data and made a presentation
95
285330
4000
04:49
which just shattered a lot of myths that people had
96
289330
3000
04:52
about the economics in the developing world.
97
292330
4000
04:56
He put up a slide a little bit like this.
98
296330
2000
04:58
It had underground all the data
99
298330
2000
05:00
OK, data is brown and boxy and boring,
100
300330
3000
05:03
and that's how we think of it, isn't it?
101
303330
2000
05:05
Because data you can't naturally use by itself
102
305330
3000
05:08
But in fact, data drives a huge amount of what happens in our lives
103
308330
4000
05:12
and it happens because somebody takes that data and does something with it.
104
312330
3000
05:15
In this case, Hans had put the data together
105
315330
2000
05:17
he had found from all kinds of United Nations websites and things.
106
317330
5000
05:22
He had put it together,
107
322330
2000
05:24
combined it into something more interesting than the original pieces
108
324330
3000
05:27
and then he'd put it into this software,
109
327330
5000
05:32
which I think his son developed, originally,
110
332330
2000
05:34
and produces this wonderful presentation.
111
334330
3000
05:37
And Hans made a point
112
337330
2000
05:39
of saying, "Look, it's really important to have a lot of data."
113
339330
4000
05:43
And I was happy to see that at the party last night
114
343330
3000
05:46
that he was still saying, very forcibly, "It's really important to have a lot of data."
115
346330
4000
05:50
So I want us now to think about
116
350330
2000
05:52
not just two pieces of data being connected, or six like he did,
117
352330
4000
05:56
but I want to think about a world where everybody has put data on the web
118
356330
5000
06:01
and so virtually everything you can imagine is on the web
119
361330
2000
06:03
and then calling that linked data.
120
363330
2000
06:05
The technology is linked data, and it's extremely simple.
121
365330
2000
06:07
If you want to put something on the web there are three rules:
122
367330
4000
06:11
first thing is that those HTTP names --
123
371330
3000
06:14
those things that start with "http:" --
124
374330
2000
06:16
we're using them not just for documents now,
125
376330
4000
06:20
we're using them for things that the documents are about.
126
380330
2000
06:22
We're using them for people, we're using them for places,
127
382330
2000
06:24
we're using them for your products, we're using them for events.
128
384330
4000
06:28
All kinds of conceptual things, they have names now that start with HTTP.
129
388330
4000
06:32
Second rule, if I take one of these HTTP names and I look it up
130
392330
5000
06:37
and I do the web thing with it and I fetch the data
131
397330
2000
06:39
using the HTTP protocol from the web,
132
399330
2000
06:41
I will get back some data in a standard format
133
401330
3000
06:44
which is kind of useful data that somebody might like to know
134
404330
5000
06:49
about that thing, about that event.
135
409330
2000
06:51
Who's at the event? Whatever it is about that person,
136
411330
2000
06:53
where they were born, things like that.
137
413330
2000
06:55
So the second rule is I get important information back.
138
415330
2000
06:57
Third rule is that when I get back that information
139
417330
4000
07:01
it's not just got somebody's height and weight and when they were born,
140
421330
3000
07:04
it's got relationships.
141
424330
2000
07:06
Data is relationships.
142
426330
2000
07:08
Interestingly, data is relationships.
143
428330
2000
07:10
This person was born in Berlin; Berlin is in Germany.
144
430330
4000
07:14
And when it has relationships, whenever it expresses a relationship
145
434330
3000
07:17
then the other thing that it's related to
146
437330
3000
07:20
is given one of those names that starts HTTP.
147
440330
4000
07:24
So, I can go ahead and look that thing up.
148
444330
2000
07:26
So I look up a person -- I can look up then the city where they were born; then
149
446330
3000
07:29
I can look up the region it's in, and the town it's in,
150
449330
3000
07:32
and the population of it, and so on.
151
452330
3000
07:35
So I can browse this stuff.
152
455330
2000
07:37
So that's it, really.
153
457330
2000
07:39
That is linked data.
154
459330
2000
07:41
I wrote an article entitled "Linked Data" a couple of years ago
155
461330
3000
07:44
and soon after that, things started to happen.
156
464330
4000
07:48
The idea of linked data is that we get lots and lots and lots
157
468330
4000
07:52
of these boxes that Hans had,
158
472330
2000
07:54
and we get lots and lots and lots of things sprouting.
159
474330
2000
07:56
It's not just a whole lot of other plants.
160
476330
3000
07:59
It's not just a root supplying a plant,
161
479330
2000
08:01
but for each of those plants, whatever it is --
162
481330
3000
08:04
a presentation, an analysis, somebody's looking for patterns in the data --
163
484330
3000
08:07
they get to look at all the data
164
487330
3000
08:10
and they get it connected together,
165
490330
2000
08:12
and the really important thing about data
166
492330
2000
08:14
is the more things you have to connect together, the more powerful it is.
167
494330
2000
08:16
So, linked data.
168
496330
2000
08:18
The meme went out there.
169
498330
2000
08:20
And, pretty soon Chris Bizer at the Freie Universitat in Berlin
170
500330
4000
08:24
who was one of the first people to put interesting things up,
171
504330
2000
08:26
he noticed that Wikipedia --
172
506330
2000
08:28
you know Wikipedia, the online encyclopedia
173
508330
3000
08:31
with lots and lots of interesting documents in it.
174
511330
2000
08:33
Well, in those documents, there are little squares, little boxes.
175
513330
4000
08:37
And in most information boxes, there's data.
176
517330
3000
08:40
So he wrote a program to take the data, extract it from Wikipedia,
177
520330
4000
08:44
and put it into a blob of linked data
178
524330
2000
08:46
on the web, which he called dbpedia.
179
526330
3000
08:49
Dbpedia is represented by the blue blob in the middle of this slide
180
529330
4000
08:53
and if you actually go and look up Berlin,
181
533330
2000
08:55
you'll find that there are other blobs of data
182
535330
2000
08:57
which also have stuff about Berlin, and they're linked together.
183
537330
3000
09:00
So if you pull the data from dbpedia about Berlin,
184
540330
3000
09:03
you'll end up pulling up these other things as well.
185
543330
2000
09:05
And the exciting thing is it's starting to grow.
186
545330
3000
09:08
This is just the grassroots stuff again, OK?
187
548330
2000
09:10
Let's think about data for a bit.
188
550330
3000
09:13
Data comes in fact in lots and lots of different forms.
189
553330
3000
09:16
Think of the diversity of the web. It's a really important thing
190
556330
3000
09:19
that the web allows you to put all kinds of data up there.
191
559330
3000
09:22
So it is with data. I could talk about all kinds of data.
192
562330
2000
09:25
We could talk about government data, enterprise data is really important,
193
565330
4000
09:29
there's scientific data, there's personal data,
194
569330
3000
09:32
there's weather data, there's data about events,
195
572330
2000
09:34
there's data about talks, and there's news and there's all kinds of stuff.
196
574330
4000
09:38
I'm just going to mention a few of them
197
578330
3000
09:41
so that you get the idea of the diversity of it,
198
581330
2000
09:43
so that you also see how much unlocked potential.
199
583330
4000
09:47
Let's start with government data.
200
587330
2000
09:49
Barack Obama said in a speech,
201
589330
2000
09:51
that he -- American government data would be available on the Internet
202
591330
5000
09:56
in accessible formats.
203
596330
2000
09:58
And I hope that they will put it up as linked data.
204
598330
2000
10:00
That's important. Why is it important?
205
600330
2000
10:02
Not just for transparency, yeah transparency in government is important,
206
602330
3000
10:05
but that data -- this is the data from all the government departments
207
605330
3000
10:08
Think about how much of that data is about how life is lived in America.
208
608330
5000
10:13
It's actual useful. It's got value.
209
613330
2000
10:15
I can use it in my company.
210
615330
2000
10:17
I could use it as a kid to do my homework.
211
617330
2000
10:19
So we're talking about making the place, making the world run better
212
619330
3000
10:22
by making this data available.
213
622330
2000
10:24
In fact if you're responsible -- if you know about some data
214
624330
4000
10:28
in a government department, often you find that
215
628330
2000
10:30
these people, they're very tempted to keep it --
216
630330
3000
10:33
Hans calls it database hugging.
217
633330
3000
10:36
You hug your database, you don't want to let it go
218
636330
2000
10:38
until you've made a beautiful website for it.
219
638330
2000
10:40
Well, I'd like to suggest that rather --
220
640330
2000
10:42
yes, make a beautiful website,
221
642330
2000
10:44
who am I to say don't make a beautiful website?
222
644330
2000
10:46
Make a beautiful website, but first
223
646330
3000
10:49
give us the unadulterated data,
224
649330
3000
10:52
we want the data.
225
652330
2000
10:54
We want unadulterated data.
226
654330
2000
10:56
OK, we have to ask for raw data now.
227
656330
3000
10:59
And I'm going to ask you to practice that, OK?
228
659330
2000
11:01
Can you say "raw"?
229
661330
1000
11:02
Audience: Raw.
230
662330
1000
11:03
Tim Berners-Lee: Can you say "data"?
231
663330
1000
11:04
Audience: Data.
232
664330
1000
11:05
TBL: Can you say "now"?
233
665330
1000
11:06
Audience: Now!
234
666330
1000
11:07
TBL: Alright, "raw data now"!
235
667330
2000
11:09
Audience: Raw data now!
236
669330
2000
11:11
Practice that. It's important because you have no idea the number of excuses
237
671330
4000
11:15
people come up with to hang onto their data
238
675330
2000
11:17
and not give it to you, even though you've paid for it as a taxpayer.
239
677330
4000
11:21
And it's not just America. It's all over the world.
240
681330
2000
11:23
And it's not just governments, of course -- it's enterprises as well.
241
683330
3000
11:26
So I'm just going to mention a few other thoughts on data.
242
686330
3000
11:29
Here we are at TED, and all the time we are very conscious
243
689330
5000
11:34
of the huge challenges that human society has right now --
244
694330
5000
11:39
curing cancer, understanding the brain for Alzheimer's,
245
699330
3000
11:42
understanding the economy to make it a little bit more stable,
246
702330
3000
11:45
understanding how the world works.
247
705330
2000
11:47
The people who are going to solve those -- the scientists --
248
707330
2000
11:49
they have half-formed ideas in their head,
249
709330
2000
11:51
they try to communicate those over the web.
250
711330
3000
11:54
But a lot of the state of knowledge of the human race at the moment
251
714330
3000
11:57
is on databases, often sitting in their computers,
252
717330
3000
12:00
and actually, currently not shared.
253
720330
3000
12:03
In fact, I'll just go into one area --
254
723330
3000
12:06
if you're looking at Alzheimer's, for example,
255
726330
2000
12:08
drug discovery -- there is a whole lot of linked data which is just coming out
256
728330
3000
12:11
because scientists in that field realize
257
731330
2000
12:13
this is a great way of getting out of those silos,
258
733330
3000
12:16
because they had their genomics data in one database
259
736330
4000
12:20
in one building, and they had their protein data in another.
260
740330
3000
12:23
Now, they are sticking it onto -- linked data --
261
743330
3000
12:26
and now they can ask the sort of question, that you probably wouldn't ask,
262
746330
3000
12:29
I wouldn't ask -- they would.
263
749330
2000
12:31
What proteins are involved in signal transduction
264
751330
2000
12:33
and also related to pyramidal neurons?
265
753330
2000
12:35
Well, you take that mouthful and you put it into Google.
266
755330
3000
12:38
Of course, there's no page on the web which has answered that question
267
758330
3000
12:41
because nobody has asked that question before.
268
761330
2000
12:43
You get 223,000 hits --
269
763330
2000
12:45
no results you can use.
270
765330
2000
12:47
You ask the linked data -- which they've now put together --
271
767330
3000
12:50
32 hits, each of which is a protein which has those properties
272
770330
4000
12:54
and you can look at.
273
774330
2000
12:56
The power of being able to ask those questions, as a scientist --
274
776330
3000
12:59
questions which actually bridge across different disciplines --
275
779330
2000
13:01
is really a complete sea change.
276
781330
3000
13:04
It's very very important.
277
784330
2000
13:06
Scientists are totally stymied at the moment --
278
786330
2000
13:08
the power of the data that other scientists have collected is locked up
279
788330
5000
13:13
and we need to get it unlocked so we can tackle those huge problems.
280
793330
3000
13:16
Now if I go on like this, you'll think that all the data comes from huge institutions
281
796330
4000
13:20
and has nothing to do with you.
282
800330
3000
13:23
But, that's not true.
283
803330
2000
13:25
In fact, data is about our lives.
284
805330
2000
13:27
You just -- you log on to your social networking site,
285
807330
3000
13:30
your favorite one, you say, "This is my friend."
286
810330
2000
13:32
Bing! Relationship. Data.
287
812330
3000
13:35
You say, "This photograph, it's about -- it depicts this person. "
288
815330
3000
13:38
Bing! That's data. Data, data, data.
289
818330
3000
13:41
Every time you do things on the social networking site,
290
821330
2000
13:43
the social networking site is taking data and using it -- re-purposing it --
291
823330
4000
13:47
and using it to make other people's lives more interesting on the site.
292
827330
4000
13:51
But, when you go to another linked data site --
293
831330
2000
13:53
and let's say this is one about travel,
294
833330
3000
13:56
and you say, "I want to send this photo to all the people in that group,"
295
836330
3000
13:59
you can't get over the walls.
296
839330
2000
14:01
The Economist wrote an article about it, and lots of people have blogged about it --
297
841330
2000
14:03
tremendous frustration.
298
843330
1000
14:04
The way to break down the silos is to get inter-operability
299
844330
2000
14:06
between social networking sites.
300
846330
2000
14:08
We need to do that with linked data.
301
848330
2000
14:10
One last type of data I'll talk about, maybe it's the most exciting.
302
850330
3000
14:13
Before I came down here, I looked it up on OpenStreetMap
303
853330
3000
14:16
The OpenStreetMap's a map, but it's also a Wiki.
304
856330
2000
14:18
Zoom in and that square thing is a theater -- which we're in right now --
305
858330
3000
14:21
The Terrace Theater. It didn't have a name on it.
306
861330
2000
14:23
So I could go into edit mode, I could select the theater,
307
863330
2000
14:25
I could add down at the bottom the name, and I could save it back.
308
865330
5000
14:30
And now if you go back to the OpenStreetMap. org,
309
870330
3000
14:33
and you find this place, you will find that The Terrace Theater has got a name.
310
873330
3000
14:36
I did that. Me!
311
876330
2000
14:38
I did that to the map. I just did that!
312
878330
2000
14:40
I put that up on there. Hey, you know what?
313
880330
2000
14:42
If I -- that street map is all about everybody doing their bit
314
882330
3000
14:45
and it creates an incredible resource
315
885330
3000
14:48
because everybody else does theirs.
316
888330
3000
14:51
And that is what linked data is all about.
317
891330
3000
14:54
It's about people doing their bit
318
894330
3000
14:57
to produce a little bit, and it all connecting.
319
897330
3000
15:00
That's how linked data works.
320
900330
3000
15:03
You do your bit. Everybody else does theirs.
321
903330
4000
15:07
You may not have lots of data which you have yourself to put on there
322
907330
4000
15:11
but you know to demand it.
323
911330
3000
15:14
And we've practiced that.
324
914330
2000
15:16
So, linked data -- it's huge.
325
916330
4000
15:20
I've only told you a very small number of things
326
920330
3000
15:23
There are data in every aspect of our lives,
327
923330
2000
15:25
every aspect of work and pleasure,
328
925330
3000
15:28
and it's not just about the number of places where data comes,
329
928330
3000
15:31
it's about connecting it together.
330
931330
3000
15:34
And when you connect data together, you get power
331
934330
3000
15:37
in a way that doesn't happen just with the web, with documents.
332
937330
3000
15:40
You get this really huge power out of it.
333
940330
4000
15:44
So, we're at the stage now
334
944330
3000
15:47
where we have to do this -- the people who think it's a great idea.
335
947330
4000
15:51
And all the people -- and I think there's a lot of people at TED who do things because --
336
951330
3000
15:54
even though there's not an immediate return on the investment
337
954330
2000
15:56
because it will only really pay off when everybody else has done it --
338
956330
3000
15:59
they'll do it because they're the sort of person who just does things
339
959330
4000
16:03
which would be good if everybody else did them.
340
963330
3000
16:06
OK, so it's called linked data.
341
966330
2000
16:08
I want you to make it.
342
968330
2000
16:10
I want you to demand it.
343
970330
2000
16:12
And I think it's an idea worth spreading.
344
972330
2000
16:14
Thanks.
345
974330
1000
16:15
(Applause)
346
975330
3000
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7