Visualizing the world's Twitter data - Jer Thorp

68,499 views ・ 2013-02-21

TED-Ed


Please double-click on the English subtitles below to play the video.

00:00
Transcriber: Andrea McDonough Reviewer: Bedirhan Cinar
0
0
7000
00:14
A couple of years ago I started using Twitter,
1
14668
2110
00:16
and one of the things that really charmed me about Twitter
2
16778
3106
00:19
is that people would wake up in the morning
3
19884
2213
00:22
and they would say, "Good morning!"
4
22097
2259
00:24
which I thought,
5
24356
1054
00:25
I'm a Canadian,
6
25410
1113
00:26
so I was a little bit,
7
26523
808
00:27
I liked that politeness.
8
27331
1769
00:29
And so, I'm also a giant nerd,
9
29100
2563
00:31
and so I wrote a computer program
10
31663
1417
00:33
that would record 24 hours of everybody on Twitter
11
33080
3421
00:36
saying, "Good morning!"
12
36501
1324
00:37
And then I asked myself my favorite question,
13
37825
2213
00:40
"What would that look like?"
14
40038
1639
00:41
Well, as it turns out, I think it would look something like this.
15
41677
3305
00:44
Right, so we'd see this wave of people
16
44982
2063
00:47
saying, "Good morning!" across the world as they wake up.
17
47045
3445
00:50
Now the green people, these are people that wake up
18
50490
1794
00:52
at around 8 o'clock in the morning,
19
52284
2075
00:54
Who wakes up at 8 o'clock or says, "Good morning!" at 8?
20
54359
3096
00:57
And the orange people,
21
57455
859
00:58
they say, "Good morning!" around 9.
22
58314
3579
01:01
And the red people, they say, "Good morning!" around 10.
23
61906
2845
01:04
Yeah, more at 10's than, more at 10's than 8's.
24
64751
3311
01:08
And actually if you look at this map,
25
68062
1127
01:09
we can learn a little bit about how people wake up
26
69189
1933
01:11
in different parts of the world.
27
71122
1275
01:12
People on the West Coast, for example,
28
72397
1345
01:13
they wake up a little bit later
29
73742
1353
01:15
than those people on the East Coast.
30
75095
2965
01:18
But that's not all that people say on Twitter, right?
31
78060
2358
01:20
We also get these really important tweets, like,
32
80418
2340
01:22
"I just landed in Orlando!! [plane sign, plane sign]"
33
82758
4869
01:27
Or, or, "I just landed in Texas [exclamation point]!"
34
87627
3518
01:31
Or "I just landed in Honduras!"
35
91145
2274
01:33
These lists, they go on and on and on,
36
93419
2140
01:35
all these people, right?
37
95559
1873
01:37
So, on the outside, these people are just telling us
38
97432
2737
01:40
something about how they're traveling.
39
100169
2369
01:42
But we know the truth, don't we?
40
102538
1802
01:44
These people are show-offs!
41
104340
1901
01:46
They are showing off that they're in Cape Town and I'm not.
42
106241
4194
01:50
So I thought, how can we take this vanity
43
110435
2652
01:53
and turn it into utility?
44
113087
1796
01:54
So using a similar approach that I did with "Good morning,"
45
114883
3421
01:58
I mapped all those people's trips
46
118304
2259
02:00
because I know where they're landing,
47
120563
2092
02:02
they just told me,
48
122655
1070
02:03
and I know where they live
49
123725
1231
02:04
because they share that information on their Twitter profile.
50
124956
4012
02:08
So what I'm able to do with 36 hours of Twitter
51
128968
3332
02:12
is create a model of how people are traveling
52
132300
2921
02:15
around the world during that 36 hours.
53
135221
3018
02:18
And this is kind of a prototype
54
138239
1486
02:19
because I think if we listen to everybody
55
139725
2906
02:22
on Twitter and Facebook and the rest of our social media,
56
142631
2758
02:25
we'd actually get a pretty clear picture
57
145389
1889
02:27
of how people are traveling from one place to the other,
58
147278
3240
02:30
which is actually turns out to be a very useful thing for scientists,
59
150518
3170
02:33
particularly those who are studying how disease is spread.
60
153688
3738
02:37
So, I work upstairs in the New York Times,
61
157426
2187
02:39
and for the last two years,
62
159613
1109
02:40
we've been working on a project called, "Cascade,"
63
160722
2101
02:42
which in some ways is kind of similar to this one.
64
162823
2649
02:45
But instead of modeling how people move,
65
165472
2222
02:47
we're modeling how people talk.
66
167694
2168
02:49
We're looking at what does a discussion look like.
67
169862
3178
02:53
Well, here's an example.
68
173040
1853
02:54
This is a discussion around an article called,
69
174893
2815
02:57
"The Island Where People Forget to Die".
70
177708
2009
02:59
It's about an island in Greece where people live
71
179717
1642
03:01
a really, really, really, really, really, really long time.
72
181359
3070
03:04
And what we're seeing here
73
184429
1063
03:05
is we're seeing a conversation that's stemming
74
185492
1922
03:07
from that first tweet down in the bottom, left-hand corner.
75
187414
3038
03:10
So we get to see the scope of this conversation
76
190452
2513
03:12
over about 9 hours right now,
77
192965
2168
03:15
we're going to creep up to 12 hours here in a second.
78
195133
2350
03:17
But, we can also see what that conversation
79
197483
2319
03:19
looks like in three dimensions.
80
199802
1802
03:21
And that three-dimensional view is actually much more useful for us.
81
201604
3304
03:24
As humans, we are really used to things
82
204908
1289
03:26
that are structured as three dimensions.
83
206197
1902
03:28
So, we can look at those little off-shoots of conversation,
84
208099
2679
03:30
we can find out what exactly happened.
85
210778
2562
03:33
And this is an interactive, exploratory tool
86
213340
1903
03:35
so we can go through every step in the conversation.
87
215243
2534
03:37
We can look at who the people were,
88
217777
1366
03:39
what they said,
89
219143
1060
03:40
how old they are,
90
220203
1109
03:41
where they live,
91
221312
1167
03:42
who follows them,
92
222479
992
03:43
and so on, and so on, and so on.
93
223471
2479
03:45
So, the Times creates about 6,500 pieces of content every month,
94
225950
4882
03:50
and we can model every single one
95
230832
1658
03:52
of the conversations that happen around them.
96
232490
1732
03:54
And they look somewhat different.
97
234222
1448
03:55
Depending on the story
98
235670
1167
03:56
and depending on how fast people are talking about it
99
236837
2727
03:59
and how far the conversation spreads,
100
239564
1835
04:01
these structures, which I call these conversational architectures,
101
241399
4218
04:05
end up looking different.
102
245617
2455
04:08
So, these projects that I've shown you,
103
248072
2102
04:10
I think they all involve the same thing:
104
250174
2364
04:12
we can take small pieces of data
105
252538
2075
04:14
and by putting them together,
106
254613
1565
04:16
we can generate more value,
107
256178
2236
04:18
we can do more exciting things with them.
108
258414
2103
04:20
But so far we've only talked about Twitter, right?
109
260517
2204
04:22
And Twitter isn't all the data.
110
262721
1965
04:24
We learned a moment ago
111
264686
1202
04:25
that there is tons and tons,
112
265888
1248
04:27
tons more data out there.
113
267136
2224
04:29
And specifically, I want you to think about one type of data
114
269360
3089
04:32
because all of you guys,
115
272449
1942
04:34
everybody in this audience, we,
116
274391
1597
04:35
we, me as well,
117
275988
1640
04:37
are data-making machines.
118
277629
2545
04:40
We are producing data all the time.
119
280174
2534
04:42
Every single one of us, we're producing data.
120
282708
2205
04:44
Somebody else, though, is storing that data.
121
284913
2307
04:47
Usually we put our trust into companies to store that data,
122
287220
5538
04:52
but what I want to suggest here
123
292758
2532
04:55
is that rather than putting our trust
124
295290
1774
04:57
in companies to store that data,
125
297064
1735
04:58
we should put the trust in ourselves
126
298799
1688
05:00
because we actually own that data.
127
300487
1919
05:02
Right, that is something we should remember.
128
302406
1867
05:04
Everything that someone else measures about you,
129
304273
2927
05:07
you actually own.
130
307200
2111
05:09
So, it's my hope,
131
309311
1167
05:10
maybe because I'm a Canadian,
132
310478
2190
05:12
that all of us can come together
133
312668
1731
05:14
with this really valuable data that we've been storing,
134
314399
3786
05:18
and we can collectively launch that data
135
318185
2878
05:21
toward some of the world's most difficulty problems
136
321063
2841
05:23
because big data can solve big problems,
137
323904
3115
05:27
but I think it can do it the best
138
327019
1635
05:28
if it's all of us who are in control.
139
328654
2870
05:31
Thank you.
140
331524
1502
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7