Nicholas Christakis: How social networks predict epidemics

93,295 views ・ 2010-09-16

TED


请双击下面的英文字幕来播放视频。

翻译人员: Ming Wu 校对人员: Xiaoqiao Xie
00:15
For the last 10 years, I've been spending my time trying to figure out
0
15260
3000
过去十年间,我一直在想
00:18
how and why human beings
1
18260
2000
人们是怎样把自己放在社交网络中的
00:20
assemble themselves into social networks.
2
20260
3000
以及他们为什么要这么做。
00:23
And the kind of social network I'm talking about
3
23260
2000
这里的社交网络
00:25
is not the recent online variety,
4
25260
2000
不是最近网上最近流行的那种
00:27
but rather, the kind of social networks
5
27260
2000
而是,
00:29
that human beings have been assembling for hundreds of thousands of years,
6
29260
3000
自从人类从非洲大陆出现之后,
00:32
ever since we emerged from the African savannah.
7
32260
3000
人们几十万年来进行的社交活动。
00:35
So, I form friendships and co-worker
8
35260
2000
比如说,
00:37
and sibling and relative relationships with other people
9
37260
3000
我和其他人建立友谊,同事,兄弟,亲戚的关系,
00:40
who in turn have similar relationships with other people.
10
40260
2000
其他人也和另外其他人建立类似的关系。
00:42
And this spreads on out endlessly into a distance.
11
42260
3000
这样的关系无尽止地延伸出去。
00:45
And you get a network that looks like this.
12
45260
2000
这样你就有了一个像这样的网络。
00:47
Every dot is a person.
13
47260
2000
网络中的每一个点就是一个人。
00:49
Every line between them is a relationship between two people --
14
49260
2000
连接两点的每一条线就是两个人之间的关系 --
00:51
different kinds of relationships.
15
51260
2000
不同类型的关系。
00:53
And you can get this kind of vast fabric of humanity,
16
53260
3000
这样就得到了这种巨大的人际关系网,
00:56
in which we're all embedded.
17
56260
2000
我们都交织在网中。
00:58
And my colleague, James Fowler and I have been studying for quite sometime
18
58260
3000
我的同事,詹姆斯.福乐,和我一起已经研究了好些时间
01:01
what are the mathematical, social,
19
61260
2000
什么是支配这些网络的数学,社交,
01:03
biological and psychological rules
20
63260
3000
生物和心理规则
01:06
that govern how these networks are assembled
21
66260
2000
以及
01:08
and what are the similar rules
22
68260
2000
什么是基本的规则
01:10
that govern how they operate, how they affect our lives.
23
70260
3000
来支配这些网络的运作,和如何影响我们的生活。
01:13
But recently, we've been wondering
24
73260
2000
最近,我们在研究
01:15
whether it might be possible to take advantage of this insight,
25
75260
3000
是否有可能利用这种认识
01:18
to actually find ways to improve the world,
26
78260
2000
来发现改善这个世界的方法,
01:20
to do something better,
27
80260
2000
做一些好事
01:22
to actually fix things, not just understand things.
28
82260
3000
解决一些问题,而不只是理解而已。
01:25
So one of the first things we thought we would tackle
29
85260
3000
所以我们认为需要解决的一件首要的事情
01:28
would be how we go about predicting epidemics.
30
88260
3000
就是怎样预测传染病。
01:31
And the current state of the art in predicting an epidemic --
31
91260
2000
现在预测传染病的方法 --
01:33
if you're the CDC or some other national body --
32
93260
3000
如果是美国疾病控制预防中心或者其他国家级的机构 --
01:36
is to sit in the middle where you are
33
96260
2000
就是呆在原地
01:38
and collect data
34
98260
2000
从医生和实验室
01:40
from physicians and laboratories in the field
35
100260
2000
收集数据
01:42
that report the prevalence or the incidence of certain conditions.
36
102260
3000
来分析疾病的流行性和发病率。
01:45
So, so and so patients have been diagnosed with something,
37
105260
3000
所以,(如果)一些病人在一个地方被诊断了得病,
01:48
or other patients have been diagnosed,
38
108260
2000
或者其他病人在另一个地方得到诊断,
01:50
and all these data are fed into a central repository, with some delay.
39
110260
3000
所有这些数据,一定的延迟之后,都送到一个中心数据库。
01:53
And if everything goes smoothly,
40
113260
2000
如果一切顺利,
01:55
one to two weeks from now
41
115260
2000
一两个星期之后,
01:57
you'll know where the epidemic was today.
42
117260
3000
你就会知道发生在今天的传染病在何处。
02:00
And actually, about a year or so ago,
43
120260
2000
实际上,一年前左右,
02:02
there was this promulgation
44
122260
2000
曾经有过这样的一个,google流感趋势的想法
02:04
of the idea of Google Flu Trends, with respect to the flu,
45
124260
3000
关于流感,
02:07
where by looking at people's searching behavior today,
46
127260
3000
通过观察人们今天的搜索行为,
02:10
we could know where the flu --
47
130260
2000
我们能知道流感的发病区...
02:12
what the status of the epidemic was today,
48
132260
2000
传染病今天的状况,
02:14
what's the prevalence of the epidemic today.
49
134260
3000
以及传播的趋势。
02:17
But what I'd like to show you today
50
137260
2000
但我今天想给你展示的
02:19
is a means by which we might get
51
139260
2000
是一种方法
02:21
not just rapid warning about an epidemic,
52
141260
3000
通过这个方法,我们不只是得到关于传染病的警示,
02:24
but also actually
53
144260
2000
而且
02:26
early detection of an epidemic.
54
146260
2000
也能够及早发现传染病。
02:28
And, in fact, this idea can be used
55
148260
2000
事实上,这个想法不只能
02:30
not just to predict epidemics of germs,
56
150260
3000
预测病毒的传播,
02:33
but also to predict epidemics of all sorts of kinds.
57
153260
3000
也能预测很多事情的传播。
02:37
For example, anything that spreads by a form of social contagion
58
157260
3000
比如说,任何以社交形式传播的事情,
02:40
could be understood in this way,
59
160260
2000
都能用这种方法来理解,
02:42
from abstract ideas on the left
60
162260
2000
从左边这些抽象的事物,
02:44
like patriotism, or altruism, or religion
61
164260
3000
像爱国主义,或利他主义,或者宗教,
02:47
to practices
62
167260
2000
到具体的事物,
02:49
like dieting behavior, or book purchasing,
63
169260
2000
像饮食行为,或者买书,
02:51
or drinking, or bicycle-helmet [and] other safety practices,
64
171260
3000
或饮酒,或自行车头盔和其他的一些安全措施,
02:54
or products that people might buy,
65
174260
2000
或者人们可能买的产品,
02:56
purchases of electronic goods,
66
176260
2000
电子书的购买,
02:58
anything in which there's kind of an interpersonal spread.
67
178260
3000
任何能在人们之间传播的事情。
03:01
A kind of a diffusion of innovation
68
181260
2000
一种创新的传播
03:03
could be understood and predicted
69
183260
2000
可以用我即将展示的方法
03:05
by the mechanism I'm going to show you now.
70
185260
3000
来理解和预测。
03:08
So, as all of you probably know,
71
188260
2000
正如你们所有人也许知道的,
03:10
the classic way of thinking about this
72
190260
2000
考虑这个问题的传统方法
03:12
is the diffusion-of-innovation,
73
192260
2000
是创新扩散,
03:14
or the adoption curve.
74
194260
2000
或创新采用曲线。
03:16
So here on the Y-axis, we have the percent of the people affected,
75
196260
2000
这儿,Y轴上显示的是受到影响的人们的百分比,
03:18
and on the X-axis, we have time.
76
198260
2000
x轴上显示的是时间。
03:20
And at the very beginning, not too many people are affected,
77
200260
3000
一开始,受到影响的人不多,
03:23
and you get this classic sigmoidal,
78
203260
2000
得到的是S状的分布,
03:25
or S-shaped, curve.
79
205260
2000
或者说S形状的曲线。
03:27
And the reason for this shape is that at the very beginning,
80
207260
2000
形成这种形状的原因是这样的,在一开始,
03:29
let's say one or two people
81
209260
2000
假设一两个人
03:31
are infected, or affected by the thing
82
211260
2000
受到影响,
03:33
and then they affect, or infect, two people,
83
213260
2000
然后他们去影响两个人,
03:35
who in turn affect four, eight, 16 and so forth,
84
215260
3000
然后就是四个,八个,十六个,等等,
03:38
and you get the epidemic growth phase of the curve.
85
218260
3000
这样你就得到了这个曲线的传播增长阶段。
03:41
And eventually, you saturate the population.
86
221260
2000
最终,这个群体就饱和了。
03:43
There are fewer and fewer people
87
223260
2000
可以被影响的人
03:45
who are still available that you might infect,
88
225260
2000
就越来越少,
03:47
and then you get the plateau of the curve,
89
227260
2000
这时你就得到了曲线的平顶部分,
03:49
and you get this classic sigmoidal curve.
90
229260
3000
这样就形成了经典的S状曲线。
03:52
And this holds for germs, ideas,
91
232260
2000
这种方法可以用于病毒,观点,
03:54
product adoption, behaviors,
92
234260
2000
产品推广,行为,
03:56
and the like.
93
236260
2000
以及其他类似的情况。
03:58
But things don't just diffuse in human populations at random.
94
238260
3000
但是这些事物在人群中并非是随机传播的。
04:01
They actually diffuse through networks.
95
241260
2000
他们实际上是通过网络传播。
04:03
Because, as I said, we live our lives in networks,
96
243260
3000
正如我所说的,因为我们在网络中生存,
04:06
and these networks have a particular kind of a structure.
97
246260
3000
这些网络有一种特殊的结构。
04:09
Now if you look at a network like this --
98
249260
2000
现在,如果你看像这样的一个网络。。。
04:11
this is 105 people.
99
251260
2000
有105人。
04:13
And the lines represent -- the dots are the people,
100
253260
2000
这些线代表。。。这些点是人,
04:15
and the lines represent friendship relationships.
101
255260
2000
这些线代表朋友关系。
04:17
You might see that people occupy
102
257260
2000
你也许看到在这个网络中
04:19
different locations within the network.
103
259260
2000
人们占据了不同的地点。
04:21
And there are different kinds of relationships between the people.
104
261260
2000
人们之间也有不同的关系。
04:23
You could have friendship relationships, sibling relationships,
105
263260
3000
这些关系可以是朋友关系,同胞关系,
04:26
spousal relationships, co-worker relationships,
106
266260
3000
配偶关系,同事关系,
04:29
neighbor relationships and the like.
107
269260
3000
邻居关系,以及类似的关系。
04:32
And different sorts of things
108
272260
2000
不同的事物
04:34
spread across different sorts of ties.
109
274260
2000
通过不同的联系来传播。
04:36
For instance, sexually transmitted diseases
110
276260
2000
比如说,性病
04:38
will spread across sexual ties.
111
278260
2000
通过性关系传播。
04:40
Or, for instance, people's smoking behavior
112
280260
2000
或者,比如说,人们的吸烟行为
04:42
might be influenced by their friends.
113
282260
2000
可能会受到他们朋友的影响。
04:44
Or their altruistic or their charitable giving behavior
114
284260
2000
或者利他主义以及慈善施舍行为
04:46
might be influenced by their coworkers,
115
286260
2000
可能会受到同事的影响,
04:48
or by their neighbors.
116
288260
2000
或者邻居的影响。
04:50
But not all positions in the network are the same.
117
290260
3000
但并非网络中所有的位置都是一样的。
04:53
So if you look at this, you might immediately grasp
118
293260
2000
所以你看这里,就能立即理解
04:55
that different people have different numbers of connections.
119
295260
3000
不同的人有不同数量的连接。
04:58
Some people have one connection, some have two,
120
298260
2000
有些人有一个连接,有些人有两个,
05:00
some have six, some have 10 connections.
121
300260
3000
有些人有六个,有些人有十个连接。
05:03
And this is called the "degree" of a node,
122
303260
2000
这就叫做结点的度数,
05:05
or the number of connections that a node has.
123
305260
2000
或者一个结点有的连接数目。
05:07
But in addition, there's something else.
124
307260
2000
但是,还有些别的东西。
05:09
So, if you look at nodes A and B,
125
309260
2000
如果你看结点A和B,
05:11
they both have six connections.
126
311260
2000
都有六个连接关系。
05:13
But if you can see this image [of the network] from a bird's eye view,
127
313260
3000
但是如果拔高来看,
05:16
you can appreciate that there's something very different
128
316260
2000
你就能理解A和B是
05:18
about nodes A and B.
129
318260
2000
非常不一样。
05:20
So, let me ask you this -- I can cultivate this intuition by asking a question --
130
320260
3000
让我来问你 -- 用一个问题来说明这个直觉 ---
05:23
who would you rather be
131
323260
2000
如果有个致命的病毒正在网络中传播,
05:25
if a deadly germ was spreading through the network, A or B?
132
325260
3000
你更愿意是哪一个,A还是B?
05:28
(Audience: B.) Nicholas Christakis: B, it's obvious.
133
328260
2000
(观众:B) 尼古拉斯·克里斯塔吉斯: B, 这很明显。
05:30
B is located on the edge of the network.
134
330260
2000
B在网络的边缘。
05:32
Now, who would you rather be
135
332260
2000
现在,如果有个非常有料的流言在网络中传播,
05:34
if a juicy piece of gossip were spreading through the network?
136
334260
3000
你愿意是哪一个?
05:37
A. And you have an immediate appreciation
137
337260
3000
A。 你很快就看到
05:40
that A is going to be more likely
138
340260
2000
A更有可能
05:42
to get the thing that's spreading and to get it sooner
139
342260
3000
更快地得到正在传播的事物
05:45
by virtue of their structural location within the network.
140
345260
3000
因为他们在网络中的位置。
05:48
A, in fact, is more central,
141
348260
2000
A,实际上,(位置)更加中心,
05:50
and this can be formalized mathematically.
142
350260
3000
这个能在数学上来表示。
05:53
So, if we want to track something
143
353260
2000
如果我们要追踪
05:55
that was spreading through a network,
144
355260
3000
在网络中传播的事物,
05:58
what we ideally would like to do is to set up sensors
145
358260
2000
理想情况下我们会想设置感应器
06:00
on the central individuals within the network,
146
360260
2000
在网络的中心人物上,
06:02
including node A,
147
362260
2000
包括结点A,
06:04
monitor those people that are right there in the middle of the network,
148
364260
3000
以此来观察网络中心的人们的活动,
06:07
and somehow get an early detection
149
367260
2000
从而能够做到及早探测
06:09
of whatever it is that is spreading through the network.
150
369260
3000
正在网络中传播的东西。
06:12
So if you saw them contract a germ or a piece of information,
151
372260
3000
也就是说,假如你看到网络中心的人们感染病毒或得到了一些信息,
06:15
you would know that, soon enough,
152
375260
2000
你就能知道,很快
06:17
everybody was about to contract this germ
153
377260
2000
所有人都会被传染这种病毒
06:19
or this piece of information.
154
379260
2000
或得到这个消息。
06:21
And this would be much better
155
381260
2000
这种方法
06:23
than monitoring six randomly chosen people,
156
383260
2000
比不考虑群体的结构,监测六个随机选择的人,
06:25
without reference to the structure of the population.
157
385260
3000
要好的多。
06:28
And in fact, if you could do that,
158
388260
2000
实际上,如果能够这样做,
06:30
what you would see is something like this.
159
390260
2000
你就会看到像这样的情况。
06:32
On the left-hand panel, again, we have the S-shaped curve of adoption.
160
392260
3000
在左边,我们有S形状的传播曲线。
06:35
In the dotted red line, we show
161
395260
2000
这条红色的点线,我们表示的是
06:37
what the adoption would be in the random people,
162
397260
2000
在随机人群中的感染率,
06:39
and in the left-hand line, shifted to the left,
163
399260
3000
左手的线条,向左移动,
06:42
we show what the adoption would be
164
402260
2000
表现的是
06:44
in the central individuals within the network.
165
404260
2000
在网络的中心群体中的感染率。
06:46
On the Y-axis is the cumulative instances of contagion,
166
406260
2000
Y轴上是感染个体的累计总数,
06:48
and on the X-axis is the time.
167
408260
2000
X轴上是时间。
06:50
And on the right-hand side, we show the same data,
168
410260
2000
右边,我们显示同样的数据,
06:52
but here with daily incidence.
169
412260
2000
但是在每天的个体数。
06:54
And what we show here is -- like, here --
170
414260
2000
我们在这里要显示的是 -- 比如说,这里 --
06:56
very few people are affected, more and more and more and up to here,
171
416260
2000
很少的人受到影响,然后逐渐增多到这里,
06:58
and here's the peak of the epidemic.
172
418260
2000
这里是感染的高峰。
07:00
But shifted to the left is what's occurring in the central individuals.
173
420260
2000
但是移到左边,是在中心群体中的发展趋势。
07:02
And this difference in time between the two
174
422260
3000
两者之间在时间上的区别
07:05
is the early detection, the early warning we can get,
175
425260
3000
正是我们能够得到
07:08
about an impending epidemic
176
428260
2000
关于传染病在人群中的
07:10
in the human population.
177
430260
2000
早期预测, 早期示警。
07:12
The problem, however,
178
432260
2000
然而,这个方法的难处在于,
07:14
is that mapping human social networks
179
434260
2000
测绘人类的社交关系网
07:16
is not always possible.
180
436260
2000
并不总是可能的。
07:18
It can be expensive, not feasible,
181
438260
2000
这很昂贵,[很难],
07:20
unethical,
182
440260
2000
不正当,
07:22
or, frankly, just not possible to do such a thing.
183
442260
3000
或者坦白说,就是没可能做这样的事情。
07:25
So, how can we figure out
184
445260
2000
那么,我们怎样能弄清楚
07:27
who the central people are in a network
185
447260
2000
哪些人在网络中心
07:29
without actually mapping the network?
186
449260
3000
而不需要通过测绘整个网络呢?
07:32
What we came up with
187
452260
2000
我们想出来的方法
07:34
was an idea to exploit an old fact,
188
454260
2000
是采用了一个古老的事实,
07:36
or a known fact, about social networks,
189
456260
2000
或者说关于社交网络的已知事实,
07:38
which goes like this:
190
458260
2000
这个事实是这样的:
07:40
Do you know that your friends
191
460260
2000
你知道你的朋友
07:42
have more friends than you do?
192
462260
3000
有比你更多的朋友吗?
07:45
Your friends have more friends than you do,
193
465260
3000
你的朋友比你有更多的朋友。
07:48
and this is known as the friendship paradox.
194
468260
2000
这个称为朋友的悖论。
07:50
Imagine a very popular person in the social network --
195
470260
2000
想像有一个在社交网络中非常受欢迎的人物--
07:52
like a party host who has hundreds of friends --
196
472260
3000
就像一个聚会的主持有几百个朋友--
07:55
and a misanthrope who has just one friend,
197
475260
2000
而一个憎恨人类的人只有一个朋友,
07:57
and you pick someone at random from the population;
198
477260
3000
然后你随机从人群中选个人;
08:00
they were much more likely to know the party host.
199
480260
2000
他们更有可能认识聚会的主持。
08:02
And if they nominate the party host as their friend,
200
482260
2000
如果他们选择晚会主持作为他们的朋友,
08:04
that party host has a hundred friends,
201
484260
2000
那么这个聚会主持就有一百个朋友,
08:06
therefore, has more friends than they do.
202
486260
3000
因此,就有比他们更多的朋友。
08:09
And this, in essence, is what's known as the friendship paradox.
203
489260
3000
这个就称为朋友悖论。
08:12
The friends of randomly chosen people
204
492260
3000
随机选择的人群的朋友
08:15
have higher degree, and are more central
205
495260
2000
比随机人群本身,
08:17
than the random people themselves.
206
497260
2000
有更高的(关系)度数,并且更加中心。
08:19
And you can get an intuitive appreciation for this
207
499260
2000
你能对这个理论有一个本能的理解
08:21
if you imagine just the people at the perimeter of the network.
208
501260
3000
如果想像网络周边的人群。
08:24
If you pick this person,
209
504260
2000
如果你选择(网络周边的)这个人
08:26
the only friend they have to nominate is this person,
210
506260
3000
他们能选择的唯一朋友就是这个人,
08:29
who, by construction, must have at least two
211
509260
2000
而这个,在这个结构中,一定有至少两个朋友,
08:31
and typically more friends.
212
511260
2000
通常情况下,有更多的朋友。
08:33
And that happens at every peripheral node.
213
513260
2000
这种情况发生在每个周边结点上。
08:35
And in fact, it happens throughout the network as you move in,
214
515260
3000
实际上,每当你加入一个网络的时候这个情况都会发生,
08:38
everyone you pick, when they nominate a random --
215
518260
2000
你选择的每个人,当他们随机选择。。。
08:40
when a random person nominates a friend of theirs,
216
520260
3000
当任意一个人选择他们的一个朋友,
08:43
you move closer to the center of the network.
217
523260
3000
你就向网络中心移动。
08:46
So, we thought we would exploit this idea
218
526260
3000
所以,我们想利用这个概念
08:49
in order to study whether we could predict phenomena within networks.
219
529260
3000
来研究是否能预测网络的现象。
08:52
Because now, with this idea
220
532260
2000
因为,有了这个概念,
08:54
we can take a random sample of people,
221
534260
2000
我们就选择一个随机人群,
08:56
have them nominate their friends,
222
536260
2000
让他们提供他们的朋友,
08:58
those friends would be more central,
223
538260
2000
他们的朋友就更加中心,
09:00
and we could do this without having to map the network.
224
540260
3000
这样我们就能选择网络的中心,而不用描绘整个网络。
09:03
And we tested this idea with an outbreak of H1N1 flu
225
543260
3000
我们用这个想法来测试H1N1流感的爆发
09:06
at Harvard College
226
546260
2000
在哈佛大学
09:08
in the fall and winter of 2009, just a few months ago.
227
548260
3000
2009年的秋冬,就是几个月前。
09:11
We took 1,300 randomly selected undergraduates,
228
551260
3000
我们随机选择了1300本科学生,
09:14
we had them nominate their friends,
229
554260
2000
让他们推举他们的朋友,
09:16
and we followed both the random students and their friends
230
556260
2000
然后我们跟踪随机的学生人群和他们的朋友
09:18
daily in time
231
558260
2000
每天按时
09:20
to see whether or not they had the flu epidemic.
232
560260
3000
观察他们是否传染上流感。
09:23
And we did this passively by looking at whether or not they'd gone to university health services.
233
563260
3000
我们观察的方法是看他们有没有去过大学健康服务机构。
09:26
And also, we had them [actively] email us a couple of times a week.
234
566260
3000
并且我们要求他们一个星期给我们发几次电子邮件。
09:29
Exactly what we predicted happened.
235
569260
3000
我们的预测一点不错的发生了。
09:32
So the random group is in the red line.
236
572260
3000
这个随机组在这个红线上。
09:35
The epidemic in the friends group has shifted to the left, over here.
237
575260
3000
朋友组的传染移到左边,这里
09:38
And the difference in the two is 16 days.
238
578260
3000
中间相差了16天。
09:41
By monitoring the friends group,
239
581260
2000
通过检测朋友组,
09:43
we could get 16 days advance warning
240
583260
2000
我们能够得到16天的预先示警
09:45
of an impending epidemic in this human population.
241
585260
3000
在这个人群的关于这个传染病的传播。
09:48
Now, in addition to that,
242
588260
2000
现在,在这个基础上,
09:50
if you were an analyst who was trying to study an epidemic
243
590260
3000
如果你是分析师,要研究一种传染病
09:53
or to predict the adoption of a product, for example,
244
593260
3000
或者预测一个产品的推广,
09:56
what you could do is you could pick a random sample of the population,
245
596260
3000
你能做的是选择一个随机的人群,
09:59
also have them nominate their friends and follow the friends
246
599260
3000
让他们任命他们的朋友,然后跟踪他们的朋友,
10:02
and follow both the randoms and the friends.
247
602260
3000
跟踪随机组和朋友组。
10:05
Among the friends, the first evidence you saw of a blip above zero
248
605260
3000
在朋友组中,你看到的第一个零上的尖峰信号
10:08
in adoption of the innovation, for example,
249
608260
3000
关于,比如说,创新科技的采纳,
10:11
would be evidence of an impending epidemic.
250
611260
2000
就是即将来临的流行趋势的信号。
10:13
Or you could see the first time the two curves diverged,
251
613260
3000
或者你能看到两条曲线第一次分离的地方,
10:16
as shown on the left.
252
616260
2000
就像左边显示的。
10:18
When did the randoms -- when did the friends take off
253
618260
3000
朋友组什么时候开始
10:21
and leave the randoms,
254
621260
2000
与随机组分离,
10:23
and [when did] their curve start shifting?
255
623260
2000
他们的曲线什么时候开始偏移?
10:25
And that, as indicated by the white line,
256
625260
2000
正如白线显示的,
10:27
occurred 46 days
257
627260
2000
发生在
10:29
before the peak of the epidemic.
258
629260
2000
流行高峰的46天之前。
10:31
So this would be a technique
259
631260
2000
通过这个技术
10:33
whereby we could get more than a month-and-a-half warning
260
633260
2000
我们能得到关于流感在特定人群中传播
10:35
about a flu epidemic in a particular population.
261
635260
3000
一个半月以上的预先示警。
10:38
I should say that
262
638260
2000
我应该说
10:40
how far advanced a notice one might get about something
263
640260
2000
能多早得到关于一些事情的消息
10:42
depends on a host of factors.
264
642260
2000
取决于很多因素。
10:44
It could depend on the nature of the pathogen --
265
644260
2000
它也许取决于病原体的本质---
10:46
different pathogens,
266
646260
2000
不同的病原体,
10:48
using this technique, you'd get different warning --
267
648260
2000
使用这种技术,你可能得到不同的示警---
10:50
or other phenomena that are spreading,
268
650260
2000
或者其他一些传播的现象,
10:52
or frankly, on the structure of the human network.
269
652260
3000
或者,直接的说,在人类网络的结构中。
10:55
Now in our case, although it wasn't necessary,
270
655260
3000
现在,在我们的例子中,尽管不是很必要,
10:58
we could also actually map the network of the students.
271
658260
2000
我们也能够描绘这个学生网络。
11:00
So, this is a map of 714 students
272
660260
2000
这是714个学生的映射图
11:02
and their friendship ties.
273
662260
2000
和他们朋友联系。
11:04
And in a minute now, I'm going to put this map into motion.
274
664260
2000
很快,我要使这个图动起来。
11:06
We're going to take daily cuts through the network
275
666260
2000
我们要通过这个网络作每日监控
11:08
for 120 days.
276
668260
2000
120天。
11:10
The red dots are going to be cases of the flu,
277
670260
3000
红点将会是流感的传染者,
11:13
and the yellow dots are going to be friends of the people with the flu.
278
673260
3000
黄点就是流感传染这人的朋友。
11:16
And the size of the dots is going to be proportional
279
676260
2000
这些点的大小
11:18
to how many of their friends have the flu.
280
678260
2000
和他们得流感朋友的数目成正比。
11:20
So bigger dots mean more of your friends have the flu.
281
680260
3000
越大的点意味着更多的朋友得了流感。
11:23
And if you look at this image -- here we are now in September the 13th --
282
683260
3000
你看这个图 --- 这儿是九月十三号 ---
11:26
you're going to see a few cases light up.
283
686260
2000
你看到几个病例出现。
11:28
You're going to see kind of blooming of the flu in the middle.
284
688260
2000
在中间你就会看到流感开始爆发。
11:30
Here we are on October the 19th.
285
690260
3000
这儿是十月十九日。
11:33
The slope of the epidemic curve is approaching now, in November.
286
693260
2000
传播曲线的坡度开始临近,在十一月。
11:35
Bang, bang, bang, bang, bang -- you're going to see lots of blooming in the middle,
287
695260
3000
砰,砰,砰,砰,砰,你将看到在中间的很多地方爆发,
11:38
and then you're going to see a sort of leveling off,
288
698260
2000
然后你会看到情况稳定下来,
11:40
fewer and fewer cases towards the end of December.
289
700260
3000
到十二月底就越来越少的病例发生。
11:43
And this type of a visualization
290
703260
2000
这样的图形表示
11:45
can show that epidemics like this take root
291
705260
2000
能显示,像这样的传染病先
11:47
and affect central individuals first,
292
707260
2000
影响中心个体
11:49
before they affect others.
293
709260
2000
在影响别人之前。
11:51
Now, as I've been suggesting,
294
711260
2000
现在,如我所说,
11:53
this method is not restricted to germs,
295
713260
3000
这个方法并不局限于病毒,
11:56
but actually to anything that spreads in populations.
296
716260
2000
实际上可以用于任何在人群中传播的事物。
11:58
Information spreads in populations,
297
718260
2000
信息在人群中传播。
12:00
norms can spread in populations,
298
720260
2000
规范在人群中传播。
12:02
behaviors can spread in populations.
299
722260
2000
行为能在人群中传播
12:04
And by behaviors, I can mean things like criminal behavior,
300
724260
3000
我说的行为,就是像犯罪的行为
12:07
or voting behavior, or health care behavior,
301
727260
3000
或选举行为,或者保健行为,
12:10
like smoking, or vaccination,
302
730260
2000
像抽烟,或免疫,
12:12
or product adoption, or other kinds of behaviors
303
732260
2000
或产品推广,或者其他类型的行为
12:14
that relate to interpersonal influence.
304
734260
2000
和人际之间影响相关的性为。
12:16
If I'm likely to do something that affects others around me,
305
736260
3000
如果我想做些事情来影响我周围的人,
12:19
this technique can get early warning or early detection
306
739260
3000
这个技术能得到早期示警,或早期预测,
12:22
about the adoption within the population.
307
742260
3000
关于人群的采纳。
12:25
The key thing is that for it to work,
308
745260
2000
要这个技术起作用,关键在于,
12:27
there has to be interpersonal influence.
309
747260
2000
人际之间的影响必须存在。
12:29
It cannot be because of some broadcast mechanism
310
749260
2000
它不能是像一些传播机制
12:31
affecting everyone uniformly.
311
751260
3000
统一地影响每一个人。
12:35
Now the same insights
312
755260
2000
现在同样的观察
12:37
can also be exploited -- with respect to networks --
313
757260
3000
可以用上 --- 关于网络 ---
12:40
can also be exploited in other ways,
314
760260
3000
能够用其他的方法来利用,
12:43
for example, in the use of targeting
315
763260
2000
比如,可以用来追踪
12:45
specific people for interventions.
316
765260
2000
特殊的人群。
12:47
So, for example, most of you are probably familiar
317
767260
2000
比如说,你们大部分人可能听过
12:49
with the notion of herd immunity.
318
769260
2000
群体免疫的概念。
12:51
So, if we have a population of a thousand people,
319
771260
3000
如果我们有一个一千人的群体,
12:54
and we want to make the population immune to a pathogen,
320
774260
3000
我们想使这个群体对某种病原体免疫,
12:57
we don't have to immunize every single person.
321
777260
2000
我们并不需要给每个人打免疫预防针。
12:59
If we immunize 960 of them,
322
779260
2000
如果我们使960人免疫,
13:01
it's as if we had immunized a hundred [percent] of them.
323
781260
3000
效果和使所有人免疫差不多。
13:04
Because even if one or two of the non-immune people gets infected,
324
784260
3000
因为即使一两个没有免疫的人感染了,
13:07
there's no one for them to infect.
325
787260
2000
也没有其他人让他们感染。
13:09
They are surrounded by immunized people.
326
789260
2000
这两个人周围的人都已经免疫。
13:11
So 96 percent is as good as 100 percent.
327
791260
3000
所以百分之96和百分之百效果一样好。
13:14
Well, some other scientists have estimated
328
794260
2000
一些其他的科学家已经预测了
13:16
what would happen if you took a 30 percent random sample
329
796260
2000
可能发生的情况,如果你从这一千人中取百分之三十的随机样本
13:18
of these 1000 people, 300 people and immunized them.
330
798260
3000
也就是三百个人,并且使他们免疫。
13:21
Would you get any population-level immunity?
331
801260
2000
这样能不能使整个群体免疫?
13:23
And the answer is no.
332
803260
3000
答案是不可能。
13:26
But if you took this 30 percent, these 300 people
333
806260
2000
但是如果你选择百分之三十的这三百人,
13:28
and had them nominate their friends
334
808260
2000
让他们举出他们的朋友
13:30
and took the same number of vaccine doses
335
810260
3000
然后使用同样数量的免疫针
13:33
and vaccinated the friends of the 300 --
336
813260
2000
使得这三百人的朋友免疫,
13:35
the 300 friends --
337
815260
2000
这三百个朋友,
13:37
you can get the same level of herd immunity
338
817260
2000
你就能达到群体免疫的效果
13:39
as if you had vaccinated 96 percent of the population
339
819260
3000
就好像给百分之九十六的人打预防针的效果一样
13:42
at a much greater efficiency, with a strict budget constraint.
340
822260
3000
同时效率更高,并且花费更少。
13:45
And similar ideas can be used, for instance,
341
825260
2000
同样的想法可以用于,比如说,
13:47
to target distribution of things like bed nets
342
827260
2000
解决像床罩这种物品在发展中国家
13:49
in the developing world.
343
829260
2000
的派发和分布。
13:51
If we could understand the structure of networks in villages,
344
831260
3000
如果我们了解村庄的网络结构,
13:54
we could target to whom to give the interventions
345
834260
2000
我们就能选择介入的目标
13:56
to foster these kinds of spreads.
346
836260
2000
来促进这些过程的进行。
13:58
Or, frankly, for advertising with all kinds of products.
347
838260
3000
或者,更加直接的说,来促销所有的产品。
14:01
If we could understand how to target,
348
841260
2000
如果我们能理解如何选择目标,
14:03
it could affect the efficiency
349
843260
2000
就能影响到我们达到目标
14:05
of what we're trying to achieve.
350
845260
2000
的效率。
14:07
And in fact, we can use data
351
847260
2000
实际上,我们能使用来源于各种渠道的
14:09
from all kinds of sources nowadays [to do this].
352
849260
2000
数据[来应用这个方法]。
14:11
This is a map of eight million phone users
353
851260
2000
这里是一个欧洲国家八百万人的
14:13
in a European country.
354
853260
2000
电话用户的网络图。
14:15
Every dot is a person, and every line represents
355
855260
2000
每一个点就是一个人,每一条线代表
14:17
a volume of calls between the people.
356
857260
2000
人们之间的通话数量。
14:19
And we can use such data, that's being passively obtained,
357
859260
3000
我们可以利用这些数据,被动方式得到的数据,
14:22
to map these whole countries
358
862260
2000
来描绘整个国家
14:24
and understand who is located where within the network.
359
864260
3000
从而了解那些人处在网络的中心。
14:27
Without actually having to query them at all,
360
867260
2000
不用实际上去询问每个人,
14:29
we can get this kind of a structural insight.
361
869260
2000
我们能得到这种的结构。
14:31
And other sources of information, as you're no doubt aware
362
871260
3000
其他来源的信息,你肯定也知道,
14:34
are available about such features, from email interactions,
363
874260
3000
也能提供这样的信息,例如电子邮件交互,
14:37
online interactions,
364
877260
2000
在线交流,
14:39
online social networks and so forth.
365
879260
3000
在线社交网络,等等。
14:42
And in fact, we are in the era of what I would call
366
882260
2000
实际上,我们处在一个我称为
14:44
"massive-passive" data collection efforts.
367
884260
3000
“大量被动”数据收集的时代。
14:47
They're all kinds of ways we can use massively collected data
368
887260
3000
有很多种不同的方法我们能使用大量收集的数据
14:50
to create sensor networks
369
890260
3000
来建立传感网络
14:53
to follow the population,
370
893260
2000
跟踪人群,
14:55
understand what's happening in the population,
371
895260
2000
了解在人群中正在发生的事件,
14:57
and intervene in the population for the better.
372
897260
3000
从而可以更好的介入。
15:00
Because these new technologies tell us
373
900260
2000
因为这些新技术告诉我们
15:02
not just who is talking to whom,
374
902260
2000
不仅仅是那些人与那些人交流,
15:04
but where everyone is,
375
904260
2000
同时也告诉我们每个人处在什么位置,
15:06
and what they're thinking based on what they're uploading on the Internet,
376
906260
3000
根据他们上传到互联网上的东西来知道他们的想法,
15:09
and what they're buying based on their purchases.
377
909260
2000
他们的购物记录告诉我们他们买了什么。
15:11
And all this administrative data can be pulled together
378
911260
3000
所有这些管理数据能一起使用处理
15:14
and processed to understand human behavior
379
914260
2000
来了解人类的行为
15:16
in a way we never could before.
380
916260
3000
以一种前所未能的方式。
15:19
So, for example, we could use truckers' purchases of fuel.
381
919260
3000
比如说,我们能用卡车司机的购油记录。
15:22
So the truckers are just going about their business,
382
922260
2000
卡车司机处理他们的生意
15:24
and they're buying fuel.
383
924260
2000
他们要买汽油作燃料。
15:26
And we see a blip up in the truckers' purchases of fuel,
384
926260
3000
我们看到卡车司机购油数据的零上尖峰信号,
15:29
and we know that a recession is about to end.
385
929260
2000
我们就知道经济衰退大概要结束了。
15:31
Or we can monitor the velocity
386
931260
2000
或者我们能监测
15:33
with which people are moving with their phones on a highway,
387
933260
3000
人们在高速公路上带着电话移动的速度,
15:36
and the phone company can see,
388
936260
2000
电话公司能看到,
15:38
as the velocity is slowing down,
389
938260
2000
如果速度慢下来,
15:40
that there's a traffic jam.
390
940260
2000
那么就发生了交通堵塞。
15:42
And they can feed that information back to their subscribers,
391
942260
3000
他们能把这个信息发给他们的用户,
15:45
but only to their subscribers on the same highway
392
945260
2000
只发给在同一条高速公路上
15:47
located behind the traffic jam!
393
947260
2000
处于交通堵塞地点之前的用户!
15:49
Or we can monitor doctors prescribing behaviors, passively,
394
949260
3000
或者我们监测医生开药的行为,以被动的形式,
15:52
and see how the diffusion of innovation with pharmaceuticals
395
952260
3000
看看在医生网络中
15:55
occurs within [networks of] doctors.
396
955260
2000
制药公司的新药的发行如何。
15:57
Or again, we can monitor purchasing behavior in people
397
957260
2000
或者,我们能监测人们的购物行为,
15:59
and watch how these types of phenomena
398
959260
2000
看看这些种类的现象
16:01
can diffuse within human populations.
399
961260
3000
在人群中是怎样传播的。
16:04
And there are three ways, I think,
400
964260
2000
我想,有三种方式,
16:06
that these massive-passive data can be used.
401
966260
2000
这些大量被动的数据能被收集。
16:08
One is fully passive,
402
968260
2000
一个方式是完全被动,
16:10
like I just described --
403
970260
2000
如我刚刚描述的 ---
16:12
as in, for instance, the trucker example,
404
972260
2000
例如在卡车司机的例子中,
16:14
where we don't actually intervene in the population in any way.
405
974260
2000
我们不需要以任何方式干涉这个群体的行为。
16:16
One is quasi-active,
406
976260
2000
另一种是类似于主动的方式,
16:18
like the flu example I gave,
407
978260
2000
比如说像我说的流感的例子,
16:20
where we get some people to nominate their friends
408
980260
3000
我们选一些人来举出他们的朋友
16:23
and then passively monitor their friends --
409
983260
2000
然后被动地监测他们的朋友 ---
16:25
do they have the flu, or not? -- and then get warning.
410
985260
2000
他们是不是感染了流感? -- 然后得到示警。
16:27
Or another example would be,
411
987260
2000
或者另一个例子,
16:29
if you're a phone company, you figure out who's central in the network
412
989260
3000
如果你是电话公司,你能弄清楚谁在网络的中心,
16:32
and you ask those people, "Look, will you just text us your fever every day?
413
992260
3000
然后你可以问这些人,“你们能不能把你们每天的发烧情况给我们发过来?
16:35
Just text us your temperature."
414
995260
2000
只要发体温度数。”
16:37
And collect vast amounts of information about people's temperature,
415
997260
3000
然后收集人体体温的大量数据,
16:40
but from centrally located individuals.
416
1000260
2000
但是只是网络中心个体的信息。
16:42
And be able, on a large scale,
417
1002260
2000
这样就能,大面积地,
16:44
to monitor an impending epidemic
418
1004260
2000
监测即将来临的传染病
16:46
with very minimal input from people.
419
1006260
2000
而只需要人们给出非常少量的信息。
16:48
Or, finally, it can be more fully active --
420
1008260
2000
最后的一种方式,就更加主动 ---
16:50
as I know subsequent speakers will also talk about today --
421
1010260
2000
我知道随后的演讲者今天会说到的 --
16:52
where people might globally participate in wikis,
422
1012260
2000
人们在哪儿参与维基,
16:54
or photographing, or monitoring elections,
423
1014260
3000
摄影,看选举,
16:57
and upload information in a way that allows us to pool
424
1017260
2000
上载信息,这样可以让我们收集
16:59
information in order to understand social processes
425
1019260
2000
数据,来了解社交过程
17:01
and social phenomena.
426
1021260
2000
和社会现象。
17:03
In fact, the availability of these data, I think,
427
1023260
2000
实际上,我认为,这些数据的可用性,
17:05
heralds a kind of new era
428
1025260
2000
预示了一个新的纪元
17:07
of what I and others would like to call
429
1027260
2000
也就是我们所说的
17:09
"computational social science."
430
1029260
2000
“计算社会学”。
17:11
It's sort of like when Galileo invented -- or, didn't invent --
431
1031260
3000
这有点像伽利略发明的 -- 不是发明 --
17:14
came to use a telescope
432
1034260
2000
使用望远镜
17:16
and could see the heavens in a new way,
433
1036260
2000
能用一种新的方式看到天空,
17:18
or Leeuwenhoek became aware of the microscope --
434
1038260
2000
或者莱文胡克开始了解微观世界 ---
17:20
or actually invented --
435
1040260
2000
发明了显微镜 ---
17:22
and could see biology in a new way.
436
1042260
2000
而能以新的方式审视生物学。
17:24
But now we have access to these kinds of data
437
1044260
2000
但现在我们能够得到这些数据
17:26
that allow us to understand social processes
438
1046260
2000
这使得我们能了解社交过程
17:28
and social phenomena
439
1048260
2000
和社会现象
17:30
in an entirely new way that was never before possible.
440
1050260
3000
以一种前所未能的新方式。
17:33
And with this science, we can
441
1053260
2000
通过这门科学,我们能
17:35
understand how exactly
442
1055260
2000
准确了解
17:37
the whole comes to be greater
443
1057260
2000
整体是怎样优于
17:39
than the sum of its parts.
444
1059260
2000
局部的总和。
17:41
And actually, we can use these insights
445
1061260
2000
我们能用这些知识
17:43
to improve society and improve human well-being.
446
1063260
3000
来改善社会和人类的生存。
17:46
Thank you.
447
1066260
2000
谢谢。
关于本网站

这个网站将向你介绍对学习英语有用的YouTube视频。你将看到来自世界各地的一流教师教授的英语课程。双击每个视频页面上显示的英文字幕,即可从那里播放视频。字幕会随着视频的播放而同步滚动。如果你有任何意见或要求,请使用此联系表与我们联系。

https://forms.gle/WvT1wiN1qDtmnspy7