Nicholas Christakis: How social networks predict epidemics

94,105 views ・ 2010-09-16

TED


请双击下面的英文字幕来播放视频。

翻译人员: Ming Wu 校对人员: Xiaoqiao Xie
00:15
For the last 10 years, I've been spending my time trying to figure out
0
15260
3000
过去十年间,我一直在想
00:18
how and why human beings
1
18260
2000
人们是怎样把自己放在社交网络中的
00:20
assemble themselves into social networks.
2
20260
3000
以及他们为什么要这么做。
00:23
And the kind of social network I'm talking about
3
23260
2000
这里的社交网络
00:25
is not the recent online variety,
4
25260
2000
不是最近网上最近流行的那种
00:27
but rather, the kind of social networks
5
27260
2000
而是,
00:29
that human beings have been assembling for hundreds of thousands of years,
6
29260
3000
自从人类从非洲大陆出现之后,
00:32
ever since we emerged from the African savannah.
7
32260
3000
人们几十万年来进行的社交活动。
00:35
So, I form friendships and co-worker
8
35260
2000
比如说,
00:37
and sibling and relative relationships with other people
9
37260
3000
我和其他人建立友谊,同事,兄弟,亲戚的关系,
00:40
who in turn have similar relationships with other people.
10
40260
2000
其他人也和另外其他人建立类似的关系。
00:42
And this spreads on out endlessly into a distance.
11
42260
3000
这样的关系无尽止地延伸出去。
00:45
And you get a network that looks like this.
12
45260
2000
这样你就有了一个像这样的网络。
00:47
Every dot is a person.
13
47260
2000
网络中的每一个点就是一个人。
00:49
Every line between them is a relationship between two people --
14
49260
2000
连接两点的每一条线就是两个人之间的关系 --
00:51
different kinds of relationships.
15
51260
2000
不同类型的关系。
00:53
And you can get this kind of vast fabric of humanity,
16
53260
3000
这样就得到了这种巨大的人际关系网,
00:56
in which we're all embedded.
17
56260
2000
我们都交织在网中。
00:58
And my colleague, James Fowler and I have been studying for quite sometime
18
58260
3000
我的同事,詹姆斯.福乐,和我一起已经研究了好些时间
01:01
what are the mathematical, social,
19
61260
2000
什么是支配这些网络的数学,社交,
01:03
biological and psychological rules
20
63260
3000
生物和心理规则
01:06
that govern how these networks are assembled
21
66260
2000
以及
01:08
and what are the similar rules
22
68260
2000
什么是基本的规则
01:10
that govern how they operate, how they affect our lives.
23
70260
3000
来支配这些网络的运作,和如何影响我们的生活。
01:13
But recently, we've been wondering
24
73260
2000
最近,我们在研究
01:15
whether it might be possible to take advantage of this insight,
25
75260
3000
是否有可能利用这种认识
01:18
to actually find ways to improve the world,
26
78260
2000
来发现改善这个世界的方法,
01:20
to do something better,
27
80260
2000
做一些好事
01:22
to actually fix things, not just understand things.
28
82260
3000
解决一些问题,而不只是理解而已。
01:25
So one of the first things we thought we would tackle
29
85260
3000
所以我们认为需要解决的一件首要的事情
01:28
would be how we go about predicting epidemics.
30
88260
3000
就是怎样预测传染病。
01:31
And the current state of the art in predicting an epidemic --
31
91260
2000
现在预测传染病的方法 --
01:33
if you're the CDC or some other national body --
32
93260
3000
如果是美国疾病控制预防中心或者其他国家级的机构 --
01:36
is to sit in the middle where you are
33
96260
2000
就是呆在原地
01:38
and collect data
34
98260
2000
从医生和实验室
01:40
from physicians and laboratories in the field
35
100260
2000
收集数据
01:42
that report the prevalence or the incidence of certain conditions.
36
102260
3000
来分析疾病的流行性和发病率。
01:45
So, so and so patients have been diagnosed with something,
37
105260
3000
所以,(如果)一些病人在一个地方被诊断了得病,
01:48
or other patients have been diagnosed,
38
108260
2000
或者其他病人在另一个地方得到诊断,
01:50
and all these data are fed into a central repository, with some delay.
39
110260
3000
所有这些数据,一定的延迟之后,都送到一个中心数据库。
01:53
And if everything goes smoothly,
40
113260
2000
如果一切顺利,
01:55
one to two weeks from now
41
115260
2000
一两个星期之后,
01:57
you'll know where the epidemic was today.
42
117260
3000
你就会知道发生在今天的传染病在何处。
02:00
And actually, about a year or so ago,
43
120260
2000
实际上,一年前左右,
02:02
there was this promulgation
44
122260
2000
曾经有过这样的一个,google流感趋势的想法
02:04
of the idea of Google Flu Trends, with respect to the flu,
45
124260
3000
关于流感,
02:07
where by looking at people's searching behavior today,
46
127260
3000
通过观察人们今天的搜索行为,
02:10
we could know where the flu --
47
130260
2000
我们能知道流感的发病区...
02:12
what the status of the epidemic was today,
48
132260
2000
传染病今天的状况,
02:14
what's the prevalence of the epidemic today.
49
134260
3000
以及传播的趋势。
02:17
But what I'd like to show you today
50
137260
2000
但我今天想给你展示的
02:19
is a means by which we might get
51
139260
2000
是一种方法
02:21
not just rapid warning about an epidemic,
52
141260
3000
通过这个方法,我们不只是得到关于传染病的警示,
02:24
but also actually
53
144260
2000
而且
02:26
early detection of an epidemic.
54
146260
2000
也能够及早发现传染病。
02:28
And, in fact, this idea can be used
55
148260
2000
事实上,这个想法不只能
02:30
not just to predict epidemics of germs,
56
150260
3000
预测病毒的传播,
02:33
but also to predict epidemics of all sorts of kinds.
57
153260
3000
也能预测很多事情的传播。
02:37
For example, anything that spreads by a form of social contagion
58
157260
3000
比如说,任何以社交形式传播的事情,
02:40
could be understood in this way,
59
160260
2000
都能用这种方法来理解,
02:42
from abstract ideas on the left
60
162260
2000
从左边这些抽象的事物,
02:44
like patriotism, or altruism, or religion
61
164260
3000
像爱国主义,或利他主义,或者宗教,
02:47
to practices
62
167260
2000
到具体的事物,
02:49
like dieting behavior, or book purchasing,
63
169260
2000
像饮食行为,或者买书,
02:51
or drinking, or bicycle-helmet [and] other safety practices,
64
171260
3000
或饮酒,或自行车头盔和其他的一些安全措施,
02:54
or products that people might buy,
65
174260
2000
或者人们可能买的产品,
02:56
purchases of electronic goods,
66
176260
2000
电子书的购买,
02:58
anything in which there's kind of an interpersonal spread.
67
178260
3000
任何能在人们之间传播的事情。
03:01
A kind of a diffusion of innovation
68
181260
2000
一种创新的传播
03:03
could be understood and predicted
69
183260
2000
可以用我即将展示的方法
03:05
by the mechanism I'm going to show you now.
70
185260
3000
来理解和预测。
03:08
So, as all of you probably know,
71
188260
2000
正如你们所有人也许知道的,
03:10
the classic way of thinking about this
72
190260
2000
考虑这个问题的传统方法
03:12
is the diffusion-of-innovation,
73
192260
2000
是创新扩散,
03:14
or the adoption curve.
74
194260
2000
或创新采用曲线。
03:16
So here on the Y-axis, we have the percent of the people affected,
75
196260
2000
这儿,Y轴上显示的是受到影响的人们的百分比,
03:18
and on the X-axis, we have time.
76
198260
2000
x轴上显示的是时间。
03:20
And at the very beginning, not too many people are affected,
77
200260
3000
一开始,受到影响的人不多,
03:23
and you get this classic sigmoidal,
78
203260
2000
得到的是S状的分布,
03:25
or S-shaped, curve.
79
205260
2000
或者说S形状的曲线。
03:27
And the reason for this shape is that at the very beginning,
80
207260
2000
形成这种形状的原因是这样的,在一开始,
03:29
let's say one or two people
81
209260
2000
假设一两个人
03:31
are infected, or affected by the thing
82
211260
2000
受到影响,
03:33
and then they affect, or infect, two people,
83
213260
2000
然后他们去影响两个人,
03:35
who in turn affect four, eight, 16 and so forth,
84
215260
3000
然后就是四个,八个,十六个,等等,
03:38
and you get the epidemic growth phase of the curve.
85
218260
3000
这样你就得到了这个曲线的传播增长阶段。
03:41
And eventually, you saturate the population.
86
221260
2000
最终,这个群体就饱和了。
03:43
There are fewer and fewer people
87
223260
2000
可以被影响的人
03:45
who are still available that you might infect,
88
225260
2000
就越来越少,
03:47
and then you get the plateau of the curve,
89
227260
2000
这时你就得到了曲线的平顶部分,
03:49
and you get this classic sigmoidal curve.
90
229260
3000
这样就形成了经典的S状曲线。
03:52
And this holds for germs, ideas,
91
232260
2000
这种方法可以用于病毒,观点,
03:54
product adoption, behaviors,
92
234260
2000
产品推广,行为,
03:56
and the like.
93
236260
2000
以及其他类似的情况。
03:58
But things don't just diffuse in human populations at random.
94
238260
3000
但是这些事物在人群中并非是随机传播的。
04:01
They actually diffuse through networks.
95
241260
2000
他们实际上是通过网络传播。
04:03
Because, as I said, we live our lives in networks,
96
243260
3000
正如我所说的,因为我们在网络中生存,
04:06
and these networks have a particular kind of a structure.
97
246260
3000
这些网络有一种特殊的结构。
04:09
Now if you look at a network like this --
98
249260
2000
现在,如果你看像这样的一个网络。。。
04:11
this is 105 people.
99
251260
2000
有105人。
04:13
And the lines represent -- the dots are the people,
100
253260
2000
这些线代表。。。这些点是人,
04:15
and the lines represent friendship relationships.
101
255260
2000
这些线代表朋友关系。
04:17
You might see that people occupy
102
257260
2000
你也许看到在这个网络中
04:19
different locations within the network.
103
259260
2000
人们占据了不同的地点。
04:21
And there are different kinds of relationships between the people.
104
261260
2000
人们之间也有不同的关系。
04:23
You could have friendship relationships, sibling relationships,
105
263260
3000
这些关系可以是朋友关系,同胞关系,
04:26
spousal relationships, co-worker relationships,
106
266260
3000
配偶关系,同事关系,
04:29
neighbor relationships and the like.
107
269260
3000
邻居关系,以及类似的关系。
04:32
And different sorts of things
108
272260
2000
不同的事物
04:34
spread across different sorts of ties.
109
274260
2000
通过不同的联系来传播。
04:36
For instance, sexually transmitted diseases
110
276260
2000
比如说,性病
04:38
will spread across sexual ties.
111
278260
2000
通过性关系传播。
04:40
Or, for instance, people's smoking behavior
112
280260
2000
或者,比如说,人们的吸烟行为
04:42
might be influenced by their friends.
113
282260
2000
可能会受到他们朋友的影响。
04:44
Or their altruistic or their charitable giving behavior
114
284260
2000
或者利他主义以及慈善施舍行为
04:46
might be influenced by their coworkers,
115
286260
2000
可能会受到同事的影响,
04:48
or by their neighbors.
116
288260
2000
或者邻居的影响。
04:50
But not all positions in the network are the same.
117
290260
3000
但并非网络中所有的位置都是一样的。
04:53
So if you look at this, you might immediately grasp
118
293260
2000
所以你看这里,就能立即理解
04:55
that different people have different numbers of connections.
119
295260
3000
不同的人有不同数量的连接。
04:58
Some people have one connection, some have two,
120
298260
2000
有些人有一个连接,有些人有两个,
05:00
some have six, some have 10 connections.
121
300260
3000
有些人有六个,有些人有十个连接。
05:03
And this is called the "degree" of a node,
122
303260
2000
这就叫做结点的度数,
05:05
or the number of connections that a node has.
123
305260
2000
或者一个结点有的连接数目。
05:07
But in addition, there's something else.
124
307260
2000
但是,还有些别的东西。
05:09
So, if you look at nodes A and B,
125
309260
2000
如果你看结点A和B,
05:11
they both have six connections.
126
311260
2000
都有六个连接关系。
05:13
But if you can see this image [of the network] from a bird's eye view,
127
313260
3000
但是如果拔高来看,
05:16
you can appreciate that there's something very different
128
316260
2000
你就能理解A和B是
05:18
about nodes A and B.
129
318260
2000
非常不一样。
05:20
So, let me ask you this -- I can cultivate this intuition by asking a question --
130
320260
3000
让我来问你 -- 用一个问题来说明这个直觉 ---
05:23
who would you rather be
131
323260
2000
如果有个致命的病毒正在网络中传播,
05:25
if a deadly germ was spreading through the network, A or B?
132
325260
3000
你更愿意是哪一个,A还是B?
05:28
(Audience: B.) Nicholas Christakis: B, it's obvious.
133
328260
2000
(观众:B) 尼古拉斯·克里斯塔吉斯: B, 这很明显。
05:30
B is located on the edge of the network.
134
330260
2000
B在网络的边缘。
05:32
Now, who would you rather be
135
332260
2000
现在,如果有个非常有料的流言在网络中传播,
05:34
if a juicy piece of gossip were spreading through the network?
136
334260
3000
你愿意是哪一个?
05:37
A. And you have an immediate appreciation
137
337260
3000
A。 你很快就看到
05:40
that A is going to be more likely
138
340260
2000
A更有可能
05:42
to get the thing that's spreading and to get it sooner
139
342260
3000
更快地得到正在传播的事物
05:45
by virtue of their structural location within the network.
140
345260
3000
因为他们在网络中的位置。
05:48
A, in fact, is more central,
141
348260
2000
A,实际上,(位置)更加中心,
05:50
and this can be formalized mathematically.
142
350260
3000
这个能在数学上来表示。
05:53
So, if we want to track something
143
353260
2000
如果我们要追踪
05:55
that was spreading through a network,
144
355260
3000
在网络中传播的事物,
05:58
what we ideally would like to do is to set up sensors
145
358260
2000
理想情况下我们会想设置感应器
06:00
on the central individuals within the network,
146
360260
2000
在网络的中心人物上,
06:02
including node A,
147
362260
2000
包括结点A,
06:04
monitor those people that are right there in the middle of the network,
148
364260
3000
以此来观察网络中心的人们的活动,
06:07
and somehow get an early detection
149
367260
2000
从而能够做到及早探测
06:09
of whatever it is that is spreading through the network.
150
369260
3000
正在网络中传播的东西。
06:12
So if you saw them contract a germ or a piece of information,
151
372260
3000
也就是说,假如你看到网络中心的人们感染病毒或得到了一些信息,
06:15
you would know that, soon enough,
152
375260
2000
你就能知道,很快
06:17
everybody was about to contract this germ
153
377260
2000
所有人都会被传染这种病毒
06:19
or this piece of information.
154
379260
2000
或得到这个消息。
06:21
And this would be much better
155
381260
2000
这种方法
06:23
than monitoring six randomly chosen people,
156
383260
2000
比不考虑群体的结构,监测六个随机选择的人,
06:25
without reference to the structure of the population.
157
385260
3000
要好的多。
06:28
And in fact, if you could do that,
158
388260
2000
实际上,如果能够这样做,
06:30
what you would see is something like this.
159
390260
2000
你就会看到像这样的情况。
06:32
On the left-hand panel, again, we have the S-shaped curve of adoption.
160
392260
3000
在左边,我们有S形状的传播曲线。
06:35
In the dotted red line, we show
161
395260
2000
这条红色的点线,我们表示的是
06:37
what the adoption would be in the random people,
162
397260
2000
在随机人群中的感染率,
06:39
and in the left-hand line, shifted to the left,
163
399260
3000
左手的线条,向左移动,
06:42
we show what the adoption would be
164
402260
2000
表现的是
06:44
in the central individuals within the network.
165
404260
2000
在网络的中心群体中的感染率。
06:46
On the Y-axis is the cumulative instances of contagion,
166
406260
2000
Y轴上是感染个体的累计总数,
06:48
and on the X-axis is the time.
167
408260
2000
X轴上是时间。
06:50
And on the right-hand side, we show the same data,
168
410260
2000
右边,我们显示同样的数据,
06:52
but here with daily incidence.
169
412260
2000
但是在每天的个体数。
06:54
And what we show here is -- like, here --
170
414260
2000
我们在这里要显示的是 -- 比如说,这里 --
06:56
very few people are affected, more and more and more and up to here,
171
416260
2000
很少的人受到影响,然后逐渐增多到这里,
06:58
and here's the peak of the epidemic.
172
418260
2000
这里是感染的高峰。
07:00
But shifted to the left is what's occurring in the central individuals.
173
420260
2000
但是移到左边,是在中心群体中的发展趋势。
07:02
And this difference in time between the two
174
422260
3000
两者之间在时间上的区别
07:05
is the early detection, the early warning we can get,
175
425260
3000
正是我们能够得到
07:08
about an impending epidemic
176
428260
2000
关于传染病在人群中的
07:10
in the human population.
177
430260
2000
早期预测, 早期示警。
07:12
The problem, however,
178
432260
2000
然而,这个方法的难处在于,
07:14
is that mapping human social networks
179
434260
2000
测绘人类的社交关系网
07:16
is not always possible.
180
436260
2000
并不总是可能的。
07:18
It can be expensive, not feasible,
181
438260
2000
这很昂贵,[很难],
07:20
unethical,
182
440260
2000
不正当,
07:22
or, frankly, just not possible to do such a thing.
183
442260
3000
或者坦白说,就是没可能做这样的事情。
07:25
So, how can we figure out
184
445260
2000
那么,我们怎样能弄清楚
07:27
who the central people are in a network
185
447260
2000
哪些人在网络中心
07:29
without actually mapping the network?
186
449260
3000
而不需要通过测绘整个网络呢?
07:32
What we came up with
187
452260
2000
我们想出来的方法
07:34
was an idea to exploit an old fact,
188
454260
2000
是采用了一个古老的事实,
07:36
or a known fact, about social networks,
189
456260
2000
或者说关于社交网络的已知事实,
07:38
which goes like this:
190
458260
2000
这个事实是这样的:
07:40
Do you know that your friends
191
460260
2000
你知道你的朋友
07:42
have more friends than you do?
192
462260
3000
有比你更多的朋友吗?
07:45
Your friends have more friends than you do,
193
465260
3000
你的朋友比你有更多的朋友。
07:48
and this is known as the friendship paradox.
194
468260
2000
这个称为朋友的悖论。
07:50
Imagine a very popular person in the social network --
195
470260
2000
想像有一个在社交网络中非常受欢迎的人物--
07:52
like a party host who has hundreds of friends --
196
472260
3000
就像一个聚会的主持有几百个朋友--
07:55
and a misanthrope who has just one friend,
197
475260
2000
而一个憎恨人类的人只有一个朋友,
07:57
and you pick someone at random from the population;
198
477260
3000
然后你随机从人群中选个人;
08:00
they were much more likely to know the party host.
199
480260
2000
他们更有可能认识聚会的主持。
08:02
And if they nominate the party host as their friend,
200
482260
2000
如果他们选择晚会主持作为他们的朋友,
08:04
that party host has a hundred friends,
201
484260
2000
那么这个聚会主持就有一百个朋友,
08:06
therefore, has more friends than they do.
202
486260
3000
因此,就有比他们更多的朋友。
08:09
And this, in essence, is what's known as the friendship paradox.
203
489260
3000
这个就称为朋友悖论。
08:12
The friends of randomly chosen people
204
492260
3000
随机选择的人群的朋友
08:15
have higher degree, and are more central
205
495260
2000
比随机人群本身,
08:17
than the random people themselves.
206
497260
2000
有更高的(关系)度数,并且更加中心。
08:19
And you can get an intuitive appreciation for this
207
499260
2000
你能对这个理论有一个本能的理解
08:21
if you imagine just the people at the perimeter of the network.
208
501260
3000
如果想像网络周边的人群。
08:24
If you pick this person,
209
504260
2000
如果你选择(网络周边的)这个人
08:26
the only friend they have to nominate is this person,
210
506260
3000
他们能选择的唯一朋友就是这个人,
08:29
who, by construction, must have at least two
211
509260
2000
而这个,在这个结构中,一定有至少两个朋友,
08:31
and typically more friends.
212
511260
2000
通常情况下,有更多的朋友。
08:33
And that happens at every peripheral node.
213
513260
2000
这种情况发生在每个周边结点上。
08:35
And in fact, it happens throughout the network as you move in,
214
515260
3000
实际上,每当你加入一个网络的时候这个情况都会发生,
08:38
everyone you pick, when they nominate a random --
215
518260
2000
你选择的每个人,当他们随机选择。。。
08:40
when a random person nominates a friend of theirs,
216
520260
3000
当任意一个人选择他们的一个朋友,
08:43
you move closer to the center of the network.
217
523260
3000
你就向网络中心移动。
08:46
So, we thought we would exploit this idea
218
526260
3000
所以,我们想利用这个概念
08:49
in order to study whether we could predict phenomena within networks.
219
529260
3000
来研究是否能预测网络的现象。
08:52
Because now, with this idea
220
532260
2000
因为,有了这个概念,
08:54
we can take a random sample of people,
221
534260
2000
我们就选择一个随机人群,
08:56
have them nominate their friends,
222
536260
2000
让他们提供他们的朋友,
08:58
those friends would be more central,
223
538260
2000
他们的朋友就更加中心,
09:00
and we could do this without having to map the network.
224
540260
3000
这样我们就能选择网络的中心,而不用描绘整个网络。
09:03
And we tested this idea with an outbreak of H1N1 flu
225
543260
3000
我们用这个想法来测试H1N1流感的爆发
09:06
at Harvard College
226
546260
2000
在哈佛大学
09:08
in the fall and winter of 2009, just a few months ago.
227
548260
3000
2009年的秋冬,就是几个月前。
09:11
We took 1,300 randomly selected undergraduates,
228
551260
3000
我们随机选择了1300本科学生,
09:14
we had them nominate their friends,
229
554260
2000
让他们推举他们的朋友,
09:16
and we followed both the random students and their friends
230
556260
2000
然后我们跟踪随机的学生人群和他们的朋友
09:18
daily in time
231
558260
2000
每天按时
09:20
to see whether or not they had the flu epidemic.
232
560260
3000
观察他们是否传染上流感。
09:23
And we did this passively by looking at whether or not they'd gone to university health services.
233
563260
3000
我们观察的方法是看他们有没有去过大学健康服务机构。
09:26
And also, we had them [actively] email us a couple of times a week.
234
566260
3000
并且我们要求他们一个星期给我们发几次电子邮件。
09:29
Exactly what we predicted happened.
235
569260
3000
我们的预测一点不错的发生了。
09:32
So the random group is in the red line.
236
572260
3000
这个随机组在这个红线上。
09:35
The epidemic in the friends group has shifted to the left, over here.
237
575260
3000
朋友组的传染移到左边,这里
09:38
And the difference in the two is 16 days.
238
578260
3000
中间相差了16天。
09:41
By monitoring the friends group,
239
581260
2000
通过检测朋友组,
09:43
we could get 16 days advance warning
240
583260
2000
我们能够得到16天的预先示警
09:45
of an impending epidemic in this human population.
241
585260
3000
在这个人群的关于这个传染病的传播。
09:48
Now, in addition to that,
242
588260
2000
现在,在这个基础上,
09:50
if you were an analyst who was trying to study an epidemic
243
590260
3000
如果你是分析师,要研究一种传染病
09:53
or to predict the adoption of a product, for example,
244
593260
3000
或者预测一个产品的推广,
09:56
what you could do is you could pick a random sample of the population,
245
596260
3000
你能做的是选择一个随机的人群,
09:59
also have them nominate their friends and follow the friends
246
599260
3000
让他们任命他们的朋友,然后跟踪他们的朋友,
10:02
and follow both the randoms and the friends.
247
602260
3000
跟踪随机组和朋友组。
10:05
Among the friends, the first evidence you saw of a blip above zero
248
605260
3000
在朋友组中,你看到的第一个零上的尖峰信号
10:08
in adoption of the innovation, for example,
249
608260
3000
关于,比如说,创新科技的采纳,
10:11
would be evidence of an impending epidemic.
250
611260
2000
就是即将来临的流行趋势的信号。
10:13
Or you could see the first time the two curves diverged,
251
613260
3000
或者你能看到两条曲线第一次分离的地方,
10:16
as shown on the left.
252
616260
2000
就像左边显示的。
10:18
When did the randoms -- when did the friends take off
253
618260
3000
朋友组什么时候开始
10:21
and leave the randoms,
254
621260
2000
与随机组分离,
10:23
and [when did] their curve start shifting?
255
623260
2000
他们的曲线什么时候开始偏移?
10:25
And that, as indicated by the white line,
256
625260
2000
正如白线显示的,
10:27
occurred 46 days
257
627260
2000
发生在
10:29
before the peak of the epidemic.
258
629260
2000
流行高峰的46天之前。
10:31
So this would be a technique
259
631260
2000
通过这个技术
10:33
whereby we could get more than a month-and-a-half warning
260
633260
2000
我们能得到关于流感在特定人群中传播
10:35
about a flu epidemic in a particular population.
261
635260
3000
一个半月以上的预先示警。
10:38
I should say that
262
638260
2000
我应该说
10:40
how far advanced a notice one might get about something
263
640260
2000
能多早得到关于一些事情的消息
10:42
depends on a host of factors.
264
642260
2000
取决于很多因素。
10:44
It could depend on the nature of the pathogen --
265
644260
2000
它也许取决于病原体的本质---
10:46
different pathogens,
266
646260
2000
不同的病原体,
10:48
using this technique, you'd get different warning --
267
648260
2000
使用这种技术,你可能得到不同的示警---
10:50
or other phenomena that are spreading,
268
650260
2000
或者其他一些传播的现象,
10:52
or frankly, on the structure of the human network.
269
652260
3000
或者,直接的说,在人类网络的结构中。
10:55
Now in our case, although it wasn't necessary,
270
655260
3000
现在,在我们的例子中,尽管不是很必要,
10:58
we could also actually map the network of the students.
271
658260
2000
我们也能够描绘这个学生网络。
11:00
So, this is a map of 714 students
272
660260
2000
这是714个学生的映射图
11:02
and their friendship ties.
273
662260
2000
和他们朋友联系。
11:04
And in a minute now, I'm going to put this map into motion.
274
664260
2000
很快,我要使这个图动起来。
11:06
We're going to take daily cuts through the network
275
666260
2000
我们要通过这个网络作每日监控
11:08
for 120 days.
276
668260
2000
120天。
11:10
The red dots are going to be cases of the flu,
277
670260
3000
红点将会是流感的传染者,
11:13
and the yellow dots are going to be friends of the people with the flu.
278
673260
3000
黄点就是流感传染这人的朋友。
11:16
And the size of the dots is going to be proportional
279
676260
2000
这些点的大小
11:18
to how many of their friends have the flu.
280
678260
2000
和他们得流感朋友的数目成正比。
11:20
So bigger dots mean more of your friends have the flu.
281
680260
3000
越大的点意味着更多的朋友得了流感。
11:23
And if you look at this image -- here we are now in September the 13th --
282
683260
3000
你看这个图 --- 这儿是九月十三号 ---
11:26
you're going to see a few cases light up.
283
686260
2000
你看到几个病例出现。
11:28
You're going to see kind of blooming of the flu in the middle.
284
688260
2000
在中间你就会看到流感开始爆发。
11:30
Here we are on October the 19th.
285
690260
3000
这儿是十月十九日。
11:33
The slope of the epidemic curve is approaching now, in November.
286
693260
2000
传播曲线的坡度开始临近,在十一月。
11:35
Bang, bang, bang, bang, bang -- you're going to see lots of blooming in the middle,
287
695260
3000
砰,砰,砰,砰,砰,你将看到在中间的很多地方爆发,
11:38
and then you're going to see a sort of leveling off,
288
698260
2000
然后你会看到情况稳定下来,
11:40
fewer and fewer cases towards the end of December.
289
700260
3000
到十二月底就越来越少的病例发生。
11:43
And this type of a visualization
290
703260
2000
这样的图形表示
11:45
can show that epidemics like this take root
291
705260
2000
能显示,像这样的传染病先
11:47
and affect central individuals first,
292
707260
2000
影响中心个体
11:49
before they affect others.
293
709260
2000
在影响别人之前。
11:51
Now, as I've been suggesting,
294
711260
2000
现在,如我所说,
11:53
this method is not restricted to germs,
295
713260
3000
这个方法并不局限于病毒,
11:56
but actually to anything that spreads in populations.
296
716260
2000
实际上可以用于任何在人群中传播的事物。
11:58
Information spreads in populations,
297
718260
2000
信息在人群中传播。
12:00
norms can spread in populations,
298
720260
2000
规范在人群中传播。
12:02
behaviors can spread in populations.
299
722260
2000
行为能在人群中传播
12:04
And by behaviors, I can mean things like criminal behavior,
300
724260
3000
我说的行为,就是像犯罪的行为
12:07
or voting behavior, or health care behavior,
301
727260
3000
或选举行为,或者保健行为,
12:10
like smoking, or vaccination,
302
730260
2000
像抽烟,或免疫,
12:12
or product adoption, or other kinds of behaviors
303
732260
2000
或产品推广,或者其他类型的行为
12:14
that relate to interpersonal influence.
304
734260
2000
和人际之间影响相关的性为。
12:16
If I'm likely to do something that affects others around me,
305
736260
3000
如果我想做些事情来影响我周围的人,
12:19
this technique can get early warning or early detection
306
739260
3000
这个技术能得到早期示警,或早期预测,
12:22
about the adoption within the population.
307
742260
3000
关于人群的采纳。
12:25
The key thing is that for it to work,
308
745260
2000
要这个技术起作用,关键在于,
12:27
there has to be interpersonal influence.
309
747260
2000
人际之间的影响必须存在。
12:29
It cannot be because of some broadcast mechanism
310
749260
2000
它不能是像一些传播机制
12:31
affecting everyone uniformly.
311
751260
3000
统一地影响每一个人。
12:35
Now the same insights
312
755260
2000
现在同样的观察
12:37
can also be exploited -- with respect to networks --
313
757260
3000
可以用上 --- 关于网络 ---
12:40
can also be exploited in other ways,
314
760260
3000
能够用其他的方法来利用,
12:43
for example, in the use of targeting
315
763260
2000
比如,可以用来追踪
12:45
specific people for interventions.
316
765260
2000
特殊的人群。
12:47
So, for example, most of you are probably familiar
317
767260
2000
比如说,你们大部分人可能听过
12:49
with the notion of herd immunity.
318
769260
2000
群体免疫的概念。
12:51
So, if we have a population of a thousand people,
319
771260
3000
如果我们有一个一千人的群体,
12:54
and we want to make the population immune to a pathogen,
320
774260
3000
我们想使这个群体对某种病原体免疫,
12:57
we don't have to immunize every single person.
321
777260
2000
我们并不需要给每个人打免疫预防针。
12:59
If we immunize 960 of them,
322
779260
2000
如果我们使960人免疫,
13:01
it's as if we had immunized a hundred [percent] of them.
323
781260
3000
效果和使所有人免疫差不多。
13:04
Because even if one or two of the non-immune people gets infected,
324
784260
3000
因为即使一两个没有免疫的人感染了,
13:07
there's no one for them to infect.
325
787260
2000
也没有其他人让他们感染。
13:09
They are surrounded by immunized people.
326
789260
2000
这两个人周围的人都已经免疫。
13:11
So 96 percent is as good as 100 percent.
327
791260
3000
所以百分之96和百分之百效果一样好。
13:14
Well, some other scientists have estimated
328
794260
2000
一些其他的科学家已经预测了
13:16
what would happen if you took a 30 percent random sample
329
796260
2000
可能发生的情况,如果你从这一千人中取百分之三十的随机样本
13:18
of these 1000 people, 300 people and immunized them.
330
798260
3000
也就是三百个人,并且使他们免疫。
13:21
Would you get any population-level immunity?
331
801260
2000
这样能不能使整个群体免疫?
13:23
And the answer is no.
332
803260
3000
答案是不可能。
13:26
But if you took this 30 percent, these 300 people
333
806260
2000
但是如果你选择百分之三十的这三百人,
13:28
and had them nominate their friends
334
808260
2000
让他们举出他们的朋友
13:30
and took the same number of vaccine doses
335
810260
3000
然后使用同样数量的免疫针
13:33
and vaccinated the friends of the 300 --
336
813260
2000
使得这三百人的朋友免疫,
13:35
the 300 friends --
337
815260
2000
这三百个朋友,
13:37
you can get the same level of herd immunity
338
817260
2000
你就能达到群体免疫的效果
13:39
as if you had vaccinated 96 percent of the population
339
819260
3000
就好像给百分之九十六的人打预防针的效果一样
13:42
at a much greater efficiency, with a strict budget constraint.
340
822260
3000
同时效率更高,并且花费更少。
13:45
And similar ideas can be used, for instance,
341
825260
2000
同样的想法可以用于,比如说,
13:47
to target distribution of things like bed nets
342
827260
2000
解决像床罩这种物品在发展中国家
13:49
in the developing world.
343
829260
2000
的派发和分布。
13:51
If we could understand the structure of networks in villages,
344
831260
3000
如果我们了解村庄的网络结构,
13:54
we could target to whom to give the interventions
345
834260
2000
我们就能选择介入的目标
13:56
to foster these kinds of spreads.
346
836260
2000
来促进这些过程的进行。
13:58
Or, frankly, for advertising with all kinds of products.
347
838260
3000
或者,更加直接的说,来促销所有的产品。
14:01
If we could understand how to target,
348
841260
2000
如果我们能理解如何选择目标,
14:03
it could affect the efficiency
349
843260
2000
就能影响到我们达到目标
14:05
of what we're trying to achieve.
350
845260
2000
的效率。
14:07
And in fact, we can use data
351
847260
2000
实际上,我们能使用来源于各种渠道的
14:09
from all kinds of sources nowadays [to do this].
352
849260
2000
数据[来应用这个方法]。
14:11
This is a map of eight million phone users
353
851260
2000
这里是一个欧洲国家八百万人的
14:13
in a European country.
354
853260
2000
电话用户的网络图。
14:15
Every dot is a person, and every line represents
355
855260
2000
每一个点就是一个人,每一条线代表
14:17
a volume of calls between the people.
356
857260
2000
人们之间的通话数量。
14:19
And we can use such data, that's being passively obtained,
357
859260
3000
我们可以利用这些数据,被动方式得到的数据,
14:22
to map these whole countries
358
862260
2000
来描绘整个国家
14:24
and understand who is located where within the network.
359
864260
3000
从而了解那些人处在网络的中心。
14:27
Without actually having to query them at all,
360
867260
2000
不用实际上去询问每个人,
14:29
we can get this kind of a structural insight.
361
869260
2000
我们能得到这种的结构。
14:31
And other sources of information, as you're no doubt aware
362
871260
3000
其他来源的信息,你肯定也知道,
14:34
are available about such features, from email interactions,
363
874260
3000
也能提供这样的信息,例如电子邮件交互,
14:37
online interactions,
364
877260
2000
在线交流,
14:39
online social networks and so forth.
365
879260
3000
在线社交网络,等等。
14:42
And in fact, we are in the era of what I would call
366
882260
2000
实际上,我们处在一个我称为
14:44
"massive-passive" data collection efforts.
367
884260
3000
“大量被动”数据收集的时代。
14:47
They're all kinds of ways we can use massively collected data
368
887260
3000
有很多种不同的方法我们能使用大量收集的数据
14:50
to create sensor networks
369
890260
3000
来建立传感网络
14:53
to follow the population,
370
893260
2000
跟踪人群,
14:55
understand what's happening in the population,
371
895260
2000
了解在人群中正在发生的事件,
14:57
and intervene in the population for the better.
372
897260
3000
从而可以更好的介入。
15:00
Because these new technologies tell us
373
900260
2000
因为这些新技术告诉我们
15:02
not just who is talking to whom,
374
902260
2000
不仅仅是那些人与那些人交流,
15:04
but where everyone is,
375
904260
2000
同时也告诉我们每个人处在什么位置,
15:06
and what they're thinking based on what they're uploading on the Internet,
376
906260
3000
根据他们上传到互联网上的东西来知道他们的想法,
15:09
and what they're buying based on their purchases.
377
909260
2000
他们的购物记录告诉我们他们买了什么。
15:11
And all this administrative data can be pulled together
378
911260
3000
所有这些管理数据能一起使用处理
15:14
and processed to understand human behavior
379
914260
2000
来了解人类的行为
15:16
in a way we never could before.
380
916260
3000
以一种前所未能的方式。
15:19
So, for example, we could use truckers' purchases of fuel.
381
919260
3000
比如说,我们能用卡车司机的购油记录。
15:22
So the truckers are just going about their business,
382
922260
2000
卡车司机处理他们的生意
15:24
and they're buying fuel.
383
924260
2000
他们要买汽油作燃料。
15:26
And we see a blip up in the truckers' purchases of fuel,
384
926260
3000
我们看到卡车司机购油数据的零上尖峰信号,
15:29
and we know that a recession is about to end.
385
929260
2000
我们就知道经济衰退大概要结束了。
15:31
Or we can monitor the velocity
386
931260
2000
或者我们能监测
15:33
with which people are moving with their phones on a highway,
387
933260
3000
人们在高速公路上带着电话移动的速度,
15:36
and the phone company can see,
388
936260
2000
电话公司能看到,
15:38
as the velocity is slowing down,
389
938260
2000
如果速度慢下来,
15:40
that there's a traffic jam.
390
940260
2000
那么就发生了交通堵塞。
15:42
And they can feed that information back to their subscribers,
391
942260
3000
他们能把这个信息发给他们的用户,
15:45
but only to their subscribers on the same highway
392
945260
2000
只发给在同一条高速公路上
15:47
located behind the traffic jam!
393
947260
2000
处于交通堵塞地点之前的用户!
15:49
Or we can monitor doctors prescribing behaviors, passively,
394
949260
3000
或者我们监测医生开药的行为,以被动的形式,
15:52
and see how the diffusion of innovation with pharmaceuticals
395
952260
3000
看看在医生网络中
15:55
occurs within [networks of] doctors.
396
955260
2000
制药公司的新药的发行如何。
15:57
Or again, we can monitor purchasing behavior in people
397
957260
2000
或者,我们能监测人们的购物行为,
15:59
and watch how these types of phenomena
398
959260
2000
看看这些种类的现象
16:01
can diffuse within human populations.
399
961260
3000
在人群中是怎样传播的。
16:04
And there are three ways, I think,
400
964260
2000
我想,有三种方式,
16:06
that these massive-passive data can be used.
401
966260
2000
这些大量被动的数据能被收集。
16:08
One is fully passive,
402
968260
2000
一个方式是完全被动,
16:10
like I just described --
403
970260
2000
如我刚刚描述的 ---
16:12
as in, for instance, the trucker example,
404
972260
2000
例如在卡车司机的例子中,
16:14
where we don't actually intervene in the population in any way.
405
974260
2000
我们不需要以任何方式干涉这个群体的行为。
16:16
One is quasi-active,
406
976260
2000
另一种是类似于主动的方式,
16:18
like the flu example I gave,
407
978260
2000
比如说像我说的流感的例子,
16:20
where we get some people to nominate their friends
408
980260
3000
我们选一些人来举出他们的朋友
16:23
and then passively monitor their friends --
409
983260
2000
然后被动地监测他们的朋友 ---
16:25
do they have the flu, or not? -- and then get warning.
410
985260
2000
他们是不是感染了流感? -- 然后得到示警。
16:27
Or another example would be,
411
987260
2000
或者另一个例子,
16:29
if you're a phone company, you figure out who's central in the network
412
989260
3000
如果你是电话公司,你能弄清楚谁在网络的中心,
16:32
and you ask those people, "Look, will you just text us your fever every day?
413
992260
3000
然后你可以问这些人,“你们能不能把你们每天的发烧情况给我们发过来?
16:35
Just text us your temperature."
414
995260
2000
只要发体温度数。”
16:37
And collect vast amounts of information about people's temperature,
415
997260
3000
然后收集人体体温的大量数据,
16:40
but from centrally located individuals.
416
1000260
2000
但是只是网络中心个体的信息。
16:42
And be able, on a large scale,
417
1002260
2000
这样就能,大面积地,
16:44
to monitor an impending epidemic
418
1004260
2000
监测即将来临的传染病
16:46
with very minimal input from people.
419
1006260
2000
而只需要人们给出非常少量的信息。
16:48
Or, finally, it can be more fully active --
420
1008260
2000
最后的一种方式,就更加主动 ---
16:50
as I know subsequent speakers will also talk about today --
421
1010260
2000
我知道随后的演讲者今天会说到的 --
16:52
where people might globally participate in wikis,
422
1012260
2000
人们在哪儿参与维基,
16:54
or photographing, or monitoring elections,
423
1014260
3000
摄影,看选举,
16:57
and upload information in a way that allows us to pool
424
1017260
2000
上载信息,这样可以让我们收集
16:59
information in order to understand social processes
425
1019260
2000
数据,来了解社交过程
17:01
and social phenomena.
426
1021260
2000
和社会现象。
17:03
In fact, the availability of these data, I think,
427
1023260
2000
实际上,我认为,这些数据的可用性,
17:05
heralds a kind of new era
428
1025260
2000
预示了一个新的纪元
17:07
of what I and others would like to call
429
1027260
2000
也就是我们所说的
17:09
"computational social science."
430
1029260
2000
“计算社会学”。
17:11
It's sort of like when Galileo invented -- or, didn't invent --
431
1031260
3000
这有点像伽利略发明的 -- 不是发明 --
17:14
came to use a telescope
432
1034260
2000
使用望远镜
17:16
and could see the heavens in a new way,
433
1036260
2000
能用一种新的方式看到天空,
17:18
or Leeuwenhoek became aware of the microscope --
434
1038260
2000
或者莱文胡克开始了解微观世界 ---
17:20
or actually invented --
435
1040260
2000
发明了显微镜 ---
17:22
and could see biology in a new way.
436
1042260
2000
而能以新的方式审视生物学。
17:24
But now we have access to these kinds of data
437
1044260
2000
但现在我们能够得到这些数据
17:26
that allow us to understand social processes
438
1046260
2000
这使得我们能了解社交过程
17:28
and social phenomena
439
1048260
2000
和社会现象
17:30
in an entirely new way that was never before possible.
440
1050260
3000
以一种前所未能的新方式。
17:33
And with this science, we can
441
1053260
2000
通过这门科学,我们能
17:35
understand how exactly
442
1055260
2000
准确了解
17:37
the whole comes to be greater
443
1057260
2000
整体是怎样优于
17:39
than the sum of its parts.
444
1059260
2000
局部的总和。
17:41
And actually, we can use these insights
445
1061260
2000
我们能用这些知识
17:43
to improve society and improve human well-being.
446
1063260
3000
来改善社会和人类的生存。
17:46
Thank you.
447
1066260
2000
谢谢。
关于本网站

这个网站将向你介绍对学习英语有用的YouTube视频。你将看到来自世界各地的一流教师教授的英语课程。双击每个视频页面上显示的英文字幕,即可从那里播放视频。字幕会随着视频的播放而同步滚动。如果你有任何意见或要求,请使用此联系表与我们联系。

https://forms.gle/WvT1wiN1qDtmnspy7


This website was created in October 2020 and last updated on June 12, 2025.

It is now archived and preserved as an English learning resource.

Some information may be out of date.

隐私政策

eng.lish.video

Developer's Blog