How we're building the world's largest family tree | Yaniv Erlich

41,746 views ・ 2019-10-18

TED


請雙擊下方英文字幕播放視頻。

譯者: Lilian Chiu 審譯者: 易帆 余
00:12
People use the internet for various reasons.
0
12817
3452
人們因各種原因使用網際網路。
00:17
It turns out that one of the most popular categories of website
1
17765
3804
結果發現,最熱門的網站類型之一
00:21
is something that people typically consume in private.
2
21593
2872
是人們私下瀏覽的東西。
00:25
It involves curiosity,
3
25639
2510
它和好奇心有關,
00:28
non-insignificant levels of self-indulgence
4
28173
3796
和看不太出來但又明顯的 放蕩不羈程度有關,
00:31
and is centered around recording the reproductive activities
5
31993
3260
整天沉浸在記錄別人 繁殖活動的圈圈裡。
00:35
of other people.
6
35277
1309
00:36
(Laughter)
7
36610
1032
(笑聲)
00:37
Of course, I'm talking about genealogy --
8
37666
2250
當然,我在說的是家譜學——
00:39
(Laughter)
9
39940
1214
(笑聲)
00:41
the study of family history.
10
41178
1702
家族史的研究。
00:43
When it comes to detailing family history,
11
43353
2037
說到詳細的家族歷史,
00:45
in every family, we have this person that is obsessed with genealogy.
12
45414
3943
在每個家庭中,都會 有一個人特別迷戀家譜。
00:49
Let's call him Uncle Bernie.
13
49381
1713
咱們就稱他為柏尼叔叔吧。
00:51
Uncle Bernie is exactly the last person you want to sit next to
14
51118
3782
感恩節晚餐時,你最不希望
坐到的位子 就是柏尼叔叔的旁邊,
00:54
in Thanksgiving dinner,
15
54924
1599
00:56
because he will bore you to death with peculiar details
16
56547
2814
因為他會一直講某個 古老親戚的獨特細節,
00:59
about some ancient relatives.
17
59385
1966
講到讓你無聊死。
01:02
But as you know,
18
62462
1262
但,各位都知道,
01:03
there is a scientific side for everything,
19
63748
2872
凡事都有科學的一面,
01:06
and we found that Uncle Bernie's stories
20
66644
2978
而我們發現,柏尼叔叔的故事
01:09
have immense potential for biomedical research.
21
69646
3168
非常有潛力可以用在 生物醫學研究上。
01:13
We let Uncle Bernie and his fellow genealogists
22
73306
2714
我們讓柏尼叔叔 和他的家譜學者夥伴們
01:16
document their family trees through a genealogy website called geni.com.
23
76044
4668
透過家譜網站 geni.com 來記錄他們的家譜。
01:21
When users upload their trees to the website,
24
81198
2128
當使用者將他們的家譜 上傳到該網站,
01:23
it scans their relatives,
25
83350
1690
網站會掃描他們的親戚,
01:25
and if it finds matches to existing trees,
26
85064
2075
如果發現和既有的家譜樹有吻合,
01:27
it merges the existing and the new tree together.
27
87163
3610
就會把既有的家譜 和那新的家譜合併起來。
01:31
The result is that large family trees are created,
28
91768
2950
結果就是建造出了很大的家譜,
01:34
beyond the individual level of each genealogist.
29
94742
3479
超越了家譜學者 個人能做到的程度。
01:38
Now, by repeating this process with millions of people
30
98808
4129
如今,針對全世界數百萬人
01:42
all over the world,
31
102961
1817
重覆這個流程,
01:44
we can crowdsource the construction of a family tree of all humankind.
32
104802
5532
我們就能將全人類的家譜 外包給群眾來做。
01:51
Using this website,
33
111292
1584
我們用這個網站,
01:52
we were able to connect 125 million people
34
112900
4813
將一億兩千五百萬人連結起來,
01:57
into a single family tree.
35
117737
2521
成為單一家譜樹。
02:00
I cannot draw the tree on the screens over here
36
120967
2788
我無法在這裡的螢幕上 畫出這個家譜樹,
02:03
because they have less pixels
37
123779
2165
因為這個家譜樹中的人數
02:05
than the number of people in this tree.
38
125968
2513
比螢幕的畫素還要多。
02:08
But here is an example of a subset of 6,000 individuals.
39
128505
5010
但,可以取其中一部分 六千人的家譜給各位看。
02:14
Each green node is a person.
40
134159
2362
每個綠色節點代表一個人。
02:17
The red nodes represent marriages,
41
137060
2849
紅色節點代表婚姻關係,
02:19
and the connections represent parenthood.
42
139933
2258
連線則代表親子關係。
02:22
In the middle of this tree, you see the ancestors.
43
142557
2372
在家譜的中間可以看到祖先。
02:24
And as we go to the periphery, you see the descendants.
44
144953
2604
在外圍則是後代。
02:27
This tree has seven generations, approximately.
45
147581
3102
這個家譜樹大約涵蓋了七個世代。
02:31
Now, this is what happens when we increase the number of individuals
46
151692
3234
當我們把人數增加到七萬人時,
02:34
to 70,000 people --
47
154950
1828
就會變成這樣——
02:36
still a tiny subset of all the data that we have.
48
156802
4330
相對我們所有的資料, 這仍然只是冰山一角。
02:41
Despite that, you can already see the formation of gigantic family trees
49
161629
4813
儘管如此,各位已經可以 看出有巨大的家譜樹形成了,
02:46
with many very distant relatives.
50
166466
2655
當中有許多遠親。
02:49
Thanks to the hard work of our genealogists,
51
169610
3134
仰賴家譜學者的努力,
02:52
we can go back in time hundreds of years ago.
52
172768
3103
我們可以回到數百年前。
02:56
For example, here is Alexander Hamilton,
53
176418
3441
比如,這是亞歷山大 · 漢密爾頓,
02:59
who was born in 1755.
54
179883
2475
生於 1755 年。
03:02
Alexander was the first US Secretary of the Treasury,
55
182872
3764
亞歷山大是美國第一位財政部長,
03:06
but mostly known today due to a popular Broadway musical.
56
186660
3831
但主要由於一部流行的 百老匯音樂劇而廣為人知。
03:11
We found that Alexander has deeper connections in the showbiz industry.
57
191137
4922
我們發現亞歷山大 在娛樂圈有更深厚的人脈。
03:16
In fact, he's a blood relative of ...
58
196083
2111
事實上,他是……
03:18
Kevin Bacon!
59
198781
1220
凱文貝肯的血親!
03:20
(Laughter)
60
200025
2032
(笑聲)
03:22
Both of them are descendants of a lady from Scotland
61
202081
2606
他們兩人都是十三世紀
03:24
who lived in the 13th century.
62
204711
2314
一位蘇格蘭女子的後裔。
03:27
So you can say that Alexander Hamilton
63
207049
3102
所以,可以說亞歷山大漢密爾頓
03:30
is 35 degrees of Kevin Bacon genealogy.
64
210175
3188
是凱文貝肯的三十五度宗譜。 (改自「六度分離」)
03:33
(Laughter)
65
213387
1441
(笑聲)
03:34
And our tree has millions of stories like that.
66
214852
3230
我們的家譜樹有數百萬個 像這樣的故事。
03:40
We invested significant efforts to validate the quality of our data.
67
220113
4890
我們投入許多心力 去驗證我們資料的品質。
03:45
Using DNA, we found that .3 percent of the mother-child connections in our data
68
225027
5391
利用 DNA,我們發現,
我們的資料中有 0.3% 的 母子關係是錯的,
03:50
are wrong,
69
230442
1250
03:51
which could match the adoption rate in the US pre-Second World War.
70
231716
3591
這很符合在二次大戰之前 美國的領養率。
03:56
For the father's side,
71
236847
1785
就父系的這一面來說,
03:58
the news is not as good:
72
238656
1961
狀況就沒這麼好了:
04:02
1.9 percent of the father-child connections in our data are wrong.
73
242149
5600
我們的資料中,1.9% 的 父子關係是錯的。
04:07
And I see some people smirk over here.
74
247773
2363
我看到這邊有些人在笑。
04:10
It is what you think --
75
250160
1717
就如各位所想的——
04:11
there are many milkmen out there.
76
251901
1789
世界上有很多師奶殺手級的男人。
04:13
(Laughter)
77
253714
1064
(笑聲)
04:14
However, this 1.9 percent error rate in patrilineal connections
78
254802
3989
然而,這 1.9% 的父子關係錯誤率
不是我們數據獨有的 。
04:18
is not unique to our data.
79
258815
1769
04:20
Previous studies found a similar error rate
80
260608
3069
過去用臨床等級家譜所做的研究,
04:23
using clinical-grade pedigrees.
81
263701
2021
也有發現近似的錯誤率。
04:26
So the quality of our data is good,
82
266254
2525
所以我們的資料品質算不錯,
04:28
and that should not be a surprise.
83
268803
2133
那並不讓人意外。
04:30
Our genealogists have a profound, vested interest
84
270960
3776
我們的系譜學家對正確記錄
04:34
in correctly documenting their family history.
85
274760
3668
他們的家族史有著濃厚的興趣。
04:40
We can leverage this data to learn quantitative information about humanity,
86
280594
4591
我們可以善用這些資料, 來了解人類的量化資訊,
04:45
for example, questions about demography.
87
285209
2596
比如,人口統計相關的問題。
04:47
Here is a look at all our profiles on the map of the world.
88
287829
3857
這是我們的資料在世界地圖上的樣子。
04:52
Each pixel is a person that lived at some point.
89
292250
4481
每一個畫素就是 活在某個時點的一個人。
04:56
And since we have so much data,
90
296755
1680
因為我們有非常多資料,
04:58
you can see the contours of many countries,
91
298459
2781
各位可以看見許多國家的輪廓,
05:01
especially in the Western world.
92
301264
2099
特別是西方世界的國家。
05:03
In this clip, we stratified the map that I've showed you
93
303387
3548
在這段影片中,我們根據 1400~1900 年間出生的人,
05:06
based on the year of births of individuals from 1400 to 1900,
94
306959
5072
將剛才那張地圖做分層,
05:12
and we compared it to known migration events.
95
312055
2766
再將結果和已知的 移民事件做比對。
05:15
The clip is going to show you that the deepest lineages in our data
96
315482
3165
這支影片會讓各位看到, 我們資料中最深遠的連結,
05:18
go all the way back to the UK,
97
318671
1627
會一路連到記錄 保存得比較好的英國,
05:20
where they had better record keeping,
98
320322
1808
05:22
and then they spread along the routes of Western colonialism.
99
322154
3282
接著再隨西方殖民路線散播出去。
05:25
Let's watch this.
100
325460
1322
咱們來看看影片。
05:27
(Music)
101
327143
1609
(音樂)
05:28
[Year of birth: ]
102
328776
2341
〔出生年:〕
05:31
[1492 - Columbus sails the ocean blue]
103
331705
1836
〔1492 年:哥倫布藍色海洋航行時期〕
05:35
[1620 - Mayflower lands in Massachusetts]
104
335661
2000
〔1620 年:五月花號在麻州靠岸〕
05:38
[1652 - Dutch settle in South Africa]
105
338726
1775
〔1652 年:荷蘭人在南非定居〕
05:44
[1788 - Great Britain penal transportation to Australia starts]
106
344321
3186
〔1788 年:英國開始將受刑者運往澳洲〕
05:47
[1836 - First migrants use Oregon Trail]
107
347531
1927
〔1836 年:奧勒岡小徑 初次被移民使用〕
05:50
[all activity]
108
350149
3183
〔所有活動〕
05:55
I love this movie.
109
355851
1543
我很愛這支影片。
05:57
Now, since these migration events are giving the context of families,
110
357418
5093
既然有這些移民事件 作為家族的脈絡,
06:02
we can ask questions such as:
111
362535
2183
我們就能問像這樣的問題:
06:04
What is the typical distance between the birth locations
112
364742
3470
先生和太太的出生地,
06:08
of husbands and wives?
113
368236
2812
通常距離多遠?
06:11
This distance plays a pivotal role in demography,
114
371072
3677
在人口統計學上, 這距離扮演很關鍵的角色,
06:14
because the patterns in which people migrate to form families
115
374773
3681
因為人們遷移建構家庭的模式
06:18
determine how genes spread in geographical areas.
116
378478
3713
會決定基因在地理 區域上如何散播。
06:22
We analyzed this distance using our data,
117
382706
2328
我們用我們的資料 來分析這個距離,
06:25
and we found that in the old days,
118
385058
2290
我們發現,在過去 用的方式並不困難。
06:27
people had it easy.
119
387372
1230
06:28
They just married someone in the village nearby.
120
388626
2594
他們只會和鄰近村落的人結婚。
06:31
But the Industrial Revolution really complicated our love life.
121
391958
3705
但,工業革命讓我們的 愛情生活變複雜了。
06:35
And today, with affordable flights and online social media,
122
395687
4560
現今,機票大家可以付擔得起, 再加上線上社群媒體,
06:40
people typically migrate more than 100 kilometers from their place of birth
123
400271
4828
人們通常從出生地遷移一百多公里,
06:45
to find their soul mate.
124
405123
1504
去尋找靈魂伴侶。
06:48
So now you might ask:
125
408524
1187
現在各位可能會問: 好吧,從一個地方遷移到另一個地方
06:49
OK, but who does the hard work of migrating from places to places
126
409735
4496
去建構家庭的苦差事是誰在做呢?
06:54
to form families?
127
414255
1269
06:55
Are these the males or the females?
128
415548
3727
是男方或女方?
06:59
We used our data to address this question,
129
419752
2155
我們用我們的資料來解這個問題,
07:01
and at least in the last 300 years,
130
421931
2594
至少,在過去三百年間,
07:04
we found that the ladies do the hard work
131
424549
3883
我們發現從一地移民到另一地
去組成家庭的苦差事是女方在做。
07:08
of migrating from places to places to form families.
132
428456
2996
07:11
Now, these results are statistically significant,
133
431476
3101
這些結果在統計上是顯著的,
07:14
so you can take it as scientific fact that males are lazy.
134
434601
3471
所以男人比較懶是有科學根據的。
07:18
(Laughter)
135
438096
3156
(笑聲)
07:21
We can move from questions about demography
136
441276
2536
我們可以從人口統計相關的問題
07:23
and ask questions about human health.
137
443836
2913
換到詢問人類健康相關的問題。
07:26
For example, we can ask
138
446773
1487
比如,我們可以問
07:28
to what extent genetic variations account for differences in life span
139
448284
4963
人與人之間的壽命差異,
受到遺傳變異的影響有多大?
07:33
between individuals.
140
453271
1194
07:34
Previous studies analyzed the correlation of longevity between twins
141
454988
4530
過去有研究分析 雙胞胎壽命的相關性
07:39
to address this question.
142
459542
1442
來解答這個問題。
07:41
They estimated that the genetic variations account for
143
461411
2667
他們估計,人與人 之間的壽命差異,
07:44
about a quarter of the differences in life span between individuals.
144
464102
4040
有四分之一是來自遺傳變異。
07:48
But twins can be correlated due to so many reasons,
145
468688
2598
但,雙胞胎之間的關聯性 有許多可能成因,
07:51
including various environmental effects
146
471310
2304
包括各種環境的影響,
07:53
or a shared household.
147
473638
1622
或共同的家庭。
07:56
Large family trees give us the opportunity to analyze both close relatives,
148
476411
3753
大型家譜樹讓我們有機會 分析這些近親,
08:00
such as twins,
149
480188
1207
比如雙胞胎,
08:01
all the way to distant relatives, even fourth cousins.
150
481419
2917
到遠房親戚,甚至第四代表親。
08:04
This way we can build robust models
151
484749
2689
這樣我們就能建立穩健的模型,
08:07
that can tease apart the contribution of genetic variations
152
487462
3708
從環境因素中
分離出遺傳變異的貢獻來。
08:11
from environmental factors.
153
491194
1717
08:13
We conducted this analysis using our data,
154
493379
2899
我們用我們的資料進行這項分析,
08:16
and we found that genetic variations explain only 15 percent
155
496302
5791
我們發現,遺傳變異只解釋了
15% 的個體壽命差異 。
08:22
of the differences in life span between individuals.
156
502117
2806
08:26
That is five years, on average.
157
506760
2756
平均而言,就是五年之差。
08:30
So genes matter less than what we thought before to life span.
158
510316
4708
所以,基因對壽命的影響 沒有我們以前想的那麼大。
08:35
And I find it great news,
159
515675
2136
我認為這是大好消息,
08:38
because it means that our actions can matter more.
160
518438
3293
因為那就表示我們的行為 與壽命有較大的關係。
08:42
Smoking, for example, determines 10 years of our life expectancy --
161
522533
4274
比如,抽菸就能影響 十年的壽命——
08:46
twice as much as what genetics determines.
162
526831
2646
是基因影響的兩倍。
08:50
We can even have more surprising findings
163
530236
2289
我們還有更驚人的發現,
08:52
as we move from family trees
164
532549
1492
就在我們從做家譜樹到
08:54
and we let our genealogists document and crowdsource DNA information.
165
534065
4732
請家譜學家幫我們整理 DNA 資訊 並做眾包後發現的。
08:58
And the results can be amazing.
166
538821
2024
結果很驚人。
09:01
It might be hard to imagine, but Uncle Bernie and his friends
167
541255
3915
可能很難想像, 但柏尼叔叔和他的朋友
09:05
can create DNA forensic capabilities
168
545194
2646
所創造出來的 DNA 法醫鑑定能力
09:07
that even exceed what the FBI currently has.
169
547864
3559
甚至比目前的聯邦調查局還要強。
09:12
When you place the DNA on a large family tree,
170
552862
2404
當你把 DNA 放入大型的家譜樹中,
09:15
you effectively create a beacon
171
555290
2117
就能有效地創造出 如燈塔般的光束,
09:17
that illuminates the hundreds of distant relatives
172
557431
2634
從 DNA 的源頭者放射出與
09:20
that are all connected to the person that originated the DNA.
173
560089
3490
數百名遠親的連結光束。
09:24
By placing multiple beacons on a large family tree,
174
564505
2913
在家譜中放入數個燈塔,
09:27
you can now triangulate the DNA of an unknown person,
175
567442
3720
就能針對一個未知的人 做 DNA 三角定位,
09:31
the same way that the GPS system uses multiple satellites
176
571186
3938
原理和 GPS 使用多個衛星
來定位一個地點的方法相同。
09:35
to find a location.
177
575148
1324
09:37
The prime example of the power of this technique
178
577226
3624
有個主要的例子可以說明 這項技術有多強大,
09:40
is capturing the Golden State Killer,
179
580874
2675
那就是追捕金州(加州)殺手,
09:44
one of the most notorious criminals in the history of the US.
180
584612
4528
他是美國史上 最惡名昭彰的罪犯之一。
09:49
The FBI had been searching for this person for over 40 years.
181
589164
5892
聯邦調查局尋找這個人
已經超過四十年。
09:55
They had his DNA,
182
595588
1835
他們有他的 DNA,
09:57
but he never showed up in any police database.
183
597447
3350
但他從來沒有出現在 任何警方資料庫中。
10:01
About a year ago, the FBI consulted a genetic genealogist,
184
601447
4712
大約一年前,聯邦調查局 去諮詢了一位基因系譜學家,
10:06
and she suggested that they submit his DNA to a genealogy service
185
606183
3950
她建議他們將他的 DNA 上傳到一項家譜服務中,
10:10
that can locate distant relatives.
186
610157
2398
這項服務能找出遠親。
10:13
They did that,
187
613117
1156
他們照做了,
10:14
and they found a third cousin of the Golden State Killer.
188
614297
3692
找到了金州殺手的第三代表親。
10:18
They built a large family tree,
189
618013
2344
他們建立了一個很大的家譜樹,
10:20
scanned the different branches of that tree,
190
620381
2102
掃描樹狀圖上的不同分支,
10:22
until they found a profile that exactly matched
191
622507
2565
直到他們找到完美匹配
金州殺手資訊的檔案。
10:25
what they knew about the Golden State Killer.
192
625096
2581
10:27
They obtained DNA from this person and found a perfect match
193
627701
3592
他們從這個人身上取得 DNA
並發現跟他們手上的 DNA 相匹配。
10:31
to the DNA they had in hand.
194
631317
2025
10:33
They arrested him and brought him to justice
195
633366
2350
他們逮捕了這個人, 這麼多年後終於將他繩之以法。
10:35
after all these years.
196
635740
1424
10:38
Since then, genetic genealogists have started working with
197
638172
3241
從那之後,基因系譜學家就開始
和美國執法單位合作,
10:41
local US law enforcement agencies
198
641437
2668
10:44
to use this technique in order to capture criminals.
199
644129
3362
使用這項技術來抓罪犯。
10:47
And only in the past six months,
200
647521
2681
光是在過去六個月,
10:50
they were able to solve over 20 cold cases with this technique.
201
650226
4296
他們就用這項技術破了 超過二十件長年未破的案件。
10:56
Luckily, we have people like Uncle Bernie and his fellow genealogists
202
656203
4636
很幸運,我們有柏尼叔叔 和他的家譜學家夥伴們。
11:01
These are not amateurs with a self-serving hobby.
203
661045
2994
這些人不只是業餘愛好者。
11:04
These are citizen scientists with a deep passion to tell us who we are.
204
664602
6419
他們還是滿懷熱情
能說「我們是誰」的公民科學家,
11:11
And they know that the past can hold a key to the future.
205
671065
4458
他們知道過去是通向未來的鑰匙。
11:16
Thank you very much.
206
676067
1183
感謝各位。
11:17
(Applause)
207
677314
3469
(掌聲)
關於本網站

本網站將向您介紹對學習英語有用的 YouTube 視頻。 您將看到來自世界各地的一流教師教授的英語課程。 雙擊每個視頻頁面上顯示的英文字幕,從那裡播放視頻。 字幕與視頻播放同步滾動。 如果您有任何意見或要求,請使用此聯繫表與我們聯繫。

https://forms.gle/WvT1wiN1qDtmnspy7