The method that can "prove" almost anything - James A. Smith

848,766 views ・ 2021-08-05

TED-Ed


請雙擊下方英文字幕播放視頻。

譯者: Lilian Chiu 審譯者: Amanda Zhu
00:06
In 2011, a group of researchers conducted a scientific study
0
6371
4167
2011 年,一群研究者 進行了一項科學研究,
00:10
to find an impossible result:
1
10538
2125
其發現讓人難以置信:
00:12
that listening to certain songs can make you younger.
2
12663
3500
聆聽某些歌曲能讓你變年輕。
00:16
Their study involved real people, truthfully reported data,
3
16663
3625
他們的研究用到真人參與、 誠實回報的資料,
00:20
and commonplace statistical analyses.
4
20288
3000
以及常用的統計分析。
00:23
So how did they do it?
5
23288
1416
他們怎麼做到的?
00:24
The answer lies in a statistical method scientists often use
6
24704
4125
答案是一種統計方法,
科學家通常會用它來判別
00:28
to try to figure out whether their results mean something or if they’re random noise.
7
28829
4875
研究結果是有意義的, 或者只是隨機雜音。
00:33
In fact, the whole point of the music study
8
33704
2625
事實上,這項音樂研究的重點
00:36
was to point out ways this method can be misused.
9
36329
3917
就是要點出這個方法 可能如何被誤用。
00:40
A famous thought experiment explains the method:
10
40746
2791
有一個著名的思想實驗 就解釋了這個方法:
00:43
there are eight cups of tea,
11
43746
1750
有八杯茶,
00:45
four with the milk added first, and four with the tea added first.
12
45496
4416
其中四杯先加牛奶,
另外四杯先加茶。
00:50
A participant must determine which are which according to taste.
13
50162
3625
受試者要根據味道 來判斷哪一杯是哪一種。
00:53
There are 70 different ways the cups can be sorted into two groups of four,
14
53871
4583
將任意四杯分成一組, 一共會有七十種組合,
00:58
and only one is correct.
15
58454
2000
其中只有一種是正確的。
01:00
So, can she taste the difference?
16
60662
2584
我們這項研究的問題是
01:03
That’s our research question.
17
63246
1625
「她能嚐出差異嗎?」
01:04
To analyze her choices, we define what’s called a null hypothesis:
18
64871
4625
為了分析各種選擇,
我們要先設定所謂的虛無假說:
01:09
that she can’t distinguish the teas.
19
69496
2167
她無法分辨。
01:11
If she can’t distinguish the teas,
20
71871
2042
如果她無法分辨,
01:13
she’ll still get the right answer 1 in 70 times by chance.
21
73913
5166
她仍然有答對的可能,
猜對的機率有七十分之一。
01:19
1 in 70 is roughly .014.
22
79079
3334
七十分之一約為 0.014。
01:22
That single number is called a p-value.
23
82746
3292
這個數字叫做 p 值。
01:26
In many fields, a p-value of .05 or below is considered statistically significant,
24
86038
6916
在許多領域中,
等於或小於 0.05 的 p 值
被認為具有統計顯著性,
01:32
meaning there’s enough evidence to reject the null hypothesis.
25
92954
3792
意即已有證據足以摒棄這個虛無假設。
01:36
Based on a p-value of .014,
26
96996
3375
因為這個研究的 p 值為 0.014,
01:40
they’d rule out the null hypothesis that she can’t distinguish the teas.
27
100371
4125
他們就會將「她無法分辨」的 虛無假說排除。
01:44
Though p-values are commonly used by both researchers and journals
28
104913
3916
雖然研究者和期刊都經常使用 p 值
01:48
to evaluate scientific results,
29
108829
2084
來評估科學研究結果,
01:50
they’re really confusing, even for many scientists.
30
110913
2958
但就連許多科學家 也會對 p 值感到困惑,
01:54
That’s partly because all a p-value actually tells us
31
114329
4042
部分原因是 p 值其實只是告訴我們,
01:58
is the probability of getting a certain result,
32
118371
3000
如果虛無假設是真的,
02:01
assuming the null hypothesis is true.
33
121371
2917
得到某個結果的機率有多高。
02:04
So if she correctly sorts the teas,
34
124663
2791
所以,如果她把茶正確地分類,
02:07
the p-value is the probability of her doing so
35
127454
3417
p 值就是在假設 她無法分辨的前提下
02:10
assuming she can’t tell the difference.
36
130871
2458
正確分辨的機率,
02:13
But the reverse isn’t true:
37
133329
2459
但反過來就不見得是對的:
02:15
the p-value doesn’t tell us the probability
38
135788
2416
p 值不會告訴我們 她分辨錯誤的機率,
02:18
that she can taste the difference,
39
138204
1625
02:19
which is what we’re trying to find out.
40
139829
2084
這機率才是我們想找出的答案。
02:22
So if a p-value doesn’t answer the research question,
41
142329
3250
所以,如果 p 值不能解答研究問題,
02:25
why does the scientific community use it?
42
145579
2292
為什麼仍被科學界採用?
02:28
Well, because even though a p-value doesn’t directly state the probability
43
148329
4709
因為雖然 p 值不能直接代表
隨機猜中的機率,
02:33
that the results are due to random chance,
44
153038
2500
02:35
it usually gives a pretty reliable indication.
45
155538
3333
但它通常仍然能提供蠻可靠的暗示,
02:39
At least, it does when used correctly.
46
159204
2792
至少是在正確使用的情況下。
02:41
And that’s where many researchers, and even whole fields,
47
161996
3917
這就是許多研究者,
甚至整個研究領域遇到問題的地方了。
02:45
have run into trouble.
48
165913
1458
02:47
Most real studies are more complex than the tea experiment.
49
167538
3458
大部分真正的研究 都比這個茶的實驗複雜許多。
02:51
Scientists can test their research question in multiple ways,
50
171288
3375
科學家可以用多種方式 來測試他們的研究,
02:54
and some of these tests might produce a statistically significant result,
51
174663
4375
有些測試可能會產生 具有統計顯著性的結果,
02:59
while others don’t.
52
179038
1208
有些則不會。
03:00
It might seem like a good idea to test every possibility.
53
180454
3167
測試每一種可能性似乎是個好點子,
03:03
But it’s not, because with each additional test,
54
183913
3083
但事實並非如此,
因為每增加一項測試,
03:07
the chance of a false positive increases.
55
187163
3208
結果是偽真的可能性就會增加。
03:10
Searching for a low p-value, and then presenting only that analysis,
56
190996
4500
找一個很低的 p 值,
並只呈現對應該 p 值的分析,
03:15
is often called p-hacking.
57
195496
2750
通常被稱為 p 值駭客。
03:18
It’s like throwing darts until you hit a bullseye
58
198246
2750
這就像是不斷射飛鏢, 直到命中紅心,
03:20
and then saying you only threw the dart that hit the bull’s eye.
59
200996
3333
然後宣稱你只射了 命中紅心的那個飛鏢。
03:24
This is exactly what the music researchers did.
60
204746
3208
那些聲稱音樂可以駐顏的研究者 用的就是這一招。
03:28
They played three groups of participants each a different song
61
208079
3709
針對三組受試者, 他們各播放一首不同的歌曲,
03:31
and collected lots of information about them.
62
211788
2500
接著收集許多實驗的資訊。
03:34
The analysis they published included only two out of the three groups.
63
214288
4250
他們發表的分析 只包含三組當中的兩組。
03:38
Of all the information they collected,
64
218538
2208
在他們所收集到的所有資訊中,
03:40
their analysis only used participants’ fathers’ age—
65
220746
3542
他們的分析只使用了 受試者的父親年齡——
03:44
to “control for variation in baseline age across participants.”
66
224288
4541
以「控制各受試者 基線年齡的差異」。
03:49
They also paused their experiment after every ten participants,
67
229246
4208
而且每做完十個受試者, 他們就會把實驗暫停,
03:53
and continued if the p-value was above .05,
68
233454
4459
如果 p 值高於 0.05 就會繼續,
03:57
but stopped when it dipped below .05.
69
237913
3291
若低於 0.05,就停下來。
04:01
They found that participants who heard one song were 1.5 years younger
70
241746
5208
他們發現,聽某一首歌曲的受試者
比聽另一首歌曲的受試者 還要年輕一歲半,
04:06
than those who heard the other song, with a p-value of .04.
71
246954
4375
對應的 p 值為 0.04。
04:12
Usually it’s much tougher to spot p-hacking,
72
252163
2833
一般來說,p 值駭客很難被發現,
04:14
because we don’t know the results are impossible:
73
254996
2667
因為我們不會知道結果是不可能的:
04:17
the whole point of doing experiments is to learn something new.
74
257663
3416
做實驗的目的就是想取得新知。
04:21
Fortunately, there’s a simple way to make p-values more reliable:
75
261329
4209
幸運的是,有一個簡單的方法
可以讓 p 值變得更可靠:
04:25
pre-registering a detailed plan for the experiment and analysis
76
265913
4708
事先登錄實驗及分析計畫,
04:30
beforehand that others can check,
77
270621
2458
讓他人能夠檢查,
04:33
so researchers can’t keep trying different analyses
78
273079
3417
這樣研究者就無法 不斷嘗試不同的分析,
04:36
until they find a significant result.
79
276496
2125
直到找到顯著的結果為止。
04:38
And, in the true spirit of scientific inquiry,
80
278788
2458
而且,根據真正的科學調查精神,
04:41
there’s even a new field that’s basically science doing science on itself:
81
281246
5375
甚至有一個新領域,
基本上是科學在對自己做科學:
04:46
studying scientific practices in order to improve them.
82
286621
3667
研究的是科學的研究方法,以改善它們。
關於本網站

本網站將向您介紹對學習英語有用的 YouTube 視頻。 您將看到來自世界各地的一流教師教授的英語課程。 雙擊每個視頻頁面上顯示的英文字幕,從那裡播放視頻。 字幕與視頻播放同步滾動。 如果您有任何意見或要求,請使用此聯繫表與我們聯繫。

https://forms.gle/WvT1wiN1qDtmnspy7


This website was created in October 2020 and last updated on June 12, 2025.

It is now archived and preserved as an English learning resource.

Some information may be out of date.

隱私政策

eng.lish.video

Developer's Blog