請雙擊下方英文字幕播放視頻。
譯者: Crystal Yip
00:09
The first standardized tests
that we know of
0
9061
3101
我們所知的第一個標準測驗
00:12
were administered in China
over 2,000 years ago
1
12162
4180
是在 2000 年前
時值漢朝的中國
00:16
during the Han dynasty.
2
16342
1881
00:18
Chinese officials used them to determine
aptitude for various government posts.
3
18223
5260
中國官員透過測驗決定應試者
是否勝任各類政府職務
00:23
The subject matter included philosophy,
4
23483
2089
考試範圍包括哲學
00:25
farming,
5
25572
1065
農耕
00:26
and even military tactics.
6
26637
2326
甚至軍事謀略
00:28
Standardized tests continued to be used
around the world for the next two millennia,
7
28963
4827
往後二千年,
標準測驗在世界各地繼續沿用
00:33
and today, they're used for everything
8
33790
2092
今天,測驗用於各種事情
00:35
from evaluating stair climbs
for firefighters in France
9
35882
3956
由法國評估消防員爬樓梯的能力
00:39
to language examinations
for diplomats in Canada
10
39838
3485
以至加拿大外交官的語言考試
00:43
to students in schools.
11
43323
2591
乃至學校學生
00:45
Some standardized tests measure scores
12
45914
2110
有些標準測驗按其他人的成績來評量分數
00:48
only in relation to the results
of other test takers.
13
48024
3760
00:51
Others measure performances on how well
test takers meet predetermined criteria.
14
51784
5671
另一些按預設標準來評量表現
00:57
So the stair climb for the firefighter
15
57455
2258
因此消防員爬樓梯的能力
00:59
could be measured by comparing
the time of the climb
16
59713
2881
能按其他消防員
爬樓梯需要的時間來評量
01:02
to that of all other firefighters.
17
62594
3010
01:05
This might be expressed in what
many call a bell curve.
18
65604
3839
這可用鐘形曲線來表示
01:09
Or it could be evaluated with reference
to set criteria,
19
69443
3971
或按預設標準來評量
01:13
such as carrying a certain amount
of weight a certain distance
20
73414
3590
例如攜帶相當重量行走特定距離
01:17
up a certain number of stairs.
21
77004
2920
並爬上特定數量的梯級
01:19
Similarly, the diplomat might be measured
against other test-taking diplomats,
22
79924
4778
同理,外交官可按
其他應試外交官的表現來評量
01:24
or against a set of fixed criteria,
23
84702
2443
或按預設標準
01:27
which demonstrate different levels
of language proficiency.
24
87145
3909
來顯示應試者精通語言的程度
01:31
And all of these results can be expressed
using something called a percentile.
25
91054
4731
而這些結果可用百分位數來表達
01:35
If a diplomat is in the 70th percentile,
70% of test takers scored below her.
26
95785
5989
若外交官在第 70 百分位數,
70% 應試者的分數低於她
01:41
If she scored in the 30th percentile,
70% of test takers scored above her.
27
101774
5561
若得分在第 30 百分位數,
70% 應試者的分數高於她
01:47
Although standardized tests
are sometimes controversial,
28
107335
3411
雖然標準測驗有時備受爭議
01:50
they're simply a tool.
29
110746
1779
但它們其實只是工具
01:52
As a thought experiment,
think of a standardized test as a ruler.
30
112525
4171
試作思想實驗:標準測驗是把直尺
01:56
A ruler's usefulness
depends on two things.
31
116696
2699
直尺是否有用視乎兩件事
01:59
First, the job we ask it to do.
32
119395
2762
第一,我們將它應用在甚麼工作
02:02
Our ruler can't measure
the temperature outside
33
122157
2829
我們的直尺不能量度室外温度
02:04
or how loud someone is singing.
34
124986
2460
或某人唱歌的聲調高低
02:07
Second, the ruler's usefulness depends
on its design.
35
127446
3419
第二,直尺是否合用視乎其設計
02:10
Say you need to measure the circumference
of an orange.
36
130865
3281
譬如你需要量度一個柳橙的圓周
02:14
Our ruler measures length,
which is the right quantity,
37
134146
3251
雖然圓周是長度,
而我們的直尺能量度長度
02:17
but it hasn't been designed with the
flexibility required for the task at hand.
38
137397
4841
但它的設計未能有彈性量度曲線
02:22
So, if standardized tests are given
the wrong job,
39
142238
3128
所以,如果標準測驗錯配工作
02:25
or aren't designed properly,
40
145366
1871
或設計不善
02:27
they may end up measuring
the wrong things.
41
147237
4390
最後可能會量度錯誤
02:31
In the case of schools,
42
151627
1280
以學校為例
02:32
students with test anxiety may have
trouble performing their best
43
152907
3771
對測驗感到焦慮的學生
或在測驗中難有最佳表現
02:36
on a standardized test,
44
156678
1730
02:38
not because they don't know the answers,
45
158408
1708
不是因為他們不懂得回答問題
02:40
but because they're feeling too nervous
to share what they've learned.
46
160116
3619
而是因為太緊張以致無法呈現成果
02:43
Students with reading challenges
47
163735
1683
有閱讀困難的學生
02:45
may struggle with the wording
of a math problem,
48
165418
2660
也許難於明白數學題的文句
02:48
so their test results may better reflect
their literacy
49
168078
2800
因此他們的測驗成績
或較能反映他們閱讀文字的能力
02:50
rather than numeracy skills.
50
170878
2640
而非算術能力
02:53
And students who were confused by examples
51
173518
2060
一些學生礙於文化隔閡,
未能明白測驗中的例子
02:55
on tests that contain
unfamiliar cultural references
52
175578
3590
02:59
may do poorly,
53
179168
1449
可能表現欠佳
03:00
telling us more about the test taker's
cultural familiarity
54
180617
2792
這些測驗較能得知
應試者的文化熟悉度
03:03
than their academic learning.
55
183409
2289
而非他們的學術知識
03:05
In these cases, the tests may need
to be designed differently.
56
185698
5392
這些例子中,測驗或需要更改設計
03:11
Standardized tests can also
have a hard time
57
191090
2329
標準測驗也難於
03:13
measuring abstract
characteristics or skills,
58
193419
3219
量度抽象的性格或技能
03:16
such as creativity, critical thinking,
and collaboration.
59
196638
4020
例如創意、批判思考和合作能力
03:20
If we design a test poorly,
60
200658
1720
如果測驗設計不良
03:22
or ask it to do the wrong job,
61
202378
1922
或用之不當
03:24
or a job it's not very good at,
62
204300
2253
或用之不善
03:26
the results may not be reliable or valid.
63
206553
3296
結果可能會不可靠或無效
03:29
Reliability and validity
are two critical ideas
64
209849
3090
可靠性和有效性是兩個重要概念
03:32
for understanding standardized tests.
65
212939
2680
來理解標準測驗
03:35
To understand the difference between them,
66
215619
1681
要理解兩者的差異
03:37
we can use the metaphor
of two broken thermometers.
67
217300
3089
我們能夠用兩支壞的温度計作比喻
03:40
An unreliable thermometer
68
220389
1900
一支不可靠的温度計
03:42
gives you a different reading
each time you take your temperature,
69
222289
3253
每次你測量自己體温時,
都的到不同的讀數
03:45
and the reliable but invalid thermometer
is consistently ten degrees too hot.
70
225542
5649
另一支是可靠但不準確的温度計
總是比正確温度高出十度
03:51
Validity also depends on accurate
interpretations of results.
71
231191
4269
有效性也在於恰當解讀結果
03:55
If people say results of a test
mean something they don't,
72
235460
3311
如果分析不符合測驗結果
03:58
that test may have a validity problem.
73
238771
3163
該測驗的有效性就會成疑
04:01
Just as we wouldn't expect a ruler
to tell us how much an elephant weighs,
74
241934
4508
正如我們不會
以直尺量度大象的重量
04:06
or what it had for breakfast,
75
246442
1860
或問直尺大象吃了甚麼早餐
04:08
we can't expect standardized tests alone
to reliably tell us how smart someone is,
76
248302
5879
我們不能認為單靠標準測驗
便可知某人有多聰明
04:14
how diplomats will handle
a tough situation,
77
254181
2142
外交官有多能應對困難情況
04:16
or how brave a firefighter
might turn out to be.
78
256323
4299
或消防員將會有多勇敢
04:20
So standardized tests may help us learn
a little about a lot of people
79
260622
4790
因此標準測驗或能助我們
短時間內簡略了解很多人
04:25
in a short time,
80
265412
1150
04:26
but they usually can't tell us a lot
about a single person.
81
266562
4451
但我們通常不能
從中詳細知道一個人
04:31
Many social scientists worry about
test scores resulting in sweeping
82
271013
4719
很多社會科學家擔心測驗分數籠統
04:35
and often negative changes
for test takers,
83
275732
3114
並經常為應試者帶來負面影響
04:38
sometimes with long-term
life consequences.
84
278846
3542
有時影響一生
04:42
We can't blame the tests, though.
85
282388
2001
但是,我們不能錯怪測驗
04:44
It's up to us to use the right tests
for the right jobs,
86
284389
3790
而是在於我們是否用得其所
04:48
and to interpret results appropriately.
87
288179
2884
並合理分析結果
New videos
關於本網站
本網站將向您介紹對學習英語有用的 YouTube 視頻。 您將看到來自世界各地的一流教師教授的英語課程。 雙擊每個視頻頁面上顯示的英文字幕,從那裡播放視頻。 字幕與視頻播放同步滾動。 如果您有任何意見或要求,請使用此聯繫表與我們聯繫。