Should we get rid of standardized testing? - Arlo Kempf

1,241,187 views ・ 2017-09-19

TED-Ed


請雙擊下方英文字幕播放視頻。

譯者: Crystal Yip
00:09
The first standardized tests that we know of
0
9061
3101
我們所知的第一個標準測驗
00:12
were administered in China over 2,000 years ago
1
12162
4180
是在 2000 年前 時值漢朝的中國
00:16
during the Han dynasty.
2
16342
1881
00:18
Chinese officials used them to determine aptitude for various government posts.
3
18223
5260
中國官員透過測驗決定應試者 是否勝任各類政府職務
00:23
The subject matter included philosophy,
4
23483
2089
考試範圍包括哲學
00:25
farming,
5
25572
1065
農耕
00:26
and even military tactics.
6
26637
2326
甚至軍事謀略
00:28
Standardized tests continued to be used around the world for the next two millennia,
7
28963
4827
往後二千年, 標準測驗在世界各地繼續沿用
00:33
and today, they're used for everything
8
33790
2092
今天,測驗用於各種事情
00:35
from evaluating stair climbs for firefighters in France
9
35882
3956
由法國評估消防員爬樓梯的能力
00:39
to language examinations for diplomats in Canada
10
39838
3485
以至加拿大外交官的語言考試
00:43
to students in schools.
11
43323
2591
乃至學校學生
00:45
Some standardized tests measure scores
12
45914
2110
有些標準測驗按其他人的成績來評量分數
00:48
only in relation to the results of other test takers.
13
48024
3760
00:51
Others measure performances on how well test takers meet predetermined criteria.
14
51784
5671
另一些按預設標準來評量表現
00:57
So the stair climb for the firefighter
15
57455
2258
因此消防員爬樓梯的能力
00:59
could be measured by comparing the time of the climb
16
59713
2881
能按其他消防員 爬樓梯需要的時間來評量
01:02
to that of all other firefighters.
17
62594
3010
01:05
This might be expressed in what many call a bell curve.
18
65604
3839
這可用鐘形曲線來表示
01:09
Or it could be evaluated with reference to set criteria,
19
69443
3971
或按預設標準來評量
01:13
such as carrying a certain amount of weight a certain distance
20
73414
3590
例如攜帶相當重量行走特定距離
01:17
up a certain number of stairs.
21
77004
2920
並爬上特定數量的梯級
01:19
Similarly, the diplomat might be measured against other test-taking diplomats,
22
79924
4778
同理,外交官可按 其他應試外交官的表現來評量
01:24
or against a set of fixed criteria,
23
84702
2443
或按預設標準
01:27
which demonstrate different levels of language proficiency.
24
87145
3909
來顯示應試者精通語言的程度
01:31
And all of these results can be expressed using something called a percentile.
25
91054
4731
而這些結果可用百分位數來表達
01:35
If a diplomat is in the 70th percentile, 70% of test takers scored below her.
26
95785
5989
若外交官在第 70 百分位數, 70% 應試者的分數低於她
01:41
If she scored in the 30th percentile, 70% of test takers scored above her.
27
101774
5561
若得分在第 30 百分位數, 70% 應試者的分數高於她
01:47
Although standardized tests are sometimes controversial,
28
107335
3411
雖然標準測驗有時備受爭議
01:50
they're simply a tool.
29
110746
1779
但它們其實只是工具
01:52
As a thought experiment, think of a standardized test as a ruler.
30
112525
4171
試作思想實驗:標準測驗是把直尺
01:56
A ruler's usefulness depends on two things.
31
116696
2699
直尺是否有用視乎兩件事
01:59
First, the job we ask it to do.
32
119395
2762
第一,我們將它應用在甚麼工作
02:02
Our ruler can't measure the temperature outside
33
122157
2829
我們的直尺不能量度室外温度
02:04
or how loud someone is singing.
34
124986
2460
或某人唱歌的聲調高低
02:07
Second, the ruler's usefulness depends on its design.
35
127446
3419
第二,直尺是否合用視乎其設計
02:10
Say you need to measure the circumference of an orange.
36
130865
3281
譬如你需要量度一個柳橙的圓周
02:14
Our ruler measures length, which is the right quantity,
37
134146
3251
雖然圓周是長度, 而我們的直尺能量度長度
02:17
but it hasn't been designed with the flexibility required for the task at hand.
38
137397
4841
但它的設計未能有彈性量度曲線
02:22
So, if standardized tests are given the wrong job,
39
142238
3128
所以,如果標準測驗錯配工作
02:25
or aren't designed properly,
40
145366
1871
或設計不善
02:27
they may end up measuring the wrong things.
41
147237
4390
最後可能會量度錯誤
02:31
In the case of schools,
42
151627
1280
以學校為例
02:32
students with test anxiety may have trouble performing their best
43
152907
3771
對測驗感到焦慮的學生 或在測驗中難有最佳表現
02:36
on a standardized test,
44
156678
1730
02:38
not because they don't know the answers,
45
158408
1708
不是因為他們不懂得回答問題
02:40
but because they're feeling too nervous to share what they've learned.
46
160116
3619
而是因為太緊張以致無法呈現成果
02:43
Students with reading challenges
47
163735
1683
有閱讀困難的學生
02:45
may struggle with the wording of a math problem,
48
165418
2660
也許難於明白數學題的文句
02:48
so their test results may better reflect their literacy
49
168078
2800
因此他們的測驗成績 或較能反映他們閱讀文字的能力
02:50
rather than numeracy skills.
50
170878
2640
而非算術能力
02:53
And students who were confused by examples
51
173518
2060
一些學生礙於文化隔閡, 未能明白測驗中的例子
02:55
on tests that contain unfamiliar cultural references
52
175578
3590
02:59
may do poorly,
53
179168
1449
可能表現欠佳
03:00
telling us more about the test taker's cultural familiarity
54
180617
2792
這些測驗較能得知 應試者的文化熟悉度
03:03
than their academic learning.
55
183409
2289
而非他們的學術知識
03:05
In these cases, the tests may need to be designed differently.
56
185698
5392
這些例子中,測驗或需要更改設計
03:11
Standardized tests can also have a hard time
57
191090
2329
標準測驗也難於
03:13
measuring abstract characteristics or skills,
58
193419
3219
量度抽象的性格或技能
03:16
such as creativity, critical thinking, and collaboration.
59
196638
4020
例如創意、批判思考和合作能力
03:20
If we design a test poorly,
60
200658
1720
如果測驗設計不良
03:22
or ask it to do the wrong job,
61
202378
1922
或用之不當
03:24
or a job it's not very good at,
62
204300
2253
或用之不善
03:26
the results may not be reliable or valid.
63
206553
3296
結果可能會不可靠或無效
03:29
Reliability and validity are two critical ideas
64
209849
3090
可靠性和有效性是兩個重要概念
03:32
for understanding standardized tests.
65
212939
2680
來理解標準測驗
03:35
To understand the difference between them,
66
215619
1681
要理解兩者的差異
03:37
we can use the metaphor of two broken thermometers.
67
217300
3089
我們能夠用兩支壞的温度計作比喻
03:40
An unreliable thermometer
68
220389
1900
一支不可靠的温度計
03:42
gives you a different reading each time you take your temperature,
69
222289
3253
每次你測量自己體温時, 都的到不同的讀數
03:45
and the reliable but invalid thermometer is consistently ten degrees too hot.
70
225542
5649
另一支是可靠但不準確的温度計 總是比正確温度高出十度
03:51
Validity also depends on accurate interpretations of results.
71
231191
4269
有效性也在於恰當解讀結果
03:55
If people say results of a test mean something they don't,
72
235460
3311
如果分析不符合測驗結果
03:58
that test may have a validity problem.
73
238771
3163
該測驗的有效性就會成疑
04:01
Just as we wouldn't expect a ruler to tell us how much an elephant weighs,
74
241934
4508
正如我們不會 以直尺量度大象的重量
04:06
or what it had for breakfast,
75
246442
1860
或問直尺大象吃了甚麼早餐
04:08
we can't expect standardized tests alone to reliably tell us how smart someone is,
76
248302
5879
我們不能認為單靠標準測驗
便可知某人有多聰明
04:14
how diplomats will handle a tough situation,
77
254181
2142
外交官有多能應對困難情況
04:16
or how brave a firefighter might turn out to be.
78
256323
4299
或消防員將會有多勇敢
04:20
So standardized tests may help us learn a little about a lot of people
79
260622
4790
因此標準測驗或能助我們 短時間內簡略了解很多人
04:25
in a short time,
80
265412
1150
04:26
but they usually can't tell us a lot about a single person.
81
266562
4451
但我們通常不能 從中詳細知道一個人
04:31
Many social scientists worry about test scores resulting in sweeping
82
271013
4719
很多社會科學家擔心測驗分數籠統
04:35
and often negative changes for test takers,
83
275732
3114
並經常為應試者帶來負面影響
04:38
sometimes with long-term life consequences.
84
278846
3542
有時影響一生
04:42
We can't blame the tests, though.
85
282388
2001
但是,我們不能錯怪測驗
04:44
It's up to us to use the right tests for the right jobs,
86
284389
3790
而是在於我們是否用得其所
04:48
and to interpret results appropriately.
87
288179
2884
並合理分析結果
關於本網站

本網站將向您介紹對學習英語有用的 YouTube 視頻。 您將看到來自世界各地的一流教師教授的英語課程。 雙擊每個視頻頁面上顯示的英文字幕,從那裡播放視頻。 字幕與視頻播放同步滾動。 如果您有任何意見或要求,請使用此聯繫表與我們聯繫。

https://forms.gle/WvT1wiN1qDtmnspy7


This website was created in October 2020 and last updated on June 12, 2025.

It is now archived and preserved as an English learning resource.

Some information may be out of date.

隱私政策

eng.lish.video

Developer's Blog