Should we get rid of standardized testing? - Arlo Kempf

1,241,187 views ・ 2017-09-19

TED-Ed


请双击下面的英文字幕来播放视频。

翻译人员: Jiawen Wei 校对人员: Jessica Lee
00:09
The first standardized tests that we know of
0
9061
3101
我们所知的第一场标准化考核
00:12
were administered in China over 2,000 years ago
1
12162
4180
是在2000多年前
00:16
during the Han dynasty.
2
16342
1881
由中国的汉朝举办的。
00:18
Chinese officials used them to determine aptitude for various government posts.
3
18223
5260
当时汉朝的官员依据这些考核 来为政府职位挑选人才。
00:23
The subject matter included philosophy,
4
23483
2089
考试的科目包括哲学,
00:25
farming,
5
25572
1065
农业,
00:26
and even military tactics.
6
26637
2326
甚至军事策略。
00:28
Standardized tests continued to be used around the world for the next two millennia,
7
28963
4827
标准化考核在之后的 两千年中被世界各地所采用,
00:33
and today, they're used for everything
8
33790
2092
时至今日,它们仍然 被广泛应用于方方面面,
00:35
from evaluating stair climbs for firefighters in France
9
35882
3956
从法国消防员的台阶攀爬考核,
00:39
to language examinations for diplomats in Canada
10
39838
3485
到加拿大外交官的语言考核,
00:43
to students in schools.
11
43323
2591
再到学校的学生。
00:45
Some standardized tests measure scores
12
45914
2110
有些标准化考核的成绩
00:48
only in relation to the results of other test takers.
13
48024
3760
仅仅和其他参加考试的考生成绩相关。
00:51
Others measure performances on how well test takers meet predetermined criteria.
14
51784
5671
其他考试则依据预定的标准 来评判考生的表现
00:57
So the stair climb for the firefighter
15
57455
2258
所以消防员的台阶攀爬测试
00:59
could be measured by comparing the time of the climb
16
59713
2881
可以通过和其他消防员
01:02
to that of all other firefighters.
17
62594
3010
比较攀爬时长来进行评估。
01:05
This might be expressed in what many call a bell curve.
18
65604
3839
考核结果可以用我们大家 所说的钟形曲线来展现。
01:09
Or it could be evaluated with reference to set criteria,
19
69443
3971
或者可以依据预设的 标准为参考来进行评估,
01:13
such as carrying a certain amount of weight a certain distance
20
73414
3590
比如携带指定的负重向上攀爬
01:17
up a certain number of stairs.
21
77004
2920
特定距离及特定的台阶数。
01:19
Similarly, the diplomat might be measured against other test-taking diplomats,
22
79924
4778
同样的,外交官考核的成绩可以 通过和其他考生互相比较来评估,
01:24
or against a set of fixed criteria,
23
84702
2443
或者根据能够展现
01:27
which demonstrate different levels of language proficiency.
24
87145
3909
语言掌握程度而设立的标准进行评估。
01:31
And all of these results can be expressed using something called a percentile.
25
91054
4731
而所有这些考核成绩都可以通过 一种被称为百分位数的形式来展现。
01:35
If a diplomat is in the 70th percentile, 70% of test takers scored below her.
26
95785
5989
例如,一位外交官的成绩是第70个 百分位数,即高于70%的考生。
01:41
If she scored in the 30th percentile, 70% of test takers scored above her.
27
101774
5561
而如果她的成绩是第30个百分位数, 就是低于70%的考生。
01:47
Although standardized tests are sometimes controversial,
28
107335
3411
尽管标准化考核有时也会引起争议,
01:50
they're simply a tool.
29
110746
1779
它们也仅仅只是一种工具而已。
01:52
As a thought experiment, think of a standardized test as a ruler.
30
112525
4171
把标准化考核想像成一把尺。
01:56
A ruler's usefulness depends on two things.
31
116696
2699
而让尺发挥作用取决于两个因素。
01:59
First, the job we ask it to do.
32
119395
2762
首先,是我们想让它发挥的功能。
02:02
Our ruler can't measure the temperature outside
33
122157
2829
我们不能用尺来测量室外的温度,
02:04
or how loud someone is singing.
34
124986
2460
或者某个人唱歌的分贝。
02:07
Second, the ruler's usefulness depends on its design.
35
127446
3419
其次,尺的设计决定了它的作用。
02:10
Say you need to measure the circumference of an orange.
36
130865
3281
比如你想要测量一个橙子的圆周长,
02:14
Our ruler measures length, which is the right quantity,
37
134146
3251
我们的尺正是用来测量长度的,
02:17
but it hasn't been designed with the flexibility required for the task at hand.
38
137397
4841
但是它的设计并不能满足 当前任务所需的弹性。
02:22
So, if standardized tests are given the wrong job,
39
142238
3128
所以当标准化考核 被赋予了错误的功能,
02:25
or aren't designed properly,
40
145366
1871
或者考核的设计失当,
02:27
they may end up measuring the wrong things.
41
147237
4390
它们最终可能会得出错误的测试结果。
02:31
In the case of schools,
42
151627
1280
例如在学校中,
02:32
students with test anxiety may have trouble performing their best
43
152907
3771
有考试焦虑症的学生 可能无法在标准化考核中
02:36
on a standardized test,
44
156678
1730
展现全部实力,
02:38
not because they don't know the answers,
45
158408
1708
这并不是因为他们不知道答案,
02:40
but because they're feeling too nervous to share what they've learned.
46
160116
3619
而是因为他们太紧张 而无法分享自己所学的知识。
02:43
Students with reading challenges
47
163735
1683
有阅读障碍的学生
02:45
may struggle with the wording of a math problem,
48
165418
2660
可能无法理解一道数学题的题意,
02:48
so their test results may better reflect their literacy
49
168078
2800
所以他们的考试成绩 也许更好的反馈了
02:50
rather than numeracy skills.
50
170878
2640
他们的读写能力,而不是数学能力。
02:53
And students who were confused by examples
51
173518
2060
而有些学生对于试题中涉及的
02:55
on tests that contain unfamiliar cultural references
52
175578
3590
他们所不熟悉的文化背景感到困惑,
02:59
may do poorly,
53
179168
1449
因而表现不佳。
03:00
telling us more about the test taker's cultural familiarity
54
180617
2792
这些最终会更多的向我们展示 考生对于文化的熟悉程度,
03:03
than their academic learning.
55
183409
2289
而非他们的学术能力。
03:05
In these cases, the tests may need to be designed differently.
56
185698
5392
以上事例中的考核也许需要重新设计。
03:11
Standardized tests can also have a hard time
57
191090
2329
标准化考核在测试抽象的特性或者技能
03:13
measuring abstract characteristics or skills,
58
193419
3219
比如创造力,批判性思维 和协同合作性上
03:16
such as creativity, critical thinking, and collaboration.
59
196638
4020
也无法发挥应有的作用。
03:20
If we design a test poorly,
60
200658
1720
如果我们没有正确的设计考核机制
03:22
or ask it to do the wrong job,
61
202378
1922
或者赋予考核错误的作用,
03:24
or a job it's not very good at,
62
204300
2253
或者将考核应用于不恰当的领域,
03:26
the results may not be reliable or valid.
63
206553
3296
考核的结果就可能并不可信或者无效。
03:29
Reliability and validity are two critical ideas
64
209849
3090
可信度和有效性是理解标准化考核的
03:32
for understanding standardized tests.
65
212939
2680
两个重要概念。
03:35
To understand the difference between them,
66
215619
1681
为了理解这两者间的不同之处,
03:37
we can use the metaphor of two broken thermometers.
67
217300
3089
我们可以用两个破损的温度计做比喻。
03:40
An unreliable thermometer
68
220389
1900
一个不可靠的温度计
03:42
gives you a different reading each time you take your temperature,
69
222289
3253
会在每次测量的时候得到不同的读数,
03:45
and the reliable but invalid thermometer is consistently ten degrees too hot.
70
225542
5649
而一个可靠但是结果无效的 温度计的读数会始终偏高10度。
03:51
Validity also depends on accurate interpretations of results.
71
231191
4269
有效性也取决于对于结果准确的解读。
03:55
If people say results of a test mean something they don't,
72
235460
3311
如果人们想将考核的结果推广到 超出其本身所代表的意义,
03:58
that test may have a validity problem.
73
238771
3163
那这个考核的有效性就出现了问题。
04:01
Just as we wouldn't expect a ruler to tell us how much an elephant weighs,
74
241934
4508
正如我们不能期望用尺来 测量出大象的重量
04:06
or what it had for breakfast,
75
246442
1860
或者它早饭吃了什么,
04:08
we can't expect standardized tests alone to reliably tell us how smart someone is,
76
248302
5879
我们也无法期待仅仅通过标准化考核 就能知道某个人有多聪明,
04:14
how diplomats will handle a tough situation,
77
254181
2142
外交官是否能机智的化解困境,
04:16
or how brave a firefighter might turn out to be.
78
256323
4299
或者消防员会有多勇敢。
04:20
So standardized tests may help us learn a little about a lot of people
79
260622
4790
所以标准化考核也许能够 帮助我们在短时间内
04:25
in a short time,
80
265412
1150
对一大群人有大概的了解,
04:26
but they usually can't tell us a lot about a single person.
81
266562
4451
但是这些考核通常无法告诉我们 关于某一个人的很多特点。
04:31
Many social scientists worry about test scores resulting in sweeping
82
271013
4719
很多社会学家担心考核成绩太过笼统
04:35
and often negative changes for test takers,
83
275732
3114
并且通常会为考生带来负面的变化,
04:38
sometimes with long-term life consequences.
84
278846
3542
有时候甚至是长期或者 影响终生的变化。
04:42
We can't blame the tests, though.
85
282388
2001
然而我们不能抱怨考核本身,
04:44
It's up to us to use the right tests for the right jobs,
86
284389
3790
因为这取决于我们如何去 将正确的考核用在正确的领域,
04:48
and to interpret results appropriately.
87
288179
2884
并且正确的解读考核的结果。
关于本网站

这个网站将向你介绍对学习英语有用的YouTube视频。你将看到来自世界各地的一流教师教授的英语课程。双击每个视频页面上显示的英文字幕,即可从那里播放视频。字幕会随着视频的播放而同步滚动。如果你有任何意见或要求,请使用此联系表与我们联系。

https://forms.gle/WvT1wiN1qDtmnspy7


This website was created in October 2020 and last updated on June 12, 2025.

It is now archived and preserved as an English learning resource.

Some information may be out of date.

隐私政策

eng.lish.video

Developer's Blog