Erin McKean: The joy of lexicography

72,730 views ・ 2007-08-30

TED


Please double-click on the English subtitles below to play the video.

00:25
Now, have any of y'all ever looked up this word?
0
25000
4000
00:29
You know, in a dictionary? (Laughter) Yeah, that's what I thought.
1
29000
4000
00:33
How about this word?
2
33000
2000
00:35
Here, I'll show it to you.
3
35000
1000
00:36
Lexicography: the practice of compiling dictionaries.
4
36000
3000
00:39
Notice -- we're very specific -- that word "compile."
5
39000
3000
00:42
The dictionary is not carved out of a piece of granite,
6
42000
3000
00:45
out of a lump of rock. It's made up of lots of little bits.
7
45000
3000
00:48
It's little discrete --
8
48000
1000
00:49
that's spelled D-I-S-C-R-E-T-E -- bits.
9
49000
4000
00:53
And those bits are words.
10
53000
2000
00:55
Now one of the perks of being a lexicographer --
11
55000
4000
00:59
besides getting to come to TED -- is that you get to say really fun words,
12
59000
3000
01:02
like lexicographical.
13
62000
3000
01:05
Lexicographical has this great pattern:
14
65000
2000
01:07
it's called a double dactyl. And just by saying double dactyl,
15
67000
2000
01:09
I've sent the geek needle all the way into the red. (Laughter) (Applause)
16
69000
3000
01:12
But "lexicographical" is the same pattern as "higgledy-piggledy."
17
72000
4000
01:16
Right? It's a fun word to say,
18
76000
2000
01:18
and I get to say it a lot.
19
78000
3000
01:21
Now, one of the non-perks of being a lexicographer
20
81000
3000
01:24
is that people don't usually have a kind of warm, fuzzy, snuggly image of the dictionary.
21
84000
5000
01:29
Right? Nobody hugs their dictionaries.
22
89000
3000
01:32
But what people really often think about the dictionary is, they think more like this.
23
92000
7000
01:39
Just to let you know, I do not have a lexicographical whistle.
24
99000
3000
01:42
But people think that my job is to let the good words
25
102000
2000
01:44
make that difficult left-hand turn into the dictionary,
26
104000
3000
01:47
and keep the bad words out.
27
107000
2000
01:49
But the thing is, I don't want to be a traffic cop.
28
109000
3000
01:52
For one thing, I just do not do uniforms.
29
112000
4000
01:56
And for another, deciding what words are good
30
116000
4000
02:00
and what words are bad is actually not very easy.
31
120000
2000
02:02
And it's not very fun. And when parts of your job are not easy or fun,
32
122000
4000
02:06
you kind of look for an excuse not to do them.
33
126000
3000
02:09
So if I had to think of some kind of occupation
34
129000
5000
02:14
as a metaphor for my work, I would much rather be a fisherman.
35
134000
6000
02:20
I want to throw my big net into the deep, blue ocean of English
36
140000
3000
02:23
and see what marvelous creatures I can drag up from the bottom.
37
143000
4000
02:27
But why do people want me to direct traffic, when I would much rather go fishing?
38
147000
5000
02:32
Well, I blame the Queen.
39
152000
2000
02:34
Why do I blame the Queen?
40
154000
2000
02:36
Well, first of all, I blame the Queen because it's funny.
41
156000
2000
02:38
But secondly, I blame the Queen because
42
158000
3000
02:41
dictionaries have really not changed.
43
161000
2000
02:43
Our idea of what a dictionary is has not changed since her reign.
44
163000
2000
02:45
The only thing that Queen Victoria would not be amused by in modern dictionaries
45
165000
6000
02:51
is our inclusion of the F-word, which has happened
46
171000
3000
02:54
in American dictionaries since 1965.
47
174000
2000
02:56
So, there's this guy, right? Victorian era.
48
176000
3000
02:59
James Murray, first editor of the Oxford English Dictionary.
49
179000
2000
03:01
I do not have that hat. I wish I had that hat.
50
181000
3000
03:04
So he's really responsible for a lot of
51
184000
4000
03:08
what we consider modern in dictionaries today.
52
188000
2000
03:10
When a guy who looks like that, in that hat,
53
190000
3000
03:13
is the face of modernity, you have a problem.
54
193000
7000
03:20
And so, James Murray could get a job on any dictionary today.
55
200000
2000
03:22
There'd be virtually no learning curve.
56
202000
3000
03:25
And of course, a few of us are saying: okay, computers!
57
205000
2000
03:27
Computers! What about computers?
58
207000
2000
03:29
The thing about computers is, I love computers.
59
209000
2000
03:31
I mean, I'm a huge geek, I love computers.
60
211000
2000
03:33
I would go on a hunger strike before I let them take away Google Book Search from me.
61
213000
4000
03:37
But computers don't do much else other than
62
217000
2000
03:39
speed up the process of compiling dictionaries.
63
219000
4000
03:43
They don't change the end result.
64
223000
4000
03:47
Because what a dictionary is,
65
227000
3000
03:50
is it's Victorian design merged with a little bit of modern propulsion.
66
230000
3000
03:53
It's steampunk. What we have is an electric velocipede.
67
233000
6000
03:59
You know, we have Victorian design with an engine on it. That's all!
68
239000
3000
04:02
The design has not changed.
69
242000
3000
04:05
And OK, what about online dictionaries, right?
70
245000
2000
04:07
Online dictionaries must be different.
71
247000
3000
04:10
This is the Oxford English Dictionary Online, one of the best online dictionaries.
72
250000
2000
04:12
This is my favorite word, by the way.
73
252000
1000
04:13
Erinaceous: pertaining to the hedgehog family; of the nature of a hedgehog.
74
253000
5000
04:18
Very useful word. So, look at that.
75
258000
6000
04:24
Online dictionaries right now are paper thrown up on a screen.
76
264000
2000
04:26
This is flat. Look how many links there are in the actual entry: two!
77
266000
5000
04:31
Right? Those little buttons,
78
271000
2000
04:33
I had them all expanded except for the date chart.
79
273000
3000
04:36
So there's not very much going on here.
80
276000
2000
04:38
There's not a lot of clickiness.
81
278000
2000
04:40
And in fact, online dictionaries replicate
82
280000
3000
04:43
almost all the problems of print, except for searchability.
83
283000
3000
04:46
And when you improve searchability,
84
286000
2000
04:48
you actually take away the one advantage of print, which is serendipity.
85
288000
3000
04:51
Serendipity is when you find things you weren't looking for,
86
291000
3000
04:54
because finding what you are looking for is so damned difficult.
87
294000
3000
04:57
So -- (Laughter) (Applause) -- now, when you think about this,
88
297000
9000
05:06
what we have here is a ham butt problem.
89
306000
3000
05:09
Does everyone know the ham butt problem?
90
309000
2000
05:11
Woman's making a ham for a big, family dinner.
91
311000
2000
05:13
She goes to cut the butt off the ham and throw it away,
92
313000
2000
05:15
and she looks at this piece of ham and she's like,
93
315000
1000
05:16
"This is a perfectly good piece of ham. Why am I throwing this away?"
94
316000
2000
05:18
She thought, "Well, my mom always did this."
95
318000
2000
05:20
So she calls up mom, and she says,
96
320000
1000
05:21
"Mom, why'd you cut the butt off the ham, when you're making a ham?"
97
321000
2000
05:23
She says, "I don't know, my mom always did it!"
98
323000
3000
05:26
So they call grandma, and grandma says,
99
326000
2000
05:28
"My pan was too small!" (Laughter)
100
328000
4000
05:32
So, it's not that we have good words and bad words.
101
332000
4000
05:36
We have a pan that's too small!
102
336000
3000
05:39
You know, that ham butt is delicious! There's no reason to throw it away.
103
339000
2000
05:41
The bad words -- see, when people think about a place
104
341000
3000
05:44
and they don't find a place on the map,
105
344000
2000
05:46
they think, "This map sucks!"
106
346000
2000
05:48
When they find a nightspot or a bar, and it's not in the guidebook,
107
348000
2000
05:50
they're like, "Ooh, this place must be cool! It's not in the guidebook."
108
350000
3000
05:53
When they find a word that's not in the dictionary, they think,
109
353000
3000
05:56
"This must be a bad word." Why? It's more likely to be a bad dictionary.
110
356000
5000
06:01
Why are you blaming the ham for being too big for the pan?
111
361000
5000
06:06
So, you can't get a smaller ham.
112
366000
3000
06:09
The English language is as big as it is.
113
369000
3000
06:12
So, if you have a ham butt problem,
114
372000
2000
06:14
and you're thinking about the ham butt problem,
115
374000
2000
06:16
the conclusion that it leads you to is inexorable and counterintuitive:
116
376000
5000
06:21
paper is the enemy of words.
117
381000
3000
06:24
How can this be? I mean, I love books. I really love books.
118
384000
4000
06:28
Some of my best friends are books.
119
388000
2000
06:30
But the book is not the best shape for the dictionary.
120
390000
5000
06:35
Now they're going to think "Oh, boy.
121
395000
2000
06:37
People are going to take away my beautiful, paper dictionaries?"
122
397000
3000
06:40
No. There will still be paper dictionaries.
123
400000
2000
06:42
When we had cars -- when cars became the dominant mode of transportation,
124
402000
4000
06:46
we didn't round up all the horses and shoot them.
125
406000
3000
06:49
You know, there're still going to be paper dictionaries,
126
409000
2000
06:51
but it's not going to be the dominant dictionary.
127
411000
3000
06:54
The book-shaped dictionary is not going to be the only shape
128
414000
3000
06:57
dictionaries come in. And it's not going to be
129
417000
2000
06:59
the prototype for the shapes dictionaries come in.
130
419000
4000
07:03
So, think about it this way: if you've got an artificial constraint,
131
423000
4000
07:07
artificial constraints lead to
132
427000
4000
07:11
arbitrary distinctions and a skewed worldview.
133
431000
4000
07:15
What if biologists could only study animals
134
435000
3000
07:18
that made people go, "Aww." Right?
135
438000
2000
07:20
What if we made aesthetic judgments about animals,
136
440000
2000
07:22
and only the ones we thought were cute were the ones that we could study?
137
442000
5000
07:27
We'd know a whole lot about charismatic megafauna,
138
447000
4000
07:31
and not very much about much else.
139
451000
2000
07:33
And I think this is a problem.
140
453000
2000
07:35
I think we should study all the words,
141
455000
2000
07:37
because when you think about words, you can make beautiful expressions
142
457000
5000
07:42
from very humble parts.
143
462000
4000
07:46
Lexicography is really more about material science.
144
466000
4000
07:50
We are studying the tolerances of the materials
145
470000
3000
07:53
that you use to build the structure of your expression:
146
473000
3000
07:56
your speeches and your writing. And then, often people say to me,
147
476000
7000
08:03
"Well, OK, how do I know that this word is real?"
148
483000
5000
08:08
They think, "OK, if we think words are the tools
149
488000
7000
08:15
that we use to build the expressions of our thoughts,
150
495000
2000
08:17
how can you say that screwdrivers are better than hammers?
151
497000
3000
08:20
How can you say that a sledgehammer is better than a ball-peen hammer?"
152
500000
3000
08:23
They're just the right tools for the job.
153
503000
3000
08:26
And so people say to me, "How do I know if a word is real?"
154
506000
3000
08:29
You know, anybody who's read a children's book
155
509000
3000
08:32
knows that love makes things real.
156
512000
4000
08:36
If you love a word, use it. That makes it real.
157
516000
5000
08:41
Being in the dictionary is an artificial distinction.
158
521000
3000
08:44
It doesn't make a word any more real than any other way.
159
524000
3000
08:47
If you love a word, it becomes real.
160
527000
4000
08:51
So if we're not worrying about directing traffic,
161
531000
3000
08:54
if we've transcended paper, if we are worrying less
162
534000
5000
08:59
about control and more about description,
163
539000
4000
09:03
then we can think of the English language
164
543000
2000
09:05
as being this beautiful mobile.
165
545000
3000
09:08
And any time one of those little parts of the mobile changes,
166
548000
2000
09:10
is touched, any time you touch a word,
167
550000
3000
09:13
you use it in a new context, you give it a new connotation,
168
553000
2000
09:15
you verb it, you make the mobile move.
169
555000
3000
09:18
You didn't break it. It's just in a new position,
170
558000
4000
09:22
and that new position can be just as beautiful.
171
562000
3000
09:25
Now, if you're no longer a traffic cop --
172
565000
4000
09:29
the problem with being a traffic cop is
173
569000
2000
09:31
there can only be so many traffic cops in any one intersection,
174
571000
3000
09:34
or the cars get confused. Right?
175
574000
3000
09:37
But if your goal is no longer to direct the traffic,
176
577000
3000
09:40
but maybe to count the cars that go by, then more eyeballs are better.
177
580000
4000
09:44
You can ask for help!
178
584000
2000
09:46
If you ask for help, you get more done. And we really need help.
179
586000
4000
09:50
Library of Congress: 17 million books,
180
590000
3000
09:53
of which half are in English.
181
593000
3000
09:56
If only one out of every 10 of those books
182
596000
4000
10:00
had a word that's not in the dictionary in it,
183
600000
2000
10:02
that would be equivalent to more than two unabridged dictionaries.
184
602000
3000
10:05
And I find an un-dictionaried word --
185
605000
3000
10:08
a word like "un-dictionaried," for example --
186
608000
2000
10:10
in almost every book I read. What about newspapers?
187
610000
5000
10:15
Newspaper archive goes back to 1759,
188
615000
5000
10:20
58.1 million newspaper pages. If only one in 100
189
620000
5000
10:25
of those pages had an un-dictionaried word on it,
190
625000
3000
10:28
it would be an entire other OED.
191
628000
3000
10:31
That's 500,000 more words. So that's a lot.
192
631000
5000
10:36
And I'm not even talking about magazines. I'm not talking about blogs --
193
636000
3000
10:39
and I find more new words on BoingBoing in a given week
194
639000
2000
10:41
than I do Newsweek or Time.
195
641000
2000
10:43
There's a lot going on there.
196
643000
2000
10:45
And I'm not even talking about polysemy,
197
645000
2000
10:47
which is the greedy habit some words have of taking
198
647000
3000
10:50
more than one meaning for themselves.
199
650000
5000
10:55
So if you think of the word "set," a set can be a badger's burrow,
200
655000
4000
10:59
a set can be one of the pleats in an Elizabethan ruff,
201
659000
3000
11:02
and there's one numbered definition in the OED.
202
662000
2000
11:04
The OED has 33 different numbered definitions for set.
203
664000
3000
11:07
Tiny, little word, 33 numbered definitions.
204
667000
3000
11:10
One of them is just labeled "miscellaneous technical senses."
205
670000
5000
11:15
Do you know what that says to me?
206
675000
1000
11:16
That says to me, it was Friday afternoon and somebody wanted to go down the pub. (Laughter)
207
676000
5000
11:21
That's a lexicographical cop out,
208
681000
2000
11:23
to say, "miscellaneous technical senses."
209
683000
2000
11:25
So, we have all these words, and we really need help!
210
685000
4000
11:29
And the thing is, we could ask for help --
211
689000
3000
11:32
asking for help's not that hard.
212
692000
1000
11:33
I mean, lexicography is not rocket science.
213
693000
3000
11:36
See, I just gave you a lot of words and a lot of numbers,
214
696000
3000
11:39
and this is more of a visual explanation.
215
699000
2000
11:41
If we think of the dictionary as being the map of the English language,
216
701000
3000
11:44
these bright spots are what we know about,
217
704000
2000
11:46
and the dark spots are where we are in the dark.
218
706000
3000
11:49
If that was the map of all the words in American English, we don't know very much.
219
709000
5000
11:54
And we don't even know the shape of the language.
220
714000
3000
11:57
If this was the dictionary -- if this was the map of American English --
221
717000
3000
12:00
look, we have a kind of lumpy idea of Florida,
222
720000
3000
12:03
but there's no California!
223
723000
3000
12:06
We're missing California from American English.
224
726000
3000
12:09
We just don't know enough, and we don't even know that we're missing California.
225
729000
5000
12:14
We don't even see that there's a gap on the map.
226
734000
2000
12:16
So again, lexicography is not rocket science.
227
736000
3000
12:19
But even if it were, rocket science is being done
228
739000
3000
12:22
by dedicated amateurs these days. You know?
229
742000
4000
12:26
It can't be that hard to find some words!
230
746000
4000
12:30
So, enough scientists in other disciplines
231
750000
3000
12:33
are really asking people to help, and they're doing a good job of it.
232
753000
3000
12:36
For instance, there's eBird, where amateur birdwatchers
233
756000
2000
12:38
can upload information about their bird sightings.
234
758000
2000
12:40
And then, ornithologists can go
235
760000
2000
12:42
and help track populations, migrations, etc.
236
762000
3000
12:45
And there's this guy, Mike Oates. Mike Oates lives in the U.K.
237
765000
3000
12:48
He's a director of an electroplating company.
238
768000
4000
12:52
He's found more than 140 comets.
239
772000
3000
12:55
He's found so many comets, they named a comet after him.
240
775000
3000
12:58
It's kind of out past Mars. It's a hike.
241
778000
1000
12:59
I don't think he's getting his picture taken there anytime soon.
242
779000
2000
13:01
But he found 140 comets without a telescope.
243
781000
4000
13:05
He downloaded data from the NASA SOHO satellite,
244
785000
3000
13:08
and that's how he found them.
245
788000
2000
13:10
If we can find comets without a telescope,
246
790000
4000
13:14
shouldn't we be able to find words?
247
794000
2000
13:16
Now, y'all know where I'm going with this.
248
796000
2000
13:18
Because I'm going to the Internet, which is where everybody goes.
249
798000
3000
13:21
And the Internet is great for collecting words,
250
801000
2000
13:23
because the Internet's full of collectors.
251
803000
1000
13:24
And this is a little-known technological fact about the Internet,
252
804000
3000
13:27
but the Internet is actually made up of words and enthusiasm.
253
807000
3000
13:30
And words and enthusiasm actually happen to be
254
810000
5000
13:35
the recipe for lexicography. Isn't that great?
255
815000
3000
13:38
So there are a lot of really good word-collecting sites out there right now,
256
818000
4000
13:42
but the problem with some of them is that they're not scientific enough.
257
822000
2000
13:44
They show the word, but they don't show any context.
258
824000
3000
13:47
Where did it come from? Who said it?
259
827000
2000
13:49
What newspaper was it in? What book?
260
829000
2000
13:51
Because a word is like an archaeological artifact.
261
831000
4000
13:55
If you don't know the provenance or the source of the artifact,
262
835000
3000
13:58
it's not science, it's a pretty thing to look at.
263
838000
3000
14:01
So a word without its source is like a cut flower.
264
841000
3000
14:04
You know, it's pretty to look at for a while, but then it dies.
265
844000
4000
14:08
It dies too fast.
266
848000
1000
14:09
So, this whole time I've been saying,
267
849000
4000
14:13
"The dictionary, the dictionary, the dictionary, the dictionary."
268
853000
2000
14:15
Not "a dictionary," or "dictionaries." And that's because,
269
855000
3000
14:18
well, people use the dictionary to stand for the whole language.
270
858000
3000
14:21
They use it synecdochically.
271
861000
3000
14:24
And one of the problems of knowing a word like "synecdochically"
272
864000
3000
14:27
is that you really want an excuse to say "synecdochically."
273
867000
3000
14:30
This whole talk has just been an excuse to get me to the point
274
870000
2000
14:32
where I could say "synecdochically" to all of you.
275
872000
2000
14:34
So I'm really sorry. But when you use a part of something --
276
874000
3000
14:37
like the dictionary is a part of the language,
277
877000
2000
14:39
or a flag stands for the United States, it's a symbol of the country --
278
879000
5000
14:44
then you're using it synecdochically.
279
884000
4000
14:48
But the thing is, we could make the dictionary the whole language.
280
888000
4000
14:52
If we get a bigger pan, then we can put all the words in.
281
892000
4000
14:56
We can put in all the meanings.
282
896000
4000
15:00
Doesn't everyone want more meaning in their lives?
283
900000
4000
15:04
And we can make the dictionary not just be a symbol of the language --
284
904000
4000
15:08
we can make it be the whole language.
285
908000
3000
15:11
You see, what I'm really hoping for is that my son,
286
911000
2000
15:13
who turns seven this month -- I want him to barely remember
287
913000
3000
15:16
that this is the form factor that dictionaries used to come in.
288
916000
5000
15:21
This is what dictionaries used to look like.
289
921000
2000
15:23
I want him to think of this kind of dictionary as an eight-track tape.
290
923000
2000
15:25
It's a format that died because it wasn't useful enough.
291
925000
4000
15:29
It wasn't really what people needed.
292
929000
3000
15:32
And the thing is, if we can put in all the words,
293
932000
3000
15:35
no longer have that artificial distinction between good and bad,
294
935000
4000
15:39
we can really describe the language like scientists.
295
939000
3000
15:42
We can leave the aesthetic judgments to the writers and the speakers.
296
942000
2000
15:44
If we can do that, then I can spend all my time fishing,
297
944000
4000
15:48
and I don't have to be a traffic cop anymore.
298
948000
5000
15:53
Thank you very much for your kind attention.
299
953000
2000
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7