How computers translate human language - Ioannis Papachimonas

426,286 views ・ 2015-10-26

TED-Ed


請雙擊下方英文字幕播放視頻。

譯者: Ivy Wang 審譯者: Gentian Pan
00:06
How is it that so many intergalactic species in movies and TV
0
6677
4629
為何電影、電視中星際間的不同物種
00:11
just happen to speak perfect English?
1
11306
3177
恰巧能講一口流利的英語?
00:14
The short answer is that no one wants to watch a starship crew
2
14483
3403
答案是:沒人想看太空船員在影片中
00:17
spend years compiling an alien dictionary.
3
17886
3888
花費數年來編撰外星人字典。
00:21
But to keep things consistent,
4
21774
1618
但為保持一致性,
00:23
the creators of Star Trek and other science-fiction worlds
5
23392
3397
「星際迷航」和其他科幻小說創作者
00:26
have introduced the concept of a universal translator,
6
26789
3725
引進「萬能翻譯器」的概念:
00:30
a portable device that can instantly translate between any languages.
7
30514
4498
一種攜帶式裝置,可即時翻譯任何語言。
00:35
So is a universal translator possible in real life?
8
35012
3527
那麼,「萬能翻譯器」可能存在於現實嗎?
00:38
We already have many programs that claim to do just that,
9
38539
3598
已有很多個程式宣稱做得到:
00:42
taking a word, sentence, or entire book in one language
10
42137
3817
從一種語言中選取單字、句子,或整本書,
00:45
and translating it into almost any other,
11
45954
3050
幾乎可以將它們翻譯成任何語言,
00:49
whether it's modern English or Ancient Sanskrit.
12
49004
3333
不論是現代英語,或是古梵語。
00:52
And if translation were just a matter of looking up words in a dictionary,
13
52337
3576
如果翻譯只是在詞典中查找單字,
00:55
these programs would run circles around humans.
14
55913
3912
那麼,這些程式早就普及了。
00:59
The reality, however, is a bit more complicated.
15
59825
3474
然而,現實複雜許多。
01:03
A rule-based translation program uses a lexical database,
16
63299
4050
基於「規則」的翻譯程式使用字彙資料庫,
01:07
which includes all the words you'd find in a dictionary
17
67349
2953
包含字典找到的單字、
01:10
and all grammatical forms they can take,
18
70302
2981
套用的文法型式、
01:13
and set of rules to recognize the basic linguistic elements in the input language.
19
73283
5642
以及「辨認基本語言元素」的規則。
01:18
For a seemingly simple sentence like, "The children eat the muffins,"
20
78925
3471
這個看似簡單的句子為例:「孩子們吃松餅」,
01:22
the program first parses its syntax, or grammatical structure,
21
82396
4654
程式首先分析「語法」或「文法結構」,
01:27
by identifying the children as the subject,
22
87050
2537
辨識出「孩子們」為主詞,
01:29
and the rest of the sentence as the predicate
23
89587
2730
剩下的句子為「述語」- 由動詞「吃」構成。
01:32
consisting of a verb "eat,"
24
92317
2051
01:34
and a direct object "the muffins."
25
94368
3054
和直接受詞 「松餅」。
01:37
It then needs to recognize English morphology,
26
97422
2827
程式需要辨識出「英語構詞學」,
01:40
or how the language can be broken down into its smallest meaningful units,
27
100249
4432
也就是將該語言拆分成 有意義的最小單元,
01:44
such as the word muffin
28
104681
1443
例如單字 「松餅」
01:46
and the suffix "s," used to indicate plural.
29
106124
3631
及字尾加「s」表示複數型。
01:49
Finally, it needs to understand the semantics,
30
109755
2694
最後,程式還需要理解「語意」- 各別部份所表達的意思。
01:52
what the different parts of the sentence actually mean.
31
112449
3729
01:56
To translate this sentence properly,
32
116178
1896
為了正確翻譯句子,
01:58
the program would refer to a different set of vocabulary and rules
33
118074
3908
程式會參考不同語言的字彙與規則
02:01
for each element of the target language.
34
121982
3184
來處理目標語言的每個元素。
02:05
But this is where it gets tricky.
35
125166
1854
這卻是棘手的地方。
02:07
The syntax of some languages allows words to be arranged in any order,
36
127020
4800
某些語言允許單字以任何順序排列,
02:11
while in others, doing so could make the muffin eat the child.
37
131820
5134
但在其它語言,這樣做會出現 「松餅吃孩子們」的句子。
02:16
Morphology can also pose a problem.
38
136954
2693
「構詞學」也有同樣問題。
02:19
Slovene distinguishes between two children and three or more
39
139647
3596
「斯拉維尼亞語」可區分是 兩個、三個、或更多孩子-
02:23
using a dual suffix absent in many other languages,
40
143243
3854
「雙字尾」的用法未見於其它語言中。
02:27
while Russian's lack of definite articles might leave you wondering
41
147097
3435
而 俄語 則缺少「定冠詞」,你可能會困惑
02:30
whether the children are eating some particular muffins,
42
150532
3043
孩子們是在吃某種特定的松餅,
02:33
or just eat muffins in general.
43
153575
3144
還是泛指一般松餅。
02:36
Finally, even when the semantics are technically correct,
44
156719
2989
最後,即使「語意」技術上正確,
02:39
the program might miss their finer points,
45
159708
3049
程式也可能遺失細微部分,
02:42
such as whether the children "mangiano" the muffins,
46
162757
3052
例如,孩子們是在「吃」松餅,
02:45
or "divorano" them.
47
165809
1985
還是在「吞」松餅?
02:47
Another method is statistical machine translation,
48
167794
3764
另一種方法是基於「統計」的機器翻譯,
02:51
which analyzes a database of books, articles, and documents
49
171558
4204
該方法分析「已翻譯的書籍、文章、文件」 所建立的資料庫。
02:55
that have already been translated by humans.
50
175762
3726
02:59
By finding matches between source and translated text
51
179488
3471
從「原文」與「譯文」之間, 尋找非偶然的匹配模式,
03:02
that are unlikely to occur by chance,
52
182959
2434
03:05
the program can identify corresponding phrases and patterns,
53
185393
3952
程式就可以辨識出對應的片語和句型,
03:09
and use them for future translations.
54
189345
3084
以便使用在未來的翻譯上。
03:12
However, the quality of this type of translation
55
192429
2540
然而,這種翻譯的品質
03:14
depends on the size of the initial database
56
194969
2721
決定於資料庫的大小
03:17
and the availability of samples for certain languages
57
197690
3667
以及能否應用於特定語言或 寫作風格的翻譯上。
03:21
or styles of writing.
58
201357
2026
03:23
The difficulty that computers have with the exceptions, irregularities
59
203383
3757
電腦的困難:會遇到異常、非常規情況、
03:27
and shades of meaning that seem to come instinctively to humans
60
207140
3854
以及無法呈現人類「直覺本能」可以了解的意函-
03:30
has led some researchers to believe that our understanding of language
61
210994
4051
這些令研究者相信「語言的理解能力」
03:35
is a unique product of our biological brain structure.
62
215045
4206
是我們大腦生理結構的獨特產物。
03:39
In fact, one of the most famous fictional universal translators,
63
219251
3850
實際上,小說中最著名的萬能翻譯器之一,
03:43
the Babel fish from "The Hitchhiker's Guide to the Galaxy",
64
223101
3338
出自《星際大奇航》的 「寶貝魚」,
03:46
is not a machine at all but a small creature
65
226439
3287
根本就不是機器,而是小生物-
03:49
that translates the brain waves and nerve signals of sentient species
66
229726
4484
是一隻能透過心靈感應,翻譯腦波和 神經信號的 「有感知」的生物 。
03:54
through a form of telepathy.
67
234210
2795
目前傳統的語言學習
03:57
For now, learning a language the old fashioned way
68
237005
2721
03:59
will still give you better results than any currently available computer program.
69
239726
5380
仍然會優於利用電腦程式的翻譯。
04:05
But this is no easy task,
70
245106
1643
但這不是簡單的任務,
04:06
and the sheer number of languages in the world,
71
246749
2265
世界上語言的數量,
04:09
as well as the increasing interaction between the people who speak them,
72
249014
3975
和人與人之間逐漸增加的語言互動,
04:12
will only continue to spur greater advances in automatic translation.
73
252989
5015
都會繼續激發「自動翻譯」的進步。
04:18
Perhaps by the time we encounter intergalactic life forms,
74
258004
3405
也許,遇到星際間的其他生物時,
04:21
we'll be able to communicate with them through a tiny gizmo,
75
261409
3251
我們已經能夠透過小裝置來溝通,
04:24
or we might have to start compiling that dictionary, after all.
76
264660
4366
也或許最終,我們還是得著手編寫那部字典。
關於本網站

本網站將向您介紹對學習英語有用的 YouTube 視頻。 您將看到來自世界各地的一流教師教授的英語課程。 雙擊每個視頻頁面上顯示的英文字幕,從那裡播放視頻。 字幕與視頻播放同步滾動。 如果您有任何意見或要求,請使用此聯繫表與我們聯繫。

https://forms.gle/WvT1wiN1qDtmnspy7


This website was created in October 2020 and last updated on June 12, 2025.

It is now archived and preserved as an English learning resource.

Some information may be out of date.

隱私政策

eng.lish.video

Developer's Blog