How to get better at video games, according to babies - Brian Christian

559,494 views ・ 2021-11-02

TED-Ed

下の英語字幕をダブルクリックすると動画を再生できます。

翻訳: sola watanabe 校正: Tomoyuki Suzuki

00:08

In 2013, a group of researchers at DeepMind in London

8871

4292

2013年ロンドンにある DeepMind社の研究グループは

00:13

had set their sights on a grand challenge.

13163

2666

ある壮大な挑戦をしました

00:15

They wanted to create an AI system that could beat,

15996

3292

アタリ社のゲーム１本のみならず

00:19

not just a single Atari game, but every Atari game.

19288

4833

アタリ社のゲームすべてをクリアできる AIシステムを作りたいと考えました

00:24

They developed a system they called Deep Q Networks, or DQN,

24663

5166

Deep Q Networks(DQN)というシステムを開発し

00:29

and less than two years later, it was superhuman.

29829

3667

２年も経たないうちに超人的な性能になりました

00:33

DQN was getting scores 13 times better

33954

4167

DQNは人間のプロゲームテスターよりもよいスコアを出したのです

00:38

than professional human games testers at “Breakout,”

38121

3541

『ブレイクアウト』はプロゲームテスターの13倍

00:41

17 times better at “Boxing,” and 25 times better at “Video Pinball.”

41662

6334

『ボクシング』は17倍『ビデオピンボール』は25倍上回りました

00:48

But there was one notable, and glaring, exception.

48162

3834

しかし１つだけ注目すべきかつ目立った例外がありました

00:52

When playing “Montezuma’s Revenge” DQN couldn’t score a single point,

52496

5791

『モンテスマの復讐』をプレイしたDQNは

00:58

even after playing for weeks.

58537

2625

何週間プレイしても１点も取れなかったのです

01:01

What was it that made this particular game so vexingly difficult for AI?

61412

5459

AIにとってこのゲームがとても難しい理由は何だったのでしょう?

01:07

And what would it take to solve it?

67204

2459

そしてそれを解決するために何が必要なのでしょうか?

01:10

Spoiler alert: babies.

70538

2833

ネタバレ注意それは「赤ちゃん」です

01:13

We’ll come back to that in a minute.

73746

2000

この話は後ほど

01:16

Playing Atari games with AI involves what’s called reinforcement learning,

76163

5541

AIがアタリのゲームをプレイするのに「強化学習」が使われています

01:21

where the system is designed to maximize some kind of numerical rewards.

81871

4917

数値で表される種の報酬を最大化するようにシステムが設計されています

01:26

In this case, those rewards were simply the game's points.

86788

3833

この場合の報酬は単純にゲームの得点のことです

01:30

This underlying goal drives the system to learn which buttons to press

90746

4333

この背後にある目的によってより高い点数を得るために

01:35

and when to press them to get the most points.

95079

3000

どのボタンをいつ押すべきかという学習が促進されます

01:38

Some systems use model-based approaches, where they have a model of the environment

98079

5542

一部のシステムではモデルベースのアプローチを用いており

01:43

that they can use to predict what will happen next

103621

3125

環境のモデルがありこれを用いるとある行動をとった時に

01:46

once they take a certain action.

106746

2000

次に何が起きるかを予測することができます

01:49

DQN, however, is model free.

109288

3041

しかしDQNはモデルフリーです

01:52

Instead of explicitly modeling its environment,

112704

2584

環境を明示的にモデル化するのではなく

01:55

it just learns to predict, based on the images on screen,

115288

3458

異なるボタンを押すと

01:58

how many future points it can expect to earn by pressing different buttons.

118746

4958

画面上の画像を元にした予測を学習するに過ぎません

02:03

For instance, “if the ball is here and I move left, more points,

123871

4792

例えば「ボールがここにあって左に動けばポイントがとれるが

02:08

but if I move right, no more points.”

128663

2833

右に動けばこれ以上ポイントがとれない」といった具合です

02:12

But learning these connections requires a lot of trial and error.

132038

4500

しかしこの一連の関連性を学ぶには試行錯誤を繰り返す必要があります

02:16

The DQN system would start by mashing buttons randomly,

136704

3834

DQNシステムはまず無作為にボタンをガンガンと押してゆき

02:20

and then slowly piece together which buttons to mash when

140538

3541

どのボタンをどのタイミングで押せばスコアが最大になるかという情報の断片を

02:24

in order to maximize its score.

144079

2125

少しづつ繋ぎ合わせていきます

02:26

But in playing “Montezuma’s Revenge,”

146704

2375

しかし『モンテスマの復讐』をプレイしたときは

02:29

this approach of random button-mashing fell flat on its face.

149079

4334

無作為にボタンを押すという方法は通用しませんでした

02:34

A player would have to perform this entire sequence

154121

3000

プレイヤーが最後の最後にやっと得られる得点を得るためだけに

02:37

just to score their first points at the very end.

157121

3375

このような一連の動作を行わなければなりません

02:40

A mistake? Game over.

160871

2208

ミスをしたら？ゲームオーバーです

02:43

So how could DQN even know it was on the right track?

163538

3708

ではDQNはどうしたら自分が正しい方向に進んでいるかを理解できたのでしょうか

02:47

This is where babies come in.

167746

2458

ここで赤ちゃんの出番です

02:50

In studies, infants consistently look longer at pictures

170746

3875

研究で赤ちゃんはいつも以前見た絵より ―

02:54

they haven’t seen before than ones they have.

174621

2667

見たことのない絵を長く見ることが分かりました

02:57

There just seems to be something intrinsically rewarding about novelty.

177579

4000

どうやら新しさには何か本質的な価値があるようです

03:02

This behavior has been essential in understanding the infant mind.

182121

4125

この行動は乳幼児の心を理解する上で欠かせないものです

03:06

It also turned out to be the secret to beating “Montezuma’s Revenge.”

186496

4792

またこれこそが『モンテスマの復讐』をクリアする秘訣であることも判明しました

03:12

The DeepMind researchers worked out an ingenious way

192121

3708

DeepMind社の研究者たちはこの目新しさへの嗜好を

03:15

to plug this preference for novelty into reinforcement learning.

195829

4500

強化学習に組み込む巧妙な方法を開発しました

03:20

They made it so that unusual or new images appearing on the screen

200704

4542

画面に表示される珍しい画像や新しい画像が

03:25

were every bit as rewarding as real in-game points.

205246

4208

実際のゲーム内のスコアと同じぐらい価値があるようにしたのです

03:29

Suddenly, DQN was behaving totally differently from before.

209704

4709

するとDQNはそれまでとはまったく違う行動をとるようになりました

03:34

It wanted to explore the room it was in,

214579

2334

DQNは自分がいる部屋を探索し

03:36

to grab the key and escape through the locked door—

216913

2708

鍵を手に入れて鍵のかかったドアから脱出しようとしました

03:39

not because it was worth 100 points,

219621

2708

それは100ポイントの価値があるからではなく

03:42

but for the same reason we would: to see what was on the other side.

222329

4667

私たちと同じ理由で「向こう側に何があるか見てみたい」からです

03:48

With this new drive, DQN not only managed to grab that first key—

228163

5250

DQNはこの新たな原動力によって最初の鍵を手に入れることができただけでなく

03:53

it explored all the way through 15 of the temple’s 24 chambers.

233413

4833

神殿の24ある部屋のうち 15部屋まで探索しました

03:58

But emphasizing novelty-based rewards can sometimes create more problems

238454

4209

しかし目新しさを重視した報酬は時に解決する問題よりも多くの問題を

04:02

than it solves.

242663

1166

引き起こすことがあります

04:03

A novelty-seeking system that’s played a game too long

243913

3208

目新しさを求めるシステムは一つのゲームを長くプレイしていると

04:07

will eventually lose motivation.

247121

2500

やがて改善の動機づけを失っていきます

04:09

If it’s seen it all before, why go anywhere?

249996

3042

見たことがある場面だけになったらどこへ行けばよいのでしょうか？

04:13

Alternately, if it encounters, say, a television, it will freeze.

253621

5167

あるいは例えばテレビのようなものに遭遇するとフリーズしてしまいます

04:18

The constant novel images are essentially paralyzing.

258954

3750

常に目新しい映像が流れるためどうしても麻痺してしまうのです

04:23

The ideas and inspiration here go in both directions.

263204

3625

ここでのアイデアや着想は良くも悪くもなり得ます

04:27

AI researchers stuck on a practical problem,

267079

3125

AI研究者が現実的な問題に行き詰ったとき

04:30

like how to get DQN to beat a difficult game,

270204

3334

例えばDQNに難しいゲームをクリアさせるにはどうしたらいいかなど

04:33

are turning increasingly to experts in human intelligence for ideas.

273538

5000

高い知性をもった専門家にアイデアを求めることが増えています

04:38

At the same time,

278788

1125

それと同時に

04:39

AI is giving us new insights into the ways we get stuck and unstuck:

279913

5416

AIは人間がどのようにして

04:45

into boredom, depression, and addiction,

285329

2792

退屈うつ依存症に陥ったり

04:48

along with curiosity, creativity, and play.

288121

3667

好奇心創造力遊びによって解放されるかの新たな洞察を与えているのです

New videos

06:16

How important is politeness? ⏲️ 6 Minute English

07:44

North Korea’s secrets revealed by phone: Study:...

17:30

Advanced English Learning: Speaking Practice

03:48

What can you do? Easy English Conversations 💬 ...

12:13

Speak English Confidently: Daily Tricks & Tips 🧠

13:00

Practice English Conversation (Family life) Imp...

10:22

VOCABULARY English Speaking Practice

11:45

3 Simple Steps to Become Fluent in English

Original video on YouTube.com

How to get better at video games, according to babies - Brian Christian - YouTube

このウェブサイトについて

このサイトでは英語学習に役立つYouTube動画を紹介します。世界中の一流講師による英語レッスンを見ることができます。各ビデオのページに表示される英語字幕をダブルクリックすると、そこからビデオを再生することができます。字幕はビデオの再生と同期してスクロールします。ご意見・ご要望がございましたら、こちらのお問い合わせフォームよりご連絡ください。

https://forms.gle/WvT1wiN1qDtmnspy7

Playback speed

Subtitle font size

How to get better at video games, according to babies - Brian Christian

New videos

How to get better at video games, according to babies - Brian Christian

New videos

Original video on YouTube.com