How to get better at video games, according to babies - Brian Christian

559,494 views ・ 2021-11-02

TED-Ed

Please double-click on the English subtitles below to play the video.

Translator: avesta masoud Reviewer: Daban Q. Jaff

00:08

In 2013, a group of researchers at DeepMind in London

8871

4292

ساڵی ٢٠١٣، کۆمەڵێک توێژەر لە دیپ میند لە لەندەن

00:13

had set their sights on a grand challenge.

13163

2666

دیمەنی خۆیان لەسەر تەحەدایەکی گەورە دانابوو.

00:15

They wanted to create an AI system that could beat,

15996

3292

ویستیان سیستمێکی AI دروست بکەن کە دەتوانێ لێی بدات،

00:19

not just a single Atari game, but every Atari game.

19288

4833

تەنها یاری ئاتاری نییە ،بەڵام هەموو یاریەکی ئاتاری.

00:24

They developed a system they called Deep Q Networks, or DQN,

24663

5166

سیستەمێکیان پێشخست کە پێی دەڵێن تۆڕی قوڵی کیو، یان DQN.

00:29

and less than two years later, it was superhuman.

29829

3667

وە کەمتر لە دوو ساڵ دواتر ،ئەوە زۆر مرۆڤانە بوو.

00:33

DQN was getting scores 13 times better

33954

4167

DQN 13 جار باشتر نمرەی وەرگرت

00:38

than professional human games testers at “Breakout,”

38121

3541

لە تاقیکردنەوەی یاریە مرۆییە پیشەییەکان لە “برێکاوت“دا

00:41

17 times better at “Boxing,” and 25 times better at “Video Pinball.”

41662

6334

17 جار باشترە لە “بۆکسینگ“. وە 25 جار باشتر لە “ڤیدیۆ پینباڵ”

00:48

But there was one notable, and glaring, exception.

48162

3834

بەڵام یەکێکی دیارو بەرچاوو بەرچاو بوو. زۆر بەدەرە.

00:52

When playing “Montezuma’s Revenge” DQN couldn’t score a single point,

52496

5791

لەکاتی یاریکردنی “تۆڵەی مۆنتیزوما” DQN نەیتوانی یەک خاڵ تۆمار بکات،

00:58

even after playing for weeks.

58537

2625

تەنانەت دوای چەند هەفتەیەک یاری کردن.

01:01

What was it that made this particular game so vexingly difficult for AI?

61412

5459

چی بوو کە ئەم یارییە تایبەتەی کرد زۆر قورسە بۆ ئای ئەی؟

01:07

And what would it take to solve it?

67204

2459

وە چی دەوێت بۆ چارەسەرکردنی؟

01:10

Spoiler alert: babies.

70538

2833

ئاگادارکردنەوەی تێکدەر: منداڵ.

01:13

We’ll come back to that in a minute.

73746

2000

ئێمە لە خولەکێکدا دەگەڕێینەوە بۆ ئەوە.

01:16

Playing Atari games with AI involves what’s called reinforcement learning,

76163

5541

یارییەکانی ئاتاری لەگەڵ ئای ئای تێدەگلێت ئەوەی پێی دەوترێت فێربوونی بەهێزکردن.

01:21

where the system is designed to maximize some kind of numerical rewards.

81871

4917

شوێنە سیستەمەکە دیزاین کراوە بۆ زیاترکردنی هەندێک پاداشتی ژمارەیی

01:26

In this case, those rewards were simply the game's points.

86788

3833

لەم حاڵەتەدا ئەو خەڵاتانە بوون بە سادەیی خاڵەکانی یاریەکە.

01:30

This underlying goal drives the system to learn which buttons to press

90746

4333

ئەم ئامانجە بنچینەییە سیستەمەکە لێخوڕیوە بۆ فێربوونی کام دوگمە بۆ فشار

01:35

and when to press them to get the most points.

95079

3000

وە کەی فشاریان بۆ بهێن بۆ بەدەستهێنانی زۆرترین خاڵ.

01:38

Some systems use model-based approaches, where they have a model of the environment

98079

5542

سیستەم شێوازی مۆدێل-بنچینە بەکاردەهێنن مۆدێلێکی ژینگەییان هەیە

01:43

that they can use to predict what will happen next

103621

3125

کە دەتوانن بەکاری بهێنن بۆ پێشبینیکردن لە دوایدا چی ڕوودەدات

01:46

once they take a certain action.

106746

2000

کاتێک ئەوان کردارێکی دیاریکراو دەگرنە بەر.

01:49

DQN, however, is model free.

109288

3041

دی کیو ئێن، لەگەڵ ئەوەشدا، مۆدێلێکی ئازادە.

01:52

Instead of explicitly modeling its environment,

112704

2584

لەجیاتی ئەوەی بە ڕاشکاوی مۆدێل بکات کەش و هەواکەی،

01:55

it just learns to predict, based on the images on screen,

115288

3458

ئەوە تەنها فێر دەبێت پێشبینی بکات لەسەر بنەمای وێنەکانی سەر شاشەکە،

01:58

how many future points it can expect to earn by pressing different buttons.

118746

4958

چەند خاڵی داهاتوو دەتوانێت پێشبینی بکات بەدەستهێنانی بە فشارخستنە سەر دوگمە جیاواز

02:03

For instance, “if the ball is here and I move left, more points,

123871

4792

بۆ نمونە،“ئەگەر تۆپەکە لێرە بێت وە من بە چەپ دەجوڵێمەوە، خاڵی زیاتر،

02:08

but if I move right, no more points.”

128663

2833

بەڵام ئەگەر بە ڕاست بڕۆم، چیتر خاڵێک نیە”

02:12

But learning these connections requires a lot of trial and error.

132038

4500

بەڵام فێربوونی ئەم پەیوەندیانە پێویستی بە زۆر لە تاقیکردنەوە و هەڵە.

02:16

The DQN system would start by mashing buttons randomly,

136704

3834

سیستەمی DQN دەستپێدەکات بە شێواندنی دوگمەکان بە شێوەیەکی هەڕەمەکی.

02:20

and then slowly piece together which buttons to mash when

140538

3541

وە دواتر بە هێواشی پێکەوە پارچە کە دوگمەکان بۆ ماش کاتێک

02:24

in order to maximize its score.

144079

2125

بۆ ئەوەی نمرەی خۆی زیاتر بکات.

02:26

But in playing “Montezuma’s Revenge,”

146704

2375

بەڵام لە یاریکردنی “تۆڵەی مۆنتیزوما“دا

02:29

this approach of random button-mashing fell flat on its face.

149079

4334

ئەم ڕێگایە لە دوگمەی هەڕەمەکی لە سەر دەموچاوی تەخت کەوت.

02:34

A player would have to perform this entire sequence

154121

3000

یاریزانێک دەبێت نمایش بکات هەموو ئەم زنجیرەیە

02:37

just to score their first points at the very end.

157121

3375

تەنها بۆ تۆمارکردنی یەکەم خاڵەکانیان لە کۆتایدا.

02:40

A mistake? Game over.

160871

2208

ئەوە چیە؟ یاری کۆتایی پێهات.

02:43

So how could DQN even know it was on the right track?

163538

3708

کەواتە چۆن دەکرێت DQN بزانیت ئەوە لەسەر ڕێڕەوی ڕاست بوو؟

02:47

This is where babies come in.

167746

2458

ئەمە ئەو شوێنەیە کە منداڵان دێنە ژوورەوە.

02:50

In studies, infants consistently look longer at pictures

170746

3875

لەخوێندنەکان،ساواکان بەبەردەوامی دەڕوانن درێژتر لە وێنەکان

02:54

they haven’t seen before than ones they have.

174621

2667

ئەوان پێشتر نەبینیوە لە ئەو کەسانەی کە هەیانە.

02:57

There just seems to be something intrinsically rewarding about novelty.

177579

4000

وادیارە شتێک هەبێت شێوەیەکی ناوازە خەڵاتکردن دەربارەی تازەکار

03:02

This behavior has been essential in understanding the infant mind.

182121

4125

ئەم ڕەفتارە زۆر گرنگ بووە لە تێگەیشتنی مێشکی ساوادا.

03:06

It also turned out to be the secret to beating “Montezuma’s Revenge.”

186496

4792

هەروەها دەرکەوت کە نهێنییەکە بۆ لێدانی “تۆڵەی مۆنتیزوما”

03:12

The DeepMind researchers worked out an ingenious way

192121

3708

توێژەرانی دیپ میند کاریان کرد ڕێگایەکی نەزێن

03:15

to plug this preference for novelty into reinforcement learning.

195829

4500

بۆ پێوەکردنی ئەم پەسەندییە بۆ نوێکاری بۆ فێربوونی بەهێزکردن.

03:20

They made it so that unusual or new images appearing on the screen

200704

4542

وایان کردووە کە وێنەی نائاسایی یان نوێ دەرکەوتن لەسەر شاشەکە

03:25

were every bit as rewarding as real in-game points.

205246

4208

هەموو بیتێک وەک خەڵات بوون وەک خاڵی راستەقینە لە یاریدا.

03:29

Suddenly, DQN was behaving totally differently from before.

209704

4709

لەناکاو، DQN بەتەواوی ڕەفتاری دەکرد بە جیاواز لە پێشووەوە.

03:34

It wanted to explore the room it was in,

214579

2334

دەیویست ئەو ژوورە بپشکنە کە تێیدا بوو.

03:36

to grab the key and escape through the locked door—

216913

2708

بۆ گرتنی کلیلەکە و هەڵاتن لە ڕێگەی دەرگا داخراوەکەوە—

03:39

not because it was worth 100 points,

219621

2708

نەک لەبەر ئەوەی نرخی ١٠٠ خاڵ بوو،

03:42

but for the same reason we would: to see what was on the other side.

222329

4667

بەڵام بۆ هەمان هۆکار ئێمە دەیکەین: بۆ بینینی ئەوەی کە لە لایەکی ترەوە بوو.

03:48

With this new drive, DQN not only managed to grab that first key—

228163

5250

بەم درایڤە نوێیە، DQN نەک تەنها توانی ئەو یەکەم کلیلە بگرێت

03:53

it explored all the way through 15 of the temple’s 24 chambers.

233413

4833

هەموو ڕێگاکەی گەڕاندەوە لە ١٥ لە ٢٤ ژووری پەرستگاکە.

03:58

But emphasizing novelty-based rewards can sometimes create more problems

238454

4209

جەختکردنەوە لەسەر پاداشتە تازەناسییەکان هەندێک جار دەتوانێت کێشەی زیاتر دروست بکات

04:02

than it solves.

242663

1166

وەک ئەوەی چارەسەر بکات.

04:03

A novelty-seeking system that’s played a game too long

243913

3208

سیستمێکی تازەگەرانەی کە ژەنراوە- یاریەکە زۆر درێژە

04:07

will eventually lose motivation.

247121

2500

لە کۆتاییدا هاندان لە دەست دەدات.

04:09

If it’s seen it all before, why go anywhere?

249996

3042

ئەگەر پێشتر هەمووی بینرابێت بۆچی بڕۆ بۆ هەر شوێنێک؟

04:13

Alternately, if it encounters, say, a television, it will freeze.

253621

5167

شێوەیەکی جێگرەوە،ئەگەر ڕوبەڕوی دەبێت، بڵێ تەلەفیزیۆنێک، بەست دەبێت.

04:18

The constant novel images are essentially paralyzing.

258954

3750

وێنەی ڕۆمانە بەردەوامەکان لە بنەڕەتدا ئیفلیج دەکەن.

04:23

The ideas and inspiration here go in both directions.

263204

3625

بیرۆکە و ئیلهامەکانی ئێرە لە هەردوو ئاڕاستەدا بڕۆ.

04:27

AI researchers stuck on a practical problem,

267079

3125

توێژەرانی ئای ئای گیران لەسەر کێشەیەکی کرداری،

04:30

like how to get DQN to beat a difficult game,

270204

3334

وەک چۆن DQN بۆ لێدان یارییەکی سەخت،

04:33

are turning increasingly to experts in human intelligence for ideas.

273538

5000

بە شێوەیەکی زۆر بەرەو شارەزایان دەگەڕێن لە زیرەکی مرۆڤدا بۆ بیرۆکەکان.

04:38

At the same time,

278788

1125

لە ھەمەن کاتدا،

04:39

AI is giving us new insights into the ways we get stuck and unstuck:

279913

5416

ئای ئای تێڕوانینی نوێمان پێدەبەخشێت بۆ ئەو ڕێگایەی گیرمان خوارد لە دەست نادە:

04:45

into boredom, depression, and addiction,

285329

2792

بۆ بێزاری، و خەمۆکی و ئالوودەبوون.

04:48

along with curiosity, creativity, and play.

288121

3667

لەگەڵ شێتی ،و داهێناندا وە یاری بکەن.

New videos

05:29

How are microchips made? - George Zaidan and Sa...

04:59

How some friendships last — and others don’t - ...

07:56

How I Imitate Nature’s Voices | Snow Raven | TED

05:08

Yo! Have You Ever Seen a Yo-Yo Dance Like This?...

04:34

A Lens on Georgia’s Survival in the Shadow of a...

09:32

12 Predictions for the Future of Technology | V...

04:55

How did ancient civilizations make ice cream? -...

04:52

Will the real Fernando please stand up? - Ilan ...

Original video on YouTube.com

How to get better at video games, according to babies - Brian Christian - YouTube

About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7

Playback speed

Subtitle font size

How to get better at video games, according to babies - Brian Christian

New videos

How to get better at video games, according to babies - Brian Christian

New videos

Original video on YouTube.com