How to get better at video games, according to babies - Brian Christian

559,494 views ・ 2021-11-02

TED-Ed

ဗီဒီယိုကိုဖွင့်ရန် အောက်ပါ အင်္ဂလိပ်စာတန်းများကို နှစ်ချက်နှိပ်ပါ။

Translator: Sanntint Tint Reviewer: Myo Aung

00:08

In 2013, a group of researchers at DeepMind in London

8871

4292

၂၀၁၃ ခုနှစ်မှာ လန်ဒန်မှာရှိတဲ့ DeepMind က သုတေသီတစ်စုဟာ

00:13

had set their sights on a grand challenge.

13163

2666

စိန်ခေါ်မှုကြီးတစ်ခုမှာ မရမနေ အားထုတ်ခဲ့တယ်။

00:15

They wanted to create an AI system that could beat,

15996

3292

သူတို့ဟာ Atari ဂိမ်းတစ်ခုတည်းကို သာမက Atari ဂိမ်းတိုင်းကို

00:19

not just a single Atari game, but every Atari game.

19288

4833

အနိုင်တိုက်နိုင်တဲ့ AI စနစ်တစ်ခုကို ဖန်တီးချင်ခဲ့တယ်။

00:24

They developed a system they called Deep Q Networks, or DQN,

24663

5166

Deep Q Networks (သို့) DQN လို့ ခေါ်တဲ့ စနစ်တစ်ခုကို တီထွင်ခဲ့ပြီး

00:29

and less than two years later, it was superhuman.

29829

3667

နောက် နှစ်နှစ် မရှိတရှိအချိန်မှာ ဒါက မဟာလူသားဖြစ်ခဲ့တယ်။

00:33

DQN was getting scores 13 times better

33954

4167

DQN က ကျွမ်းကျင် လူသား ဂိမ်း စမ်းသပ်သူတွေ ထက် Breakout မှာ ၁၃ ဆ၊

00:38

than professional human games testers at “Breakout,”

38121

3541

“Boxing”မှာ ၁၇ ဆ၊

00:41

17 times better at “Boxing,” and 25 times better at “Video Pinball.”

41662

6334

“Video Pinball”မှာ ၂၅ ဆ ပိုပြီး အမှတ်ကောင်းလာနေတယ်။

00:48

But there was one notable, and glaring, exception.

48162

3834

ဒါပေမဲ့ သတိပြုမိလောက်ပြီး ထင်ရှားနေတဲ့ ခြွင်းချက်တစ်ခုတော့ ရှိတယ်။

00:52

When playing “Montezuma’s Revenge” DQN couldn’t score a single point,

52496

5791

“Montezuma’s Revenge” ကို ကစားတဲ့အခါ DQN ဟာ အမှတ်တစ်မှတ်တောင် မရနိုင်ခဲ့ဘူး။

00:58

even after playing for weeks.

58537

2625

သီတင်းပတ်ချီကာ ကစားပြီးနောက်မှာတောင်ပါ။

01:01

What was it that made this particular game so vexingly difficult for AI?

61412

5459

AI အတွက် ဒီဂိမ်းကို ဦးနှောက်ခြောက်စရာ ခက်ခဲအောင် ဘာက ဖြစ်စေခဲ့တာလဲ။

01:07

And what would it take to solve it?

67204

2459

ဒါကို ဖြေရှင်းဖို့ လိုအပ်တာက ဘာဖြစ်မလဲ။

01:10

Spoiler alert: babies.

70538

2833

ကြိုတင် သတိပေးချက်။ ကလေးငယ်တွေပါ။

01:13

We’ll come back to that in a minute.

73746

2000

ဒီအကြောင်းကို ခဏနေရင် ပြန်လာမှာပါ။

01:16

Playing Atari games with AI involves what’s called reinforcement learning,

76163

5541

Atari ဂိမ်းတွေကို AI နဲ့အတူ ကစားခြင်းမှာ အားပေး သင်ယူခြင်းလို့ခေါ်တာပါဝင်တယ်။

01:21

where the system is designed to maximize some kind of numerical rewards.

81871

4917

ကိန်းဂဏန်းဆိုင်ရာ ဆုလာဘ်မျိုးတွေကို များ နိုင်သမျှ များအောင် ပုံစံထုတ်ထားတဲ့စနစ်ပါ။

01:26

In this case, those rewards were simply the game's points.

86788

3833

ဒီဖြစ်ရပ်မှာတော့ ဒီဆုလာဘ်တွေက ဂိမ်း ရမှတ်တွေပါ။

01:30

This underlying goal drives the system to learn which buttons to press

90746

4333

ဒီငုပ်နေတဲ့ ရည်ရွယ်ချက်က ဘယ်ခလုတ်တွေကို နှိပ်ပြီး အများဆုံး အမှတ်တွေရဖို့

01:35

and when to press them to get the most points.

95079

3000

ဘယ်အချိန် ဆိုတာကို သင်ယူဖို့ စနစ်ကို မောင်းနှင်တာပါ။

01:38

Some systems use model-based approaches, where they have a model of the environment

98079

5542

တချို့စနစ်တွေက ပုံစံအခြေပြု နည်းလမ်းတွေကို အသုံးပြုပြီး လုပ်ဆောင်မှုတစ်ခု

01:43

that they can use to predict what will happen next

103621

3125

လုပ်ပြီဆိုတာနဲ့ နောက်ဘာဖြစ်မယ်ဆိုတာကို ကြိုခန့်မှန်းဖို့

01:46

once they take a certain action.

106746

2000

သုံးနိုင်တဲ့ ပတ်ဝန်းကျင် ပုံစံတစ်ခုရှိတယ်။

01:49

DQN, however, is model free.

109288

3041

DQN ကတော့ ပုံစံ ကင်းတယ်။

01:52

Instead of explicitly modeling its environment,

112704

2584

၎င်းရဲ့ဝန်းကျင်ကို ပြတ်သားစွာ ပုံစံချတာအစား

01:55

it just learns to predict, based on the images on screen,

115288

3458

စခရင်ပေါ်က ရုပ်ပုံတွေကို အခြေခံကာ မတူတဲ့ ခလုတ်တွေကို နှိပ်ခြင်းအားဖြင့်

01:58

how many future points it can expect to earn by pressing different buttons.

118746

4958

ရရှိဖို့ အနာဂတ် အမှတ် ဘယ်နှမှတ် မျှော်လင့် နိုင်တာကို ခန့်မှန်းဖို့ သင်ယူရုံပါ။

02:03

For instance, “if the ball is here and I move left, more points,

123871

4792

ဥပမာ “ဘောလုံးက ဒီမှာရှိပြီး ဘယ်ဘက်ကို ရွှေ့ရင် အမှတ်ပိုများပေမဲ့

02:08

but if I move right, no more points.”

128663

2833

ညာဘက်ကို ရွှေ့လိုက်ရင် အမှတ်ပိုမများတော့ဘူး။”

02:12

But learning these connections requires a lot of trial and error.

132038

4500

ဒါပေမဲ့ ဒီဆက်သွယ်မှုတွေကို သင်ယူခြင်းက စမ်းလိုက်၊ပြုပြင်လိုက် အများကြီးလိုအပ်တယ်။

02:16

The DQN system would start by mashing buttons randomly,

136704

3834

DQN စနစ်က ခလုတ်တွေကို ကျပန်း ဖိညှစ်ရင်း စတင်မှာဖြစ်ပြီး

02:20

and then slowly piece together which buttons to mash when

140538

3541

ဒီနောက် ၎င်းရဲ့အမှတ်ကို များနိုင်သမျှ များဖို့ ဘယ်အချိန် ဘယ်ခလုတ်ကို

02:24

in order to maximize its score.

144079

2125

ဖိညှစ်ဖို့ ဖြည်းဖြည်းချင်း အတူတူ ဆက်စပ်မှာပါ။

02:26

But in playing “Montezuma’s Revenge,”

146704

2375

ဒါပေမဲ့ “Montezuma’s Revenge” ကစားရာမှာတော့

02:29

this approach of random button-mashing fell flat on its face.

149079

4334

ဒီကျပန်း ခလုတ် ဖိညှစ်တဲ့ နည်းလမ်းက လုံးဝကို ကျရှုံးသွားတယ်။

02:34

A player would have to perform this entire sequence

154121

3000

ကစားသမားတစ်ဦးဟာ အဆုံးနားက ပထမ အမှတ်တွေကို ရဖို့ကိုပဲ

02:37

just to score their first points at the very end.

157121

3375

အစဉ်တစ်ခုလုံးကို လုပ်ဆောင်ဖို့ လိုလိမ့်မယ်။

02:40

A mistake? Game over.

160871

2208

အမှားတစ်ခုလား။ ပွဲက ပြီးသွားပြီ။

02:43

So how could DQN even know it was on the right track?

163538

3708

ဒီတော့ လမ်းကြောင်းမှန် ရှိနေတယ်ဆိုတာကို DQN က ဘယ်လို သိနိုင်တာလဲ။

02:47

This is where babies come in.

167746

2458

ဒါက ကလေးတွေ ပါဝင်လာတဲ့ နေရာပါ။

02:50

In studies, infants consistently look longer at pictures

170746

3875

လေ့လာမှုတွေမှာ မွေးကင်းစ ကလေးတွေဟာ သူတို့ မြင်ဖူးတဲ့ ပုံတွေထက် အရင်က

02:54

they haven’t seen before than ones they have.

174621

2667

မမြင်ဖူးတာတွေကို တစ်သမတ်တည်း ပိုကြာကြာ ကြည့်ကြတယ်။

02:57

There just seems to be something intrinsically rewarding about novelty.

177579

4000

သစ်ဆန်းမှုနဲ့ ပတ်သက်ပြီး ပင်ကိုအားဖြင့် အကျိုးပြုတာ တစ်ခုခုရှိပုံရတယ်။

03:02

This behavior has been essential in understanding the infant mind.

182121

4125

ဒီအပြုအမူက ကလေး စိတ်ကို နားလည်ရာမှာ မုချလိုအပ်ပါတယ်။

03:06

It also turned out to be the secret to beating “Montezuma’s Revenge.”

186496

4792

ဒါက “Montezuma’s Revenge” ကို နိုင်တာမှာ လျှို့ဝှက်ချက်တစ်ခုလည်းဖြစ်သွားတယ်။

03:12

The DeepMind researchers worked out an ingenious way

192121

3708

DeepMind သုတေသီတွေက ဆန်းသစ်မှုကို လိုလားမှုကို အားပေးတဲ့ သင်ယူခြင်းထဲမှာ

03:15

to plug this preference for novelty into reinforcement learning.

195829

4500

ထည့်ဖို့ ထွင်ဉာဏ်ရှိတဲ့ နည်းလမ်းတစ်ခုနဲ့ တွက်ထုတ်တယ်။

03:20

They made it so that unusual or new images appearing on the screen

200704

4542

ဒါတွေက စခရင်မှာ ပေါ်နေတဲ့ ထူးခြားတဲ့ (သို့)ပုံရိပ်သစ်တွေဟာ ဂိမ်း အမှတ်တွေလိုပဲ

03:25

were every bit as rewarding as real in-game points.

205246

4208

ဆုလာဘ်လိုဖြစ်အောင် အပိုင်းအစတိုင်းကို လုပ်ပေးတယ်။

03:29

Suddenly, DQN was behaving totally differently from before.

209704

4709

ရုတ်တရက် DQN အရင်တုန်းကနဲ့ ခြားနားစွာ ပြမူနေခဲ့တယ်။

03:34

It wanted to explore the room it was in,

214579

2334

င်းရှိခဲ့တဲ့ နေရာကို စူးမ်းချင်ခဲ့တယ်။

03:36

to grab the key and escape through the locked door—

216913

2708

သော့ကို ဆုပ်ကိုင်ပြီး ပိတ်ထားတဲ့ တံခါးကနေ လွတ်ချင်တာက

03:39

not because it was worth 100 points,

219621

2708

အမှတ် ၁၀၀ တန်တာကြောင့်မဟုတ်ဘဲ

03:42

but for the same reason we would: to see what was on the other side.

222329

4667

အခြားတစ်ဘက်မှာရှိတာကို မြင်ချင်တဲ့ လူတွေ မှာရှိမယ့် တူညီတဲ့အကြောင်းပြချက်ကြောင့်ပါ။

03:48

With this new drive, DQN not only managed to grab that first key—

228163

5250

ဒီတွန်းအား အသစ်နဲ့အတူ DQN ဟာ ပထမဆုံး သော့ကို ဆုပ်ကိုင်နိုင်ရုံသာမက

03:53

it explored all the way through 15 of the temple’s 24 chambers.

233413

4833

ကျောင်းတော်ရဲ့ ခန်းမဆောင် ၂၄ ခုအနက် ၁၅ ခုကို တောက်လျှောက် စူးစမ်းခဲ့တယ်။

03:58

But emphasizing novelty-based rewards can sometimes create more problems

238454

4209

ဒါပေမဲ့ ဆန်းသစ်မှု အခြေခံ ဆုလာဘ်တွေကို အသားပေးတာက ပြဿနာတွေ ဖြေရှင်းတာထက် တစ်ခါတလေ

04:02

than it solves.

242663

1166

ပိုဖန်တီးနိုင်တယ်။

04:03

A novelty-seeking system that’s played a game too long

243913

3208

ဂိမ်းတစ်ခုကို အကြာကြီး ကစားတဲ့ ဆန်းသစ်မှု ရှာဖွေတဲ့ စနစ်တစ်ခုဟာ

04:07

will eventually lose motivation.

247121

2500

နောက်ဆုံးမှာ စိတ်ပါဝင်စားမှု ပျောက်ဆုံးမှာပါ။

04:09

If it’s seen it all before, why go anywhere?

249996

3042

အားလုံးကို မြင်ဖူးပြီဆိုရင် ဘာကြောင့် တစ်နေရာရာကို သွားတာလဲ။

04:13

Alternately, if it encounters, say, a television, it will freeze.

253621

5167

အပြောင်းအလဲအနေနဲ့ ဆိုပါတော့ ရုပ်သံတစ်ခုက ဒါကို ကြုံတွေ့ရရင် ရပ်သွားမယ်။

04:18

The constant novel images are essentially paralyzing.

258954

3750

တရစပ် ပုံရိပ်သစ်တွေဟာ အခြေခံအားဖြင့် တုံ့ဆိုင်းနေတာပါ။

04:23

The ideas and inspiration here go in both directions.

263204

3625

စိတ်ကူးတွေနဲ့ စေ့ဆော်မှု နှစ်ခုစလုံး ဒီမှာ ဦးတည်ရာ နှစ်ခုစလုံးကို သွားတယ်။

04:27

AI researchers stuck on a practical problem,

267079

3125

DQN ကို ခက်ခဲတဲ့ ဂိမ်းတစ်ခုကို အနိုင်တိုက်ခိုင်းတာမျိုးလို

04:30

like how to get DQN to beat a difficult game,

270204

3334

AI သုတေသီတွေဟာ လက်တွေ့ ပြဿနာတစ်ခုမှာ တစ်နေတယ်

04:33

are turning increasingly to experts in human intelligence for ideas.

273538

5000

စိတ်ကူးတွေတွက် လူသား ဉာဏ်ရည်မှာ ကျွမ်းကျင်သူတွေကို တိုးကာ ပြောင်းနေတာပါ။

04:38

At the same time,

278788

1125

တစ်ချိန်တည်းမှာ

04:39

AI is giving us new insights into the ways we get stuck and unstuck:

279913

5416

AI က ကျွန်တော်တို့ တစ်နေပြီး လွတ်သွားပုံ တွေမှာ ထိုးထွင်းအမြင် အသစ်တွေကို ပေးနေတယ်။

04:45

into boredom, depression, and addiction,

285329

2792

သိလိုမှု၊ဖန်တီးမှု၊ ကစားမှုနဲ့အတူပါတဲ့

04:48

along with curiosity, creativity, and play.

288121

3667

ငြီးငွေ့မှု၊စိတ်ဓာတ်ကျမှု၊ စွဲလန်းမှုပါ။

New videos

06:27

How do drugs make you hallucinate? - Anees Bahji

08:33

Can AI Help with the Chaos of Family Life? | Av...

09:49

The Poetry of Family | Duncan Keegan | TED

08:05

How Art Helped Me Grapple with Grief | Navied M...

05:08

What does 'typical' mean? LEARN ENGLISH with MR...

08:29

Are We Still Human If Robots Help Raise Our Bab...

$LEARN ENGLISH phrases using HEART \ English Addict with Mr Duncan -16 #englishaddictwithmrduncan$

10:15

LEARN ENGLISH phrases using HEART \ English Add...

06:45

Parkour! How the Sport Keeps Your Body and Mind...

Original video on YouTube.com

How to get better at video games, according to babies - Brian Christian - YouTube

ဤဝဘ်ဆိုဒ်အကြောင်း

ဤဆိုက်သည် သင့်အား အင်္ဂလိပ်စာလေ့လာရန်အတွက် အသုံးဝင်သော YouTube ဗီဒီယိုများနှင့် မိတ်ဆက်ပေးပါမည်။ ကမ္ဘာတစ်ဝှမ်းမှ ထိပ်တန်းဆရာများ သင်ကြားပေးသော အင်္ဂလိပ်စာသင်ခန်းစာများကို သင်တွေ့မြင်ရပါမည်။ ဗီဒီယိုစာမျက်နှာတစ်ခုစီတွင် ပြသထားသည့် အင်္ဂလိပ်စာတန်းထိုးများကို နှစ်ချက်နှိပ်ပါ။ စာတန်းထိုးများသည် ဗီဒီယိုပြန်ဖွင့်ခြင်းနှင့်အတူ ထပ်တူပြု၍ လှိမ့်သွားနိုင်သည်။ သင့်တွင် မှတ်ချက်များ သို့မဟုတ် တောင်းဆိုမှုများရှိပါက ဤဆက်သွယ်ရန်ပုံစံကို အသုံးပြု၍ ကျွန်ုပ်တို့ထံ ဆက်သွယ်ပါ။

https://forms.gle/WvT1wiN1qDtmnspy7

Playback speed

Subtitle font size

How to get better at video games, according to babies - Brian Christian

New videos

How to get better at video games, according to babies - Brian Christian

New videos

Original video on YouTube.com