How to get better at video games, according to babies - Brian Christian

559,494 views ・ 2021-11-02

TED-Ed

Por favor, faça duplo clique nas legendas em inglês abaixo para reproduzir o vídeo.

Tradutor: Margarida Ferreira Revisora: Ana Sofia Ferreira

00:08

In 2013, a group of researchers at DeepMind in London

8871

4292

Em 2013, um grupo de investigadores da DeepMind em Londres

00:13

had set their sights on a grand challenge.

13163

2666

dedicou-se a um enorme desafio.

00:15

They wanted to create an AI system that could beat,

15996

3292

Queriam criar um sistema de IA que pudesse vencer

00:19

not just a single Atari game, but every Atari game.

19288

4833

não apenas um jogo de Atari, mas todos os jogos de Atari.

00:24

They developed a system they called Deep Q Networks, or DQN,

24663

5166

Desenvolveram um sistema a que chamaram Deep Q Networks, ou DQN,

00:29

and less than two years later, it was superhuman.

29829

3667

e em menos de dois anos tornou-se super-humano.

00:33

DQN was getting scores 13 times better

33954

4167

O DQN obtinha pontuações 13 vezes melhores

00:38

than professional human games testers at “Breakout,”

38121

3541

do que os testadores humanos de jogos em “Breakout”,

00:41

17 times better at “Boxing,” and 25 times better at “Video Pinball.”

41662

6334

17 vezes melhores em “Boxing”

e 25 vezes melhores em “Video Pinball”.

00:48

But there was one notable, and glaring, exception.

48162

3834

Mas havia uma exceção notável e evidente.

00:52

When playing “Montezuma’s Revenge” DQN couldn’t score a single point,

52496

5791

Quando jogavam “Montezuma’s Revenge”, o DQN não marcava nem um ponto

00:58

even after playing for weeks.

58537

2625

mesmo depois de jogar há semanas.

01:01

What was it that made this particular game so vexingly difficult for AI?

61412

5459

O que é que aquele jogo tinha de especial para ser tão difícil para a IA?

01:07

And what would it take to solve it?

67204

2459

E o que seria preciso para resolver isso?

01:10

Spoiler alert: babies.

70538

2833

Vamos revelar: bebés.

01:13

We’ll come back to that in a minute.

73746

2000

Já volto a falar disto.

01:16

Playing Atari games with AI involves what’s called reinforcement learning,

76163

5541

Jogar jogos de Atari com IA

envolve aquilo a que chamamos aprendizagem reforçada,

01:21

where the system is designed to maximize some kind of numerical rewards.

81871

4917

em que o sistema está concebido

para maximizar algumas recompensas numéricas.

01:26

In this case, those rewards were simply the game's points.

86788

3833

Neste caso, essas recompensas eram simplesmente a pontuação do jogo.

01:30

This underlying goal drives the system to learn which buttons to press

90746

4333

Este objetivo subjacente leva o sistema a aprender quais os botões a premir

01:35

and when to press them to get the most points.

95079

3000

e quando os premir para obter mais pontos.

01:38

Some systems use model-based approaches, where they have a model of the environment

98079

5542

Alguns sistemas usam abordagens baseadas em modelos

em que têm um modelo do ambiente

01:43

that they can use to predict what will happen next

103621

3125

que podem usar para prever o que vai acontecer a seguir,

01:46

once they take a certain action.

106746

2000

depois de se realizar uma determinada ação.

01:49

DQN, however, is model free.

109288

3041

Mas o DQN não segue nenhum modelo.

01:52

Instead of explicitly modeling its environment,

112704

2584

Em vez de modelar explicitamente o seu ambiente,

01:55

it just learns to predict, based on the images on screen,

115288

3458

simplesmente aprende a prever, com base nas imagens do ecrã,

01:58

how many future points it can expect to earn by pressing different buttons.

118746

4958

quantos pontos pode esperar obter se premir diferentes botões.

02:03

For instance, “if the ball is here and I move left, more points,

123871

4792

Por exemplo, “se a bola está aqui e a atirar para a esquerda, mais pontos,

02:08

but if I move right, no more points.”

128663

2833

“mas, se atirar para a direita, zero pontos.”

02:12

But learning these connections requires a lot of trial and error.

132038

4500

Mas aprender estas ligações exige muita tentativa-erro.

02:16

The DQN system would start by mashing buttons randomly,

136704

3834

O sistema DQN começará por premir botões ao acaso,

02:20

and then slowly piece together which buttons to mash when

140538

3541

e depois percebe lentamente quais os botões a premir e quando

02:24

in order to maximize its score.

144079

2125

a fim de maximizar a pontuação.

02:26

But in playing “Montezuma’s Revenge,”

146704

2375

Mas, ao jogar ”Montezuma’s Revenge,”

02:29

this approach of random button-mashing fell flat on its face.

149079

4334

esta abordagem de premir botões ao acaso não adianta.

02:34

A player would have to perform this entire sequence

154121

3000

Um jogador tem de executar toda esta sequência

02:37

just to score their first points at the very end.

157121

3375

só para no fim conseguir os primeiros pontos.

02:40

A mistake? Game over.

160871

2208

Um erro? Fim do jogo.

02:43

So how could DQN even know it was on the right track?

163538

3708

Como é que o DQN pode saber se estava no caminho certo?

02:47

This is where babies come in.

167746

2458

É aqui que entram os bebés.

02:50

In studies, infants consistently look longer at pictures

170746

3875

Em estudos, os bebés olham durante mais tempo

02:54

they haven’t seen before than ones they have.

174621

2667

para imagens que nunca viram, do que para as que já viram.

02:57

There just seems to be something intrinsically rewarding about novelty.

177579

4000

Parece haver qualquer coisa de gratificante na novidade.

03:02

This behavior has been essential in understanding the infant mind.

182121

4125

Este comportamento tem sido essencial para entender a mente das crianças.

03:06

It also turned out to be the secret to beating “Montezuma’s Revenge.”

186496

4792

Ao que parece, também é o segredo para vencer “Montezuma’s Revenge”.

03:12

The DeepMind researchers worked out an ingenious way

192121

3708

Os investigadores da DeepMind trabalharam de forma engenhosa

03:15

to plug this preference for novelty into reinforcement learning.

195829

4500

para transformarem esta preferência pela novidade em aprendizagem reforçada.

03:20

They made it so that unusual or new images appearing on the screen

200704

4542

Fizeram com que as imagens invulgares ou novas que apareciam no ecrã

03:25

were every bit as rewarding as real in-game points.

205246

4208

fossem tão gratificantes como pontos dos jogos na realidade.

03:29

Suddenly, DQN was behaving totally differently from before.

209704

4709

De repente, o DQN estava comportar-se de forma totalmente diferente.

03:34

It wanted to explore the room it was in,

214579

2334

Queria explorar a divisão em que estava,

03:36

to grab the key and escape through the locked door—

216913

2708

apanhar a chave e escapar pela porta trancada;

03:39

not because it was worth 100 points,

219621

2708

não porque valesse 100 pontos,

03:42

but for the same reason we would: to see what was on the other side.

222329

4667

mas pela mesma razão que nós queríamos: para ver o que estava do outro lado.

03:48

With this new drive, DQN not only managed to grab that first key—

228163

5250

Com esta nova motivação, o DQN não só conseguiu agarrar a primeira chave,

03:53

it explored all the way through 15 of the temple’s 24 chambers.

233413

4833

como explorou 15 das 24 câmaras do templo.

03:58

But emphasizing novelty-based rewards can sometimes create more problems

238454

4209

Mas realçar as recompensas com base na novidade

pode, por vezes, criar mais problemas do que os que resolve.

04:02

than it solves.

242663

1166

04:03

A novelty-seeking system that’s played a game too long

243913

3208

Um sistema que procure a novidade e que jogue demasiado um jogo

04:07

will eventually lose motivation.

247121

2500

vai acabar por perder a motivação.

04:09

If it’s seen it all before, why go anywhere?

249996

3042

Se já se viu tudo antes, porquê continuar?

04:13

Alternately, if it encounters, say, a television, it will freeze.

253621

5167

Em alternativa, se encontrar, digamos, uma televisão, vai paralisar.

04:18

The constant novel images are essentially paralyzing.

258954

3750

As imagens novas constantes são paralisantes.

04:23

The ideas and inspiration here go in both directions.

263204

3625

As ideias e a inspiração vão nas duas direções.

04:27

AI researchers stuck on a practical problem,

267079

3125

Os investigadores de IA, encalhados num problema prático

04:30

like how to get DQN to beat a difficult game,

270204

3334

como como levar o DQN a vencer um jogo difícil,

04:33

are turning increasingly to experts in human intelligence for ideas.

273538

5000

estão a recorrer cada vez mais a especialistas de inteligência humana

em busca de ideias.

04:38

At the same time,

278788

1125

Ao mesmo tempo,

04:39

AI is giving us new insights into the ways we get stuck and unstuck:

279913

5416

a IA está a dar-nos novas ideias

sobre a forma como ficamos encalhados e como nos libertamos:

04:45

into boredom, depression, and addiction,

285329

2792

no aborrecimento, na depressão e na dependência,

04:48

along with curiosity, creativity, and play.

288121

3667

a par da curiosidade, da criatividade e de jogar.

New videos

08:29

Are We Still Human If Robots Help Raise Our Bab...

06:45

Parkour! How the Sport Keeps Your Body and Mind...

03:17

Aria Shares Her FAVORITE Things And Surprising ...

05:29

How are microchips made? - George Zaidan and Sa...

10:07

Should we get rid of pregnancy? | Ada, Ep. 3

05:10

The myth of Ireland's most infamous love triang...

10:48

Would you sell your kidney for $100,000? | Ada,...

14:46

How AI Is Decoding Ancient Scrolls | Julian Sch...

Original video on YouTube.com

How to get better at video games, according to babies - Brian Christian - YouTube

Sobre este site

Este sítio irá apresentar-lhe vídeos do YouTube que são úteis para a aprendizagem do inglês. Verá lições de inglês ensinadas por professores de primeira linha de todo o mundo. Faça duplo clique nas legendas em inglês apresentadas em cada página de vídeo para reproduzir o vídeo a partir daí. As legendas deslocam-se em sincronia com a reprodução do vídeo. Se tiver quaisquer comentários ou pedidos, por favor contacte-nos utilizando este formulário de contacto.

https://forms.gle/WvT1wiN1qDtmnspy7

Playback speed

Subtitle font size

How to get better at video games, according to babies - Brian Christian

New videos

How to get better at video games, according to babies - Brian Christian

New videos

Original video on YouTube.com