How to get better at video games, according to babies - Brian Christian

559,494 views ・ 2021-11-02

TED-Ed

Haga doble clic en los subtítulos en inglés para reproducir el vídeo.

Traductor: Tomás Sosa Revisor: Sebastian Betti

00:08

In 2013, a group of researchers at DeepMind in London

8871

4292

En 2013, un grupo de investigadores en DeepMind en Londres

00:13

had set their sights on a grand challenge.

13163

2666

había puesto su mirada en un gran desafío.

00:15

They wanted to create an AI system that could beat,

15996

3292

Quería crear un sistema de IA que pudiera vencer,

00:19

not just a single Atari game, but every Atari game.

19288

4833

no solo a un juego Atari, sino a todos.

00:24

They developed a system they called Deep Q Networks, or DQN,

24663

5166

Desarrollaron un sistema que llamaron Deep Q Networks o DQN,

00:29

and less than two years later, it was superhuman.

29829

3667

y menos de dos años después, ya era un superhumano.

00:33

DQN was getting scores 13 times better

33954

4167

DQN obtenía puntajes 13 veces mejor

00:38

than professional human games testers at “Breakout,”

38121

3541

que evaluadores profesionales de juegos en “Breakout”,

00:41

17 times better at “Boxing,” and 25 times better at “Video Pinball.”

41662

6334

17 veces mejor en “Boxing”, y 25 veces mejor en “Video Pinball”.

00:48

But there was one notable, and glaring, exception.

48162

3834

Pero había una excepción destacada y evidente.

00:52

When playing “Montezuma’s Revenge” DQN couldn’t score a single point,

52496

5791

Cuando jugaba a “Montezuma’s Revenge” DQN no podía anotar ni un solo punto,

00:58

even after playing for weeks.

58537

2625

incluso después de jugar durante semanas.

01:01

What was it that made this particular game so vexingly difficult for AI?

61412

5459

¿Qué fue lo que hizo que este juego fuera tan difícil para la IA?

01:07

And what would it take to solve it?

67204

2459

Y ¿qué haría falta para resolverlo?

01:10

Spoiler alert: babies.

70538

2833

Alerta de spoiler: bebés.

01:13

We’ll come back to that in a minute.

73746

2000

Volveremos a eso en un minuto.

01:16

Playing Atari games with AI involves what’s called reinforcement learning,

76163

5541

Jugar juegos Atari con IA involucra lo que se llama aprendizaje por refuerzo,

01:21

where the system is designed to maximize some kind of numerical rewards.

81871

4917

donde el sistema es diseñado para aumentar algún tipo de recompensa numérica.

01:26

In this case, those rewards were simply the game's points.

86788

3833

En este caso, esas recompensas eran simplemente los puntos del juego.

01:30

This underlying goal drives the system to learn which buttons to press

90746

4333

Este objetivo básico guía al sistema para aprender qué botones presionar

01:35

and when to press them to get the most points.

95079

3000

y cuándo apretarlos para obtener la mayor cantidad de puntos.

01:38

Some systems use model-based approaches, where they have a model of the environment

98079

5542

Algunos sistemas utilizan enfoques basados en modelos,

donde tienen un modelo del entorno

01:43

that they can use to predict what will happen next

103621

3125

que pueden utilizar para predecir lo que sucederá a continuación

01:46

once they take a certain action.

106746

2000

cuando realicen una acción determinada.

01:49

DQN, however, is model free.

109288

3041

Sin embargo, DQN es un modelo libre.

01:52

Instead of explicitly modeling its environment,

112704

2584

En lugar de modelar explícitamente su entorno,

01:55

it just learns to predict, based on the images on screen,

115288

3458

solo aprende a predecir, basado en las imágenes en pantalla,

01:58

how many future points it can expect to earn by pressing different buttons.

118746

4958

cuántos puntos puede esperar ganar pulsando diferentes botones.

02:03

For instance, “if the ball is here and I move left, more points,

123871

4792

Por ejemplo, “si la bola está aquí y se mueve hacia la izquierda, más puntos,

02:08

but if I move right, no more points.”

128663

2833

pero si se mueve hacia la derecha, no más puntos”.

02:12

But learning these connections requires a lot of trial and error.

132038

4500

Pero aprender estas conexiones requiere mucho de prueba y error.

02:16

The DQN system would start by mashing buttons randomly,

136704

3834

El sistema DQN empezó pulsando botones al azar,

02:20

and then slowly piece together which buttons to mash when

140538

3541

y luego lentamente fue resolviendo qué botones pulsar y cuándo

02:24

in order to maximize its score.

144079

2125

para aumentar su puntuación.

02:26

But in playing “Montezuma’s Revenge,”

146704

2375

Pero jugando “Montezuma’s Revenge”,

02:29

this approach of random button-mashing fell flat on its face.

149079

4334

este enfoque de presionar botones al azar quedó en la nada.

02:34

A player would have to perform this entire sequence

154121

3000

Un jugador tuvo que realizar toda esta secuencia

02:37

just to score their first points at the very end.

157121

3375

solo para anotar sus primeros puntos al final.

02:40

A mistake? Game over.

160871

2208

¿Un error? Se acabó el juego.

02:43

So how could DQN even know it was on the right track?

163538

3708

Entonces, ¿cómo pudo saber DQN que estaba en el camino correcto?

02:47

This is where babies come in.

167746

2458

Aquí es donde aparecen los bebés.

02:50

In studies, infants consistently look longer at pictures

170746

3875

En estudios, los niños miran

sistemáticamente durante más tiempo las imágenes

02:54

they haven’t seen before than ones they have.

174621

2667

que no han visto antes que las que sí han visto.

02:57

There just seems to be something intrinsically rewarding about novelty.

177579

4000

Parece que hay algo intrínsecamente gratificante en la novedad.

03:02

This behavior has been essential in understanding the infant mind.

182121

4125

Este comportamiento ha sido esencial en la comprensión de la mente infantil.

03:06

It also turned out to be the secret to beating “Montezuma’s Revenge.”

186496

4792

También resultó ser el secreto para derrotar a “Montezuma’s Revenge”.

03:12

The DeepMind researchers worked out an ingenious way

192121

3708

Los investigadores de DeepMind elaboraron una ingeniosa manera

03:15

to plug this preference for novelty into reinforcement learning.

195829

4500

de incorporar esta preferencia por la novedad en el aprendizaje por refuerzo.

03:20

They made it so that unusual or new images appearing on the screen

200704

4542

Hicieron que las imágenes inusuales o nuevas que aparecían en pantalla

03:25

were every bit as rewarding as real in-game points.

205246

4208

fueran tan gratificantes como los puntos reales del juego.

03:29

Suddenly, DQN was behaving totally differently from before.

209704

4709

De repente, DQN se estaba comportando totalmente diferente que antes.

03:34

It wanted to explore the room it was in,

214579

2334

Quería explorar la habitación en la que estaba,

03:36

to grab the key and escape through the locked door—

216913

2708

para agarrar la llave y escapar a través de la puerta cerrada...

03:39

not because it was worth 100 points,

219621

2708

no porque valiera 100 puntos,

03:42

but for the same reason we would: to see what was on the other side.

222329

4667

sino por la misma razón por la que lo haríamos nosotros:

para ver qué hay del otro lado.

03:48

With this new drive, DQN not only managed to grab that first key—

228163

5250

Con este nuevo accionar, DQN no solo lograba agarrar la primera llave...

03:53

it explored all the way through 15 of the temple’s 24 chambers.

233413

4833

también exploraba todo el camino a través de 15 de las 24 cámaras del templo.

03:58

But emphasizing novelty-based rewards can sometimes create more problems

238454

4209

Pero enfatizar en las recompensas con base en la novedad

a veces puede crear más problemas de los que resuelve.

04:02

than it solves.

242663

1166

04:03

A novelty-seeking system that’s played a game too long

243913

3208

Un sistema que busca la novedad que ha jugado un juego demasiado largo

04:07

will eventually lose motivation.

247121

2500

acabará perdiendo la motivación.

04:09

If it’s seen it all before, why go anywhere?

249996

3042

Si lo ha visto todo antes, ¿por qué ir a cualquier parte?

04:13

Alternately, if it encounters, say, a television, it will freeze.

253621

5167

En cambio, si se encuentra, por ejemplo, con un televisor, se congelará.

04:18

The constant novel images are essentially paralyzing.

258954

3750

Las imágenes constantes y novedosas son esencialmente paralizantes.

04:23

The ideas and inspiration here go in both directions.

263204

3625

Las ideas e inspiraciones aquí van en ambas direcciones.

04:27

AI researchers stuck on a practical problem,

267079

3125

Los investigadores de IA atascados en un problema práctico,

04:30

like how to get DQN to beat a difficult game,

270204

3334

como hacer que DQN derrote a un juego complicado,

04:33

are turning increasingly to experts in human intelligence for ideas.

273538

5000

recurren cada vez más a expertos en inteligencia humana en busca de ideas.

04:38

At the same time,

278788

1125

Al mismo tiempo,

04:39

AI is giving us new insights into the ways we get stuck and unstuck:

279913

5416

la IA nos ofrece nuevos datos sobre las formas en que nos atascamos y desatascamos

04:45

into boredom, depression, and addiction,

285329

2792

en el aburrimiento, la depresión y la adicción,

04:48

along with curiosity, creativity, and play.

288121

3667

junto con la curiosidad, la creatividad y el juego.

New videos

06:27

How do drugs make you hallucinate? - Anees Bahji

06:51

The Rise of China's Homegrown Brands — and Why ...

06:16

How important is politeness? ⏲️ 6 Minute English

07:44

North Korea’s secrets revealed by phone: Study:...

17:30

Advanced English Learning: Speaking Practice

03:48

What can you do? Easy English Conversations 💬 ...

08:33

Can AI Help with the Chaos of Family Life? | Av...

12:13

Speak English Confidently: Daily Tricks & Tips 🧠

Original video on YouTube.com

How to get better at video games, according to babies - Brian Christian - YouTube

Acerca de este sitio web

Este sitio le presentará vídeos de YouTube útiles para aprender inglés. Verá lecciones de inglés impartidas por profesores de primera categoría de todo el mundo. Haz doble clic en los subtítulos en inglés que aparecen en cada página de vídeo para reproducir el vídeo desde allí. Los subtítulos se desplazan en sincronía con la reproducción del vídeo. Si tiene algún comentario o petición, póngase en contacto con nosotros mediante este formulario de contacto.

https://forms.gle/WvT1wiN1qDtmnspy7

Playback speed

Subtitle font size

How to get better at video games, according to babies - Brian Christian

New videos

How to get better at video games, according to babies - Brian Christian

New videos

Original video on YouTube.com