How computers are learning to be creative | Blaise Agüera y Arcas

457,771 views ・ 2016-07-22

TED

Please double-click on the English subtitles below to play the video.

00:12

So, I lead a team at Google that works on machine intelligence;

12800

3124

00:15

in other words, the engineering discipline of making computers and devices

15948

4650

00:20

able to do some of the things that brains do.

20622

2419

00:23

And this makes us interested in real brains

23439

3099

00:26

and neuroscience as well,

26562

1289

00:27

and especially interested in the things that our brains do

27875

4172

00:32

that are still far superior to the performance of computers.

32071

4042

00:37

Historically, one of those areas has been perception,

37209

3609

00:40

the process by which things out there in the world --

40842

3039

00:43

sounds and images --

43905

1584

00:45

can turn into concepts in the mind.

45513

2178

00:48

This is essential for our own brains,

48235

2517

00:50

and it's also pretty useful on a computer.

50776

2464

00:53

The machine perception algorithms, for example, that our team makes,

53636

3350

00:57

are what enable your pictures on Google Photos to become searchable,

57010

3874

01:00

based on what's in them.

60908

1397

01:03

The flip side of perception is creativity:

63594

3493

01:07

turning a concept into something out there into the world.

67111

3038

01:10

So over the past year, our work on machine perception

70173

3555

01:13

has also unexpectedly connected with the world of machine creativity

73752

4859

01:18

and machine art.

78635

1160

01:20

I think Michelangelo had a penetrating insight

80556

3284

01:23

into to this dual relationship between perception and creativity.

83864

3656

01:28

This is a famous quote of his:

88023

2006

01:30

"Every block of stone has a statue inside of it,

90053

3323

01:34

and the job of the sculptor is to discover it."

94036

3002

01:38

So I think that what Michelangelo was getting at

98029

3216

01:41

is that we create by perceiving,

101269

3180

01:44

and that perception itself is an act of imagination

104473

3023

01:47

and is the stuff of creativity.

107520

2461

01:50

The organ that does all the thinking and perceiving and imagining,

110691

3925

01:54

of course, is the brain.

114640

1588

01:57

And I'd like to begin with a brief bit of history

117089

2545

01:59

about what we know about brains.

119658

2302

02:02

Because unlike, say, the heart or the intestines,

122496

2446

02:04

you really can't say very much about a brain by just looking at it,

124966

3144

02:08

at least with the naked eye.

128134

1412

02:09

The early anatomists who looked at brains

129983

2416

02:12

gave the superficial structures of this thing all kinds of fanciful names,

132423

3807

02:16

like hippocampus, meaning "little shrimp."

136254

2433

02:18

But of course that sort of thing doesn't tell us very much

138711

2764

02:21

about what's actually going on inside.

141499

2318

02:24

The first person who, I think, really developed some kind of insight

144780

3613

02:28

into what was going on in the brain

148417

1930

02:30

was the great Spanish neuroanatomist, Santiago Ramón y Cajal,

150371

3920

02:34

in the 19th century,

154315

1544

02:35

who used microscopy and special stains

155883

3755

02:39

that could selectively fill in or render in very high contrast

159662

4170

02:43

the individual cells in the brain,

163856

2008

02:45

in order to start to understand their morphologies.

165888

3154

02:49

And these are the kinds of drawings that he made of neurons

169972

2891

02:52

in the 19th century.

172887

1209

02:54

This is from a bird brain.

174120

1884

02:56

And you see this incredible variety of different sorts of cells,

176028

3057

02:59

even the cellular theory itself was quite new at this point.

179109

3435

03:02

And these structures,

182568

1278

03:03

these cells that have these arborizations,

183870

2259

03:06

these branches that can go very, very long distances --

186153

2608

03:08

this was very novel at the time.

188785

1616

03:10

They're reminiscent, of course, of wires.

190779

2903

03:13

That might have been obvious to some people in the 19th century;

193706

3457

03:17

the revolutions of wiring and electricity were just getting underway.

197187

4314

03:21

But in many ways,

201964

1178

03:23

these microanatomical drawings of Ramón y Cajal's, like this one,

203166

3313

03:26

they're still in some ways unsurpassed.

206503

2332

03:28

We're still more than a century later,

208859

1854

03:30

trying to finish the job that Ramón y Cajal started.

210737

2825

03:33

These are raw data from our collaborators

213586

3134

03:36

at the Max Planck Institute of Neuroscience.

216744

2881

03:39

And what our collaborators have done

219649

1790

03:41

is to image little pieces of brain tissue.

221463

5001

03:46

The entire sample here is about one cubic millimeter in size,

226488

3326

03:49

and I'm showing you a very, very small piece of it here.

229838

2621

03:52

That bar on the left is about one micron.

232483

2346

03:54

The structures you see are mitochondria

234853

2409

03:57

that are the size of bacteria.

237286

2044

03:59

And these are consecutive slices

239354

1551

04:00

through this very, very tiny block of tissue.

240929

3148

04:04

Just for comparison's sake,

244101

2403

04:06

the diameter of an average strand of hair is about 100 microns.

246528

3792

04:10

So we're looking at something much, much smaller

250344

2274

04:12

than a single strand of hair.

252642

1398

04:14

And from these kinds of serial electron microscopy slices,

254064

4031

04:18

one can start to make reconstructions in 3D of neurons that look like these.

258119

5008

04:23

So these are sort of in the same style as Ramón y Cajal.

263151

3157

04:26

Only a few neurons lit up,

266332

1492

04:27

because otherwise we wouldn't be able to see anything here.

267848

2781

04:30

It would be so crowded,

270653

1312

04:31

so full of structure,

271989

1330

04:33

of wiring all connecting one neuron to another.

273343

2724

04:37

So Ramón y Cajal was a little bit ahead of his time,

277293

2804

04:40

and progress on understanding the brain

280121

2555

04:42

proceeded slowly over the next few decades.

282700

2271

04:45

But we knew that neurons used electricity,

285455

2853

04:48

and by World War II, our technology was advanced enough

288332

2936

04:51

to start doing real electrical experiments on live neurons

291292

2806

04:54

to better understand how they worked.

294122

2106

04:56

This was the very same time when computers were being invented,

296631

4356

05:01

very much based on the idea of modeling the brain --

301011

3100

05:04

of "intelligent machinery," as Alan Turing called it,

304135

3085

05:07

one of the fathers of computer science.

100

307244

1991

05:09

Warren McCulloch and Walter Pitts looked at Ramón y Cajal's drawing

101

309923

4632

05:14

of visual cortex,

102

314579

1317

05:15

which I'm showing here.

103

315920

1562

05:17

This is the cortex that processes imagery that comes from the eye.

104

317506

4442

05:22

And for them, this looked like a circuit diagram.

105

322424

3508

05:26

So there are a lot of details in McCulloch and Pitts's circuit diagram

106

326353

3835

05:30

that are not quite right.

107

330212

1352

05:31

But this basic idea

108

331588

1235

05:32

that visual cortex works like a series of computational elements

109

332847

3992

05:36

that pass information one to the next in a cascade,

110

336863

2746

05:39

is essentially correct.

111

339633

1602

05:41

Let's talk for a moment

112

341259

2350

05:43

about what a model for processing visual information would need to do.

113

343633

4032

05:48

The basic task of perception

114

348228

2741

05:50

is to take an image like this one and say,

115

350993

4194

05:55

"That's a bird,"

116

355211

1176

05:56

which is a very simple thing for us to do with our brains.

117

356411

2874

05:59

But you should all understand that for a computer,

118

359309

3421

06:02

this was pretty much impossible just a few years ago.

119

362754

3087

06:05

The classical computing paradigm

120

365865

1916

06:07

is not one in which this task is easy to do.

121

367805

2507

06:11

So what's going on between the pixels,

122

371366

2552

06:13

between the image of the bird and the word "bird,"

123

373942

4028

06:17

is essentially a set of neurons connected to each other

124

377994

2814

06:20

in a neural network,

125

380832

1155

06:22

as I'm diagramming here.

126

382011

1223

06:23

This neural network could be biological, inside our visual cortices,

127

383258

3272

06:26

or, nowadays, we start to have the capability

128

386554

2162

06:28

to model such neural networks on the computer.

129

388740

2454

06:31

And I'll show you what that actually looks like.

130

391834

2353

06:34

So the pixels you can think about as a first layer of neurons,

131

394211

3416

06:37

and that's, in fact, how it works in the eye --

132

397651

2239

06:39

that's the neurons in the retina.

133

399914

1663

06:41

And those feed forward

134

401601

1500

06:43

into one layer after another layer, after another layer of neurons,

135

403125

3403

06:46

all connected by synapses of different weights.

136

406552

3033

06:49

The behavior of this network

137

409609

1335

06:50

is characterized by the strengths of all of those synapses.

138

410968

3284

06:54

Those characterize the computational properties of this network.

139

414276

3288

06:57

And at the end of the day,

140

417588

1470

06:59

you have a neuron or a small group of neurons

141

419082

2447

07:01

that light up, saying, "bird."

142

421553

1647

07:03

Now I'm going to represent those three things --

143

423824

3132

07:06

the input pixels and the synapses in the neural network,

144

426980

4696

07:11

and bird, the output --

145

431700

1585

07:13

by three variables: x, w and y.

146

433309

3057

07:16

There are maybe a million or so x's --

147

436853

1811

07:18

a million pixels in that image.

148

438688

1953

07:20

There are billions or trillions of w's,

149

440665

2446

07:23

which represent the weights of all these synapses in the neural network.

150

443135

3421

07:26

And there's a very small number of y's,

151

446580

1875

07:28

of outputs that that network has.

152

448479

1858

07:30

"Bird" is only four letters, right?

153

450361

1749

07:33

So let's pretend that this is just a simple formula,

154

453088

3426

07:36

x "x" w = y.

155

456538

2163

07:38

I'm putting the times in scare quotes

156

458725

2036

07:40

because what's really going on there, of course,

157

460785

2280

07:43

is a very complicated series of mathematical operations.

158

463089

3046

07:47

That's one equation.

159

467172

1221

07:48

There are three variables.

160

468417

1672

07:50

And we all know that if you have one equation,

161

470113

2726

07:52

you can solve one variable by knowing the other two things.

162

472863

3642

07:57

So the problem of inference,

163

477158

3380

08:00

that is, figuring out that the picture of a bird is a bird,

164

480562

2873

08:03

is this one:

165

483459

1274

08:04

it's where y is the unknown and w and x are known.

166

484757

3459

08:08

You know the neural network, you know the pixels.

167

488240

2459

08:10

As you can see, that's actually a relatively straightforward problem.

168

490723

3327

08:14

You multiply two times three and you're done.

169

494074

2186

08:16

I'll show you an artificial neural network

170

496862

2123

08:19

that we've built recently, doing exactly that.

171

499009

2296

08:21

This is running in real time on a mobile phone,

172

501634

2860

08:24

and that's, of course, amazing in its own right,

173

504518

3313

08:27

that mobile phones can do so many billions and trillions of operations

174

507855

3468

08:31

per second.

175

511347

1248

08:32

What you're looking at is a phone

176

512619

1615

08:34

looking at one after another picture of a bird,

177

514258

3547

08:37

and actually not only saying, "Yes, it's a bird,"

178

517829

2715

08:40

but identifying the species of bird with a network of this sort.

179

520568

3411

08:44

So in that picture,

180

524890

1826

08:46

the x and the w are known, and the y is the unknown.

181

526740

3802

08:50

I'm glossing over the very difficult part, of course,

182

530566

2508

08:53

which is how on earth do we figure out the w,

183

533098

3861

08:56

the brain that can do such a thing?

184

536983

2187

08:59

How would we ever learn such a model?

185

539194

1834

09:01

So this process of learning, of solving for w,

186

541418

3233

09:04

if we were doing this with the simple equation

187

544675

2647

09:07

in which we think about these as numbers,

188

547346

2000

09:09

we know exactly how to do that: 6 = 2 x w,

189

549370

2687

09:12

well, we divide by two and we're done.

190

552081

3312

09:16

The problem is with this operator.

191

556001

2220

09:18

So, division --

192

558823

1151

09:19

we've used division because it's the inverse to multiplication,

193

559998

3121

09:23

but as I've just said,

194

563143

1440

09:24

the multiplication is a bit of a lie here.

195

564607

2449

09:27

This is a very, very complicated, very non-linear operation;

196

567080

3326

09:30

it has no inverse.

197

570430

1704

09:32

So we have to figure out a way to solve the equation

198

572158

3150

09:35

without a division operator.

199

575332

2024

09:37

And the way to do that is fairly straightforward.

200

577380

2343

09:39

You just say, let's play a little algebra trick,

201

579747

2671

09:42

and move the six over to the right-hand side of the equation.

202

582442

2906

09:45

Now, we're still using multiplication.

203

585372

1826

09:47

And that zero -- let's think about it as an error.

204

587675

3580

09:51

In other words, if we've solved for w the right way,

205

591279

2515

09:53

then the error will be zero.

206

593818

1656

09:55

And if we haven't gotten it quite right,

207

595498

1938

09:57

the error will be greater than zero.

208

597460

1749

09:59

So now we can just take guesses to minimize the error,

209

599233

3366

10:02

and that's the sort of thing computers are very good at.

210

602623

2687

10:05

So you've taken an initial guess:

211

605334

1593

10:06

what if w = 0?

212

606951

1156

10:08

Well, then the error is 6.

213

608131

1240

10:09

What if w = 1? The error is 4.

214

609395

1446

10:10

And then the computer can sort of play Marco Polo,

215

610865

2367

10:13

and drive down the error close to zero.

216

613256

2367

10:15

As it does that, it's getting successive approximations to w.

217

615647

3374

10:19

Typically, it never quite gets there, but after about a dozen steps,

218

619045

3656

10:22

we're up to w = 2.999, which is close enough.

219

622725

4624

10:28

And this is the learning process.

220

628302

1814

10:30

So remember that what's been going on here

221

630140

2730

10:32

is that we've been taking a lot of known x's and known y's

222

632894

4378

10:37

and solving for the w in the middle through an iterative process.

223

637296

3454

10:40

It's exactly the same way that we do our own learning.

224

640774

3556

10:44

We have many, many images as babies

225

644354

2230

10:46

and we get told, "This is a bird; this is not a bird."

226

646608

2633

10:49

And over time, through iteration,

227

649714

2098

10:51

we solve for w, we solve for those neural connections.

228

651836

2928

10:55

So now, we've held x and w fixed to solve for y;

229

655460

4086

10:59

that's everyday, fast perception.

230

659570

1847

11:01

We figure out how we can solve for w,

231

661441

1763

11:03

that's learning, which is a lot harder,

232

663228

1903

11:05

because we need to do error minimization,

233

665155

1985

11:07

using a lot of training examples.

234

667164

1687

11:08

And about a year ago, Alex Mordvintsev, on our team,

235

668875

3187

11:12

decided to experiment with what happens if we try solving for x,

236

672086

3550

11:15

given a known w and a known y.

237

675660

2037

11:18

In other words,

238

678124

1151

11:19

you know that it's a bird,

239

679299

1352

11:20

and you already have your neural network that you've trained on birds,

240

680675

3303

11:24

but what is the picture of a bird?

241

684002

2344

11:27

It turns out that by using exactly the same error-minimization procedure,

242

687034

5024

11:32

one can do that with the network trained to recognize birds,

243

692082

3430

11:35

and the result turns out to be ...

244

695536

3388

11:42

a picture of birds.

245

702400

1305

11:44

So this is a picture of birds generated entirely by a neural network

246

704814

3737

11:48

that was trained to recognize birds,

247

708575

1826

11:50

just by solving for x rather than solving for y,

248

710425

3538

11:53

and doing that iteratively.

249

713987

1288

11:55

Here's another fun example.

250

715732

1847

11:57

This was a work made by Mike Tyka in our group,

251

717603

3437

12:01

which he calls "Animal Parade."

252

721064

2308

12:03

It reminds me a little bit of William Kentridge's artworks,

253

723396

2876

12:06

in which he makes sketches, rubs them out,

254

726296

2489

12:08

makes sketches, rubs them out,

255

728809

1460

12:10

and creates a movie this way.

256

730293

1398

12:11

In this case,

257

731715

1151

12:12

what Mike is doing is varying y over the space of different animals,

258

732890

3277

12:16

in a network designed to recognize and distinguish

259

736191

2382

12:18

different animals from each other.

260

738597

1810

12:20

And you get this strange, Escher-like morph from one animal to another.

261

740431

3751

12:26

Here he and Alex together have tried reducing

262

746221

4614

12:30

the y's to a space of only two dimensions,

263

750859

2759

12:33

thereby making a map out of the space of all things

264

753642

3438

12:37

recognized by this network.

265

757104

1719

12:38

Doing this kind of synthesis

266

758847

2023

12:40

or generation of imagery over that entire surface,

267

760894

2382

12:43

varying y over the surface, you make a kind of map --

268

763300

2846

12:46

a visual map of all the things the network knows how to recognize.

269

766170

3141

12:49

The animals are all here; "armadillo" is right in that spot.

270

769335

2865

12:52

You can do this with other kinds of networks as well.

271

772919

2479

12:55

This is a network designed to recognize faces,

272

775422

2874

12:58

to distinguish one face from another.

273

778320

2000

13:00

And here, we're putting in a y that says, "me,"

274

780344

3249

13:03

my own face parameters.

275

783617

1575

13:05

And when this thing solves for x,

276

785216

1706

13:06

it generates this rather crazy,

277

786946

2618

13:09

kind of cubist, surreal, psychedelic picture of me

278

789588

4428

13:14

from multiple points of view at once.

279

794040

1806

13:15

The reason it looks like multiple points of view at once

280

795870

2734

13:18

is because that network is designed to get rid of the ambiguity

281

798628

3687

13:22

of a face being in one pose or another pose,

282

802339

2476

13:24

being looked at with one kind of lighting, another kind of lighting.

283

804839

3376

13:28

So when you do this sort of reconstruction,

284

808239

2085

13:30

if you don't use some sort of guide image

285

810348

2304

13:32

or guide statistics,

286

812676

1211

13:33

then you'll get a sort of confusion of different points of view,

287

813911

3765

13:37

because it's ambiguous.

288

817700

1368

13:39

This is what happens if Alex uses his own face as a guide image

289

819786

4223

13:44

during that optimization process to reconstruct my own face.

290

824033

3321

13:48

So you can see it's not perfect.

291

828284

2328

13:50

There's still quite a lot of work to do

292

830636

1874

13:52

on how we optimize that optimization process.

293

832534

2453

13:55

But you start to get something more like a coherent face,

294

835011

2827

13:57

rendered using my own face as a guide.

295

837862

2014

14:00

You don't have to start with a blank canvas

296

840892

2501

14:03

or with white noise.

297

843417

1156

14:04

When you're solving for x,

298

844597

1304

14:05

you can begin with an x, that is itself already some other image.

299

845925

3889

14:09

That's what this little demonstration is.

300

849838

2556

14:12

This is a network that is designed to categorize

301

852418

4122

14:16

all sorts of different objects -- man-made structures, animals ...

302

856564

3119

14:19

Here we're starting with just a picture of clouds,

303

859707

2593

14:22

and as we optimize,

304

862324

1671

14:24

basically, this network is figuring out what it sees in the clouds.

305

864019

4486

14:28

And the more time you spend looking at this,

306

868931

2320

14:31

the more things you also will see in the clouds.

307

871275

2753

14:35

You could also use the face network to hallucinate into this,

308

875004

3375

14:38

and you get some pretty crazy stuff.

309

878403

1812

14:40

(Laughter)

310

880239

1150

14:42

Or, Mike has done some other experiments

311

882401

2744

14:45

in which he takes that cloud image,

312

885169

3905

14:49

hallucinates, zooms, hallucinates, zooms hallucinates, zooms.

313

889098

3507

14:52

And in this way,

314

892629

1151

14:53

you can get a sort of fugue state of the network, I suppose,

315

893804

3675

14:57

or a sort of free association,

316

897503

3680

15:01

in which the network is eating its own tail.

317

901207

2227

15:03

So every image is now the basis for,

318

903458

3421

15:06

"What do I think I see next?

319

906903

1421

15:08

What do I think I see next? What do I think I see next?"

320

908348

2803

15:11

I showed this for the first time in public

321

911487

2936

15:14

to a group at a lecture in Seattle called "Higher Education" --

322

914447

5437

15:19

this was right after marijuana was legalized.

323

919908

2437

15:22

(Laughter)

324

922369

2415

15:26

So I'd like to finish up quickly

325

926627

2104

15:28

by just noting that this technology is not constrained.

326

928755

4255

15:33

I've shown you purely visual examples because they're really fun to look at.

327

933034

3665

15:36

It's not a purely visual technology.

328

936723

2451

15:39

Our artist collaborator, Ross Goodwin,

329

939198

1993

15:41

has done experiments involving a camera that takes a picture,

330

941215

3671

15:44

and then a computer in his backpack writes a poem using neural networks,

331

944910

4234

15:49

based on the contents of the image.

332

949168

1944

15:51

And that poetry neural network has been trained

333

951136

2947

15:54

on a large corpus of 20th-century poetry.

334

954107

2234

15:56

And the poetry is, you know,

335

956365

1499

15:57

I think, kind of not bad, actually.

336

957888

1914

15:59

(Laughter)

337

959826

1384

16:01

In closing,

338

961234

1159

16:02

I think that per Michelangelo,

339

962417

2132

16:04

I think he was right;

340

964573

1234

16:05

perception and creativity are very intimately connected.

341

965831

3436

16:09

What we've just seen are neural networks

342

969611

2634

16:12

that are entirely trained to discriminate,

343

972269

2303

16:14

or to recognize different things in the world,

344

974596

2242

16:16

able to be run in reverse, to generate.

345

976862

3161

16:20

One of the things that suggests to me

346

980047

1783

16:21

is not only that Michelangelo really did see

347

981854

2398

16:24

the sculpture in the blocks of stone,

348

984276

2452

16:26

but that any creature, any being, any alien

349

986752

3638

16:30

that is able to do perceptual acts of that sort

350

990414

3657

16:34

is also able to create

351

994095

1375

16:35

because it's exactly the same machinery that's used in both cases.

352

995494

3224

16:38

Also, I think that perception and creativity are by no means

353

998742

4532

16:43

uniquely human.

354

1003298

1210

16:44

We start to have computer models that can do exactly these sorts of things.

355

1004532

3708

16:48

And that ought to be unsurprising; the brain is computational.

356

1008264

3328

16:51

And finally,

357

1011616

1657

16:53

computing began as an exercise in designing intelligent machinery.

358

1013297

4668

16:57

It was very much modeled after the idea

359

1017989

2462

17:00

of how could we make machines intelligent.

360

1020475

3013

17:03

And we finally are starting to fulfill now

361

1023512

2162

17:05

some of the promises of those early pioneers,

362

1025698

2406

17:08

of Turing and von Neumann

363

1028128

1713

17:09

and McCulloch and Pitts.

364

1029865

2265

17:12

And I think that computing is not just about accounting

365

1032154

4098

17:16

or playing Candy Crush or something.

366

1036276

2147

17:18

From the beginning, we modeled them after our minds.

367

1038447

2578

17:21

And they give us both the ability to understand our own minds better

368

1041049

3269

17:24

and to extend them.

369

1044342

1529

17:26

Thank you very much.

370

1046627

1167

17:27

(Applause)

371

1047818

5939

New videos

06:27

How do drugs make you hallucinate? - Anees Bahji

06:51

The Rise of China's Homegrown Brands — and Why ...

06:16

How important is politeness? ⏲️ 6 Minute English

07:44

North Korea’s secrets revealed by phone: Study:...

17:30

Advanced English Learning: Speaking Practice

03:48

What can you do? Easy English Conversations 💬 ...

08:33

Can AI Help with the Chaos of Family Life? | Av...

12:13

Speak English Confidently: Daily Tricks & Tips 🧠

Original video on YouTube.com

How computers are learning to be creative | Blaise Agüera y Arcas - YouTube

About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7

Playback speed

Subtitle font size

How computers are learning to be creative | Blaise Agüera y Arcas

New videos

How computers are learning to be creative | Blaise Agüera y Arcas

New videos

Original video on YouTube.com