How computers are learning to be creative | Blaise Agüera y Arcas

452,941 views ・ 2016-07-22

TED


Please double-click on the English subtitles below to play the video.

00:12
So, I lead a team at Google that works on machine intelligence;
0
12800
3124
00:15
in other words, the engineering discipline of making computers and devices
1
15948
4650
00:20
able to do some of the things that brains do.
2
20622
2419
00:23
And this makes us interested in real brains
3
23439
3099
00:26
and neuroscience as well,
4
26562
1289
00:27
and especially interested in the things that our brains do
5
27875
4172
00:32
that are still far superior to the performance of computers.
6
32071
4042
00:37
Historically, one of those areas has been perception,
7
37209
3609
00:40
the process by which things out there in the world --
8
40842
3039
00:43
sounds and images --
9
43905
1584
00:45
can turn into concepts in the mind.
10
45513
2178
00:48
This is essential for our own brains,
11
48235
2517
00:50
and it's also pretty useful on a computer.
12
50776
2464
00:53
The machine perception algorithms, for example, that our team makes,
13
53636
3350
00:57
are what enable your pictures on Google Photos to become searchable,
14
57010
3874
01:00
based on what's in them.
15
60908
1397
01:03
The flip side of perception is creativity:
16
63594
3493
01:07
turning a concept into something out there into the world.
17
67111
3038
01:10
So over the past year, our work on machine perception
18
70173
3555
01:13
has also unexpectedly connected with the world of machine creativity
19
73752
4859
01:18
and machine art.
20
78635
1160
01:20
I think Michelangelo had a penetrating insight
21
80556
3284
01:23
into to this dual relationship between perception and creativity.
22
83864
3656
01:28
This is a famous quote of his:
23
88023
2006
01:30
"Every block of stone has a statue inside of it,
24
90053
3323
01:34
and the job of the sculptor is to discover it."
25
94036
3002
01:38
So I think that what Michelangelo was getting at
26
98029
3216
01:41
is that we create by perceiving,
27
101269
3180
01:44
and that perception itself is an act of imagination
28
104473
3023
01:47
and is the stuff of creativity.
29
107520
2461
01:50
The organ that does all the thinking and perceiving and imagining,
30
110691
3925
01:54
of course, is the brain.
31
114640
1588
01:57
And I'd like to begin with a brief bit of history
32
117089
2545
01:59
about what we know about brains.
33
119658
2302
02:02
Because unlike, say, the heart or the intestines,
34
122496
2446
02:04
you really can't say very much about a brain by just looking at it,
35
124966
3144
02:08
at least with the naked eye.
36
128134
1412
02:09
The early anatomists who looked at brains
37
129983
2416
02:12
gave the superficial structures of this thing all kinds of fanciful names,
38
132423
3807
02:16
like hippocampus, meaning "little shrimp."
39
136254
2433
02:18
But of course that sort of thing doesn't tell us very much
40
138711
2764
02:21
about what's actually going on inside.
41
141499
2318
02:24
The first person who, I think, really developed some kind of insight
42
144780
3613
02:28
into what was going on in the brain
43
148417
1930
02:30
was the great Spanish neuroanatomist, Santiago Ramón y Cajal,
44
150371
3920
02:34
in the 19th century,
45
154315
1544
02:35
who used microscopy and special stains
46
155883
3755
02:39
that could selectively fill in or render in very high contrast
47
159662
4170
02:43
the individual cells in the brain,
48
163856
2008
02:45
in order to start to understand their morphologies.
49
165888
3154
02:49
And these are the kinds of drawings that he made of neurons
50
169972
2891
02:52
in the 19th century.
51
172887
1209
02:54
This is from a bird brain.
52
174120
1884
02:56
And you see this incredible variety of different sorts of cells,
53
176028
3057
02:59
even the cellular theory itself was quite new at this point.
54
179109
3435
03:02
And these structures,
55
182568
1278
03:03
these cells that have these arborizations,
56
183870
2259
03:06
these branches that can go very, very long distances --
57
186153
2608
03:08
this was very novel at the time.
58
188785
1616
03:10
They're reminiscent, of course, of wires.
59
190779
2903
03:13
That might have been obvious to some people in the 19th century;
60
193706
3457
03:17
the revolutions of wiring and electricity were just getting underway.
61
197187
4314
03:21
But in many ways,
62
201964
1178
03:23
these microanatomical drawings of Ramón y Cajal's, like this one,
63
203166
3313
03:26
they're still in some ways unsurpassed.
64
206503
2332
03:28
We're still more than a century later,
65
208859
1854
03:30
trying to finish the job that Ramón y Cajal started.
66
210737
2825
03:33
These are raw data from our collaborators
67
213586
3134
03:36
at the Max Planck Institute of Neuroscience.
68
216744
2881
03:39
And what our collaborators have done
69
219649
1790
03:41
is to image little pieces of brain tissue.
70
221463
5001
03:46
The entire sample here is about one cubic millimeter in size,
71
226488
3326
03:49
and I'm showing you a very, very small piece of it here.
72
229838
2621
03:52
That bar on the left is about one micron.
73
232483
2346
03:54
The structures you see are mitochondria
74
234853
2409
03:57
that are the size of bacteria.
75
237286
2044
03:59
And these are consecutive slices
76
239354
1551
04:00
through this very, very tiny block of tissue.
77
240929
3148
04:04
Just for comparison's sake,
78
244101
2403
04:06
the diameter of an average strand of hair is about 100 microns.
79
246528
3792
04:10
So we're looking at something much, much smaller
80
250344
2274
04:12
than a single strand of hair.
81
252642
1398
04:14
And from these kinds of serial electron microscopy slices,
82
254064
4031
04:18
one can start to make reconstructions in 3D of neurons that look like these.
83
258119
5008
04:23
So these are sort of in the same style as Ramón y Cajal.
84
263151
3157
04:26
Only a few neurons lit up,
85
266332
1492
04:27
because otherwise we wouldn't be able to see anything here.
86
267848
2781
04:30
It would be so crowded,
87
270653
1312
04:31
so full of structure,
88
271989
1330
04:33
of wiring all connecting one neuron to another.
89
273343
2724
04:37
So Ramón y Cajal was a little bit ahead of his time,
90
277293
2804
04:40
and progress on understanding the brain
91
280121
2555
04:42
proceeded slowly over the next few decades.
92
282700
2271
04:45
But we knew that neurons used electricity,
93
285455
2853
04:48
and by World War II, our technology was advanced enough
94
288332
2936
04:51
to start doing real electrical experiments on live neurons
95
291292
2806
04:54
to better understand how they worked.
96
294122
2106
04:56
This was the very same time when computers were being invented,
97
296631
4356
05:01
very much based on the idea of modeling the brain --
98
301011
3100
05:04
of "intelligent machinery," as Alan Turing called it,
99
304135
3085
05:07
one of the fathers of computer science.
100
307244
1991
05:09
Warren McCulloch and Walter Pitts looked at Ramón y Cajal's drawing
101
309923
4632
05:14
of visual cortex,
102
314579
1317
05:15
which I'm showing here.
103
315920
1562
05:17
This is the cortex that processes imagery that comes from the eye.
104
317506
4442
05:22
And for them, this looked like a circuit diagram.
105
322424
3508
05:26
So there are a lot of details in McCulloch and Pitts's circuit diagram
106
326353
3835
05:30
that are not quite right.
107
330212
1352
05:31
But this basic idea
108
331588
1235
05:32
that visual cortex works like a series of computational elements
109
332847
3992
05:36
that pass information one to the next in a cascade,
110
336863
2746
05:39
is essentially correct.
111
339633
1602
05:41
Let's talk for a moment
112
341259
2350
05:43
about what a model for processing visual information would need to do.
113
343633
4032
05:48
The basic task of perception
114
348228
2741
05:50
is to take an image like this one and say,
115
350993
4194
05:55
"That's a bird,"
116
355211
1176
05:56
which is a very simple thing for us to do with our brains.
117
356411
2874
05:59
But you should all understand that for a computer,
118
359309
3421
06:02
this was pretty much impossible just a few years ago.
119
362754
3087
06:05
The classical computing paradigm
120
365865
1916
06:07
is not one in which this task is easy to do.
121
367805
2507
06:11
So what's going on between the pixels,
122
371366
2552
06:13
between the image of the bird and the word "bird,"
123
373942
4028
06:17
is essentially a set of neurons connected to each other
124
377994
2814
06:20
in a neural network,
125
380832
1155
06:22
as I'm diagramming here.
126
382011
1223
06:23
This neural network could be biological, inside our visual cortices,
127
383258
3272
06:26
or, nowadays, we start to have the capability
128
386554
2162
06:28
to model such neural networks on the computer.
129
388740
2454
06:31
And I'll show you what that actually looks like.
130
391834
2353
06:34
So the pixels you can think about as a first layer of neurons,
131
394211
3416
06:37
and that's, in fact, how it works in the eye --
132
397651
2239
06:39
that's the neurons in the retina.
133
399914
1663
06:41
And those feed forward
134
401601
1500
06:43
into one layer after another layer, after another layer of neurons,
135
403125
3403
06:46
all connected by synapses of different weights.
136
406552
3033
06:49
The behavior of this network
137
409609
1335
06:50
is characterized by the strengths of all of those synapses.
138
410968
3284
06:54
Those characterize the computational properties of this network.
139
414276
3288
06:57
And at the end of the day,
140
417588
1470
06:59
you have a neuron or a small group of neurons
141
419082
2447
07:01
that light up, saying, "bird."
142
421553
1647
07:03
Now I'm going to represent those three things --
143
423824
3132
07:06
the input pixels and the synapses in the neural network,
144
426980
4696
07:11
and bird, the output --
145
431700
1585
07:13
by three variables: x, w and y.
146
433309
3057
07:16
There are maybe a million or so x's --
147
436853
1811
07:18
a million pixels in that image.
148
438688
1953
07:20
There are billions or trillions of w's,
149
440665
2446
07:23
which represent the weights of all these synapses in the neural network.
150
443135
3421
07:26
And there's a very small number of y's,
151
446580
1875
07:28
of outputs that that network has.
152
448479
1858
07:30
"Bird" is only four letters, right?
153
450361
1749
07:33
So let's pretend that this is just a simple formula,
154
453088
3426
07:36
x "x" w = y.
155
456538
2163
07:38
I'm putting the times in scare quotes
156
458725
2036
07:40
because what's really going on there, of course,
157
460785
2280
07:43
is a very complicated series of mathematical operations.
158
463089
3046
07:47
That's one equation.
159
467172
1221
07:48
There are three variables.
160
468417
1672
07:50
And we all know that if you have one equation,
161
470113
2726
07:52
you can solve one variable by knowing the other two things.
162
472863
3642
07:57
So the problem of inference,
163
477158
3380
08:00
that is, figuring out that the picture of a bird is a bird,
164
480562
2873
08:03
is this one:
165
483459
1274
08:04
it's where y is the unknown and w and x are known.
166
484757
3459
08:08
You know the neural network, you know the pixels.
167
488240
2459
08:10
As you can see, that's actually a relatively straightforward problem.
168
490723
3327
08:14
You multiply two times three and you're done.
169
494074
2186
08:16
I'll show you an artificial neural network
170
496862
2123
08:19
that we've built recently, doing exactly that.
171
499009
2296
08:21
This is running in real time on a mobile phone,
172
501634
2860
08:24
and that's, of course, amazing in its own right,
173
504518
3313
08:27
that mobile phones can do so many billions and trillions of operations
174
507855
3468
08:31
per second.
175
511347
1248
08:32
What you're looking at is a phone
176
512619
1615
08:34
looking at one after another picture of a bird,
177
514258
3547
08:37
and actually not only saying, "Yes, it's a bird,"
178
517829
2715
08:40
but identifying the species of bird with a network of this sort.
179
520568
3411
08:44
So in that picture,
180
524890
1826
08:46
the x and the w are known, and the y is the unknown.
181
526740
3802
08:50
I'm glossing over the very difficult part, of course,
182
530566
2508
08:53
which is how on earth do we figure out the w,
183
533098
3861
08:56
the brain that can do such a thing?
184
536983
2187
08:59
How would we ever learn such a model?
185
539194
1834
09:01
So this process of learning, of solving for w,
186
541418
3233
09:04
if we were doing this with the simple equation
187
544675
2647
09:07
in which we think about these as numbers,
188
547346
2000
09:09
we know exactly how to do that: 6 = 2 x w,
189
549370
2687
09:12
well, we divide by two and we're done.
190
552081
3312
09:16
The problem is with this operator.
191
556001
2220
09:18
So, division --
192
558823
1151
09:19
we've used division because it's the inverse to multiplication,
193
559998
3121
09:23
but as I've just said,
194
563143
1440
09:24
the multiplication is a bit of a lie here.
195
564607
2449
09:27
This is a very, very complicated, very non-linear operation;
196
567080
3326
09:30
it has no inverse.
197
570430
1704
09:32
So we have to figure out a way to solve the equation
198
572158
3150
09:35
without a division operator.
199
575332
2024
09:37
And the way to do that is fairly straightforward.
200
577380
2343
09:39
You just say, let's play a little algebra trick,
201
579747
2671
09:42
and move the six over to the right-hand side of the equation.
202
582442
2906
09:45
Now, we're still using multiplication.
203
585372
1826
09:47
And that zero -- let's think about it as an error.
204
587675
3580
09:51
In other words, if we've solved for w the right way,
205
591279
2515
09:53
then the error will be zero.
206
593818
1656
09:55
And if we haven't gotten it quite right,
207
595498
1938
09:57
the error will be greater than zero.
208
597460
1749
09:59
So now we can just take guesses to minimize the error,
209
599233
3366
10:02
and that's the sort of thing computers are very good at.
210
602623
2687
10:05
So you've taken an initial guess:
211
605334
1593
10:06
what if w = 0?
212
606951
1156
10:08
Well, then the error is 6.
213
608131
1240
10:09
What if w = 1? The error is 4.
214
609395
1446
10:10
And then the computer can sort of play Marco Polo,
215
610865
2367
10:13
and drive down the error close to zero.
216
613256
2367
10:15
As it does that, it's getting successive approximations to w.
217
615647
3374
10:19
Typically, it never quite gets there, but after about a dozen steps,
218
619045
3656
10:22
we're up to w = 2.999, which is close enough.
219
622725
4624
10:28
And this is the learning process.
220
628302
1814
10:30
So remember that what's been going on here
221
630140
2730
10:32
is that we've been taking a lot of known x's and known y's
222
632894
4378
10:37
and solving for the w in the middle through an iterative process.
223
637296
3454
10:40
It's exactly the same way that we do our own learning.
224
640774
3556
10:44
We have many, many images as babies
225
644354
2230
10:46
and we get told, "This is a bird; this is not a bird."
226
646608
2633
10:49
And over time, through iteration,
227
649714
2098
10:51
we solve for w, we solve for those neural connections.
228
651836
2928
10:55
So now, we've held x and w fixed to solve for y;
229
655460
4086
10:59
that's everyday, fast perception.
230
659570
1847
11:01
We figure out how we can solve for w,
231
661441
1763
11:03
that's learning, which is a lot harder,
232
663228
1903
11:05
because we need to do error minimization,
233
665155
1985
11:07
using a lot of training examples.
234
667164
1687
11:08
And about a year ago, Alex Mordvintsev, on our team,
235
668875
3187
11:12
decided to experiment with what happens if we try solving for x,
236
672086
3550
11:15
given a known w and a known y.
237
675660
2037
11:18
In other words,
238
678124
1151
11:19
you know that it's a bird,
239
679299
1352
11:20
and you already have your neural network that you've trained on birds,
240
680675
3303
11:24
but what is the picture of a bird?
241
684002
2344
11:27
It turns out that by using exactly the same error-minimization procedure,
242
687034
5024
11:32
one can do that with the network trained to recognize birds,
243
692082
3430
11:35
and the result turns out to be ...
244
695536
3388
11:42
a picture of birds.
245
702400
1305
11:44
So this is a picture of birds generated entirely by a neural network
246
704814
3737
11:48
that was trained to recognize birds,
247
708575
1826
11:50
just by solving for x rather than solving for y,
248
710425
3538
11:53
and doing that iteratively.
249
713987
1288
11:55
Here's another fun example.
250
715732
1847
11:57
This was a work made by Mike Tyka in our group,
251
717603
3437
12:01
which he calls "Animal Parade."
252
721064
2308
12:03
It reminds me a little bit of William Kentridge's artworks,
253
723396
2876
12:06
in which he makes sketches, rubs them out,
254
726296
2489
12:08
makes sketches, rubs them out,
255
728809
1460
12:10
and creates a movie this way.
256
730293
1398
12:11
In this case,
257
731715
1151
12:12
what Mike is doing is varying y over the space of different animals,
258
732890
3277
12:16
in a network designed to recognize and distinguish
259
736191
2382
12:18
different animals from each other.
260
738597
1810
12:20
And you get this strange, Escher-like morph from one animal to another.
261
740431
3751
12:26
Here he and Alex together have tried reducing
262
746221
4614
12:30
the y's to a space of only two dimensions,
263
750859
2759
12:33
thereby making a map out of the space of all things
264
753642
3438
12:37
recognized by this network.
265
757104
1719
12:38
Doing this kind of synthesis
266
758847
2023
12:40
or generation of imagery over that entire surface,
267
760894
2382
12:43
varying y over the surface, you make a kind of map --
268
763300
2846
12:46
a visual map of all the things the network knows how to recognize.
269
766170
3141
12:49
The animals are all here; "armadillo" is right in that spot.
270
769335
2865
12:52
You can do this with other kinds of networks as well.
271
772919
2479
12:55
This is a network designed to recognize faces,
272
775422
2874
12:58
to distinguish one face from another.
273
778320
2000
13:00
And here, we're putting in a y that says, "me,"
274
780344
3249
13:03
my own face parameters.
275
783617
1575
13:05
And when this thing solves for x,
276
785216
1706
13:06
it generates this rather crazy,
277
786946
2618
13:09
kind of cubist, surreal, psychedelic picture of me
278
789588
4428
13:14
from multiple points of view at once.
279
794040
1806
13:15
The reason it looks like multiple points of view at once
280
795870
2734
13:18
is because that network is designed to get rid of the ambiguity
281
798628
3687
13:22
of a face being in one pose or another pose,
282
802339
2476
13:24
being looked at with one kind of lighting, another kind of lighting.
283
804839
3376
13:28
So when you do this sort of reconstruction,
284
808239
2085
13:30
if you don't use some sort of guide image
285
810348
2304
13:32
or guide statistics,
286
812676
1211
13:33
then you'll get a sort of confusion of different points of view,
287
813911
3765
13:37
because it's ambiguous.
288
817700
1368
13:39
This is what happens if Alex uses his own face as a guide image
289
819786
4223
13:44
during that optimization process to reconstruct my own face.
290
824033
3321
13:48
So you can see it's not perfect.
291
828284
2328
13:50
There's still quite a lot of work to do
292
830636
1874
13:52
on how we optimize that optimization process.
293
832534
2453
13:55
But you start to get something more like a coherent face,
294
835011
2827
13:57
rendered using my own face as a guide.
295
837862
2014
14:00
You don't have to start with a blank canvas
296
840892
2501
14:03
or with white noise.
297
843417
1156
14:04
When you're solving for x,
298
844597
1304
14:05
you can begin with an x, that is itself already some other image.
299
845925
3889
14:09
That's what this little demonstration is.
300
849838
2556
14:12
This is a network that is designed to categorize
301
852418
4122
14:16
all sorts of different objects -- man-made structures, animals ...
302
856564
3119
14:19
Here we're starting with just a picture of clouds,
303
859707
2593
14:22
and as we optimize,
304
862324
1671
14:24
basically, this network is figuring out what it sees in the clouds.
305
864019
4486
14:28
And the more time you spend looking at this,
306
868931
2320
14:31
the more things you also will see in the clouds.
307
871275
2753
14:35
You could also use the face network to hallucinate into this,
308
875004
3375
14:38
and you get some pretty crazy stuff.
309
878403
1812
14:40
(Laughter)
310
880239
1150
14:42
Or, Mike has done some other experiments
311
882401
2744
14:45
in which he takes that cloud image,
312
885169
3905
14:49
hallucinates, zooms, hallucinates, zooms hallucinates, zooms.
313
889098
3507
14:52
And in this way,
314
892629
1151
14:53
you can get a sort of fugue state of the network, I suppose,
315
893804
3675
14:57
or a sort of free association,
316
897503
3680
15:01
in which the network is eating its own tail.
317
901207
2227
15:03
So every image is now the basis for,
318
903458
3421
15:06
"What do I think I see next?
319
906903
1421
15:08
What do I think I see next? What do I think I see next?"
320
908348
2803
15:11
I showed this for the first time in public
321
911487
2936
15:14
to a group at a lecture in Seattle called "Higher Education" --
322
914447
5437
15:19
this was right after marijuana was legalized.
323
919908
2437
15:22
(Laughter)
324
922369
2415
15:26
So I'd like to finish up quickly
325
926627
2104
15:28
by just noting that this technology is not constrained.
326
928755
4255
15:33
I've shown you purely visual examples because they're really fun to look at.
327
933034
3665
15:36
It's not a purely visual technology.
328
936723
2451
15:39
Our artist collaborator, Ross Goodwin,
329
939198
1993
15:41
has done experiments involving a camera that takes a picture,
330
941215
3671
15:44
and then a computer in his backpack writes a poem using neural networks,
331
944910
4234
15:49
based on the contents of the image.
332
949168
1944
15:51
And that poetry neural network has been trained
333
951136
2947
15:54
on a large corpus of 20th-century poetry.
334
954107
2234
15:56
And the poetry is, you know,
335
956365
1499
15:57
I think, kind of not bad, actually.
336
957888
1914
15:59
(Laughter)
337
959826
1384
16:01
In closing,
338
961234
1159
16:02
I think that per Michelangelo,
339
962417
2132
16:04
I think he was right;
340
964573
1234
16:05
perception and creativity are very intimately connected.
341
965831
3436
16:09
What we've just seen are neural networks
342
969611
2634
16:12
that are entirely trained to discriminate,
343
972269
2303
16:14
or to recognize different things in the world,
344
974596
2242
16:16
able to be run in reverse, to generate.
345
976862
3161
16:20
One of the things that suggests to me
346
980047
1783
16:21
is not only that Michelangelo really did see
347
981854
2398
16:24
the sculpture in the blocks of stone,
348
984276
2452
16:26
but that any creature, any being, any alien
349
986752
3638
16:30
that is able to do perceptual acts of that sort
350
990414
3657
16:34
is also able to create
351
994095
1375
16:35
because it's exactly the same machinery that's used in both cases.
352
995494
3224
16:38
Also, I think that perception and creativity are by no means
353
998742
4532
16:43
uniquely human.
354
1003298
1210
16:44
We start to have computer models that can do exactly these sorts of things.
355
1004532
3708
16:48
And that ought to be unsurprising; the brain is computational.
356
1008264
3328
16:51
And finally,
357
1011616
1657
16:53
computing began as an exercise in designing intelligent machinery.
358
1013297
4668
16:57
It was very much modeled after the idea
359
1017989
2462
17:00
of how could we make machines intelligent.
360
1020475
3013
17:03
And we finally are starting to fulfill now
361
1023512
2162
17:05
some of the promises of those early pioneers,
362
1025698
2406
17:08
of Turing and von Neumann
363
1028128
1713
17:09
and McCulloch and Pitts.
364
1029865
2265
17:12
And I think that computing is not just about accounting
365
1032154
4098
17:16
or playing Candy Crush or something.
366
1036276
2147
17:18
From the beginning, we modeled them after our minds.
367
1038447
2578
17:21
And they give us both the ability to understand our own minds better
368
1041049
3269
17:24
and to extend them.
369
1044342
1529
17:26
Thank you very much.
370
1046627
1167
17:27
(Applause)
371
1047818
5939
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7