Abe Davis: New video technology that reveals an object's hidden properties

204,157 views ・ 2015-05-05

TED


Please double-click on the English subtitles below to play the video.

00:13
Most of us think of motion as a very visual thing.
0
13373
3349
00:17
If I walk across this stage or gesture with my hands while I speak,
1
17889
5088
00:22
that motion is something that you can see.
2
22977
2261
00:26
But there's a world of important motion that's too subtle for the human eye,
3
26255
5482
00:31
and over the past few years,
4
31737
2041
00:33
we've started to find that cameras
5
33778
1997
00:35
can often see this motion even when humans can't.
6
35775
3410
00:40
So let me show you what I mean.
7
40305
1551
00:42
On the left here, you see video of a person's wrist,
8
42717
3622
00:46
and on the right, you see video of a sleeping infant,
9
46339
3147
00:49
but if I didn't tell you that these were videos,
10
49486
3146
00:52
you might assume that you were looking at two regular images,
11
52632
3761
00:56
because in both cases,
12
56393
1672
00:58
these videos appear to be almost completely still.
13
58065
3047
01:02
But there's actually a lot of subtle motion going on here,
14
62175
3885
01:06
and if you were to touch the wrist on the left,
15
66060
2392
01:08
you would feel a pulse,
16
68452
1996
01:10
and if you were to hold the infant on the right,
17
70448
2485
01:12
you would feel the rise and fall of her chest
18
72933
2391
01:15
as she took each breath.
19
75324
1390
01:17
And these motions carry a lot of significance,
20
77762
3576
01:21
but they're usually too subtle for us to see,
21
81338
3343
01:24
so instead, we have to observe them
22
84681
2276
01:26
through direct contact, through touch.
23
86957
2900
01:30
But a few years ago,
24
90997
1265
01:32
my colleagues at MIT developed what they call a motion microscope,
25
92262
4405
01:36
which is software that finds these subtle motions in video
26
96667
4384
01:41
and amplifies them so that they become large enough for us to see.
27
101051
3562
01:45
And so, if we use their software on the left video,
28
105416
3483
01:48
it lets us see the pulse in this wrist,
29
108899
3250
01:52
and if we were to count that pulse,
30
112149
1695
01:53
we could even figure out this person's heart rate.
31
113844
2355
01:57
And if we used the same software on the right video,
32
117095
3065
02:00
it lets us see each breath that this infant takes,
33
120160
3227
02:03
and we can use this as a contact-free way to monitor her breathing.
34
123387
4137
02:08
And so this technology is really powerful because it takes these phenomena
35
128884
5348
02:14
that we normally have to experience through touch
36
134232
2367
02:16
and it lets us capture them visually and non-invasively.
37
136599
2957
02:21
So a couple years ago, I started working with the folks that created that software,
38
141104
4411
02:25
and we decided to pursue a crazy idea.
39
145515
3367
02:28
We thought, it's cool that we can use software
40
148882
2693
02:31
to visualize tiny motions like this,
41
151575
3135
02:34
and you can almost think of it as a way to extend our sense of touch.
42
154710
4458
02:39
But what if we could do the same thing with our ability to hear?
43
159168
4059
02:44
What if we could use video to capture the vibrations of sound,
44
164508
4665
02:49
which are just another kind of motion,
45
169173
2827
02:52
and turn everything that we see into a microphone?
46
172000
3346
02:56
Now, this is a bit of a strange idea,
47
176236
1971
02:58
so let me try to put it in perspective for you.
48
178207
2586
03:01
Traditional microphones work by converting the motion
49
181523
3488
03:05
of an internal diaphragm into an electrical signal,
50
185011
3599
03:08
and that diaphragm is designed to move readily with sound
51
188610
4318
03:12
so that its motion can be recorded and interpreted as audio.
52
192928
4807
03:17
But sound causes all objects to vibrate.
53
197735
3668
03:21
Those vibrations are just usually too subtle and too fast for us to see.
54
201403
5480
03:26
So what if we record them with a high-speed camera
55
206883
3738
03:30
and then use software to extract tiny motions
56
210621
3576
03:34
from our high-speed video,
57
214197
2090
03:36
and analyze those motions to figure out what sounds created them?
58
216287
4274
03:41
This would let us turn visible objects into visual microphones from a distance.
59
221859
5449
03:49
And so we tried this out,
60
229080
2183
03:51
and here's one of our experiments,
61
231263
1927
03:53
where we took this potted plant that you see on the right
62
233190
2949
03:56
and we filmed it with a high-speed camera
63
236139
2438
03:58
while a nearby loudspeaker played this sound.
64
238577
3529
04:02
(Music: "Mary Had a Little Lamb")
65
242275
8190
04:11
And so here's the video that we recorded,
66
251820
2824
04:14
and we recorded it at thousands of frames per second,
67
254644
3924
04:18
but even if you look very closely,
68
258568
2322
04:20
all you'll see are some leaves
69
260890
1951
04:22
that are pretty much just sitting there doing nothing,
70
262841
3065
04:25
because our sound only moved those leaves by about a micrometer.
71
265906
4806
04:31
That's one ten-thousandth of a centimeter,
72
271103
4276
04:35
which spans somewhere between a hundredth and a thousandth
73
275379
4156
04:39
of a pixel in this image.
74
279535
2299
04:41
So you can squint all you want,
75
281881
2887
04:44
but motion that small is pretty much perceptually invisible.
76
284768
3335
04:49
But it turns out that something can be perceptually invisible
77
289667
4157
04:53
and still be numerically significant,
78
293824
2809
04:56
because with the right algorithms,
79
296633
2002
04:58
we can take this silent, seemingly still video
80
298635
3687
05:02
and we can recover this sound.
81
302322
1527
05:04
(Music: "Mary Had a Little Lamb")
82
304690
7384
05:12
(Applause)
83
312074
5828
05:22
So how is this possible?
84
322058
1939
05:23
How can we get so much information out of so little motion?
85
323997
4344
05:28
Well, let's say that those leaves move by just a single micrometer,
86
328341
5361
05:33
and let's say that that shifts our image by just a thousandth of a pixel.
87
333702
4308
05:39
That may not seem like much,
88
339269
2572
05:41
but a single frame of video
89
341841
1996
05:43
may have hundreds of thousands of pixels in it,
90
343837
3257
05:47
and so if we combine all of the tiny motions that we see
91
347094
3454
05:50
from across that entire image,
92
350548
2298
05:52
then suddenly a thousandth of a pixel
93
352846
2623
05:55
can start to add up to something pretty significant.
94
355469
2775
05:58
On a personal note, we were pretty psyched when we figured this out.
95
358870
3635
06:02
(Laughter)
96
362505
2320
06:04
But even with the right algorithm,
97
364825
3253
06:08
we were still missing a pretty important piece of the puzzle.
98
368078
3617
06:11
You see, there are a lot of factors that affect when and how well
99
371695
3604
06:15
this technique will work.
100
375299
1997
06:17
There's the object and how far away it is;
101
377296
3204
06:20
there's the camera and the lens that you use;
102
380500
2394
06:22
how much light is shining on the object and how loud your sound is.
103
382894
4091
06:27
And even with the right algorithm,
104
387945
3375
06:31
we had to be very careful with our early experiments,
105
391320
3390
06:34
because if we got any of these factors wrong,
106
394710
2392
06:37
there was no way to tell what the problem was.
107
397102
2368
06:39
We would just get noise back.
108
399470
2647
06:42
And so a lot of our early experiments looked like this.
109
402117
3320
06:45
And so here I am,
110
405437
2206
06:47
and on the bottom left, you can kind of see our high-speed camera,
111
407643
4040
06:51
which is pointed at a bag of chips,
112
411683
2183
06:53
and the whole thing is lit by these bright lamps.
113
413866
2949
06:56
And like I said, we had to be very careful in these early experiments,
114
416815
4365
07:01
so this is how it went down.
115
421180
2508
07:03
(Video) Abe Davis: Three, two, one, go.
116
423688
3761
07:07
Mary had a little lamb! Little lamb! Little lamb!
117
427449
5387
07:12
(Laughter)
118
432836
4500
07:17
AD: So this experiment looks completely ridiculous.
119
437336
2814
07:20
(Laughter)
120
440150
1788
07:21
I mean, I'm screaming at a bag of chips --
121
441938
2345
07:24
(Laughter) --
122
444283
1551
07:25
and we're blasting it with so much light,
123
445834
2117
07:27
we literally melted the first bag we tried this on. (Laughter)
124
447951
4479
07:32
But ridiculous as this experiment looks,
125
452525
3274
07:35
it was actually really important,
126
455799
1788
07:37
because we were able to recover this sound.
127
457587
2926
07:40
(Audio) Mary had a little lamb! Little lamb! Little lamb!
128
460513
4712
07:45
(Applause)
129
465225
4088
07:49
AD: And this was really significant,
130
469313
1881
07:51
because it was the first time we recovered intelligible human speech
131
471194
4119
07:55
from silent video of an object.
132
475424
2341
07:57
And so it gave us this point of reference,
133
477765
2391
08:00
and gradually we could start to modify the experiment,
134
480156
3871
08:04
using different objects or moving the object further away,
135
484106
3805
08:07
using less light or quieter sounds.
136
487911
2770
08:11
And we analyzed all of these experiments
137
491887
2874
08:14
until we really understood the limits of our technique,
138
494761
3622
08:18
because once we understood those limits,
139
498383
1950
08:20
we could figure out how to push them.
140
500333
2346
08:22
And that led to experiments like this one,
141
502679
3181
08:25
where again, I'm going to speak to a bag of chips,
142
505860
2739
08:28
but this time we've moved our camera about 15 feet away,
143
508599
4830
08:33
outside, behind a soundproof window,
144
513429
2833
08:36
and the whole thing is lit by only natural sunlight.
145
516262
2803
08:40
And so here's the video that we captured.
146
520529
2155
08:44
And this is what things sounded like from inside, next to the bag of chips.
147
524450
4559
08:49
(Audio) Mary had a little lamb whose fleece was white as snow,
148
529009
5038
08:54
and everywhere that Mary went, that lamb was sure to go.
149
534047
5619
08:59
AD: And here's what we were able to recover from our silent video
150
539666
4017
09:03
captured outside behind that window.
151
543683
2345
09:06
(Audio) Mary had a little lamb whose fleece was white as snow,
152
546028
4435
09:10
and everywhere that Mary went, that lamb was sure to go.
153
550463
5457
09:15
(Applause)
154
555920
6501
09:22
AD: And there are other ways that we can push these limits as well.
155
562421
3542
09:25
So here's a quieter experiment
156
565963
1798
09:27
where we filmed some earphones plugged into a laptop computer,
157
567761
4110
09:31
and in this case, our goal was to recover the music that was playing on that laptop
158
571871
4110
09:35
from just silent video
159
575981
2299
09:38
of these two little plastic earphones,
160
578280
2507
09:40
and we were able to do this so well
161
580787
2183
09:42
that I could even Shazam our results.
162
582970
2461
09:45
(Laughter)
163
585431
2411
09:49
(Music: "Under Pressure" by Queen)
164
589191
10034
10:01
(Applause)
165
601615
4969
10:06
And we can also push things by changing the hardware that we use.
166
606584
4551
10:11
Because the experiments I've shown you so far
167
611135
2461
10:13
were done with a camera, a high-speed camera,
168
613596
2322
10:15
that can record video about a 100 times faster
169
615918
2879
10:18
than most cell phones,
170
618797
1927
10:20
but we've also found a way to use this technique
171
620724
2809
10:23
with more regular cameras,
172
623533
2230
10:25
and we do that by taking advantage of what's called a rolling shutter.
173
625763
4069
10:29
You see, most cameras record images one row at a time,
174
629832
4798
10:34
and so if an object moves during the recording of a single image,
175
634630
5702
10:40
there's a slight time delay between each row,
176
640344
2717
10:43
and this causes slight artifacts
177
643061
3157
10:46
that get coded into each frame of a video.
178
646218
3483
10:49
And so what we found is that by analyzing these artifacts,
179
649701
3806
10:53
we can actually recover sound using a modified version of our algorithm.
180
653507
4615
10:58
So here's an experiment we did
181
658122
1912
11:00
where we filmed a bag of candy
182
660034
1695
11:01
while a nearby loudspeaker played
183
661729
1741
11:03
the same "Mary Had a Little Lamb" music from before,
184
663470
2972
11:06
but this time, we used just a regular store-bought camera,
185
666442
4203
11:10
and so in a second, I'll play for you the sound that we recovered,
186
670645
3174
11:13
and it's going to sound distorted this time,
187
673819
2050
11:15
but listen and see if you can still recognize the music.
188
675869
2836
11:19
(Audio: "Mary Had a Little Lamb")
189
679723
6223
11:37
And so, again, that sounds distorted,
190
697527
3465
11:40
but what's really amazing here is that we were able to do this
191
700992
4386
11:45
with something that you could literally run out
192
705378
2626
11:48
and pick up at a Best Buy.
193
708004
1444
11:51
So at this point,
194
711122
1363
11:52
a lot of people see this work,
195
712485
1974
11:54
and they immediately think about surveillance.
196
714459
3413
11:57
And to be fair,
197
717872
2415
12:00
it's not hard to imagine how you might use this technology to spy on someone.
198
720287
4133
12:04
But keep in mind that there's already a lot of very mature technology
199
724420
3947
12:08
out there for surveillance.
200
728367
1579
12:09
In fact, people have been using lasers
201
729946
2090
12:12
to eavesdrop on objects from a distance for decades.
202
732036
2799
12:15
But what's really new here,
203
735978
2025
12:18
what's really different,
204
738003
1440
12:19
is that now we have a way to picture the vibrations of an object,
205
739443
4295
12:23
which gives us a new lens through which to look at the world,
206
743738
3413
12:27
and we can use that lens
207
747151
1510
12:28
to learn not just about forces like sound that cause an object to vibrate,
208
748661
4899
12:33
but also about the object itself.
209
753560
2288
12:36
And so I want to take a step back
210
756975
1693
12:38
and think about how that might change the ways that we use video,
211
758668
4249
12:42
because we usually use video to look at things,
212
762917
3553
12:46
and I've just shown you how we can use it
213
766470
2322
12:48
to listen to things.
214
768792
1857
12:50
But there's another important way that we learn about the world:
215
770649
3971
12:54
that's by interacting with it.
216
774620
2275
12:56
We push and pull and poke and prod things.
217
776895
3111
13:00
We shake things and see what happens.
218
780006
3181
13:03
And that's something that video still won't let us do,
219
783187
4273
13:07
at least not traditionally.
220
787460
2136
13:09
So I want to show you some new work,
221
789596
1950
13:11
and this is based on an idea I had just a few months ago,
222
791546
2667
13:14
so this is actually the first time I've shown it to a public audience.
223
794213
3301
13:17
And the basic idea is that we're going to use the vibrations in a video
224
797514
5363
13:22
to capture objects in a way that will let us interact with them
225
802877
4481
13:27
and see how they react to us.
226
807358
1974
13:31
So here's an object,
227
811120
1764
13:32
and in this case, it's a wire figure in the shape of a human,
228
812884
3832
13:36
and we're going to film that object with just a regular camera.
229
816716
3088
13:39
So there's nothing special about this camera.
230
819804
2124
13:41
In fact, I've actually done this with my cell phone before.
231
821928
2961
13:44
But we do want to see the object vibrate,
232
824889
2252
13:47
so to make that happen,
233
827141
1133
13:48
we're just going to bang a little bit on the surface where it's resting
234
828274
3346
13:51
while we record this video.
235
831620
2138
13:59
So that's it: just five seconds of regular video,
236
839398
3671
14:03
while we bang on this surface,
237
843069
2136
14:05
and we're going to use the vibrations in that video
238
845205
3513
14:08
to learn about the structural and material properties of our object,
239
848718
4544
14:13
and we're going to use that information to create something new and interactive.
240
853262
4834
14:24
And so here's what we've created.
241
864866
2653
14:27
And it looks like a regular image,
242
867519
2229
14:29
but this isn't an image, and it's not a video,
243
869748
3111
14:32
because now I can take my mouse
244
872859
2368
14:35
and I can start interacting with the object.
245
875227
2859
14:44
And so what you see here
246
884936
2357
14:47
is a simulation of how this object
247
887389
2226
14:49
would respond to new forces that we've never seen before,
248
889615
4458
14:54
and we created it from just five seconds of regular video.
249
894073
3633
14:59
(Applause)
250
899249
4715
15:09
And so this is a really powerful way to look at the world,
251
909421
3227
15:12
because it lets us predict how objects will respond
252
912648
2972
15:15
to new situations,
253
915620
1823
15:17
and you could imagine, for instance, looking at an old bridge
254
917443
3473
15:20
and wondering what would happen, how would that bridge hold up
255
920916
3527
15:24
if I were to drive my car across it.
256
924443
2833
15:27
And that's a question that you probably want to answer
257
927276
2774
15:30
before you start driving across that bridge.
258
930050
2560
15:33
And of course, there are going to be limitations to this technique,
259
933988
3272
15:37
just like there were with the visual microphone,
260
937260
2462
15:39
but we found that it works in a lot of situations
261
939722
3181
15:42
that you might not expect,
262
942903
1875
15:44
especially if you give it longer videos.
263
944778
2768
15:47
So for example, here's a video that I captured
264
947546
2508
15:50
of a bush outside of my apartment,
265
950054
2299
15:52
and I didn't do anything to this bush,
266
952353
3088
15:55
but by capturing a minute-long video,
267
955441
2705
15:58
a gentle breeze caused enough vibrations
268
958146
3378
16:01
that we could learn enough about this bush to create this simulation.
269
961524
3587
16:07
(Applause)
270
967270
6142
16:13
And so you could imagine giving this to a film director,
271
973412
2972
16:16
and letting him control, say,
272
976384
1719
16:18
the strength and direction of wind in a shot after it's been recorded.
273
978103
4922
16:24
Or, in this case, we pointed our camera at a hanging curtain,
274
984810
4535
16:29
and you can't even see any motion in this video,
275
989345
4129
16:33
but by recording a two-minute-long video,
276
993474
2925
16:36
natural air currents in this room
277
996399
2438
16:38
created enough subtle, imperceptible motions and vibrations
278
998837
4412
16:43
that we could learn enough to create this simulation.
279
1003249
2565
16:48
And ironically,
280
1008243
2366
16:50
we're kind of used to having this kind of interactivity
281
1010609
3088
16:53
when it comes to virtual objects,
282
1013697
2647
16:56
when it comes to video games and 3D models,
283
1016344
3297
16:59
but to be able to capture this information from real objects in the real world
284
1019641
4404
17:04
using just simple, regular video,
285
1024045
2817
17:06
is something new that has a lot of potential.
286
1026862
2183
17:10
So here are the amazing people who worked with me on these projects.
287
1030410
4904
17:16
(Applause)
288
1036057
5596
17:24
And what I've shown you today is only the beginning.
289
1044819
3057
17:27
We've just started to scratch the surface
290
1047876
2113
17:29
of what you can do with this kind of imaging,
291
1049989
2972
17:32
because it gives us a new way
292
1052961
2286
17:35
to capture our surroundings with common, accessible technology.
293
1055342
4724
17:40
And so looking to the future,
294
1060066
1929
17:41
it's going to be really exciting to explore
295
1061995
2037
17:44
what this can tell us about the world.
296
1064032
1856
17:46
Thank you.
297
1066381
1204
17:47
(Applause)
298
1067610
6107
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7