Blaise Aguera y Arcas: Jaw-dropping Photosynth demo

46,128 views ・ 2007-06-26

TED


请双击下面的英文字幕来播放视频。

翻译人员: Geng Luo 校对人员: dahong zhang
00:25
What I'm going to show you first, as quickly as I can,
0
25000
2548
首先,我要用最快的速度为大家演示
00:27
is some foundational work, some new technology
1
27572
3769
一些新技术的基础研究成果。
00:31
that we brought to Microsoft as part of an acquisition
2
31365
2611
正好是一年前,微软收购了我们公司,
00:34
almost exactly a year ago.
3
34000
1821
而我们为微软带来了这项技术,它就是Seadragon。
00:35
This is Seadragon, and it's an environment
4
35845
2368
Seadragon是一个软件环境,你可以通过它以近景或远景的方式
00:38
in which you can either locally or remotely interact
5
38237
2476
00:40
with vast amounts of visual data.
6
40737
2119
浏览浩瀚的可视化数据。
00:43
We're looking at many, many gigabytes of digital photos here
7
43165
3404
我们这里看到的是许多许多GB(千兆字节)级别的数码照片,
00:46
and kind of seamlessly and continuously zooming in,
8
46593
2915
对它们可以进行持续并且平滑的放大,
00:49
panning through it, rearranging it in any way we want.
9
49532
2545
可以通过全景的方式浏览它们,还可以对它们进行重新排列。
00:52
And it doesn't matter how much information we're looking at,
10
52389
3587
不管所见到的数据有多少、
00:56
how big these collections are or how big the images are.
11
56000
2976
图像集有多大以及图像本身有多大,Seadragon都拥有这样的处理能力。
00:59
Most of them are ordinary digital camera photos,
12
59000
2286
以上展示的图片大部分都是由数码相机拍摄的照片,
01:01
but this one, for example, is a scan from the Library of Congress,
13
61310
3144
但这个例子则不同,它是一张来自国会图书馆的扫描图片,
01:04
and it's in the 300 megapixel range.
14
64478
2818
拥有3亿个像素。
01:07
It doesn't make any difference
15
67320
1656
然而,浏览它并没有什么区别,
01:09
because the only thing that ought to limit the performance of a system like this one
16
69000
4144
因为限制系统性能的唯一因素只是:
你所使用的屏幕的像素数。
01:13
is the number of pixels on your screen at any given moment.
17
73168
2777
01:15
It's also very flexible architecture.
18
75969
1970
Seadragon同时也是一个非常灵活的架构。
01:17
This is an entire book, so this is an example of non-image data.
19
77963
3727
举个例子,这是一本完整的书,它的数据是非图像的(文本)。
01:21
This is "Bleak House" by Dickens.
20
81714
2787
这是狄更斯所著的《荒凉山庄》,一列就是一章的内容。
01:24
Every column is a chapter.
21
84525
2784
01:27
To prove to you that it's really text, and not an image,
22
87333
3643
我给大家证明一下这真的是文本而非图片,
01:31
we can do something like so, to really show
23
91000
2048
我们可以这样操作,
01:33
that this is a real representation of the text; it's not a picture.
24
93072
3192
大家可以看出这真的是文本,而不是一幅图片。
01:36
Maybe this is an artificial way to read an e-book.
25
96288
2664
也许这会是一种阅读电子书的方式,
01:38
I wouldn't recommend it.
26
98976
1200
但是我可不推荐这么做。
01:40
This is a more realistic case, an issue of The Guardian.
27
100200
2848
接下来是一个更加实际的例子,这是一期《卫报》。
01:43
Every large image is the beginning of a section.
28
103072
2286
每一张大图片是一版开篇,
01:45
And this really gives you the joy and the good experience
29
105382
2904
而报纸或者杂志的纸质版本本身就包含了多种比例的图片,
01:48
of reading the real paper version of a magazine or a newspaper,
30
108310
5183
在阅读的时候,读者会得到更好的阅读体验,
01:53
which is an inherently multi-scale kind of medium.
31
113517
2435
从而享受阅读的乐趣。
01:55
We've done something
32
115976
1000
我们在这里做了小小的改动
01:57
with the corner of this particular issue of The Guardian.
33
117000
2976
在这一期《卫报》得角上。
02:00
We've made up a fake ad that's very high resolution --
34
120000
2976
我们虚构了一个高分辨率的广告图片——
02:03
much higher than in an ordinary ad --
35
123000
2198
这比你平常看到的普通广告的分辨率要高很多,
02:05
and we've embedded extra content.
36
125222
1754
在图片中嵌入了额外的内容。
02:07
If you want to see the features of this car, you can see it here.
37
127000
3048
如果你希望看到这辆车的特性,你可以看这里。
02:10
Or other models, or even technical specifications.
38
130072
4180
你还能看到其他的型号,甚至技术规格。
02:14
And this really gets at some of these ideas
39
134276
3315
这种方式在一定程度上
02:17
about really doing away with those limits on screen real estate.
40
137615
4661
避免了屏幕实际使用面积的限制。
02:22
We hope that this means no more pop-ups
41
142300
2111
我们希望这个技术能够减少不必要的弹出窗口
02:24
and other rubbish like that -- shouldn't be necessary.
42
144435
2541
以及类似的垃圾信息。
02:27
Of course, mapping is one of those obvious applications
43
147000
2658
显然,对于这项技术的应用,
02:29
for a technology like this.
44
149682
1294
数字地图也是显而易见的应用之一。
02:31
And this one I really won't spend any time on,
45
151000
2191
对此,我真的不想花费太多的时间进行介绍,
02:33
except to say that we have things to contribute to this field as well.
46
153215
3334
我只想告诉大家我们已经对这个领域做出了自己的贡献。
02:37
But those are all the roads in the U.S.
47
157213
1858
这些只是在NASA的地理空间图片基础上
02:39
superimposed on top of a NASA geospatial image.
48
159095
4565
进行叠加处理而得到的美国的道路地图。
02:44
So let's pull up, now, something else.
49
164000
1976
现在,我们先放下这些,看看其他的。
02:46
This is actually live on the Web now; you can go check it out.
50
166000
2976
实际上,这项技术已经放到网上了,大家可以自己去体验一下。
02:49
This is a project called Photosynth, which marries two different technologies.
51
169000
3704
这个项目叫Photosynth,
它实际上融合了两个不同的技术:
02:52
One of them is Seadragon
52
172728
1248
一个是Seadragon,
02:54
and the other is some very beautiful computer-vision research
53
174000
2906
而另一个则是源自华盛顿大学的研究生Noah Snavely
02:56
done by Noah Snavely, a graduate student at the University of Washington,
54
176930
3462
所进行的计算机视觉研究的成果。
03:00
co-advised by Steve Seitz at U.W.
55
180416
1829
这项研究还得到了华盛顿大学Steve Seitz
03:02
and Rick Szeliski at Microsoft Research.
56
182269
1978
和微软研究院Rick Szeliski的协助。这是一个非常漂亮的合作成果。
03:04
A very nice collaboration.
57
184271
1733
03:06
And so this is live on the Web. It's powered by Seadragon.
58
186412
3108
这个项目在互联网上已经得到应用了,它是基于Seadragon技术构建的。
03:09
You can see that when we do these sorts of views,
59
189544
2504
你可以看到,我们轻松地对图片进行多种方式的查看,
03:12
where we can dive through images
60
192072
1723
从而能够对图片进行细致的剖析
03:13
and have this kind of multi-resolution experience.
61
193819
2334
并且拥有多分辨率的浏览体验。
03:16
But the spatial arrangement of the images here is actually meaningful.
62
196177
3799
不过,这些图片在三维空间的排列事实上是非常有意义的。
03:20
The computer vision algorithms have registered these images together
63
200000
3191
计算机视觉算法将这些图片联系到一起,
03:23
so that they correspond to the real space in which these shots --
64
203215
3761
那么这些图片就能够将真实空间呈现出来了,
03:27
all taken near Grassi Lakes in the Canadian Rockies --
65
207000
3300
而我们正是在这个空间里拍下了上述的照片——这些照片都是在
03:30
all these shots were taken.
66
210324
1663
加拿大落基山脉的格拉西湖(Grassi Lakes)附近拍下的——(所有照片)都是在这里拍下的。
03:32
So you see elements here
67
212011
1467
03:33
of stabilized slide-show or panoramic imaging,
68
213502
6013
因此你可以看到这里的元素是稳定的幻灯放映或者全景成像,
03:39
and these things have all been related spatially.
69
219539
2437
而这些内容在空间上都是关联的。
03:42
I'm not sure if I have time to show you any other environments.
70
222000
3000
我不确定我们是否有时间来展示更多的环境全景。
03:45
Some are much more spatial.
71
225024
1431
有很多例子比这个的空间感还要强。
03:46
I would like to jump straight to one of Noah's original data-sets --
72
226479
3945
下面让我们来看一下去年夏天,
03:50
this is from an early prototype that we first got working this summer --
73
230448
3552
我们利用Noah早期的数据库之一
所Photosynth的初期模型的建立。
03:54
to show you what I think
74
234024
1894
我认为
03:55
is really the punch line behind the Photosynth technology,
75
235942
3838
这可谓是我们这项技术的最抢眼之处。
03:59
It's not necessarily so apparent
76
239804
1561
这项技术不单单像我们在
04:01
from looking at the environments we've put up on the website.
77
241389
2895
网站上展示得那么简单明了。
04:04
We had to worry about the lawyers and so on.
78
244308
2177
主要因为我们制作网站时,要顾及到很多法律问题。
04:06
This is a reconstruction of Notre Dame Cathedral
79
246509
2301
这里是利用Flickr网站上
04:08
that was done entirely computationally from images scraped from Flickr.
80
248834
3457
的图像重建的巴黎圣母院。
你所要做的只是在Flickr网站上输入“巴黎圣母院”
04:12
You just type Notre Dame into Flickr,
81
252315
2019
04:14
and you get some pictures of guys in T-shirts, and of the campus and so on.
82
254358
3854
然后便能看到很多图片,包括留影的游人等等。
所有这些橘黄颜色的锥形都代表了一张
04:18
And each of these orange cones represents an image
83
258236
3146
04:21
that was discovered to belong to this model.
84
261406
3234
用来建立模型的图片。
04:26
And so these are all Flickr images,
85
266000
1976
这些全部是来自Flickr的图片,
04:28
and they've all been related spatially in this way.
86
268000
2976
被这样在空间里被串联起来。
04:31
We can just navigate in this very simple way.
87
271000
2334
接着,我们便可如此自如的进行浏览。
04:35
(Applause)
88
275000
3920
(鼓掌)
04:42
(Applause ends)
89
282557
1014
04:43
You know, I never thought that I'd end up working at Microsoft.
90
283595
2954
说实话,我从来没想过我会最后来为微软工作
04:46
It's very gratifying to have this kind of reception here.
91
286573
3000
受到这样欢迎,真挺令人高兴的。
04:49
(Laughter)
92
289597
3379
(笑声)
04:53
I guess you can see this is lots of different types of cameras:
93
293000
5048
我想你们可以看出
这些图片原自很多不同的相机:
04:58
it's everything from cell-phone cameras to professional SLRs,
94
298072
3161
从手机摄像头到专业单反。
05:01
quite a large number of them, stitched together in this environment.
95
301257
3191
如此大量的不同质量的照片,全被在这个环境下
拼合在了一起
05:04
If I can find some of the sort of weird ones --
96
304472
2632
让我来找些比较诡异的图片。
05:08
So many of them are occluded by faces, and so on.
97
308000
3322
看,不少照片包含了游客的大头照等等。
05:12
Somewhere in here there is actually a series of photographs -- here we go.
98
312595
4277
我记得这儿应该有
一个系列的照片 - 啊,在这儿。
05:16
This is actually a poster of Notre Dame that registered correctly.
99
316896
3301
这个是巴黎圣母院的海报。
05:20
We can dive in from the poster
100
320221
3216
我们可以钻到海报里
05:23
to a physical view of this environment.
101
323461
3810
去看整个重建的环境。
05:31
What the point here really is
102
331421
1866
这里的重点呢便是我们可以
05:33
is that we can do things with the social environment.
103
333311
2591
有效地利用网络社区。我们可以从每个人那里得到数据
05:35
This is now taking data from everybody --
104
335926
3002
05:38
from the entire collective memory, visually, of what the Earth looks like --
105
338952
3871
将每个人对不同环境
的记忆收集在一起,
05:42
and link all of that together.
106
342847
1749
共建成模型。
05:44
Those photos become linked, and they make something emergent
107
344620
2839
当所有这些图片交织在一起时,
所衍生出的
05:47
that's greater than the sum of the parts.
108
347483
1953
要远远超过单单收集起全部。
05:49
You have a model that emerges of the entire Earth.
109
349460
2356
这个模型所衍生出的,是整个地球。
05:51
Think of this as the long tail to Stephen Lawler's Virtual Earth work.
110
351840
4077
这如同是Stephen Lawler的《虚拟地球》的长尾市场。(Stephen Lawler 微软Virtual Earth项目主管)(见Long tail 长尾市场 TED: Chris Anderson )
05:55
And this is something that grows in complexity as people use it,
111
355941
3200
这类模型,会随着人们的
使用而不断变的复杂,
05:59
and whose benefits become greater to the users as they use it.
112
359165
3811
变得更加有价值。
06:03
Their own photos are getting tagged with meta-data that somebody else entered.
113
363000
3692
用户的照片,会被大家
注上标签。
06:06
If somebody bothered to tag all of these saints
114
366716
3360
如果有人愿意为所有这些圣母院里的圣贤注上标签,
06:10
and say who they all are, then my photo of Notre Dame Cathedral
115
370100
2953
表明他们是谁,那我们的圣母院照片便会
06:13
suddenly gets enriched with all of that data,
116
373077
2098
一下子丰富起来,
06:15
and I can use it as an entry point to dive into that space,
117
375199
2777
然后呢,我们便能以这张照片为起点,进入这个空间,
06:18
into that meta-verse, using everybody else's photos,
118
378000
2681
这个由很多人的照片所搭建的虚拟世界,
06:20
and do a kind of a cross-modal
119
380705
3301
从而得到一种跨越模型,
06:24
and cross-user social experience that way.
120
384030
3751
跨越用户的交互体验。
06:27
And of course, a by-product of all of that is immensely rich virtual models
121
387805
4171
当然了,这一切所带来另外一个宝贵产物便是
一个非常丰富的模型 - 充斥
06:32
of every interesting part of the Earth,
122
392000
1968
这地球每个角落里有趣的景观。这些景观不再
06:33
collected not just from overhead flights and from satellite images
123
393992
4487
局限于航空和卫星图片,
06:38
and so on, but from the collective memory.
124
398503
2052
而是实实在在的人们按下快门一刻所收藏的记忆的集合。
06:40
Thank you so much.
125
400579
1094
非常感谢!
06:41
(Applause)
126
401697
6863
(掌声)
06:51
(Applause ends)
127
411967
1001
06:52
Chris Anderson: Do I understand this right?
128
412992
2326
Chris Anderson: 如果我理解正确的话,你们的这个软件将能够
06:55
What your software is going to allow,
129
415342
2497
06:57
is that at some point, really within the next few years,
130
417863
3476
在未来的几年内
07:01
all the pictures that are shared by anyone across the world
131
421363
4235
将来自全球的图片
07:05
are going to link together?
132
425622
1561
接合在一起?
07:07
BAA: Yes. What this is really doing is discovering,
133
427207
2387
BAA:是的。这个软件的真正意义便是去探索。
07:09
creating hyperlinks, if you will, between images.
134
429618
2358
它在图片间构建起超链接。
07:12
It's doing that based on the content inside the images.
135
432000
2584
这个接合的过程
完全是基于图片的内容。
07:14
And that gets really exciting when you think about the richness
136
434608
3022
更令人兴奋的
07:17
of the semantic information a lot of images have.
137
437654
2304
在于图片所包含的大量文字语义信息。
07:19
Like when you do a web search for images,
138
439982
1960
比如,你在网上所以一张图片,
07:21
you type in phrases,
139
441966
1245
键入关键词后,网页上的文字内容
07:23
and the text on the web page is carrying a lot of information
140
443235
2900
将包含大量与这个图片相关的信息。
07:26
about what that picture is of.
141
446159
1502
07:27
What if that picture links to all of your pictures?
142
447685
2391
现在,假设这些图片全都与你的图片相连,那将会怎样?
那时,所以这些语义信息的相互链接
07:30
The amount of semantic interconnection and richness
143
450100
2413
以及内容量将是
07:32
that comes out of that is really huge.
144
452537
1854
巨大的。这将是非常典型的网络效应。
07:34
It's a classic network effect.
145
454415
1449
07:35
CA: Truly incredible. Congratulations.
146
455888
2024
CA:Blaise,太难以置信了。祝贺你们!
BAA:非常感谢各位!
关于本网站

这个网站将向你介绍对学习英语有用的YouTube视频。你将看到来自世界各地的一流教师教授的英语课程。双击每个视频页面上显示的英文字幕,即可从那里播放视频。字幕会随着视频的播放而同步滚动。如果你有任何意见或要求,请使用此联系表与我们联系。

https://forms.gle/WvT1wiN1qDtmnspy7