How AI Is Saving Billions of Years of Human Research Time | Max Jaderberg | TED

54,745 views ・ 2024-12-06

TED


Please double-click on the English subtitles below to play the video.

00:04
So a while ago now, I did a PhD,
0
4668
4105
00:08
and I actually thought it would be quite easy to do research.
1
8806
3170
00:11
Turns out it was really hard.
2
11976
1668
00:14
My PhD was spent coding up neural network layers
3
14478
3804
00:18
and writing CUDA kernels,
4
18315
2269
00:20
very much computer-based science.
5
20618
2335
00:23
And at that time,
6
23621
2235
00:25
I had a friend who worked in a lab doing real messy science.
7
25890
4471
00:30
He was trying to work out the structure of proteins experimentally.
8
30761
4271
00:35
And this is a really difficult thing to do.
9
35366
3203
00:39
It can take a whole PhD's worth of work
10
39403
2636
00:42
just to work out the structure of a single new protein system.
11
42039
4004
00:47
And then 10 years later, the field that I was in,
12
47378
2769
00:50
machine learning,
13
50181
1368
00:51
revolutionized his world of protein structure.
14
51582
3170
00:55
A neural network called AlphaFold was created by DeepMind
15
55753
4571
01:00
that can very accurately predict the structure of proteins
16
60324
4571
01:04
and solved this 50-year challenge of trying to do protein folding.
17
64929
5472
01:10
And just two weeks ago,
18
70901
1602
01:12
this won the Nobel Prize in chemistry.
19
72536
2903
01:15
And it's estimated that since the release of this model,
20
75439
3604
01:19
we've saved over a billion years of research time.
21
79076
4238
01:24
(Applause)
22
84315
2102
01:26
A billion years.
23
86450
1402
01:27
(Applause)
24
87852
1835
01:30
A whole PhD's worth of work is now approximated
25
90454
3771
01:34
by a couple of seconds of neural network time.
26
94258
2636
01:37
And to my friend, this might sound a bit depressing,
27
97428
2469
01:39
and I'm sorry about that,
28
99897
1268
01:41
but to me, this is just really an incredible thing.
29
101165
3003
01:44
The sheer scale of new knowledge about our protein universe
30
104201
3637
01:47
that we now have access to,
31
107872
2335
01:50
due to an AI model that's able to replace the need
32
110241
2736
01:53
for real-world experimental lab work.
33
113010
2736
01:55
And that frees up our precious human time
34
115779
2470
01:58
to begin probing the next frontiers of science.
35
118249
3303
02:02
Now some people say that this is a one-time-only event,
36
122987
3803
02:06
that we can't expect to see these sort of breakthroughs in science
37
126790
3170
02:09
with AI to be repeated.
38
129994
2235
02:13
And I disagree.
39
133030
1201
02:15
We will continue to see breakthroughs in understanding our real messy world
40
135499
5239
02:20
with AI.
41
140738
1201
02:22
Why?
42
142540
1401
02:23
Because we now have the neural network architectures
43
143974
2937
02:26
that can eat up any data modality that you throw at them.
44
146944
3670
02:31
And we have tried and tested recipes
45
151415
2002
02:33
of incorporating any possible signal in the world
46
153450
2903
02:36
into these learning algorithms.
47
156353
2069
02:38
And then we have the engineering and infrastructure
48
158989
2503
02:41
to scale these models to whatever size is needed
49
161492
3403
02:44
to take advantage of the massive amounts of compute power that we can create.
50
164929
4971
02:50
And finally, we're always creating new ways to record and measure
51
170901
3637
02:54
every detail of our real messy world
52
174572
3069
02:57
that then creates even bigger data sets
53
177641
2202
02:59
that help us train even richer models.
54
179877
2536
03:03
And so this is a new paradigm in front of us,
55
183781
3670
03:07
that of creating AI analogs of our real messy world.
56
187785
4805
03:13
This new AI paradigm takes our real, messy, natural world
57
193157
4538
03:17
and learns to recreate the elements of it with neural networks.
58
197728
3637
03:22
And why these AI analogs are so powerful is that it's not just about understanding,
59
202399
4972
03:27
approximating or simulating the world for the sake of understanding,
60
207371
5539
03:32
but this actually gives us a little virtual world
61
212910
2402
03:35
that we can experiment in at scale
62
215346
2469
03:37
to ultimately create new knowledge.
63
217848
2403
03:43
And you can imagine that this experimentation against our AI analogs,
64
223153
5940
03:49
this can also happen in silico, in a computer with other agents,
65
229126
5339
03:54
in a loop of in silico, open-ended discovery,
66
234498
4071
03:58
ultimately to create new knowledge that we can take back out
67
238602
4338
04:02
and change the world around us.
68
242973
2102
04:05
And this isn't science fiction.
69
245909
2069
04:08
Right now, we have thousands of graphics cards burning,
70
248812
5072
04:13
training foundational models of our own micro-biological world,
71
253917
3637
04:17
and then agents that are probing these AI analogs
72
257554
2737
04:20
to design new molecules that could be potential new drugs.
73
260324
4237
04:25
And I want to show you exactly how this process works for us,
74
265996
4138
04:30
because I believe it can serve as a blueprint
75
270167
3003
04:33
to bring about a whole new wave
76
273203
2169
04:35
of the future of AI-driven scientific
77
275406
2602
04:38
and technological progress.
78
278042
2202
04:42
Now drug design is such an important area to focus on
79
282179
4304
04:46
because it's actually becoming harder and harder to design new drugs.
80
286517
3804
04:50
This is a graph of the number of new drugs created
81
290754
3070
04:53
per billion dollars of R and D spent over time.
82
293857
4238
04:58
And what you can see is that the number of new drugs
83
298595
2436
05:01
is exponentially decreasing.
84
301065
1901
05:03
It's becoming more and more expensive to create a new drug.
85
303000
3770
05:07
Now during this same time period,
86
307471
1935
05:09
we've had a huge amount of advancement in the capabilities of AI,
87
309440
3837
05:13
driven by a whole host of algorithmic breakthroughs.
88
313310
2636
05:16
But one of the secret sources of this advancement in AI
89
316847
3170
05:20
has also been that of Moore's Law,
90
320050
2603
05:22
that the amount of computing power
91
322686
1635
05:24
has just been exponentially increasing over time.
92
324355
3069
05:27
And these days, it perhaps isn't Moore's Law
93
327791
2069
05:29
that we should care about, but Jensen's law.
94
329893
2470
05:32
Jensen Huang, being the CEO of Nvidia,
95
332363
2535
05:34
for the exponential increase in GPU FLOPS
96
334932
3403
05:38
that are now powering our neural networks.
97
338335
2803
05:42
So really the question is,
98
342072
1602
05:43
how do we bring this world of AI and machine learning
99
343707
3671
05:47
to that of drug design?
100
347411
1668
05:50
Can we think about using our AI analogs to reverse this curse of Eroom’s law
101
350013
5339
05:55
and jump on this exponential wave of GPU FLOPS powering our neural networks?
102
355352
5072
06:00
Actually bringing these worlds together and driving this change
103
360424
3403
06:03
is the day-to-day responsibility that I feel.
104
363861
2302
06:07
So how can we go about modeling biology?
105
367364
2903
06:11
Well if we were in the world of physics, for example, modeling the universe,
106
371735
3937
06:15
then we can actually write down a lot of the theory by hand with maths
107
375706
4771
06:20
and very accurately predict, for example, the unfolding of the universe,
108
380511
4037
06:24
even millions of light years away.
109
384548
2236
06:27
But we can't do that for the incredibly complex dynamics within ourselves.
110
387885
5138
06:33
We can't just write down some equations for ourselves.
111
393056
2970
06:36
We can perhaps write down the theory of how atoms interact.
112
396760
4004
06:40
That's physics.
113
400764
1201
06:42
But then simulating these interactions
114
402900
2035
06:44
on the scale of trillions of atoms within our cells
115
404935
3770
06:48
is just completely unfeasible.
116
408739
2002
06:51
And then we haven't worked out how to describe these complex dynamics
117
411308
3504
06:54
in coarser and simpler terms that we could write down with maths.
118
414845
4204
07:00
It’s just crazy to think that we can model the universe so far away
119
420317
5439
07:05
but not the cells at our fingertips.
120
425789
2469
07:09
But AI and machine learning can be the perfect abstraction
121
429393
3303
07:12
for a biological world.
122
432729
1702
07:15
Using the snippets of data that we can record from our cells,
123
435165
3504
07:18
we can then learn the equations and theories and abstractions implicitly
124
438702
4805
07:23
within the activations of our neural networks.
125
443507
3003
07:27
In fact, our company is called Isomorphic Labs.
126
447411
4538
07:31
Isomorphic because we believe there is an isomorphism,
127
451982
3337
07:35
a fundamental symmetry, that we can create
128
455319
3136
07:38
between the biological world and the world of information science,
129
458489
3803
07:42
machine learning and AI.
130
462326
2002
07:46
So to see how we are using these AI analogs today,
131
466029
3671
07:49
I want to dive into the body and have a look into cells
132
469700
3670
07:53
and think about proteins.
133
473403
1702
07:56
Now proteins are one of the fundamental building blocks of life.
134
476106
4405
08:00
And these proteins carry different functions in the body.
135
480811
3537
08:05
And if we can modulate the function of a protein,
136
485082
2736
08:07
then we are well on our way to creating a new drug.
137
487851
3671
08:11
Proteins are made up of a sequence of amino acids,
138
491555
3203
08:14
and there are about 20 different amino acids,
139
494791
2203
08:16
each one here depicted by a different letter.
140
496994
2903
08:20
An amino acid is a collection of atoms, a molecule,
141
500831
5205
08:26
and these molecules are joined together into a linear sequence.
142
506069
4538
08:31
And the function of a protein is not just due to the sequence of these proteins,
143
511775
4404
08:36
but also due to the three-dimensional shape that these proteins fold up into.
144
516179
5172
08:42
And there are thousands of proteins inside of us,
145
522352
2970
08:45
each with their own unique sequences and their own unique 3D shape.
146
525355
4772
08:50
And remember,
147
530494
1401
08:51
trying to work out experimentally that 3D shape
148
531895
2803
08:54
can take months or even years of lab work.
149
534698
3837
09:00
But with the breakthrough of AlphaFold and AlphaFold 2 in 2020,
150
540137
5238
09:05
we now have a model that can take the sequence of amino acids as input
151
545409
4638
09:10
and then very accurately predict the 3D structure of a protein
152
550080
4438
09:14
as the output.
153
554551
1168
09:16
And this allows us to actually fill in the gaps
154
556954
3069
09:20
of our known protein universe.
155
560023
2269
09:22
It's our AI analog of proteins.
156
562292
3170
09:27
So proteins carry their function.
157
567130
2203
09:29
But these proteins, they don't actually act in isolation.
158
569800
3503
09:33
They're part of bigger molecular machines
159
573337
2802
09:36
with these proteins interacting with other proteins
160
576173
2936
09:39
as well as other biomolecules like DNA,
161
579142
3204
09:42
RNA and small molecules.
162
582346
2168
09:45
For example,
163
585248
1168
09:46
let's zoom in and have a look at this protein.
164
586450
2169
09:48
This is a protein that repairs DNA,
165
588619
2268
09:50
and it interacts with DNA
166
590921
2469
09:53
clamping down on it,
167
593423
1702
09:55
helping facilitate repair
168
595158
1502
09:56
and then the repaired DNA is released back out to the cell.
169
596660
3170
10:00
Now in drug design what we want to do
170
600864
2269
10:03
is either make molecular machines work better
171
603166
3304
10:06
or actually stop them from working.
172
606503
2469
10:09
And in this case, for cancer,
173
609006
2502
10:11
we actually want to stop this particular DNA repair protein from working,
174
611541
4371
10:15
because in cancerous cells
175
615946
1601
10:17
there is no backup DNA repair mechanism.
176
617547
2837
10:21
And so if we stop this one working, then cancerous cells will die,
177
621018
3169
10:24
leaving just healthy cells remaining.
178
624187
2503
10:27
So what would a drug actually look like for this protein?
179
627624
3737
10:31
Well a drug is something that comes in and modulates a molecular machine.
180
631695
4705
10:36
And this could be a drug molecule that goes into the body,
181
636733
3971
10:40
goes into the cell and then sticks to this protein just over here.
182
640737
4405
10:46
And this drug molecule actually glues the DNA repair proteins clamp shut,
183
646176
4872
10:51
so it can't do effective DNA repair causing cancerous cells to die
184
651081
4905
10:55
and leaving just healthy cells remaining.
185
655986
2436
10:59
Now to design such an amazing drug molecule completely rationally,
186
659723
4638
11:04
we'd have to understand
187
664394
1535
11:05
how all of these biomolecular elements come together.
188
665929
3537
11:09
We would need an AI analog of all and any biomolecular systems.
189
669933
5472
11:16
Earlier this year, we had a breakthrough.
190
676907
2636
11:19
We developed a new version of AlphaFold, called AlphaFold 3,
191
679943
3871
11:23
that can model the structure
192
683814
1434
11:25
of almost all biomolecules coming together
193
685282
2803
11:28
with unprecedented accuracy.
194
688085
2469
11:32
This model takes as input the protein sequence,
195
692055
2636
11:34
the DNA sequence
196
694725
1334
11:36
and the molecule atoms.
197
696059
1635
11:38
And these inputs are fed to a neural network
198
698161
2770
11:40
that has a large processing trunk based on transformers.
199
700931
3403
11:45
Now unlike a large language model
200
705202
2502
11:47
that operates on one- dimensional sequences,
201
707738
3103
11:50
instead, our model uses what’s called a “pairformer”
202
710874
3503
11:54
and operates on a 2D interaction grid of the input sequence.
203
714377
4305
12:00
And this allows our model to explicitly reason about every pairwise interaction
204
720050
5238
12:05
that could occur in this biomolecular system.
205
725322
2803
12:09
And so we can use the features of this processing trunk
206
729192
3203
12:12
to condition a diffusion model.
207
732429
2202
12:15
Now you might know diffusion models
208
735565
1735
12:17
as these amazing image generative models.
209
737300
2570
12:21
Now just like diffusing the pixels in an image,
210
741104
3637
12:24
instead, our diffusion model diffuses
211
744775
1968
12:26
the 3D atom coordinates of our biomolecular system.
212
746777
4104
12:32
So now this gives us a completely malleable virtual biomolecular world.
213
752048
6173
12:38
It’s our AI analog that we can probe as if it’s the real world.
214
758789
3737
12:42
We can make changes to the inputs,
215
762559
1635
12:44
changes to the molecule designs
216
764194
1802
12:46
and see how that changes the output structure.
217
766029
2903
12:49
So let's use this model to design a new drug
218
769499
3270
12:52
for our DNA repair protein.
219
772769
1802
12:55
We can take a small molecule that's been recorded to stick to this protein
220
775972
3737
12:59
and make changes to its design.
221
779743
2102
13:02
We want to change the molecule design
222
782746
1935
13:04
so that this molecule makes more interactions with the protein,
223
784714
3003
13:07
and that will make it stick to this protein stronger.
224
787717
2870
13:11
And so you can imagine that this gives a human drug designer
225
791488
3537
13:15
a perfect game to play.
226
795058
1635
13:17
How do I change the design of this molecule
227
797060
3070
13:20
to create more interactions?
228
800163
2169
13:23
Now normally,
229
803266
1402
13:24
a drug designer would have to wait months
230
804668
2669
13:27
to get results back from a real lab at each step of this design game.
231
807370
4972
13:32
But for us, using this AI analog, this takes just seconds.
232
812876
4004
13:37
And this is the reality
233
817747
1802
13:39
of what our drug designers back in London are doing right now.
234
819583
3737
13:45
So we have this beautiful game that's being played by our drug designers,
235
825121
3471
13:48
who are using this AI analog of biomolecular systems
236
828592
3436
13:52
to rationally design potential new drug molecules.
237
832062
3436
13:56
But you can imagine
238
836800
1168
13:58
that we don't have to just limit this game
239
838001
2135
14:00
to human drug designers
240
840136
1669
14:02
Earlier in my career,
241
842706
1668
14:04
I worked on training agents
242
844374
2336
14:06
to beat the top human professionals at the game of StarCraft.
243
846743
3637
14:10
And we created game-playing agents for the games of Go and Capture the Flag.
244
850814
3970
14:15
So why can't we create agents that instead play the game
245
855452
3837
14:19
that our human drug designers are playing?
246
859322
2202
14:22
So now our AI analog becomes the game environment,
247
862359
4371
14:26
and we can train agents against that.
248
866730
2235
14:29
And we already have some incredibly powerful agents
249
869432
3037
14:32
that are already doing this today.
250
872502
2236
14:36
Now in this setup,
251
876039
1668
14:37
all of the drug design is happening on a computer.
252
877741
4037
14:42
So what happens if we have access to many, many computers?
253
882312
4037
14:47
Well instead of having one human drug designer
254
887083
2436
14:49
working on some new molecule designs,
255
889519
2269
14:51
instead, we can have thousands of agents doing molecule design in parallel.
256
891821
6307
14:59
Just imagine what impact that could have
257
899863
3003
15:02
on patients suffering from a rare type of cancer,
258
902866
3370
15:07
the speed that we could get to a potential new molecule
259
907304
3036
15:10
to address this medical need
260
910373
3070
15:13
or the ability to go after many diseases in parallel.
261
913443
3804
15:18
Cancer is often caused by mutations of proteins,
262
918348
4271
15:22
and even within the same type of cancer,
263
922652
3103
15:25
each patient can have different mutations.
264
925755
3003
15:30
And that means that one drug molecule won't work for all patients.
265
930393
4405
15:35
But what if we could go in
266
935432
1268
15:36
and measure each individual patient's protein mutations,
267
936700
3770
15:40
and then have a whole team of molecule-design agents
268
940503
2770
15:43
working on that individual's protein mutations?
269
943306
3037
15:47
Then we could create a molecule tailored for each individual patient.
270
947110
5205
15:53
I'm showing just this.
271
953383
1668
15:55
Here the protein is randomly mutating,
272
955652
2736
15:58
and each mutation in red
273
958388
2202
16:00
subtly changes the 3D shape of this protein.
274
960623
3337
16:04
And we're able to generate molecules that should stick to this protein
275
964661
4204
16:08
in response to these changes.
276
968898
2303
16:12
Now this is still far away from patients,
277
972135
2169
16:14
and there's a huge amount of complexity in drug design left to tackle,
278
974337
3904
16:18
but this really does give us a glimpse at the future that is to come.
279
978241
4371
16:24
So we've seen how this new AI paradigm is driving our progression in drug design.
280
984647
5005
16:29
And you can also see this paradigm being played out in material science,
281
989686
3470
16:33
in creating new forms of energy
282
993189
2503
16:35
and in chemistry.
283
995725
1902
16:37
The ability to take our real messy world
284
997660
3237
16:40
and then create our own AI analogs
285
1000897
2469
16:43
to then on a computer do open-ended scientific discovery
286
1003366
4605
16:48
to create new knowledge that we can take back out
287
1008004
2336
16:50
and change the world around us.
288
1010373
2102
16:53
This is an incredibly powerful paradigm,
289
1013076
2369
16:55
and one that will bring about a whole new wave of scientific
290
1015478
3404
16:58
and technological advancements.
291
1018915
2069
17:01
And we’re going to need as many people as possible,
292
1021618
2836
17:04
especially those working in machine learning,
293
1024454
2569
17:07
AI and technology,
294
1027023
1468
17:08
to help drive this new wave of progression.
295
1028491
3037
17:11
Thank you.
296
1031895
1368
17:13
(Applause)
297
1033263
4905
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7