AI Won’t Plateau — if We Give It Time To Think | Noam Brown | TED

162,603 views

2025-02-15 ・ TED


New videos

AI Won’t Plateau — if We Give It Time To Think | Noam Brown | TED

162,603 views ・ 2025-02-15

TED


Please double-click on the English subtitles below to play the video.

00:04
The incredible progress in AI over the past five years
0
4368
3370
00:07
can be summarized in one word:
1
7738
2269
00:10
scale.
2
10041
1267
00:12
Yes, there have been algorithmic advances,
3
12043
2736
00:14
but the frontier models of today
4
14779
3003
00:17
are still based on the same transformer architecture
5
17782
2569
00:20
that was introduced in 2017,
6
20384
2670
00:23
and they are trained in a very similar way to the models that were trained in 2019.
7
23087
4137
00:28
The main difference is the scale of the data and compute
8
28592
4038
00:32
that goes into these models.
9
32663
1702
00:36
In 2019, GPT-2 cost about 5,000 dollars to train.
10
36200
4304
00:41
Every year since then, for the past five years,
11
41705
3304
00:45
the models have gotten bigger,
12
45042
2002
00:47
trained for longer on more data.
13
47078
2102
00:50
And every year they've gotten better.
14
50014
1868
00:52
But today's frontier models can cost hundreds of millions of dollars to train,
15
52917
5505
00:58
and there are reasonable concerns among some
16
58456
2969
01:01
that AI will soon plateau or hit a wall.
17
61459
3570
01:06
After all, are we really going to train models
18
66330
2369
01:08
that cost hundreds of billions of dollars?
19
68699
3237
01:11
What about trillions of dollars?
20
71969
2636
01:14
At some point, the scaling paradigm breaks down.
21
74638
2937
01:19
This is, in my opinion, a reasonable concern,
22
79076
2369
01:21
and in fact it's one that I used to share.
23
81445
2336
01:23
But today I am more confident than ever that AI will not plateau.
24
83781
4938
01:28
And in fact, I believe that we will see AI progress accelerate
25
88752
3437
01:32
in the coming months.
26
92223
1468
01:33
To explain why I want to tell a story from my time as a PhD student.
27
93724
4571
01:38
I started my PhD in 2012,
28
98329
2702
01:41
and I was lucky to be able to work
29
101031
2269
01:43
on the most exciting projects I could imagine,
30
103334
2335
01:47
developing AIs that could learn, on their own, how to play poker.
31
107238
4604
01:52
Now I had played a lot of poker when I was in high school and college,
32
112943
3337
01:56
so for me, this was basically my childhood dream job.
33
116280
2669
01:58
Now contrary to its reputation,
34
118949
2736
02:01
poker is not just a game of luck,
35
121719
2335
02:04
it's also a game of deep strategy.
36
124088
1668
02:07
You can kind of think of it like chess with a deck of cards.
37
127091
2836
02:11
When I started my PhD,
38
131428
1268
02:12
there had already been several years of research
39
132696
2269
02:14
on how to make AIs that play poker.
40
134999
2435
02:17
And the general feeling among the research community
41
137468
2669
02:20
[was] that we had figured out the paradigm,
42
140171
2002
02:22
and now all we needed to do was scale it.
43
142173
2168
02:25
So every year we would train larger poker AIs
44
145709
4572
02:30
for longer on more data.
45
150281
2535
02:32
And every year they would get better.
46
152850
2135
02:34
Just like today's frontier language models.
47
154985
2269
02:38
By 2015, they got so good
48
158622
2202
02:40
that we thought they might be able to rival the top human experts.
49
160824
3337
02:45
So we challenged four of the world's top poker players
50
165262
3137
02:48
to an 80,000-hand poker competition
51
168432
2336
02:50
with 120,000 dollars in prize money to incentivize them to play their best.
52
170801
4004
02:55
And unfortunately, our bot lost
53
175873
2869
02:58
by a wide margin.
54
178742
1969
03:00
In fact, it was clear even on day one that our bot was outmatched.
55
180744
3170
03:05
But during this competition I noticed something interesting.
56
185549
2836
03:09
You see, leading up to this competition,
57
189053
2035
03:11
our bot had played almost a trillion hands of poker
58
191121
3771
03:14
over thousands of CPUs for about three months.
59
194925
4104
03:20
But when it came time to actually play against these human experts,
60
200698
3203
03:23
the bot acted instantly.
61
203934
2169
03:26
It took about ten milliseconds to make a decision,
62
206136
2336
03:28
no matter how difficult it was.
63
208505
1502
03:31
Meanwhile, the human experts had only played
64
211408
2736
03:34
maybe 10 million hands of poker in their lifetimes.
65
214178
2436
03:37
But when they were faced with a difficult decision,
66
217615
2435
03:40
they would take the time to think.
67
220084
1668
03:43
If it was an easy decision,
68
223254
1334
03:44
they might only think for a couple of seconds.
69
224622
2202
03:46
If it was a difficult decision, they might think for a few minutes,
70
226824
3403
03:50
but they would take advantage of the time that they had
71
230261
2602
03:52
to think through their decisions.
72
232863
1602
03:54
In Daniel Kahneman's book, "Thinking Fast and Slow,"
73
234498
2469
03:57
he describes this as the difference
74
237001
1735
03:58
between System 1 thinking and System 2 thinking.
75
238736
3436
04:02
System 1 thinking is the faster, more intuitive kind of thinking
76
242206
4037
04:06
that you might use, for example,
77
246243
1702
04:07
to recognize a friendly face or laugh at a funny joke.
78
247945
3770
04:12
System 2 thinking is the slower, more methodical thinking
79
252416
3570
04:15
that you might use for things like planning a vacation,
80
255986
2770
04:18
or writing an essay or solving a hard math problem.
81
258789
3337
04:22
After this competition,
82
262960
1301
04:24
I wondered whether this System 2 thinking might be what’s missing from our bot
83
264295
4671
04:28
and might explain the difference
84
268999
1602
04:30
in the performance between our bot and the human experts.
85
270634
2970
04:33
So I ran some experiments
86
273637
1268
04:34
to see just how much of a difference this System 2 thinking makes in poker.
87
274905
4805
04:40
And the results that I got blew me away.
88
280678
2268
04:43
It turned out that having the bot think for just 20 seconds in a hand of poker
89
283781
6439
04:50
got the same boost in performance as scaling up the model by 100,000x
90
290254
5906
04:56
and training it for 100,000 times longer.
91
296160
2502
04:59
Let me say that again.
92
299596
1268
05:00
Spending 20s thinking in a hand of poker got the same boost in performance
93
300864
5473
05:06
as scaling up the size of the model and the training by 100,000x.
94
306370
4805
05:11
When I got this result, I literally thought it was a bug.
95
311208
3437
05:14
For the first three years of my PhD,
96
314678
1869
05:16
I had managed to scale up these models by 100x.
97
316580
3437
05:21
I was proud of that work.
98
321051
1869
05:22
I had written multiple papers on how to do that scaling.
99
322953
2669
05:26
But I knew pretty quickly that all of that would be a footnote
100
326623
4138
05:30
compared to just scaling up System 2 thinking.
101
330794
2469
05:34
So based on these results,
102
334465
1601
05:36
we redesigned the poker AI from the ground up.
103
336100
2369
05:39
Now we were focused on scaling up System 2 thinking
104
339303
2469
05:41
in addition to System 1.
105
341805
2102
05:43
And in 2017, we again challenged four of the world's top poker pros
106
343907
4038
05:47
to a 120,000-hand poker competition,
107
347945
2068
05:50
this time with 200,000 dollars in prize money.
108
350047
2202
05:53
And this time we beat all of them by a huge margin.
109
353083
3037
05:57
This was a huge surprise to everybody involved.
110
357488
3236
06:00
It was a huge surprise to the poker community,
111
360758
2469
06:03
it was a huge surprise to the AI community,
112
363260
2136
06:05
and honestly, even a huge surprise to us.
113
365429
2336
06:08
I literally did not think it was possible
114
368665
2703
06:11
to win by the kind of margin that we won by.
115
371368
2536
06:13
In fact, I think what really highlights just how surprising this result was
116
373904
3837
06:17
is that when we announced the competition,
117
377741
2069
06:19
the poker community decided to do what they do best
118
379843
2703
06:22
and gamble on who would win.
119
382579
1602
06:24
(Laughter)
120
384214
1635
06:25
When we started, when we announced the competition,
121
385883
2435
06:28
the betting odds were about four to one against us.
122
388318
2870
06:31
After the first three days of the competition,
123
391221
2169
06:33
when we had won for the first three days,
124
393424
2002
06:35
the betting odds were still about fifty-fifty.
125
395459
2169
06:37
But by the eighth day of the competition,
126
397661
1969
06:39
you could no longer gamble on which side would win.
127
399663
2436
06:42
You could only gamble on which human would lose the least by the end.
128
402132
3270
06:45
This pattern of AI benefiting by thinking for longer
129
405402
3337
06:48
is not unique to poker,
130
408739
1401
06:50
and in fact, we've seen it in multiple other games as well.
131
410174
2802
06:53
For example, in 1997,
132
413010
2469
06:55
IBM created Deep Blue, an AI that plays chess,
133
415512
3604
06:59
and they challenged the world champion Garry Kasparov to a tournament,
134
419116
3303
07:02
and beat him in a landmark achievement for AI.
135
422453
3269
07:05
But Deep Blue didn't act instantly.
136
425756
1935
07:07
Deep Blue thought for a couple of minutes before making each move.
137
427724
4238
07:11
Similarly, in 2016, DeepMind created AlphaGo,
138
431995
3871
07:15
an AI that plays the game of go,
139
435866
1602
07:17
which is even more complicated than the game of chess.
140
437501
2569
07:20
And they too challenged a world champion, Lee Sedol,
141
440070
2503
07:22
and beat him in a landmark achievement for AI.
142
442573
2269
07:25
But AlphaGo also didn't act instantly.
143
445843
3370
07:29
AlphaGo took the time to think for a couple of minutes
144
449246
2736
07:32
before making each move.
145
452015
1969
07:33
In fact, the authors of AlphaGo later published a paper
146
453984
3237
07:37
where they measured just how much of a difference
147
457254
2336
07:39
this thinking time makes for the strongest version of AlphaGo.
148
459623
3637
07:43
And what they found is
149
463293
1302
07:44
that when AlphaGo had the time to think for a couple of minutes,
150
464628
4137
07:48
it would beat any human alive by a huge margin.
151
468799
2769
07:52
But when it had to act instantly, it would do much worse than top humans.
152
472703
4971
07:58
In 2021, there was a paper that was published
153
478809
2969
08:01
that tried to measure
154
481778
1268
08:03
just how much of a difference this thinking time made
155
483080
2536
08:05
a bit more scientifically.
156
485616
1268
08:06
In it, the authors found that in these games,
157
486917
2936
08:09
scaling up thinking time by 10x
158
489887
2969
08:12
was roughly the equivalent of scaling up the model size and training by 10x.
159
492890
5538
08:19
So you have this very clear, clean relationship
160
499496
3203
08:22
between scaling up System 2 thinking time
161
502699
2903
08:25
and scaling up System 1 training.
162
505636
2802
08:28
Now why does this matter?
163
508472
1601
08:31
Well, remember I mentioned at the start of this talk
164
511008
2469
08:33
that today's frontier models cost hundreds of millions of dollars to train,
165
513510
4271
08:37
but the cost of querying them,
166
517781
1735
08:39
the cost of asking a question and getting an answer
167
519550
2602
08:42
is fractions of a penny.
168
522152
1769
08:43
So this result says that if you want an even better model,
169
523954
4871
08:48
there are two ways you could do it.
170
528825
1769
08:50
One is to keep doing
171
530627
1735
08:52
what we've been doing for the past five years,
172
532362
2169
08:54
and scaling up System 1 training,
173
534531
2903
08:57
go from spending hundreds of millions of dollars on a model
174
537467
3337
09:00
to billions of dollars on a model.
175
540837
2436
09:03
The other is to scale up System 2 thinking
176
543307
3903
09:07
and go from spending a penny per query to 10 cents per query.
177
547244
4705
09:11
At a certain point, that trade-off becomes well worth it.
178
551949
3036
09:14
Now of course, all of these results are in the domain of games,
179
554985
3870
09:18
and there was a reasonable question
180
558889
2169
09:21
about whether these results could be extended
181
561091
2135
09:23
to a more complicated setting, like language.
182
563260
2769
09:26
But recently, my colleagues and I at OpenAI released o1,
183
566029
4572
09:30
a new series of language models that think before responding.
184
570634
4338
09:35
If it's an easy question, o1 might only think for a few seconds.
185
575005
3503
09:38
If it's a difficult decision, it might think for a few minutes.
186
578542
4438
09:43
But just like the AIs for chess, go and poker,
187
583013
3604
09:46
o1 benefits by being able to think for longer.
188
586617
3737
09:50
This opens up a completely new dimension for scaling.
189
590387
3637
09:54
We're no longer constrained to just scaling up System 1 training.
190
594057
3337
09:57
Now we can scale up System 2 thinking as well.
191
597427
3170
10:00
And the beautiful thing about scaling up in this direction
192
600631
2902
10:03
is that it's largely untapped.
193
603533
1869
10:05
Remember I mentioned
194
605402
1602
10:07
that the frontier models of today cost less than a penny to query.
195
607037
3971
10:11
Now when I mention this to people,
196
611041
3070
10:14
a frequent response that I get
197
614111
1902
10:16
is that people might not be willing to wait around for a few minutes
198
616046
4104
10:20
to get a response from a model,
199
620183
2303
10:22
or pay a few dollars to get an answer to their question.
200
622486
3436
10:25
And it's true that o1 takes longer
201
625956
3103
10:29
and costs more than other models that are out there.
202
629092
3203
10:32
But I would argue
203
632329
1268
10:33
that for some of the most important problems that we care about,
204
633630
3003
10:36
that cost is well worth it.
205
636667
1368
10:38
So let's do an experiment and see.
206
638035
1768
10:40
Raise your hand if you would be willing to pay more than a dollar
207
640737
4505
10:45
for a new cancer treatment.
208
645242
1768
10:47
Alright. Basically everybody in the audience.
209
647044
2135
10:49
Keep your hand up.
210
649212
1268
10:50
How about 1,000 dollars?
211
650514
1535
10:52
How about a million dollars?
212
652949
2770
10:55
What about for more efficient solar panels?
213
655752
3003
10:59
Or for a proof of the Riemann hypothesis?
214
659456
3203
11:02
The common conception of AI today is chatbots,
215
662659
3137
11:05
but it doesn't have to be that way.
216
665829
1935
11:08
This isn't a revolution that's 10 years away
217
668632
2135
11:10
or even two years away.
218
670801
1301
11:12
It's a revolution that's happening now.
219
672135
2469
11:14
My colleagues and I have already released o1 preview,
220
674638
2536
11:17
and I have had people come to me and say
221
677207
1935
11:19
that it has saved them days' worth of work,
222
679176
2035
11:21
including researchers at top universities.
223
681244
2102
11:23
And that's just the preview.
224
683380
1735
11:25
I mentioned at the start of this talk that the history of AI progress
225
685115
4571
11:29
over the past five years
226
689720
1368
11:31
can be summarized in one word:
227
691121
2202
11:33
scale.
228
693323
1435
11:34
So far, that has meant scaling up the System 1 training of these models.
229
694791
4939
11:39
Now we have a new paradigm,
230
699763
2302
11:42
one where we can scale up System 2 thinking as well.
231
702099
3069
11:45
And we are just at the very beginning of scaling up in this direction.
232
705202
3770
11:49
Now
233
709005
1469
11:50
I know that there are some people
234
710507
2169
11:52
who will still say that AI is going to plateau or hit a wall.
235
712709
4638
11:58
And to them I say,
236
718415
2402
12:00
want to bet?
237
720817
1268
12:02
(Laughter)
238
722119
1267
12:03
Thank you.
239
723420
1268
12:04
(Applause)
240
724721
4371
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7