That's a great question. I'd expect a bit of slowdown this year, though not necessarily much. e.g. I think there is a 10x or so possible for RL before RL-training-compute reaches the size of pre-training compute, and then we know they have enough to 10x again beyond that (since GPT-4.5 was already 10x more), so there are some gains still in the pipe there. And I wouldn't be surprised if METR timelines keep going up in part due to increased inference spend (i.e. my points about inference scaling not being that good are to do with costs exploding, so if a co...
Interesting ideas! A few quick responses:
Yeah, it isn't just like a constant factor slow-down, but is fairly hard to describe in detail. Pre-training, RL, and inference all have their own dynamics, and we don't know if there will be new good scaling ideas that breathe new life into them or create a new thing on which to scale. I'm not trying to say the speed at any future point is half what it would have been, but that you might have seen scaling as a big deal, and going forward it is a substantially smaller deal (maybe half as big a deal).
That's an interesting way to connect these. I suppose one way to view your model is as making clear the point that you can't cost-effectively use models on tasks that much longer than their 50% horizons — even if you are willing to try multiple times — and that trend of dramatic price improvements over time isn't enough to help with this. Instead you need the continuation of the METR trend of exponentially growing horizons. Moreover, you give a nice intuitive explanation of why that is.
One thing to watch out for is Gus Hamilton's recent study suggesti...
And one of the authors of the METR timelines paper has his own helpful critique/clarifications of their results.
Good points. I'm basically taking METR's results at face value and showing that people are often implicitly treating costs (or cost per 'hour') as constant (especially when extrapolating them), but show that these costs appear to be growing substantially.
Re the quality / generalisability of the METR timelines, there is quite a powerful critique of it by Nathan Witkin. I wouldn't go as far as he does, but he's got some solid points.
Thanks Basil! That's an interesting idea. The constant hazard rate model is just comparing two uses of the same model over different task lengths, so if using that to work out the 99% time horizon, it should cost 1/70th as much ($1.43). Over time, I think these 99% tasks should rise in cost in roughly the same way as the 50%-horizon ones (as they are both increasing in length in proportion). But estimating how that will change in practice is especially dicey as there is too little data.
Also, note that Gus Hamilton has written a great essay that takes the s...
Some great new analysis by Gus Hamilton shows that AI agents probably don't obey a constant hazard rate / half-life after all. Instead their hazard rates systematically decline as the task goes on.
This means that their success rates on tasks beyond their 50%-horizon are better than the simple model suggests, but those for tasks shorter than the 50% horizon are worse.
I had suggested a constant hazard rate was a good starting assumption for how their success rate at tasks decays with longer durations. It is the simplest model and fits the data OK. But Gus us...
That's a good summary and pretty in-line with my own thoughts on the overall upshots. I'd say that absent new scaling approaches the strong tailwind to AI progress from compute increases will soon weaken substantially. But it wouldn't completely disappear, there may be new scaling approaches, and there remains progress via AI research. Overall, I'd say it lengthens timelines somewhat, makes raw compute/finances less of an overwhelming advantage, and may require different approaches to compute governance.
A few points to clarify my overarching view:
That is quite a surprising graph — the annual tripling and the correlation between the compute and revenue are much more perfect than I think anyone would have expected. Indeed they are so perfect that I'm a bit skeptical of what is going on.
One thing to note is that it isn't clear what the compute graph is of (e.g. is it inference + training compute, but not R&D?). Another thing to note is that it is year-end figures vs year total figures on the right, but given they are exponentials with the same doubling time and different units, that is...
Thanks for the comments. The idea that pretraining has slowed/stalled is in the background in many posts in my series and it is unfortunate I didn't write one where I addressed it head-on. I don't disagree with Vladimir Nesov as much as you may think. Some of this is that the terms are slippery.
I think there are three things under discussion:
Thanks Paolo,
I was only able to get weak evidence of a noisy trend from the limited METR data, so it is hard to draw many conclusions from that. Moreover, METR's desire to measure the exponentially growing length of useful work tasks is potentially more exposed to an exponential rise in compute costs than more safety-related tasks. But overall, I'd think that the year-on-year growth in the amount of useful compute you can use on safety evaluations is probably growing faster than one can sustainably grow the number of staff.
I'm not sure how the dynamics wil...
Yes, that is a big limitation. Even more limiting is that it is only based on a subset of METR's data on this. That's enough to raise the question and illustrate what an answer might look like in data like this, but not to really answer it.
I'm not aware of others exploring this question, but I haven't done much looking.
Hi Simon, I want to push back on your claims about markets a bit.
Markets are great, especially when there are minimal market failures. I love them. They are responsible for a lot of good things. But the first and second fundamental theorems don't conclude that they maximise social welfare. That is a widely held misconception.
The first concludes that they reach a point on the Pareto frontier, but such a point could be really quite bad. e.g. a great outcome for one person but misery for 8 billion can be Pareto efficient. I'm not sure that extreme...
Whether it is true or not depends on the community and the point I'm making is primarily for EAs (and EA-adjacent people too). It might also be true for the AI safety and governance communities. I don't think it is true in general though — i.e. most citizens and most politicians are not giving too little regard to long timelines. So I'm not sure the point can be made when removing this reference.
Also, I'm particularly focusing on the set of people who are trying to act rationally and altruistically in response to these dangers, and are doing so in a somewhat coordinated manner. e.g. a key aspect is that the portfolio is currently skewed towards the near-term.
The point I'm trying to make is that we should have a probability distribution over timelines with a chance of short, medium or long — then we need to act given this uncertainty, with a portfolio of work based around the different lengths. So even if our median is correct, I think we're failing to do enough work aimed at the 50% of cases that are longer than the median.
"EAs aren't giving enough weight to longer AI timelines"
(The timelines until transformative AI are very uncertain. We should, of course, hedge against it coming early when we are least prepared, but currently that is less of a hedge and more of a full-on bet. I think we are unduly neglecting many opportunities that would pay off only on longer timelines.)
I ran a timelines exercise in 2017 with many well known FHI staff (though not including Nick) where the point was to elicit one's current beliefs for AGI by plotting CDFs. Looking at them now, I can tell you our median dates were: 2024, 2032, 2034, 2034, 2034, 2035, 2054, and 2079. So the median of our medians was (robustly) 2034 (i.e. 17 more years time). I was one of the people who had that date, though people didn't see each others' CDFs during the exercise.
I think these have held up well.
So I don't think Eliezer's "Oxford EAs" point is correct.
At any rate, merely uncertain catastrophic risks do not have rerun risk, while chancy ones do.
This is a key point. For many existential risks, the risk is mainly epistemic (i.e. we should assign some probability p to it happening in the next time period), rather than it being objectively chancy. For one-shot decision-making sometimes this distinction doesn't matter, but here it does.
Complicating matters, what is really going on is not just that the probability is one of two types, but that we have a credence distribution over the different levels of ...
The value of saving philanthropic resources to deploy post-superintelligence is greater than it otherwise would be.
One way to think of this is that if there is a 10% existential risk from the superintelligence transition and we will attempt that transition, then the world is currently worth 0.90 V, where V is the expected value of the world after achieving that transition. So the future world is more valuable (in the appropriate long-term sense) and saving it is correspondingly more important. With these numbers the effect isn't huge, but would be importan...
It's very difficult to do this with benchmarks, because as the models improve benchmarks come and go. Things that used to be so hard that it couldn't do better than chance quickly become saturated and we look for the next thing, then the one after that, and so on. For me, the fact that GPT-4 -> GPT4.5 seemed to involve climbing about half of one benchmark was slower progress than I expected (and the leaks from OpenAI suggest they had similar views to me). When GPT-3.5 was replaced by GPT-4, people were losing their minds about it — both internally and o...
I was going to say something about lack of incentives, but I think it is also a lack of credible signals that the work is important, is deeply desired by others working in these fields, and would be used to inform deployments of AI. In my view, there isn't much desire for work like this from people in the field and they probably wouldn't use it to inform deployment unless a lot of effort is also added from the author to meet the right people, convince theme to spend the time to take it seriously etc.
I don't know what to make of that. Obviously Vladimir knows a lot about state of the art compute, but there are so many details there without them being drawn together into a coherent point that really disagrees with you or me on this.
It does sound like he is making the argument that GPT 4.5 was actually fine and on trend. I don't really believe this, and don't think OpenAI believed it either (there are various leaks they were disappointed with it, they barely announced it, and then they shelved it almost immediately).
I don't think the argument...
Re 99% of academic philosophers, they are doing their own thing and have not heard of these possibilities and wouldn't be likely to move away from their existing areas if they had. Getting someone to change their life's work is not easy and usually requires hours of engagement to have a chance. It is especially hard to change what people work on in a field when you are outside that field.
A different question is about the much smaller number of philosophers who engage with EA and/or AI safety (there are maybe 50 of these). Some of these are working on some ...
I appreciate you raising this Wei (and Yarrow's responses too). They both echoed a lot of my internal debate on this. I'm definitely not sure whether this is the best use of my time. At the moment, my research time is roughly evenly split between this thread of essays on AI scaling and more philosophical work connected to longtermism, existential risk and post-AGI governance. The former is much easier to demonstrate forward progress and there is more of a demand signal for it. The latter is harder to be sure it is on the right path and is in less demand. M...
Thanks. I'm also a bit surprised by the lack of reaction to this series given that:
my hope with this essay is simply to make a case that all might benefit from a widening of Longtermism’s methods and a greater boldness in proclaiming that it is a part of the greatness of being human to be heroically, even slightly irrationally, generous in our relationship with others, including future generations, out of our love for humanity itself.
This is a very interesting approach, and I don't think it is in conflict with the approach in the volume. I hope you develop it further.
Thanks so much for writing this Will, I especially like the ideas:
Interesting! So this is a kind of representation theorem (a bit like the VNM Theorem) but instead of saying that Archimedean preferences of gambles can be represented as a standard sum, it says that any aggregation method (even a non-Archimedean one) can be represented by a sum of a hyperreal utility function applied to each of its parts.
The short answer is that if you take the partial sums of the first n terms, you get the sequence 1, 1, 2, 3, 4, … which settles down to have its nth element being and thus is a representative sequence for the number . I think you'll be able to follow the maths in the paper quite well, especially if trying a few examples of things you'd like to sum or integrate for yourself on paper.
(There is some tricky stuff to do with ultrafilters, but that mainly comes up as a way of settling the matter for which of two sequences represents t...
One possibility is this: I don't value prospects by their classical expected utilities, but by the version calculated with the hyperreal sum or integral (which agrees to within an infinitesimal when the answer is finite, but can disagree when it is divergent). So I don't actually want the classical expected utility to be the measure of a prospect. It is possible that the continuity axiom gets you there and my modified or dropped version can allow caring about the hyperreal version of expectation.
Interesting question.
I think there is a version of VNM utility that survives and captures the core of what we wanted: i.e. a way of representing consistent ways of ordering prospects via cardinal values of individual outcomes — it it is just that these value of outcomes can be hyperreals. I really do think the 'continuity' axiom (which is really an Archimedean axiom saying that nothing is infinitely valuable compared to something else) is obviously false in these settings, so has to go (or to be replaced by a version that allows infinitesimal probabilities...
Thanks!
On the hyperreal approach 1 + 0 + 1 + 1 + … does actually equal as desired.
This is an example of the general fact that adding zeros can change the hyperreal valuation of an infinite sum, which is a property that is pretty common in variant summation methods.
My chapter, Shaping Humanity's Longterm Trajectory, aims to better understand how reducing existential risk compares with other ways of influencing the longterm future. Helping avert a catastrophe can have profound value due to the way that the short-run effects of our actions can have a systematic influence on the long-run future. But it isn't the only way that could happen.
For example, if we advanced human progress by a year, perhaps we should expect to see us reach each subsequent milestone a year earlier. And if things are generally becoming bett...
Here is a nice simple model of the trade-off between redundancy and correlated risk. Assume that each time period, each planet has an independent and constant chance of destroying civilisation on its own planet and an independent and constant chance of destroying civilisation on all planets. Furthermore, assume that unless all planets fail in the same time period, they can be restored from those that survive.
e.g. assume the planetary destruction rate is 10% per century and the galaxy destruction rate is 1 in 1 million per century. Then with one plane...
This is a very interesting post. Here's how it fits into my thinking about existential risk and time and space.
We already know about several related risk effects over space and time:
Here is a nice simple model of the trade-off between redundancy and correlated risk. Assume that each time period, each planet has an independent and constant chance of destroying civilisation on its own planet and an independent and constant chance of destroying civilisation on all planets. Furthermore, assume that unless all planets fail in the same time period, they can be restored from those that survive.
e.g. assume the planetary destruction rate is 10% per century and the galaxy destruction rate is 1 in 1 million per century. Then with one plane...
And I'll add that RL training (and to a lesser degree inference scaling) is limited to a subset of capabilities (those with verifiable rewards and that the AI industry care enough about to run lots of training on). So progress on benchmarks has been less representative of how good they are at things that aren't being benchmarked than it was in the non-reasoning-model era. So I think the problems of the new era are somewhat bigger than the effects that show up in benchmarks.