TO

Toby_Ord

4664 karmaJoined

Sequences
1

The Scaling Series

Comments
200

Thanks Owen. I also agree with your maths.

Re conditioning, I agree that this is the technically correct thing to do and that it isn't clear what difference it makes to the more simple analysis. In some cases it is fairly easy to condition (e.g. if working on a late-stage topic, one can do the project imagining that there isn't lots of advanced AI advice in time when it arrives), while at the prioritisation stage it feels a bit harder to do. Oh, and I very much agree that it could be important to act to change whether such AI analysis happens (something that is, if anything, a bit easier to see on a view that treats whether this happens as uncertain).

Re maximal reasonable probabilities, I still genuinely feel like it is hard to get >90% credence that very large amounts of AI analysis on a key issue will happen prior to the issue coming to a head. I think one could get there for some things, but not that many. This is due to there being a variety of defeaters for such high amounts of AI analysis, such as external people like us not having access to the tools, needing the analysis earlier than expected (e.g. due to the need to socialise the ideas), jaggedness in the AI capabilities (e.g. where its engineering abilities take off substantially before more conceptual, philosophical abilities). I think you are onto something re what you are imagining as default vs what I am.

Thanks for this in-depth writeup of what is clearly a very important factor in prioritising our work aimed at the AI transition. Your piece has built the argument for such prioritisation clearly enough that it has allowed me to put some previously inchoate responses into a more crisp form:

If we could tell with certainty which topics would receive >100x as much work as we could put in prior to when that work is needed, then I think your argument goes through. But I have a lot of uncertainty about that and such uncertainty weakens the prioritisation effect substantially.

To see the effect easily, suppose for simplicity that for some piece of apparently late-stage strategy there is a 50% chance that >100x as much work gets done on it, obviating the need for us to work on it now, and a 50% chance that there is no appreciable extra work done (e.g. because the intelligence explosion is happening in a particular lab that doesn't do this work, or because the work requires aspects of cognition that are improving more slowly, or because it turns out it was needed earlier in the explosion than expected).

In this case, the expected value of marginal work on that late-stage strategy gets roughly halved compared to if there weren't going to be this AI-driven work later (50% chance of the naive estimate + 50% chance of <1% of that estimate). Given the fairly extreme distribution in the value of a particular person working on different topics, it isn't that rare for the best thing of one category to work on being >2x as good as the best thing from another category, such that you shouldn't switch category even after downgrading the EV. 

That would mean early-stage vs late-stage would be an important factor in choosing what to work on, but not any kind of filter as to what to work on. As the chance of large amounts of AI work on that topic increases, the factor gets stronger. e.g. it reaches 10x at a 90% chance, which is quite strong (though I think it is hard to reach or exceed a 90% chance here).

So I think this can have a substantial effect on the choice of what to work on on the margins, but isn't a filter.

What about its effect on the portfolio of research work aimed at the AI transition?

Suppose that there are logarithmic returns to the research work (which means that the marginal value of extra work is inversely proportional to aggregate work so far, which is a common neglectedness assumption). In that case, we should do 50% as much total work on equally-important things that we estimate to have a 50% chance of being obviated later, and 10% as much total work on those we estimate to have a 90% chance of being obviated later.

So that is still quite a lot of the share of our total work into late-stage things even when we don't think they are intrinsically more important. In the piece you suggested that we do at least some work on these topics, to avoid the possibility of being caught completely flat-footed if the anticipated AI-work on those topics doesn't happen, and I think the maths above suggests a larger amount of work than that (especially on topics that appear to be more important or more tractable).

(Note that my simplifying assumption of no appreciable AI help vs an overwhelming amount might be doing some work here. I'm not sure what the best way to relax it is.)

I agree that these are vague and could come apart from each other. But I don't see any crisp, verifiable definitions that I could replace them with and serve the same purpose. I'm interested in forecasting transformative AI for the main purpose of forecasting when one has to have one's AI-related impact by. e.g. by when do we need to have solved alignment (or to have paused AI development)? 

If I instead used a verifiable definition here, such as the “In what year would AI systems be able to replace 99% of current fully remote jobs?” that I cite in the essay, then you have to do further forecasting of how that time relates to the key things that matter (such as the deadline on AI alignment). Also, for crisp concrete definitions, one tends to then get hung up on estimating exactly how hard the final 1% of current fully remote jobs are, because that is central to the prediction. For example, are there 1% of current fully remote jobs that we only let a human do, e.g. for reasons of legal responsibility or personal relationships? Maybe? But that isn't relevant to the central features we care about.

I'm sure my definition could be improved (the focus of my essay isn't on my prediction but on the wider points about everyone's timelines), but I hope this explains why being "measurable and uncontroversial" need not make for the best thing to forecast.

Thanks Angelina!

Annoying because: the core premise is so obvious, and yet I found spelling out the implications surprisingly clarifying!

This is actually the story of almost everything I've ever done:

  • You should give to charities that do more good with your money (and find out which ones these are)
  • Because you can save multiple lives with your donations, they are deeply morally serious and a key part of living a morally good life
  • It would be really bad if humanity's potential were destroyed, so we need to prioritise making sure that never happens

It's a byproduct of trying to find the most important, unarguably true, yet neglected things. I then work very hard on finding the deepest explanations until I find the ways of presenting each claim that make it effortless to see.

I like to think of it as working at the border of the trivial and the profound.

Thanks Nick!

How to communicate this is a good question, and I don't yet know the best answer. I think admitting uncertainty is generally good — it is both honest and actually appreciated by many audiences. But there is still the question of how to do it. The scientific norm is usually to stay silent until the evidence for something (or some piece of the puzzle) is strong enough (e.g. reaching p = 0.05). I don't think that is the right norm here. We are in a very high stake situation and policy-makers need the partial evidence that we do have. But communicating it is hard.

I think your expression is pretty good, and could be made a little better. e.g. "Leading AI forecasters can't rule out it happening before 2030 and think it will probably happen before 2040."

Re AI 2027, there is a good explanation of how their views have changed here.

And I'll add that RL training (and to a lesser degree inference scaling) is limited to a subset of capabilities (those with verifiable rewards and that the AI industry care enough about to run lots of training on). So progress on benchmarks has been less representative of how good they are at things that aren't being benchmarked than it was in the non-reasoning-model era. So I think the problems of the new era are somewhat bigger than the effects that show up in benchmarks.

That's a great question. I'd expect a bit of slowdown this year, though not necessarily much. e.g. I think there is a 10x or so possible for RL before RL-training-compute reaches the size of pre-training compute, and then we know they have enough to 10x again beyond that (since GPT-4.5 was already 10x more), so there are some gains still in the pipe there. And I wouldn't be surprised if METR timelines keep going up in part due to increased inference spend (i.e. my points about inference scaling not being that good are to do with costs exploding, so if a cost-insensitive benchmark is going on, it might not register on it all that much). There is also room for more AI-research or engineering improvements to these things, and a lump of new compute coming in, making it a bit messy.

Overall, I'd say my predictions are more about appreciable slowing in 2027+ rather than 2026.

Good point about the METR curves not being Pareto frontiers.

Interesting ideas! A few quick responses:

  1. The data for the early 'linear' regime for these models actually appears to be even better than you suggest here. They have a roughly straight line (on a log-log plot), but at a slope that is better than 1. Eyeballing it, I think some are slope 5 or higher (i.e. increasing returns, with time horizon growing as the 5th power of compute). See my 3rd chart here. If anything, this would strengthen your case for talking about that regime separately from the poorly scaling high compute regime later on.
  2. I'd also suspected that when you apply extra RL to a model (e.g. o3 compared to o1) that it would have a curve that dominated the earlier model. But that doesn't seem to be the case. See the curves in the final chart here, where o1-preview is dominated, but the other OpenAI models curves all cross each other (being cheaper for the same horizon at some horizons and more expensive at others).
  3. Even when they do dominate each other neatly like in your fake data, I noticed that the 'sweet spots' and the 'saturation points' can still be getting more expensive, both in terms of $ and in terms of $/hr. I'm not sure what to make of that though!
  4. I think you're on to something with the idea that there is a problematic kind of inference scaling and a fine kind, though I'm not sure if you've quite put your finger on how to distinguish them. I suppose we can definitely talk about the super-linear scaling regime and the sub-linear regime (which meet at what I call the sweet spot), but I'm not sure these are the two types you refer to in qualitative terms near the top.

Yeah, it isn't just like a constant factor slow-down, but is fairly hard to describe in detail. Pre-training, RL, and inference all have their own dynamics, and we don't know if there will be new good scaling ideas that breathe new life into them or create a new thing on which to scale. I'm not trying to say the speed at any future point is half what it would have been, but that you might have seen scaling as a big deal, and going forward it is a substantially smaller deal (maybe half as big a deal).

Load more