Thanks for this in-depth writeup of what is clearly a very important factor in prioritising our work aimed at the AI transition. Your piece has built the argument for such prioritisation clearly enough that it has allowed me to put some previously inchoate responses into a more crisp form:
If we could tell with certainty which topics would receive >100x as much work as we could put in prior to when that work is needed, then I think your argument goes through. But I have a lot of uncertainty about that and such uncertainty weakens the prioritisation effect substantially.
To see the effect easily, suppose for simplicity that for some piece of apparently late-stage strategy there is a 50% chance that >100x as much work gets done on it, obviating the need for us to work on it now, and a 50% chance that there is no appreciable extra work done (e.g. because the intelligence explosion is happening in a particular lab that doesn't do this work, or because the work requires aspects of cognition that are improving more slowly, or because it turns out it was needed earlier in the explosion than expected).
In this case, the expected value of marginal work on that late-stage strategy gets roughly halved compared to if there weren't going to be this AI-driven work later (50% chance of the naive estimate + 50% chance of <1% of that estimate). Given the fairly extreme distribution in the value of a particular person working on different topics, it isn't that rare for the best thing of one category to work on being >2x as good as the best thing from another category, such that you shouldn't switch category even after downgrading the EV.
That would mean early-stage vs late-stage would be an important factor in choosing what to work on, but not any kind of filter as to what to work on. As the chance of large amounts of AI work on that topic increases, the factor gets stronger. e.g. it reaches 10x at a 90% chance, which is quite strong (though I think it is hard to reach or exceed a 90% chance here).
So I think this can have a substantial effect on the choice of what to work on on the margins, but isn't a filter.
What about its effect on the portfolio of research work aimed at the AI transition?
Suppose that there are logarithmic returns to the research work (which means that the marginal value of extra work is inversely proportional to aggregate work so far, which is a common neglectedness assumption). In that case, we should do 50% as much total work on equally-important things that we estimate to have a 50% chance of being obviated later, and 10% as much total work on those we estimate to have a 90% chance of being obviated later.
So that is still quite a lot of the share of our total work into late-stage things even when we don't think they are intrinsically more important. In the piece you suggested that we do at least some work on these topics, to avoid the possibility of being caught completely flat-footed if the anticipated AI-work on those topics doesn't happen, and I think the maths above suggests a larger amount of work than that (especially on topics that appear to be more important or more tractable).
(Note that my simplifying assumption of no appreciable AI help vs an overwhelming amount might be doing some work here. I'm not sure what the best way to relax it is.)
I agree that these are vague and could come apart from each other. But I don't see any crisp, verifiable definitions that I could replace them with and serve the same purpose. I'm interested in forecasting transformative AI for the main purpose of forecasting when one has to have one's AI-related impact by. e.g. by when do we need to have solved alignment (or to have paused AI development)?
If I instead used a verifiable definition here, such as the “In what year would AI systems be able to replace 99% of current fully remote jobs?” that I cite in the essay, then you have to do further forecasting of how that time relates to the key things that matter (such as the deadline on AI alignment). Also, for crisp concrete definitions, one tends to then get hung up on estimating exactly how hard the final 1% of current fully remote jobs are, because that is central to the prediction. For example, are there 1% of current fully remote jobs that we only let a human do, e.g. for reasons of legal responsibility or personal relationships? Maybe? But that isn't relevant to the central features we care about.
I'm sure my definition could be improved (the focus of my essay isn't on my prediction but on the wider points about everyone's timelines), but I hope this explains why being "measurable and uncontroversial" need not make for the best thing to forecast.
Thanks Angelina!
Annoying because: the core premise is so obvious, and yet I found spelling out the implications surprisingly clarifying!
This is actually the story of almost everything I've ever done:
It's a byproduct of trying to find the most important, unarguably true, yet neglected things. I then work very hard on finding the deepest explanations until I find the ways of presenting each claim that make it effortless to see.
I like to think of it as working at the border of the trivial and the profound.
Thanks Nick!
How to communicate this is a good question, and I don't yet know the best answer. I think admitting uncertainty is generally good — it is both honest and actually appreciated by many audiences. But there is still the question of how to do it. The scientific norm is usually to stay silent until the evidence for something (or some piece of the puzzle) is strong enough (e.g. reaching p = 0.05). I don't think that is the right norm here. We are in a very high stake situation and policy-makers need the partial evidence that we do have. But communicating it is hard.
I think your expression is pretty good, and could be made a little better. e.g. "Leading AI forecasters can't rule out it happening before 2030 and think it will probably happen before 2040."
Re AI 2027, there is a good explanation of how their views have changed here.
And I'll add that RL training (and to a lesser degree inference scaling) is limited to a subset of capabilities (those with verifiable rewards and that the AI industry care enough about to run lots of training on). So progress on benchmarks has been less representative of how good they are at things that aren't being benchmarked than it was in the non-reasoning-model era. So I think the problems of the new era are somewhat bigger than the effects that show up in benchmarks.
That's a great question. I'd expect a bit of slowdown this year, though not necessarily much. e.g. I think there is a 10x or so possible for RL before RL-training-compute reaches the size of pre-training compute, and then we know they have enough to 10x again beyond that (since GPT-4.5 was already 10x more), so there are some gains still in the pipe there. And I wouldn't be surprised if METR timelines keep going up in part due to increased inference spend (i.e. my points about inference scaling not being that good are to do with costs exploding, so if a cost-insensitive benchmark is going on, it might not register on it all that much). There is also room for more AI-research or engineering improvements to these things, and a lump of new compute coming in, making it a bit messy.
Overall, I'd say my predictions are more about appreciable slowing in 2027+ rather than 2026.
Interesting ideas! A few quick responses:
Yeah, it isn't just like a constant factor slow-down, but is fairly hard to describe in detail. Pre-training, RL, and inference all have their own dynamics, and we don't know if there will be new good scaling ideas that breathe new life into them or create a new thing on which to scale. I'm not trying to say the speed at any future point is half what it would have been, but that you might have seen scaling as a big deal, and going forward it is a substantially smaller deal (maybe half as big a deal).
Thanks Owen. I also agree with your maths.
Re conditioning, I agree that this is the technically correct thing to do and that it isn't clear what difference it makes to the more simple analysis. In some cases it is fairly easy to condition (e.g. if working on a late-stage topic, one can do the project imagining that there isn't lots of advanced AI advice in time when it arrives), while at the prioritisation stage it feels a bit harder to do. Oh, and I very much agree that it could be important to act to change whether such AI analysis happens (something that is, if anything, a bit easier to see on a view that treats whether this happens as uncertain).
Re maximal reasonable probabilities, I still genuinely feel like it is hard to get >90% credence that very large amounts of AI analysis on a key issue will happen prior to the issue coming to a head. I think one could get there for some things, but not that many. This is due to there being a variety of defeaters for such high amounts of AI analysis, such as external people like us not having access to the tools, needing the analysis earlier than expected (e.g. due to the need to socialise the ideas), jaggedness in the AI capabilities (e.g. where its engineering abilities take off substantially before more conceptual, philosophical abilities). I think you are onto something re what you are imagining as default vs what I am.