Toby_Ord

That's a very nice and clear idea — I think you're right that working on making mission-critical, but illegible, problems legible is robustly high value.

How Well Does RL Scale?

Toby_Ord1mo9

It's very difficult to do this with benchmarks, because as the models improve benchmarks come and go. Things that used to be so hard that it couldn't do better than chance quickly become saturated and we look for the next thing, then the one after that, and so on. For me, the fact that GPT-4 -> GPT4.5 seemed to involve climbing about half of one benchmark was slower progress than I expected (and the leaks from OpenAI suggest they had similar views to me). When GPT-3.5 was replaced by GPT-4, people were losing their minds about it — both internally and on launch day. Entirely new benchmarks were needed to deal with what it could do. I didn't see any of that for GPT-4.5.

I agree with you that the evidence is subjective and disputable. But I don't think it is a case where the burden of proof is disproportionately on those saying it was a smaller jump than previously.

(Also, note that this doesn't have much to do with the actual scaling laws, which are a measure of how much prediction error of the next token goes down when you 10x the training compute. I don't have reason to think that has gone off trend. But I'm saying that the real-world gains from this (or the intuitive measure of intelligence) has diminished, compared to the previous few 10x jumps. This is definitely compatible. e.g. if the model only trained on wikipedia plus an unending supply of nursery rhymes, its prediction error would continue to drop as more training happened, but its real world capabilities wouldn't improve by continued 10x jumps in the number of nursery rhymes added in. I think the real world is like this where GPT-4-level systems are already trained on most books ever written and much of the recorded knowledge of the last 10,000 years of civilisation, and it makes sense that adding more Reddit comments wouldn't move the needle much.)

How Well Does RL Scale?

Toby_Ord1mo5

I was going to say something about lack of incentives, but I think it is also a lack of credible signals that the work is important, is deeply desired by others working in these fields, and would be used to inform deployments of AI. In my view, there isn't much desire for work like this from people in the field and they probably wouldn't use it to inform deployment unless a lot of effort is also added from the author to meet the right people, convince theme to spend the time to take it seriously etc.

How Well Does RL Scale?

Toby_Ord1mo5

I don't know what to make of that. Obviously Vladimir knows a lot about state of the art compute, but there are so many details there without them being drawn together into a coherent point that really disagrees with you or me on this.

It does sound like he is making the argument that GPT 4.5 was actually fine and on trend. I don't really believe this, and don't think OpenAI believed it either (there are various leaks they were disappointed with it, they barely announced it, and then they shelved it almost immediately).

I don't think the argument about original GPT-4 really works. It improved because of post-training, but did they also add that post-training on GPT-4.5? If so, then the 10x compute really does add little. If not, then why not? Why is OpenAI's revealed preference to not put much effort into enhancing their most expensive ever system if not because they didn't think it was that good?

There is a similar story re reasoning models. It is true that in many ways the advanced reasoning versions of GPT-4o (e.g. o3) are superior to GPT-4.5, but why not make it a reasoning model too? If that's because it would use too much compute or be too slow for users due to latency, then these are big flaws with scaling up larger models.

How Well Does RL Scale?

Toby_Ord1mo5

Re 99% of academic philosophers, they are doing their own thing and have not heard of these possibilities and wouldn't be likely to move away from their existing areas if they had. Getting someone to change their life's work is not easy and usually requires hours of engagement to have a chance. It is especially hard to change what people work on in a field when you are outside that field.

A different question is about the much smaller number of philosophers who engage with EA and/or AI safety (there are maybe 50 of these). Some of these are working on some of those topics you mention. e.g. Will MacAskill and Joe Carlsmith have worked on several of these. I think some have given up philosophy to work on other things such as AI alignment. I've done occasional bits of work related to a few of these (e.g. here on dealing with infinities arising in decision theory and ethics without discounting) and also to other key philosophical questions that aren't on your list.

For such philosophers, I think it is a mixture of not having seen your list and not being convinced these are the best things that they each could be working on.

How Well Does RL Scale?

Toby_Ord1mo23

I appreciate you raising this Wei (and Yarrow's responses too). They both echoed a lot of my internal debate on this. I'm definitely not sure whether this is the best use of my time. At the moment, my research time is roughly evenly split between this thread of essays on AI scaling and more philosophical work connected to longtermism, existential risk and post-AGI governance. The former is much easier to demonstrate forward progress and there is more of a demand signal for it. The latter is harder to be sure it is on the right path and is in less demand. My suspicion is that it is generally more important though, and that demand/appreciation doesn't track importance very well.

It is puzzling to me too that no-one else was doing this kind of work on understanding scaling. I think I must be adding some rare ingredient, but I can't think of anything rare enough to really explain why no-one else got these results first. (People at the labs probably worked out a large fraction of this, but I still don't understand why the people not at the labs didn't.)

In addition to the general questions about which strand is more important, there are a few more considerations:

No-one can tell ex ante how a piece of work or research stream will pan out, so everyone will always be wrong ex post sometimes in their prioritisation decisions
My day job is at Oxford University's AI Governance Initiative (a great place!) and I need to be producing some legible research that an appreciable number of other people are finding useful
I'm vastly more effective at work when I have an angle of attack and a drive to write up the results — recently this has been for these bite-size pieces of understanding AI scaling. The fact that there is a lot of response from others is helping with this as each piece receives some pushback that leads me to the next piece.

But I've often found your (Wei Dai's) comments over the last 15-or-so years to be interesting, unusual, and insightful. So I'll definitely take into account your expressed demand for more philosophical work and will look through those pages of philosophical questions you linked to.

How Well Does RL Scale?

Toby_Ord1mo45

Thanks. I'm also a bit surprised by the lack of reaction to this series given that:

compute scaling has been the biggest story of AI in the last few decades
it has dramatically changed
very few people are covering these changes
it is surprisingly easy to make major crisp contributions to our understanding of it just by analysing the few pieces of publicly available data
the changes have major consequences for AI companies, AI timelines, AI risk, and AI governance

You Should Get a Reusable Mask

Toby_Ord2mo8

Thanks Jeff, this was very helpful. I'd listened to Andrew Snyder-Beattie's excellent interview on the 80,000 Hours podcast and wanted to buy one of these, but hadn't known exactly what to buy until now.

Longtermism: An Impracticable Attempt to Reason Our Way into Becoming Irrationally Generous Heroes?

Toby_Ord2mo11

my hope with this essay is simply to make a case that all might benefit from a widening of Longtermism’s methods and a greater boldness in proclaiming that it is a part of the greatness of being human to be heroically, even slightly irrationally, generous in our relationship with others, including future generations, out of our love for humanity itself.

This is a very interesting approach, and I don't think it is in conflict with the approach in the volume. I hope you develop it further.

Effective altruism in the age of AGI

Toby_Ord2mo43

Thanks so much for writing this Will, I especially like the ideas:

It is much more clear now than it was 10 years ago that AI will be a major issue of our time, affecting many aspects of our world (and our future). So it isn't just relevant as a cause, but instead as something that affects how we pursue many causes, including things like global health, global development, pandemics, animal welfare etc.
Previously EA work on AI was tightly focused around technical safety work, but expansion of this to include governance work has been successful and we will need to further expand it, such that there are multiple distinct AI areas of focus within EA.

If you’ve got a very high probability of AI takeover (obligatory reference!), then my first two arguments, at least, might seem very weak because essentially the only thing that matters is reducing the risk of AI takeover.

I'm not even sure your arguments would be weak in that scenario.

e.g. if there were a 90% chance we fall at the first hurdle with an unaligned AI taking over, but also a 90% chance that even if we avoid this, we fall at the second hurdle with a post-AGI world that squanders most of the value of the future, then this would be symmetrical between the problems (it doesn't matter formally which one comes a little earlier). In this case we'd only have a 1% chance of a future that is close to the best we could achieve. Completely solving either problem would increase that to 10%. Halving the chance of the bad outcome for both of them would instead increase that to 30.25% (and would probably be easier than completely solving one). So in this case there would be active reason to work on both at once (even if work on each had a linear effect on its probability).

One needs to add substantive additional assumptions on top of the very high probability of AI takeover to get it to argue against allocating some substantial effort to ensuring that even with aligned AI things go well. e.g. that if AI doesn't takeover, it solves our other problems or that the chance of near-best futures is already very high, etc.

Toby_Ord

Posts 12

Comments166

Posts
12

Comments
166