Benjamin_Todd

Minor but I actually think Deepseek was pretty on trend for algorithmic efficiency (as explained in the post). The main surprise was that it was a Chinese company near the forefront of algorithmic efficiency (but here several months before I suggest that the Chinese are close to the frontier there).

The case for AGI by 2030

Benjamin_Todd4d5

It's the first chapter in a new guide about how to help make AI go well (aimed at new audiences).

I think it's generally important for people who want to help to understand the strategic picture.

Plus in my experience the thing most likely to make people take AI risk more seriously is believing that powerful AI might happen soon.

I appreciate that talking about this could also wake more people up to AGI, but I expect the guide overall will proportionally boost the safety talent pool a lot more than the speeding up AI talent pool.

(And long term I think it's also better to be open about my actual thinking rather than try to message control to that degree, and a big part of the case in favour in my mind is that it might happen soon.)

The case for AGI by 2030

Benjamin_Todd5d10

Yes I basically agree that's the biggest limiting factor at this point.

However, a better base model can improve agency via e.g. better perception (which is still weak).

And although reasoning models are good at science and math, they still make dumb mistakes reasoning about other domains, and very high reliability is needed for agents. So I expect better reasoning models also helps with agency quite a bit.

Enough about AI timelines— we already know what we need to know.

Benjamin_Todd13d*31

I feel subtweeted :p As far as I can tell, most of the wider world isn't aware of the arguments for shorter timelines, and my pieces are aimed at them, rather than people already in the bubble.

That said, I do think there was a significant shortening of timelines from 2022 to 2024, and many people in EA should reassess whether their plans still make sense in light of that (e.g. general EA movement building looks less attractive relative to direct AI work compared to before).

Beyond that, I agree people shouldn't be making month-to-month adjustments to their plans based on timelines, and should try to look for robust interventions.

I also agree many people should be on paths that build their leverage into the 2030s, even if there's a chance it's 'too late'. It's possible to get ~10x more leverage by investing in career capital / org building / movement building, and that can easily offset. I'll try to get this message across in the new 80k AI guide.

Also agree for strategy it's usually better to discuss specific capabilities and specific transformative effects you're concerned about, rather than 'AGI' in general. (I wrote about AGI because it's the most commonly used term outside of EA and was aiming to reach new people.)

The case for AGI by 2030

Benjamin_Todd14d7

Apparently there's a preprint showing Gemini 2.5 gets 20% on the Olympiad questions, which would be in line with the o3 result.

The case for AGI by 2030

Benjamin_Todd14d7

I wouldn't totally defer to them, but I wouldn't totally ignore them either. (And this is mostly besides the point since I'm overall I'm critical of using their forecasts and my argument doesn't rest on this.)

The case for AGI by 2030

Benjamin_Todd14d*9

I only came across this paper in the last few days! (The post you link to is from 5th April; my article was first published 21st March.)

I want to see more commentary on the paper before deciding what to do about it. My current understanding:

o3-mini seems to be a lot worse than o3 – it only got ~10% on Frontier Math, similar to o1. (Claude Sonnet 3.7 only gets ~3%.)

So the results actually seem consistent with Frontier Math, except they didn't test o3, which is significantly ahead of other models.

The other factor seems to be that they evaluated the quality of the proofs rather than the ability to get a correct numerical answer.

I'm not sure data leakage is a big part of the difference.

The case for AGI by 2030

Benjamin_Todd15d5

Here we're also talking about capabilities rather than harm. If you want to find out how fast cars will be in 5 years, asking the auto industry seems like a reasonable move.

The case for AGI by 2030

Benjamin_Todd16d11

So, OpenAI is telling the truth when it says AGI will come soon and lying when it says AGI will not come soon?

I don't especially trust OpenAI's statements on either front.

The framing of the piece is "the companies are making these claims, let's dig into the evidence for ourselves" not "let's believe the companies".

(I think the companies are most worth listening to when it comes to specific capabilities that will arrive in the next 2-3 years.)

The case for AGI by 2030

Benjamin_Todd16d4

I agree those two statements don't obviously seem inconsistent, though independently it seems to me Dario probably has been too optimistic historically.

Benjamin_Todd

Posts 53

Comments886

Posts
53

Comments
886