In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress:
* OpenAI's Sam Altman: Shifted from saying in November "the rate of progress continues" to declaring in January "we are now confident we know how to build AGI"
* Anthropic's Dario Amodei: Stated in January "I'm more confident than I've ever been that we're close to powerful capabilities... in the next 2-3 years"
* Google DeepMind's Demis Hassabis: Changed from "as soon as 10 years" in autumn to "probably three to five years away" by January.
What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI)[1] by 2028?
In this article, I look at what's driven recent progress, estimate how far those drivers can continue, and explain why they're likely to continue for at least four more years.
In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning.
In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks.
We don't know how capable AGI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.
On this set of software engineering & computer use tasks, in 2020 AI was only able to do tasks that would typically take a human expert a couple of seconds. By 2024, that had risen to almost an hour. If the trend continues, by 2028 it'll reach several weeks.
No longer mere chatbots, these 'agent' models might soon satisfy many people's definitions of AGI — roughly, AI systems that match human performance at most knowledge work (see definition in footnote).
This means that, while the compa
Thanks for this Johannes - nice to see the agility and thoughtfulness of the FP Climate Fund!
I've got a couple of questions about the tracking the impact of the EEIST report via media uptake:
I'm curious in general to any other thoughts you have about quantifying impact via media mentions as that is one of the main outputs of activist groups that I'll be researching. It assume it would be more straightforward in this case as there's no sentiment issues to deal with (I imagine?) in terms of negative coverage vs positive coverage hence no backfire effects to include. The theory of change / path to impact still seems slightly opaque though so any thoughts on that would be helpful.
Hi James,
thanks for your questions!
Re 1, the ToC is actually different -- the report was already produced, but -- we believe -- would not have been sufficiently amplified absent the grant, so it is more about the latter part of your chain.
Re 2, this is roughly what we would do if the sums justified it -- this was a small grant and we operate by the principle of keeping detail of analysis roughly proportional to money moved, so we accepted higher uncertainty here. Something we will be thinking more about going forward if we evaluate similar grants.
Re 3, this is recorded in the article -- actually we wrote those sections ("what we expected") before the grants had effects ("what we achieved") to allow for this comparison.
Re 4, we spent about 30k so reaching >3m million is about 100 people per USD. There's more media uptake trickling in still, so it could be significantly more once all is set and done.
Re 5, this is a tool that the PR agency uses, I don't know which tool this is specifically.
Happy to connect more on those issues, though I probably won't have time to dig deeper into this before December.