Hide table of contents

In What will GPT-2030 look like?, Jacob Steinhardt imagines what large pretrained ML systems might look like in 2030. He predicts that a hypothetical GPT2030 would:

  1. Be superhuman at specific tasks
  2. Work and think quickly
  3. Be run in many parallel copies
  4. Learn quickly in parallel
  5. Be trained on additional modalities

Thinking through the implications of any of these predictions is I think pretty interesting - but here I want to focus on getting a handle on 2-4.

As I understand it, 4 is basically a function of 2 and 3: systems will learn quickly in parallel in proportion with how fast they work/think/learn individually, and how many systems can be run in parallel. (Probably there are other technical details here; I’m going to ignore them for now.)

It makes intuitive sense to me that  GPT2030 will be fast at thinking and fast at learning. But how fast is fast?

Jacob predicts that:

  • GPT2030  will think/work 5x as fast as a human
  • GPT2030  will be trained on enough compute to perform 1.8 million years of work at human speed
  • It will be feasible to run at least 1 million copies of GPT2030 in parallel

On the back of this, he estimates that:

  • GPT2030  will be able to perform 1.8 million years of work in 2.4 months[1]
  • GPT2030  will be able to do 2,500 years of learning in 1 day[2]

For now, let’s just assume that these forecasts are sensible. How much learning is 2500 years? What would 1.8 million years of work in 2.4 months look like?

(An aside on why I’m interested in this: I want to have a sense of how ‘good’ AI might be in 2030/the future. 1.0e28 training FLOPs doesn’t viscerally mean anything to me. 1.8 million years of work starts to mean something - but I still don’t really grok what it would mean if you could fit that many years of work into 2.4 months.)

Assuming that people work 8 hours a day, 5 days a week, 50 weeks a year,[3] and then taking the first estimates I found on the internet for how many people are in different groups, then in a single day:

  • PhD students study for a bit more than 8000 years[4]
  • DOD employees work for close to 2000 years
  • Amazon employees work for around 900 years
  • Mathematicians work for just over 400 years
  • AI researchers work for nearly 200 years
  • Microsoft and Alphabet each work for a bit over 100 years
  • Google DeepMind works for a year and a half
  • OAI works for nearly 10 months
  • Anthropic works for nearly 4 months

So an AI system that could fit 2500 years of learning/thinking/working into one day would be doing:

  • A third of the learning of all PhD students in the world
  • 10x the work of all AI researchers
  • Thousands of times the work of OAI, GDM and Anthropic combined

 

Every day.

 

What about doing 1.8 million years of work in 2.4 months?

Making the same assumptions about human working time, you’d need around 40 million humans to fit 1.8 million years of work into that time.[5]

For scale:

  • There are around 30 million software engineers in the world
  • The entire labour force of Germany is around 44 million

So GPT2030 would be doing more work than all software engineers combined.

__________________________________________________

So, where does all of this get me to? Mostly, that fast is pretty darn fast.

On Jacob’s predictions, it’s not the case that GPT2030 could do more work than humanity combined or anything crazy like that - but GPT2030 could be doubling the amount of software engineering, or ten-x-ing the amount of AI research, or thousand-x-ing the amount of AGI research.

I think worlds like that could look pretty strange.

 

Thanks to Owen Cotton-Barratt, Oscar Delaney, and Will MacAskill for comments; and to Max Dalton for unblocking me on posting.

  1. ^

     Here I think Jacob is taking a central estimate of 1.8 million copies, rather than his lower bound of at least 1 million. So 1.8 million systems working 5x as fast as humans can do 1.8 million years in 1 year / 5x speedup - 2.4 months.

  2. ^

     Here I think Jacob is using his lower bound of 1 million copies, and also isn’t factoring in the 5x speedup, I presume to make an even more conservative lower bound. So 1 million copies working for 1 day each is 1 million days, which is around 2,500 years. (With the 5x speedup, it would be around 13,500 years.)

  3. ^

     Which seems to be roughly the global average right now, from https://ourworldindata.org/working-more-than-ever (8*5*50=2000)

  1. ^

     Assuming 222 million people in tertiary education from here https://www.worldbank.org/en/topic/tertiaryeducation, and then assuming that a) the percentage of the global population with tertiary education is 17%, from here https://ourworldindata.org/grapher/share-of-the-population-with-completed-tertiary-education?tab=table&time=2020..2025, b) the percentage of the global population with a PhD is 1%, c) the proportion of those in tertiary education who are PhD students is 1:17.

  2. ^

     There are 73 days in 2.4 months (365 days * 2.4/12 months). So that’s ~25,000 years in a day (1.8 million years / 73 days). Converting that into hours and dividing by the hours worked by the average human in an average day gets you to ~40 million humans. How long stuff takes

Show all footnotes
Comments3


Sorted by Click to highlight new comments since:

Interesting. People probably aren't at peak productivity or even working at all for some part of those hours, so you could probably cut the hours by 1/4. This narrows the gap between what GPT2030 can achieve in a day and what all humans can together. 

Assuming 9 billion people work 8 hours that's ~8.22 million years of work in a day. But given slowdowns in productivity throughout the day we might want to round that down to ~6 million years. 

Additionally, GPT2030 might be more effective than even the best human workers at their peak hours. If it's 3x as good as a PhD student at learning, which it might be because of better retention and connections, it would be learning more than all PhD students in the world every day. The quality of its work might be 100x or 1000x better, which is difficult to compare abstractly. In some tasks like clearing rubble, more work time might easily translate into catching up on outcomes. 

With things like scientific breakthroughs, more time might not result in equivalent breakthroughs. From that perspective, GPT2030 might end up doing more work than all of humanity since huge breakthroughs are uncommon. 

 

I like the vividness of the comparisons!

A few points against this being nearly as crazy as the comparisons suggest:

  • GPT-2030 may learn much less sample efficiently, and much less compute efficiently, than humans. In fact, this is pretty likely. Ball-parking, humans do 1e24 FLOP before they're 30, which is ~20X less than GPT-4. And we learn languages/maths from way fewer data points. So the actual rate at which GPT-2030 itself gets smarter will be lower than the rates implied. 
    • This is a sense of "learn" as in "improves its own understanding". There's another sense which is "produces knowledge for the rest of the world to use, eg science papers" where I think your comparisons are right. 
  • Learning may be bottlenecked by serial thinking time past a certain point, after which adding more parallel copies won't help. This could make the conclusion much less extreme.
  • Learning may also be bottlenecked by experiments in the real world, which may not immediately get much faster.

Thanks, I think these points are good.

  • Learning may be bottlenecked by serial thinking time past a certain point, after which adding more parallel copies won't help. This could make the conclusion much less extreme.

Do you have any examples in mind of domains where we might expect this? I've heard people say things like 'some maths problems require serial thinking time', but I still feel pretty vague about this and don't have much intuition about how strongly to expect it to bite.


 

Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
 ·  · 9m read
 · 
TL;DR In a sentence:  We are shifting our strategic focus to put our proactive effort towards helping people work on safely navigating the transition to a world with AGI, while keeping our existing content up. In more detail: We think it’s plausible that frontier AI companies will develop AGI by 2030. Given the significant risks involved, and the fairly limited amount of work that’s been done to reduce these risks, 80,000 Hours is adopting a new strategic approach to focus our efforts in this area.   During 2025, we are prioritising: 1. Deepening our understanding as an organisation of how to improve the chances that the development of AI goes well 2. Communicating why and how people can contribute to reducing the risks 3. Connecting our users with impactful roles in this field 4. And fostering an internal culture which helps us to achieve these goals We remain focused on impactful careers, and we plan to keep our existing written and audio content accessible to users. However, we are narrowing our focus as we think that most of the very best ways to have impact with one’s career now involve helping make the transition to a world with AGI go well.   This post goes into more detail on why we’ve updated our strategic direction, how we hope to achieve it, what we think the community implications might be, and answers some potential questions. Why we’re updating our strategic direction Since 2016, we've ranked ‘risks from artificial intelligence’ as our top pressing problem. Whilst we’ve provided research and support on how to work on reducing AI risks since that point (and before!), we’ve put in varying amounts of investment over time and between programmes. We think we should consolidate our effort and focus because:   * We think that AGI by 2030 is plausible — and this is much sooner than most of us would have predicted 5 years ago. This is far from guaranteed, but we think the view is compelling based on analysis of the current flow of inputs into AI
Recent opportunities in Forecasting
20
Eva
· · 1m read
32
Ozzie Gooen
· · 2m read