The Prospect of an AI Winter

Erich_Grunewald 🔸

The Prospect of an AI Winter

Erich_Grunewald 🔸

18 min read

Comments 13

Sorted by

New & upvoted

Matthew_Barnett

Algorithmic progress on ImageNet seems to effectively halve compute requirements every 4 to 25 months (Erdil and Besiroglu 2022); assume that the doubling time is 50% longer for transformers.

I think it's important not to take the trend in algorithmic progress too literally. At the moment, we only really know the rate for computer vision, which might be very different than for other tasks. The confidence interval is also quite wide, as you mentioned (the 5th percentile is 4 months and 95th percentile is 25 months). And algorithmic progress is plausibly driven by increasing algorithmic experimentation over time, which might become bottlenecked after either the relevant pool of research talent is exhausted or we reach hardware constraints. For these reasons, I have wide uncertainty regarding the rate of general algorithmic progress in the future.

In my experience, fast algorithmic progress is often the component that yields short timelines in compute-centric models. And yet, both the rate and the mechanism behind algorithmic progress is very poorly understood. Extrapolating this rate naively gives a false impression of high confidence in the future, in my opinion. Assuming that the rate is exogenous gives the arguably false impression that we can't do much to change it. I would be very careful before interpreting these results.

aog

Nicely written, these make a lot of sense to me. My case for AI winter would focus on two tailwinds that will likely cease by the end of the decade: money and data.

Money can't continue scaling like this. Spending on training runs has gone up by about an order of magnitude every two years over the last decade. By 2032 this trend would put us at $100B training runs, which would be 3x Google's entire R&D budget. If TAI doesn't emerge before then, spending growth will need to slow down, likely slowing AI progress as well.
Maybe data can't either. Full analysis here, but basically high quality language data will run out well before 2030, possibly within the next year or two. But there are other kinds of data that could continue scaling. For example, the Epoch report discusses low quality language data like private texts and emails as one possibility. I would look more to the transition from language models to multimodal vision-and-language models as an important trend not only because vision is a useful modality, but because it would allow data scaling to continue.

I'd like to have a better view of questions about the continuation of Moore's Law. Without a full writeup, the claims about Moore's Law ending seem more credible now than in the past. I would be really interested in an aggregation of historical forecasts about Moore's Law to see whether the current doomsaying is any different from the long run trend.

NickLaing

On the data front, it seems like Chat GPT and other AIs don't have access to the mass of peer reviewed journals yet. Obviously this isn't (relatively speaking) a huge quantity of data, but the quality would be orders of magnitude higher than what they are looking at now. Could access to these change things much at all?

aog

That’s a reasonable point, but I don’t think peer reviewed journals would make much difference.

The Pile (https://arxiv.org/pdf/2101.00027.pdf) is a large dataset used for training lots of models. It includes the academic datasets Arxiv, FreeLaw, and PubMed Central, which contain 50GB, 50GB, and 100GB of data respectively. Table 7 says each byte is ~0.2 tokens, so that’s about 40B tokens to represent a good chunk of the academic literature on several subjects. If we had a similarly-sized influx of peer reviewed journals, would that change the data picture?

Chinchilla, a state of the art language model released by DeepMind one year ago, was trained on ~1.4T tokens. Only four years prior, BERT was a SOTA model trained on ~6B tokens. If we assume the Pile includes only 10% of existing academic literature, then peer reviewed journals could represent a 400B token influx that would increase available data by 25% over Chinchilla. This would meaningfully expand the dataset, but not by the orders of magnitude necessary to sustain scaling for months and years.

NickLaing

Wow what a great answer appreciate it!

aog

Money can't continue scaling like this.

Or can it? https://www.wsj.com/tech/ai/sam-altman-seeks-trillions-of-dollars-to-reshape-business-of-chips-and-ai-89ab3db0

titotal

First of all, great post, thanks for exploring this topic!

So I'm a little confused about the definition here:

AI winter is operationalised as a drawdown in annual global AI investment of ≥50%

I would guess that the burst of the dot-com bubble meets this definition? But I wouldn't exactly call 2002-2010 an "internet winter": useage kept growing and growing, just with a better understanding of what you can and can't profit from. I think theres a good chance (>30%) of this particular definition of "AI winter" occurring, but I also reckon if it happens, people will feel like it's unfair to characterize it as such.

I think a more likely outcome is a kind of "AI autumn": Investment keeps coming at a steady rate, and lots and lots of people are using AI for the things it's good at, but the number of advancements slows significantly, and certain problems prove intractable, and the hype dies down. I think we've already seen this process happen for Autonomous Vehicles. I think this scenario is very likely.

NickLaing

I would put a huge reduction in investment as way higher than 30% - investment cycles boom and bust as does the economy. Even a global recession or similar could massively reduce AI expenditure while AI development continued marching on at a similar or only slightly reduced rate.

On the other hand the current crypto winter does match the OPs definition, with practical use of crypto reducing along with investment reducing.

In general though I agree with you that looking at investment figures isn't a robust way to define a "winter".

Erich_Grunewald 🔸

Thanks, that's a good observation -- you're right that this is a permissive operationalisation. I actually deliberately did that to be more "charitable" to Eden -- to say, "AI winter seems pretty unlikely even on these pretty conservative assumptions", but I should probably have flagged that more clearly. I agree that there are some scenarios where a 50% drawdown happens but there's no real winter worthy of the name.

Another way of putting this is, I thought I'd get pushback along the lines of "this is way too bullish on AI progress" (and I did get some of that, but not a lot), and instead got lots of pushback in the form of "this is way too bullish on AI winter". (Not talking about the EA forum here, but other places.)

I think a more likely outcome is a kind of "AI autumn": Investment keeps coming at a steady rate, and lots and lots of people are using AI for the things it's good at, but the number of advancements slows significantly, and certain problems prove intractable, and the hype dies down. I think we've already seen this process happen for Autonomous Vehicles. I think this scenario is very likely.

Agree that this is a live possibility. (But I also don't think there's been a 50% drawdown in autonomous driving investment, so I don't think my operationalisation fails there.)

MaxRa

Nice post, found this pretty well written and convincing (though I already shared the bottom line, just less firmly).

Random thoughts:

A severe extreme geopolitical tail event, such as a great power conflict between the US and China, may occur.

What type of great power conflict do you have in mind here? "Extreme tail event" makes it sound like you're thinking of a fairly large scale war, but great power conflict seems to refer to any military confrontation. E.g. I haven't at all wrapped my head around a military confrontation between China and the US over Taiwan yet, and Metaculus is at ~20% for

Will armed conflicts between the Republic of China (Taiwan) and the People's Republic of China (PRC) lead to at least 100 deaths before 2026?

Also, I wonder if you have considered any potential craziness that happens after conditional on development of TAI before 2030. E.g. say TAI is developed in 2027, maybe the plausible set of scenarios for 2028 and 2029 include sufficiently many scenarios where we see a >50% decrease in AI funding such that I might want to increase your bottom line forecast?

[anonymous]

(Uncertain) My guess would be that a global conflict would increase AI investment considerably, as (I think) R&D typically increases in war times. And AI may turn out to be particularly strategically relevant.

NickLaing

Agreed looking historically as well there's every reason to think that war is more likely to accellerate technology development. In this case as well alignment focus is likely to disappear completely if there is a serious war.

Dem drones will be unleashed with the most advanced AI software, safety be damned.

Erich_Grunewald 🔸

What type of great power conflict do you have in mind here? "Extreme tail event" makes it sound like you're thinking of a fairly large scale war, but great power conflict seems to refer to any military confrontation. E.g. I haven't at all wrapped my head around a military confrontation between China and the US over Taiwan yet, and Metaculus is at ~20% for

Yeah that's an interesting question. I guess what I had in mind here was the US and China basically destroying each others' fabs or something along those lines (a compute shortage would make investment in AI labs less profitable, perhaps). But even that could increase investment as they strive to rebuild capacities? Maybe the extreme tail event that'd cause this is perpetual world peace happening!

Comments

More from the author

194

Attention on AI X-Risk Likely Hasn't Distracted from Current Harms from AI

Erich_Grunewald 🔸·2y ago·Curated 2y ago·20m read

Not a Meat Eater FAQ

Erich_Grunewald 🔸·1y ago·44m read

Doubts about Track Record Arguments for Utilitarianism

Erich_Grunewald 🔸·4y ago·13m read

Curated and popular this week

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·1w ago·Curated 4d ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

113

Spiro: an update 2.5 years on and a fundraising ask for expansion

Habiba Banu·5d ago·6m read

Summary Back in November 2023 I posted here to launch Spiro and raise our first $198k. Two and a half years later this is an update and a fundraiser for the next step. The short version: we've now reached over-5,900 people with TB preventive medicine, including over 3,000 children under five years old. Our early results have held up well an...

How (not) to fundraise from Anthropic staff

Jack Lewars·4d ago·7m read

Adapted from my Substack, Funding Anthropalypse. Short version: if you want a share of the coming Anthropic and OpenAI windfall - the $37bn+ that could be in play next year - the way in is to become 'legibly excellent', so the evaluators and donors that frontier lab staff already trust point them to yo...

Recent opportunities to take action

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·6h ago·2m read

I'm stepping down as Hive's Executive Director, and we're hiring my successor

SofiaBalderson, Hive·11h ago·3m read

announcing High Impact Aliens

tzukitchan·3d ago·1m read

By comparison, there seems to have been a drawdown in corporate investment in AI from 2014 to 2015 of 49%, in solar energy from 2011 to 2013 of 24% and in venture/private investment in crypto companies from 2018 to 2019 of 48%. The share prices of railways in Britain declined by about 60% from 1845 to 1850 as the railway mania bubble burst (Odlyzko 2010), though the railway system of course left Britain forever changed nonetheless. ↩︎
Well, this depends a bit on how you view Moore's Law. Gordon Moore wrote: "The complexity for minimum component costs has increased at a rate of roughly a factor of two per year." Dennard scaling -- which says that as transistors shrink, their performance improves while power consumption per unit area remains constant -- failed around 2005. I think some traditionalists would say that Moore's Law ended then, but clearly the number of transistors on a chip keeps doubling (only by other means). ↩︎
William Eden actually only talks about artificial general intelligence (AGI), but I think the TAI frame is better when talking about winters, investment and profitability. ↩︎
It's interesting to note that the term AI winter was inspired by the notion of a nuclear winter. AI researchers in the 1980s used it to describe a calamity that would befall themselves, namely a lack of funding, and, true, both concepts involve stagnation and decline. But a nuclear winter happens after nuclear weapons are used. ↩︎
Apparently the collapse of the LISP machine market was also a contributing factor. LISP machines were expensive workstations tailored to the use of LISP, at the time the preferred programming language of AI researchers. As AI programs were ~always written in LISP, and required a lot of compute and memory for the time, the loss of LISP machines was a serious blow to AI research. It's a bit unclear to me how exactly the decline of LISP machines slowed AI progress beyond that, but perhaps it forced a shift to less compute- and/or memory-hungry approaches. ↩︎
The question is actually operationalised as: "Will the transistors used in the CPU of Apple's most modern available iPhone model on January 1st, 2030 be of the same generation as those used in the CPU of the most modern available iPhone on January 1st, 2025?" ↩︎
That said, MosaicBERT (2023) achieves similar performance to BERT-Base (2018) with lower costs but seemingly more compute. I estimate that BERT-Base needed ~1.2e18 FLOP in pre-training, and MosaicBERT needed ~1.6e18. I'm not sure if this is an outlier, but it could suggest that the algorithmic doubling time is even longer for text models. When I asked about this, one of the people who worked on MosaicBERT told me: "[W]e ablated each of the other changes and all of them helped. We also had the fastest training on iso hardware a few months ago (as measured by MLPerf), and MosaicBERT has gotten faster since then." ↩︎
$10B may seem like a lot now, but I'm thinking world-times where this is a possibility are world-times where companies have already spent $1B on GPT-6 or whatever and seen that it does amazing things, and is plausibly not that far from being transformative. And spending $10B to get TAI seems like an obviously profitable decision. Companies spend 10x-100x that amount on some mergers and acquisitions, yet they're trivial next to TAI or even almost-TAI. If governments get involved, $10B is half of a Manhattan-project-equivalent, a no-brainer. ↩︎
Example prompt: "Can you sort this list in ascending order? [0, 8, 6, 5, 1, 1, 1, 8, 3, 7]". ↩︎
FT (2022): "It has been an outrageously expensive endeavour, of course. McKinsey put the total invested at over $100bn since 2010. Last year alone, funding into autonomous vehicle companies exceeded $12bn, according to CB Insights." -- If those numbers are right, that at least suggests the amount of funding in 2021 was substantially higher than the average over the last decade, a picture which seems inconsistent with an AV winter. ↩︎
Well, there is the ethical concern. ↩︎
I'm not exactly sure whether this analysis is done on training performance alone, but I expect trends in training performance to be highly correlated with trends in inference performance. Theoretical peak performance isn't the only thing that matters -- e.g. interconnect speed matters too -- but it seems like the most important component.

I'm also guessing that demand for inference compute is rising rapidly relative to training compute, and that we may be seeing R&D on GPUs specialised on inference in future. I think so far that hasn't been the focus as training compute has been the main bottleneck. ↩︎
By true out-of-distribution generalisation, I mean to point at something like "AI systems are able to find ideas obviously drawn from outside familiar distributions". To make that more concrete, I mean the difference between (a) AIs generating entirely new Romantic-style compositions and (b) AIs ushering in novel kinds of music the way von Weber, Beethoven, Schubert and Berlioz developed Romanticism. ↩︎
I'm not confident that this would scale, though. A quick back-of-the-envelope calculation suggests OpenAI would get the equivalent of about 0.016% of the data used to train Chinchilla if it spent the equivalent of 10 well-paid engineers' salaries (in total ~$200K per month) for one year. That's not really a lot.

That also assumes:
1. A well-paid engineer is paid $200K to $300K annually.
2. A writer is paid $10 to $15 per hour (this article suggests OpenAI paid that amount for Kenyan labourers -- themselves earning only $1.32 to $2 an hour -- to provide feedback on data for ChatGPT's reinforcement learning step).
3. A writer generates 500 to 1,500 words per hour (that seems reasonable if they stick to writing about themselves or other things they already know well).
4. A writer works 9 hours per day (the same Kenyan labourers apparently worked 9-hour shifts), about 21 days per month (assumes a 5-day work week).
5. Chinchilla was trained on ~1.4T tokens which is the equivalent of ~1.05T words (compare with ~374B words for GPT-3 davinci and ~585B words for PaLM) (Sevilla et al. 2022). I use Chinchilla as a point of comparison since that paper, which came after GPT-3 and PaLM were trained, implied LLMs were being trained on too little data.
Those assumptions imply OpenAI would afford ~88 labourers (90% CI: 66 to 118) who'd generate ~173M words per year (90% CI: 94M to 321M), as mentioned the equivalent of 0.016% of the Chinchilla training data set (90% CI: 0.009% to 0.031%). And that implies you'd need 6,000 years (90% CI: 3,300 to 11,100) to double the size of the Chinchilla data set. ↩︎

The Prospect of an AI Winter

The Prospect of an AI Winter

Summary

The Prospect of a New AI Winter

Past Winters

Moore's Law and the Future of Compute

Is Transformative AI on the Horizon?

You Won't Find Reliability on the Frontier

Autonomous Driving

Costs and Profitability

Reasons Why There Could Be a Winter After All

References