What is Compute? - Transformative AI and Compute [1/4]

lennart

What is Compute? - Transformative AI and Compute [1/4]

lennart

21 min read

Comments 5

Sorted by

New & upvoted

MarkusAnderljung

I really appreciated the extension on "AI and Compute". Do you have a sense of the extent to which your estimate of the doubling time differs from "AI and Compute" stems from differences in selection criteria vs new data since its publication in 2018? Have you done analysis on what the trend looks like if you only include data points that fulfil their inclusion criteria?

For reference, it seems like their criteria is "... results that are relatively well known, used a lot of compute for their time, and gave enough information to estimate the compute used." Whereas yours is "important publication within the field of AI OR lots of citations OR performance record on common benchmark". "... used a lot of compute for their time" would probably do a whole lot of work to select data points that will show a faster doubling time.

lennart

I have been wondering the same. However, given that OpenAI's "AI and Compute" inclusion criteria are also a bit vague, I'm having a hard time which of our data points would fulfill their criteria.

In general, I would describe our dataset matching the same criteria because:

"relatively well known" equals our "lots of citations".
"used a lot of compute for their time" equals our dataset if we exclude outliers from efficient ML models.
- There's a recent trend in efficient ML models that achieve similar performance by using less compute for inference and training (those models are then used for e.g., deployment on embedded systems or smartphones).
"gave enough information to estimate the compute": We also rely on estimates from us or the community based on the information available in the paper. For a source of the estimate see the note on the cell in our dataset.
- We're working on gathering more compute data by directly asking researchers (next target n=100) .

I'd be interested in discussing more precise inclusion criteria. As I say in the post:

Also, it is unclear on which models we should base this trend. The piece AI and Compute also quickly discusses this in the appendix. Given the recent trend of efficient ML models due to emerging fields such as Machine Learning on the Edge, I think it might be worthwhile discussing how to integrate and interpret such models in analyses like this — ignoring them cannot be the answer.

MarkusAnderljung

Thanks! What happens to your doubling times if you exclude the outliers from efficient ML models?

lennart

The described doubling time of 6.2 months is the result when the outliers are excluded. If one includes all our models, the doubling time was around ≈7 months. However, the number of efficient ML models was only one or two.

SolenoidEntity

@lennart apologies if this is a silly question, but either there's an error in footnote 4, or I misunderstand something fundamental:

A petaflop/s is floating point operations per second for one day. A day has $86, 400 s e c o n d s \approx 10^{5} s e c o n d s$ . Therefore, $10^{20}$ floating point operations

Shouldn't this read something like (in verbatim spoken words)

"A petaflop per second is ten to the power of five floating point opeations per second. A day has [...] 10 to the power of five seconds. Therefore, a 'petaflop-per-second' DAY is 10 to the power of twenty floating point operations."

You've said a petaflop/s is x flop/s for one day, which seems like a typo maybe?

Would you say "petaflop-per-second" days if reading out loud?

Comments

More from the author

Transformative AI and Compute [Summary]

lennart·4y ago·11m read

Forecasting Compute - Transformative AI and Compute [2/4]

lennart·4y ago·23m read

Curated and popular this week

Cultivating hope: calibrating the expectations for cultivated meat to end factory farming

PabloAMC 🔸·1w ago·Curated 1d ago·22m read

GWWC's 2025 impact evaluation (executive summary)

Aidan Whitfield🔸, Giving What We Can🔸·3d ago·2m read

This post presents the executive summary from Giving What We Can’s impact evaluation for 2025. At the end of this post we share links to more information, including the full report and...

Maybe do the thing you wish CEA would do

alejoacelas 🔸·17h ago·2m read

I used AI to fix transcription errors, rerrarange the ideas, and suggest tweaks to the title and some sentences. Three of the most exciting projects to come out of EA in recent years are, in a vague sense, CEA spinouts: * Kairos is directly a spinout of CEA and now handles most support for university AI safety groups. Basically everyone I've found who knows them is really excited about what they do * NEST is an opinionated ideas-fi...

Recent opportunities to take action

Announcing the Safe Pareto Improvements (SPI) Fundamentals Program

Center on Long-Term Risk, Anthony DiGiovanni 🔸, Santeri T 🔹·8h ago·3m read

Effective petitions (July 2026)

Stijn Bruers 🔸·5h ago·1m read

RP is looking for project founders in neglected animal areas

Rethink Priorities·1d ago·7m read

One could argue the universe is a computer as well: pancomputationalism. ↩︎
You can read some thoughts on quantum computing in the series “Forecasting Quantum Computing” by Jaime Sevilla. ↩︎
Compute produces the data as an interactive environment for reinforcement learning. Therefore, more compute leads to more available training data. ↩︎
A petaflop/s equals $10^{15}$ floating point operations per second. A day has $86, 400 s e c o n d s \approx 10^{5} s e c o n d s$ . Therefore, a petaflop/s-day equals $10^{20}$ floating point operations. ↩︎
Nonetheless, according to estimates, overall most compute is probably used for the deployed AI systems — inference. Whereas, as outlined, the training process is computational more complex, the repetitive behavior of inference once deployed, leads to overall more used compute. In the future those resources could be repurposed for training (if we do not see different hardware for training and inference — discussed in Section 4.2) (compute for training >> compute for inference but number of inferences >> number of training runs) (Amodei and Hernandez 2018). ↩︎
The final training run refers to the last training of an AI system before stopping updating the learned weights and biases and deploying the network for inference. There are usually dozens to hundreds of training runs of AI systems to tweak the architecture and hyper-parameters optimally. While this metric is relevant for the development costs, it is not an optimal proxy for the systems’ capabilities. ↩︎
“We think it’d be a mistake to be confident this trend won’t continue in the short term.” (Amodei and Hernandez 2018). ↩︎
The data used in this section is coming out of a project by Jaime Sevilla, Pablo Villalobos, Matthew Burtell and Juan Felipe Cerón. We collaborated to add more compute estimates to the public database. I can recommend their first analysis: “Parameter counts in Machine Learning”. ↩︎
Transformative AI, as defined by Open Philanthropy in this blogpost: “Roughly and conceptually, transformative AI is AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution.” ↩︎
For more thoughts and a discussion on this, I can recommend “The Scaling Hypothesis” by Gwern (or the summary in the AI Alignment Newsletter #156). ↩︎
I would also describe the purple part as an open research question. How can we decompose this — differentiating between parallelization, an engineering effort, and spending, where it is easier to find upper limits? ↩︎
I would be interested in an update on this. However, I also did not spend time looking for an update on this in the recent AI experts surveys. ↩︎
I initially made the claim that there are reasons to believe that the available memory capacity of compute systems might match the human brain or at least be sufficient (at least the information we can consciously recall and access). However, while thinking more about this claim, I became uncertain. I started wondering if the brain also has something similar to a memory hierarchy as it is the default for compute systems (different levels of memory capacities which can be accessed at different speeds). I would be interested in research on this. ↩︎
In general, computational power is key to our modern society, and might also be the foundation of life in the future: digital minds. The future of humanity could be computed on digital computers — see “Digital People Would Be An Even Bigger Deal” by Holden Karnofsky or “Sharing the World with Digital Minds” by Bostrom. ↩︎

What is Compute? - Transformative AI and Compute [1/4]

What is Compute? - Transformative AI and Compute [1/4]

Epistemic Status

1. Compute

1.1 Logic, Memory and Interconnect

1.2 Chips or Integrated Circuits

2. Compute in AI Systems

2.1 Computing in AI Systems

Training

Inference

2.2 Compute Trends: 2012 to 2018

2.3 Compute Trends: An Update^[8]

3. Compute and AI Alignment

3.1 The Bitter Lesson

3.2 Scaling Hypothesis

3.3 AI and Efficiency

3.4 Qualitative Assessment

3.5 Compute Milestones

3.6 Conclusion

Next Post: Forecasting Compute

Acknowledgments

References

What is Compute? - Transformative AI and Compute [1/4]

What is Compute? - Transformative AI and Compute [1/4]

Epistemic Status

1. Compute

1.1 Logic, Memory and Interconnect

1.2 Chips or Integrated Circuits

2. Compute in AI Systems

2.1 Computing in AI Systems

Training

Inference

2.2 Compute Trends: 2012 to 2018

2.3 Compute Trends: An Update[8]

3. Compute and AI Alignment

3.1 The Bitter Lesson

3.2 Scaling Hypothesis

3.3 AI and Efficiency

3.4 Qualitative Assessment

3.5 Compute Milestones

3.6 Conclusion

Next Post: Forecasting Compute

Acknowledgments

References

2.3 Compute Trends: An Update^[8]