This is a linkpost to a new Substack article from MIT FutureTech explaining our recent paper On the Origins of Algorithmic Progress in AI

We demonstrate that some algorithmic innovations have efficiency gains which get larger as pre-training compute increases. These scale-dependent innovations constitute the majority of pre-training efficiency gains over the last decade, which may imply that what looks like algorithmic progress is driven by compute scaling rather than many incremental innovations.

From the paper, our core contributions are:

  1. We find most algorithmic innovations we experimentally evaluate have small, scale-invariant efficiency improvements with less than 10× compute efficiency gain overall, and representing less than 10% of total improvements extrapolated to the 2025 compute frontier (2 × 10²³ FLOPs). This suggests that scale-invariant algorithmic progress contributes only a minor share of overall efficiency improvements.
  2. We find two strongly scale-dependent algorithmic innovations: LSTMs to Transformers, and Kaplan to Chinchilla re-balancing. Together, these account for 91% of total efficiency gains when extrapolating to the 2025 compute frontier. This implies that algorithmic progress for small-scale models is several orders of magnitude smaller than previously thought.
  3. We show that in the presence of scale-dependent innovations, not only do efficiency gains require continued compute investment, but the rate of algorithmic progress strongly depends on your choice of reference algorithm. In other words, the rate of progress in successive models can appear exponential relative to one baseline algorithm, yet be zero relative to another.

MIT FutureTech is an interdisciplinary lab at the intersection of computer science and economics, focused specifically on trends in AI and computing, and funded in part by Coefficient Giving.

11

0
0

Reactions

0
0
Comments
No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities