## Effective Altruism ForumEA Forum

lennart

Lennart Heim | Zürich, Switzerland | Personal Website | LinkedIn | Twitter | len+EA@heim.xyz |

# Wiki Contributions

Compute Governance and Conclusions - Transformative AI and Compute [3/4]

I'm still holding the same view that (a) we will probably see a switch in funding distribution and (b) if this does not happen those groups won't be able to compete with SOTA models.

we will and should see a switch in funding distribution at publicly funded AI research groups

I would change my mind if we find more evidence towards algorithmic innovation being a stronger or the significant driver.

Some recent updates in regards to providing more funding for infrastructure include The National AI Research Cloud which is currently being investigated by the US government or Compute Canada.

What is Compute? - Transformative AI and Compute [1/4]

The described doubling time of 6.2 months is the result when the outliers are excluded. If one includes all our models, the doubling time was around ≈7 months. However, the number of efficient ML models was only one or two.

Forecasting Compute - Transformative AI and Compute [2/4]

For "Semiconductor industry amortize their R&D cost due to slower improvements" the decreased price comes from the longer innovation cycles, so the R&D investments spread out over a longer time period. Competition should then drive the price down.

While in contrast "Sale price amortization when improvements are slower" describes the idea that the sale price within the company will be amortized over a longer time period given that obsolescence will be achieved later.

Those ideas stem from Cotra's appendices: "Room for improvements to silicon chips in the medium term".

Forecasting Compute - Transformative AI and Compute [2/4]

Thanks, Sammy. Indeed this is related and very interesting!

Transformative AI and Compute [Summary]

Thanks, Michael.

1. n is counting the number of ML systems in the analysis at the point of writing. (We have added more systems in the meantime). An example for such a system is GPT-3, AlphaFold, etc. - basically a row in our dataset.
2. Right, good point. I'll add the number of systems for the given time period.
3. That's hard to answer. I don't think OpenAI misinterpreted anything. For the moment, I think it's probably a mixture of:
• the inclusion criteria for the systems on which we base this trend
• actual slower doubling times for reasons which we should figure out Nonetheless, as outlined in Part 1 - Section 2.3, I did not interpret those trends yet but I'm interested in a discussion and trying to write up my thoughts on this in the future.
What is Compute? - Transformative AI and Compute [1/4]

I have been wondering the same. However, given that OpenAI's "AI and Compute" inclusion criteria are also a bit vague, I'm having a hard time which of our data points would fulfill their criteria.

In general, I would describe our dataset matching the same criteria because:

1. "relatively well known" equals our "lots of citations".
2. "used a lot of compute for their time" equals our dataset if we exclude outliers from efficient ML models.
• There's a recent trend in efficient ML models that achieve similar performance by using less compute for inference and training (those models are then used for e.g., deployment on embedded systems or smartphones).
3. "gave enough information to estimate the compute": We also rely on estimates from us or the community based on the information available in the paper. For a source of the estimate see the note on the cell in our dataset.
• We're working on gathering more compute data by directly asking researchers (next target n=100) .

I'd be interested in discussing more precise inclusion criteria. As I say in the post:

Also, it is unclear on which models we should base this trend. The piece AI and Compute also quickly discusses this in the appendix. Given the recent trend of efficient ML models due to emerging fields such as Machine Learning on the Edge, I think it might be worthwhile discussing how to integrate and interpret such models in analyses like this — ignoring them cannot be the answer.