TO

Toby_Ord

4143 karmaJoined

Sequences
1

The Scaling Series

Comments
178

Thanks Paolo,

I was only able to get weak evidence of a noisy trend from the limited METR data, so it is hard to draw many conclusions from that. Moreover, METR's desire to measure the exponentially growing length of useful work tasks is potentially more exposed to an exponential rise in compute costs than more safety-related tasks. But overall, I'd think that the year-on-year growth in the amount of useful compute you can use on safety evaluations is probably growing faster than one can sustainably grow the number of staff.

I'm not sure how the dynamics will shake out for safety evals over a few years. e.g. a lot of recent capability gain has come from RL, which I think isn't sustainable, and I also think the growth in ability via inference compute will limit both ability to serve the model and ability for people to afford it, so I suspect we'll see some returning to eke what they can out of more pretraining. i.e. that the reasoning era saw them shift to finding capabilities in new areas with comparably low costs of scaling, but once they reach the optimal mix, we'll see a mixture of all three going forward. So the future might look a bit less like 2025 and more like a mix of that and 2022-24.

Unfortunately I don't have much insight on the question of ideal mixes of safety evals!

Yes, that is a big limitation. Even more limiting is that it is only based on a subset of METR's data on this. That's enough to raise the question and illustrate what an answer might look like in data like this, but not to really answer it.

I'm not aware of others exploring this question, but I haven't done much looking.

Thanks for the clarification, and apologies for missing that in your original comment.

Hi Simon, I want to push back on your claims about markets a bit. 

Markets are great, especially when there are minimal market failures. I love them. They are responsible for a lot of good things. But the first and second fundamental theorems don't conclude that they maximise social welfare. That is a widely held misconception. 

The first concludes that they reach a point on the Pareto frontier, but such a point could be really quite bad. e.g. a great outcome for one person but misery for 8 billion can be Pareto efficient. I'm not sure that extreme outcome is compatible with your specification (as without market failures, perhaps one can say everyone will also be at least as well off as now), but a world where billions are still living in poverty as they are today is definitely compatible with the 1st theorem plus no market failures, yet is definitely not maximising social welfare. 

And the 2nd theorem says that a massive transfer plus markets can reach any Pareto optimal outcome (including a genuinely socially optimal one). However, it is often the transfer that is doing most of the work there, so the conclusion would not be that markets maximise social welfare, but that in combination with a potentially radical redistribution they do.

Whether it is true or not depends on the community and the point I'm making is primarily for EAs (and EA-adjacent people too). It might also be true for the AI safety and governance communities. I don't think it is true in general though — i.e. most citizens and most politicians are not giving too little regard to long timelines. So I'm not sure the point can be made when removing this reference.

Also, I'm particularly focusing on the set of people who are trying to act rationally and altruistically in response to these dangers, and are doing so in a somewhat coordinated manner. e.g. a key aspect is that the portfolio is currently skewed towards the near-term.

The point I'm trying to make is that we should have a probability distribution over timelines with a chance of short, medium or long — then we need to act given this uncertainty, with a portfolio of work based around the different lengths. So even if our median is correct, I think we're failing to do enough work aimed at the 50% of cases that are longer than the median.

Answer by Toby_Ord36
5
2

"EAs aren't giving enough weight to longer AI timelines"

(The timelines until transformative AI are very uncertain. We should, of course, hedge against it coming early when we are least prepared, but currently that is less of a hedge and more of a full-on bet. I think we are unduly neglecting many opportunities that would pay off only on longer timelines.)

I ran a timelines exercise in 2017 with many well known FHI staff (though not including Nick) where the point was to elicit one's current beliefs for AGI by plotting CDFs. Looking at them now, I can tell you our median dates were: 2024, 2032, 2034, 2034, 2034, 2035, 2054, and 2079. So the median of our medians was (robustly) 2034 (i.e. 17 more years time). I was one of the people who had that date, though people didn't see each others' CDFs during the exercise.

I think these have held up well.

So I don't think Eliezer's "Oxford EAs" point is correct.

I've often been frustrated by this assumption over the last 20 years, but don't remember any good pieces about it.

It may be partly from Eliezer's first alignment approach being to create a superintelligent sovereign AI, where if that goes right, other risks really would be dealt with.

Yeah, I mean 'more valuable to prevent', before taking into account the cost and difficulty.

Load more