Former UMich EA club co-president. Currently doing research to prevent x-risk from AI.
I haven't read all of your posts that carefully yet, so I might be misunderstanding something, but generally, it seems to me like this approach has a some "upper bound" modeling assumptions that are then used as an all things considered distribution.
My biggest disagreement is that I think that your distribution over FLOPs [1] required for TAI (pulled from here), is too large.
My reading is that this was generated assuming that we would train TAI primarily via human imitation, which seems really inefficient. There are so many other strategies for training powerful AI systems, and I expect people to transition to using something better than imitation. For example, see the fairly obvious techniques discussed here.
GPT-4 required ~25 log(FLOP)s, and eyeballing it, the mode of this distribution seems to be about ~35 log(FLOP)s, so that is +10 OOMs as the median over GPT-4. The gap between GPT-3 and GPT-4 is ~2 OOMs, so this would imply a median of 5 GPT-sized jumps until TAI. Personally, I think that 2 GPT jumps / +4 OOMs is a pretty reasonable mode for TAI (e.g. the difference between GPT-2 and GPT-4).
In the 'against very short timelines' section, it seems like your argument mostly routes through it being computationally difficult to simulate the entire economy with human level AIs, because of inference costs. I agree with this, but think that AIs won't stay human level for very long, because of AI-driven algorithmic improvements. In 4 year timelines worlds, I don't expect the economy to be very significantly automated before the point of no return, I instead expect it to look more like faster and faster algorithmic advances. Instead of deploying 10 million human workers dispersed over the entire economy, I think this would look more like deploying 10 million more AGI researchers, and then getting compounding returns on algorithmic progress from there.
But, as I have just argued above, a rapid general acceleration of technological progress from pre-superintelligent AI seems very unlikely in the next few years.
I generally don't see this argument. Automating the entire economy != automating ML research. It remains quite plausible to me that we reach superintelligence before the economy is 100% automated.
I'm assuming that this is something like 2023-effective flops (i.e. baking in algorithmic progress, let me know if I'm wrong about this).
(quick thoughts, may be missing something obvious)
Relative the scale of the long term future, the number of AIs deployed in the near term is very small, so to me it seems like there's pretty limited upside to improving that. In the long term, it seems like we have AIs to figure out the nature of consciousness for us.
Maybe I'm missing the case that lock-in is plausible, it currently seems pretty unlikely to me because the singularity seems like it will transform the ways the AIs are running. So in my mind it mostly matters what happens after the singularity.
I'm also not sure about the tractability, but the scale is my major crux.
I do think understanding AI consciousness might be valuable for alignment, I'm just arguing against work on nearterm AI suffering.
I appreciate Josh Clymer for living up to his reflectively endorsed values so strongly. Josh is extremely willing to do the thing that he thinks is most impactful, even when such a thing looks like going to school wearing a clown suit.
Are there any alignment research community/group/event nearby?
HAIST is probably the best AI safety group in the country, they have office space quite near campus and several full time organizers.
I'm confused why almost all of the comments seem to be from people donating to many charities. For small amounts that an individual would donate, I don't imagine that diminishing marginal returns would kick in, so shouldn't one donate entirely to the charity that has the highest EV on the current margin?
AI operates in the single-minded pursuit of a goal that humans provide it. This goal is specified in something called the reward function.
It turns out the problem is a lot worse than this -- even if we knew of a safe goal to give AI, we would have no idea how to build an AI that pursues that goal!
See this post for more detail. Another way of saying this using the inner/outer alignment framework: reward is the outer optimization target, but this does automatically induce inner optimization in the same direction.
AI scaling laws refer to a specific algorithm and so are not relevant for arguing against algorithmic progress. For example, humans are much more sample efficient than LLMs right now, and so are an existence proof for more sample efficient algorithms. I also am pretty sure that humans are far from the limits of intelligence -- neuron firing speeds are on the order of 1-100 Hz, while computers can run much faster than this. Moreover, the human brain has all sorts of bottlenecks like needing to fit through a mother's birth canal that an AI need not have, as well as all the biases that plague our reasoning.
Epoch estimates algorithmic improvements at .4 OOM / year currently, and I feel that it's hard to be confident either way about which direction this will go in the future. AI assisted AI research could dramatically increase this, but on the other hand, as you say, scaling could hit a wall.
I agree that I don't expect the exponential to hold forever, I expect the overall growth to look more like a sigmoid, as described here (though my best guess parameters to this model are different than the default ones). Where I disagree is that I expect the sigmoid to top out at far stronger than human level.