Michael Einhorn

7 karmaJoined Jul 2022


Any slowdown in alignment research is effectively the same risk as speeding up timelines by the same factor. I agree that this threat model exists, but this is a very steep tradeoff.

My primary threat model is that we simply did not do enough research, and did not know enough to make a serious attempt at aligning an AI.

The internet may not have been as economically impactful as industrialization, but that is an extremely high bar of comparison. Looking at rGDP growth in this way can hide the fact that a flat rate, or even a slightly reduced rate is still a continuation of an exponential trend.

Current rGDP is 2.5 times larger than in 1985. Choosing 1985 to start because the wiki article says "Much of the productivity from 1985 to 2000 came in the computer and related industries."

If computers didn't exist and we didn't invent anything else to take their place, the economy would have halted, unable to sustain exponential growth. That means our productive capacity would be 2.5x smaller than what it is today. This would have been cataclysmic, even though the 2.5x factor isn't as massive as industrialization.

For reference, the great depression shrunk the ecconomy by 1.36x.

Imagine if we slowed down alignment research by that same 2.5x factor. (I would also argue the internet has a larger benefit for alignment research than the average for the economy).

10 year timelines effectively become 4 year timelines. 5 years to 2 years. 

Would it be a good idea to publish a capability advancement that would have the equivalent impact on timelines, even if it would make a specific threat model easier to deal with? Maybe it would be worth it if solving that particular threat was sufficient to solve alignment, rather than it just being one of the many ways a misaligned agent could manipulate or hack through us.

Although this number is a bit slapped together, it doesn't seem out of the ballpark for me when considering both a slowdown of sharing ideas within the field and in the growth of the field. Intuitively I think it is an underestimate.

I think this tradeoff is unacceptable. Instead, I think we need to be pushing hard in the other direction to accelerate alignment research. This means reaching out to as many people as possible to grow the community, using every tool we can to accelerate research progress and broadcast ideas faster.