Who Aligns the Alignment Researchers?

ben.smith

Comments 4

Sorted by

New & upvoted

more better

[This comment is no longer endorsed by its author]

ben.smith

Can you describe exactly how much you think the average person, or average AI researcher, is willing to sacrifice on a personal level for a small chance at saving humanity? Are they willing to halve their income for the next ten years? Reduce by 90%?

I think in a world where there was a top down societal effort to try to reduce alignment risk, you might see different behavior. In the current world, I think the "personal choice" framework really is how it works because (for better or worse) there is not (yet) strong moral or social values attached to capability vs safety work.

NickLaing

What makes you think that the average person rates saving humanity highly enough to make it worth doing alignment research rather than capabilities? That seems like a pretty conservative statement from my experience. Most people I know would definitely take a small-moderate amount of extra money rather than doing more valuable work for humanity. Also building something could feel like more rewarding work than safety work as well.

Maybe I'm missing something, what do you think are the assumptions that that statement makes?

more better

Maybe my comment is off, since your article is specifically about AI alignment vs. capabilities research and I was taking the single sentence I quoted out of context. Will remove .

Comments

Who Aligns the Alignment Researchers?

History

Incentives to research Alignment

So you don’t die

So humanity doesn’t die

For social reputation

Incentives to research Capabilities

Commercial opportunities

Social Reputation

Social impact

Funding opportunities

Organizational level

Global level

Across all levels

Do we need investment in alignment research?

Conclusion