Imagine writing a policy for an AI company.
The CEO trusts you and will approve your policy, if you show it's based on 8 premises:
- AI can be scaled in capability. It can offer benefits and power to those who wield it. Also it could become unsafe for all of humanity.
- There are 'bad guys' scaling AI: people who are scaling on for temporary benefits or strategic power advantages, even if it leads to everyone dying.
- There are also 'good guys': people who are developing AI to stay safe for humanity.
- If we pause scaling because we think AI is becoming less safe, the other 'bad guys' will want to keep scaling regardless.
- If we fall behind, we lose the power to try make the most advanced AI safer.
- If we keep scaling ahead of 'bad guys', we can try make and distribute a safer version of the most advanced AI than what those other 'bad guys' would make.
- But other 'bad guys' will learn from us and catch up, developing their own version of the most advanced AI faster than they would otherwise have.
- If ever, anyone builds an existentially unsafe version, everyone dies.
The CEO wants their company to be the 'good guys'. They want a policy to not only use, but also advocate/lobby for hard so other companies also start acting as the 'good guys'.
What policy do you write?
