Want to win the AGI race? Solve alignment.

leopold

Want to win the AGI race? Solve alignment.

leopold

6 min readMar 29, 2023

Comments 5

Sorted by

New & upvoted

deep

Like Akash, I agree with a lot of the object-level points here and disagree with some of the framing / vibes. I'm not sure I can articulate the framing concerns I have, but I do want to say I appreciate you articulating the following points:

Society is waking up to AI risks, and will likely push for a bunch of restrictions on AI progress
- Sydney and the ARC Captcha example have made AI safety stuff more salient.
- There's opportunity for substantially more worry about AI risk to emerge after even mild warning events (e.g. AI-powered cyber events, crazier behavior emerging during evals)
Society's response will be dumb and inefficient in a lot of ways, but could also end up getting pointed in some good directions
The more an org's AI development / deployment abilities are constrained by safety considerations (whether their own concerns or other stakeholders'), the more safety looks like just another thing you need in order to deploy your powerful AI systems, so that safety work becomes a complement to capabilities work.

Agrippa

Given your position I am concerned about the arms race accelerationism messaging in this post. Substantively, the major claims of this post are "China AI progress poses a serious threat we must overcome via AI progress (that is, we are in an arms race)" and "society may regulate AI such that projects that don't meet a very high standard of safety will not be deployable". The argument is that pursuing safety follows from these premises, mostly the latter.

This can be interpreted in a number of ways, charitably or uncharitably. Independent of that, I do not think it is really a good idea to talk this way about AI, re: geopolitics. It has a very bad track record with other stuff such as nukes, and I'm not sure who the intended audience is (are capabilities CEOs China hawks who can only be convinced to slow down if framed in terms of beating China? big if true)

deep

There are other safety problems-- often ones that are more speculative-- that the market is not incentivizing companies to solve.

My personal response would be as follows:

As Leopold presents it, the key pressure here that keeps labs in check is societal constraints on deployment, not perceived ability to make money. The hope is that society's response has the following properties:
1. thoughtful, prominent experts are attuned to these risks and demand rigorous responses
2. policymakers are attuned to (thoughtful) expert opinion
3. policy levers exist that provide policymakers with oversight / leverage over labs
If labs are sufficiently thoughtful, they'll notice that deploying models is in fact bad for them! Can't make profit if you're dead. *taps forehead knowingly*
1. but in practice I agree that lots of people are motivated by the tastiness of progress, pro-progress vibes, etc., and will not notice the skulls.

Counterpoints to 1:

Good regulation of deployment is hard (though not impossible in my view).

reasonable policy responses are difficult to steer towards
attempts at raising awareness of AI risk could lead to policymakers getting too excited about the promise of AI while ignoring the risks
experts will differ; policymakers might not listen to the right experts

Good regulation of development is much harder, and will eventually be necessary.

This is the really tricky one IMO. I think it requires pretty far-reaching regulations that would be difficult to get passed today, and would probably misfire a lot. But doesn't seem impossible, and I know people are working on laying groundwork for this in various ways (e.g. pushing for labs to incorporate evals in their development process).

Jeffrey Kursonis

Leopold. I love your thinking here, especially that society will arise to save itself. I sent you an email at four our posterity.

Ofer

-2

Solving (scalable) alignment might be worth lots of $$$ and key to beating China.

I really don't want Xi Jinping Thought to rule the world

If you want to win the AGI race, if you want to beat China, [...]

Let’s not lose to China [...]

The China-is-an-opponent-that-we-must-beat-in-the-AI-race is a classic talking point of AI companies in the US, that is used as an argument against regulation. Are you by any chance affiliated with an AI company, or an organization that is funded by one?

Comments

deep

There are other safety problems-- often ones that are more speculative-- that the market is not incentivizing companies to solve.

My personal response would be as follows:

As Leopold presents it, the key pressure here that keeps labs in check is societal constraints on deployment, not perceived ability to make money. The hope is that society's response has the following properties:
1. thoughtful, prominent experts are attuned to these risks and demand rigorous responses
2. policymakers are attuned to (thoughtful) expert opinion
3. policy levers exist that provide policymakers with oversight / leverage over labs
If labs are sufficiently thoughtful, they'll notice that deploying models is in fact bad for them! Can't make profit if you're dead. *taps forehead knowingly*
1. but in practice I agree that lots of people are motivated by the tastiness of progress, pro-progress vibes, etc., and will not notice the skulls.

Counterpoints to 1:

Good regulation of deployment is hard (though not impossible in my view).

reasonable policy responses are difficult to steer towards
attempts at raising awareness of AI risk could lead to policymakers getting too excited about the promise of AI while ignoring the risks
experts will differ; policymakers might not listen to the right experts

Good regulation of development is much harder, and will eventually be necessary.

^{^}

Though, for now, it seems that China is a few years behind, and the US AI chip export controls might considerably hamper them (great CSIS explainer on the export controls, CSET report on why china might have a hard time catching up). So especially if timelines are short, we have a healthy lead for now.

^{^}

Which risk is bigger, AI misalignment or "bad guys getting AGI first"? cf Holden Karnofsky on the "caution vs. competition" frame

^{^}

Or at least, it’s widely believed it has such a 10% chance.

^{^}

Roon gets it right.

^{^}

If this ends up being a big barrier to deploying your model in 50% of worlds, that 50% is enough to make alignment incredibly commercially valuable for you.

^{^}

An interesting potential implication not discussed in the main post: if alignment techniques become incredibly commercially valuable/key competitive advantages, will these become trade secrets not shared publicly or with other labs?

Want to win the AGI race? Solve alignment.

Want to win the AGI race? Solve alignment.

Things are going to get crazy, and people will pay attention

The binding constraint on making AGI could be aligning it. You want an unambiguous solution, for which there is consensus that it’s safe.