Slowing down AI progress is an underexplored alignment strategy

Michael Huang

This is a linkpost for https://www.lesswrong.com/posts/b7JXJWY7R2jNtHerP/slowing-down-ai-progress-is-an-underexplored-alignment

LessWrong user "Norman Borlaug" with an interesting post on reducing existential risk through deliberate overregulation of AI development:

In the latest Metaculus forecasts, we have 13 years left until some lab somewhere creates AGI, and perhaps far less than that until the blueprints to create it are published and nothing short of a full-scale nuclear war will stop someone somewhere from doing so. The community strategy (insofar as there even is one) is to bet everything on getting a couple of technical alignment folks onto the team at top research labs in the hopes that they will miraculously solve alignment before the mad scientists in the office next door turn on the doomsday machine.
While I admit there is at least a chance this might work, and it IS worth doing technical alignment research, the indications we have so far from the most respected people in the field are that this is an extremely hard problem and there is at least a non-zero chance it is fundamentally unsolvable.
There are a dozen other strategies we could potentially deploy to achieve alignment, but they all depend on someone not turning on the doomsday machine. But thus far we have almost completely ignored the class of strategies that might buy more time. The cutting edge of thought on this front seems to come from one grumpy former EA founder on Twitter who isn't even trying that hard.

From Kerry Vaughan's Twitter thread:

I've recently learned that this is a *spicy* take on AI Safety: AGI labs (eg OpenAI, DeepMind, and others) are THE CAUSE of the fundamental problem the AI Safety field faces.
I thought this was obvious until very recently. Since it's not, I should explain my position.
(I'll note that while I single out OpenAI and DeepMind here, that's only because they appear to be advancing the cutting edge the most. This critique applies to any company or academic researcher that spends their time working to solve the bottlenecks to building AGI.)
To vastly oversimply the situation, you can think of AI Safety as a race. In one corner you have the AGI builders who are trying to create AGI as fast as possible. In the other corner, you have people trying to make sure AGI will be aligned with human goals once we build it. If AGI gets built before we know how to align it, it *might* be CATASTROPHIC.
Fortunately, aligning an AGI is unlikely to be impossible. So, given enough time and effort into the problem, we will eventually solve it.
This means the actual enemy is time. If we have enough time to both find capable people and have them work productively on the problem, we will eventually win. If not, we lose. I think the fundamental dynamic is really just that simple.
AGI labs like OAI and DeepMind have it as their MISSION to decrease the time we have. Their FOUNDING OBJECTIVE is to build AGI and they are very clearly and obviously trying *as hard as they can* to do just that. They raise money, hire talent, etc. all premised on this goal.
Every day an AGI engineer at OpenAI or DeepMind shows up to work and tries to solve the current bottlenecks in creating AGI, we lose just a little bit of time. Every day they show up to work, the odds of victory get a little bit lower. My very bold take is that THIS IS BAD
Now you might be thinking: "Demis Hassabis and Sam Altman are not psychopaths or morons. If they get close to AGI without solving alignment they can just not deploy the AGI." There are a number of problems with this, but the most obvious is: they're still robbing us of time.
Every. Single. Day. the AGI labs are steadily advancing the state of the art on building AGI. With every new study they publish, researcher they train, and technology they commercialize, they also make it easier for every other AGI lab to build and deploy an AGI.
So unless they can somehow refrain from deploying an unaligned AGI and stop EVERYONE ELSE from doing the same, they continue to be in the business of robbing humanity of valuable time.
They are the cause of the fundamental problem faced by the AI Safety community.
In conclusion: Stop building AGI you fucks.
Notably, a number of people in the AI Safety community basically agree with all of this but think I shouldn't be saying it. (Or, at least, that EA Bigwigs shouldn't say it.)
I obviously disagree. But it's a more complex question which I'll reserve for a future thread.

Elon Musk has been vocal on the need to regulate AI development, even if it includes regulating (and slowing down) his own companies:

All orgs developing advanced AI should be regulated, including Tesla

91 Reactions

Mentioned in

339Let’s think about slowing down AI

90The History, Epistemology and Strategy of Technological Restraint, and lessons for AI (short essay)

84Katja Grace: Let's think about slowing down AI

64What if we just…didn’t build AGI? An Argument Against Inevitability

46Monthly Overload of EA - August 2022

Load more (5/6)

More posts like this

Comments11

Sorted by

New & upvoted

Click to highlight new comments since: Today at 2:32 PM

MauJul 13 202224

(Crossposting, with tweaks)

Thanks for the post - I think (unoriginally) there are some ways heavy regulation of AI could be very counterproductive or ineffective for safety:

If AI progress slows down enough in countries were safety-concerned people are especially influential, then these countries (and their companies) will fall behind internationally in AI development. This would eliminate much/most of safety-concerned people's opportunities for impacting AI's trajectory.
If China "catches up" to the US in AI (due to US over-regulation) when AI is looking increasingly economically and militarily important, that could motivate US policymakers to hit the gas on AI (which would at least undo some of the earlier slowing down of AI, and might spark an international race to the bottom on AI).

Also, you mention,

The community strategy (insofar as there even is one) is to bet everything on getting a couple of technical alignment folks onto the team at top research labs in the hopes that they will miraculously solve alignment before the mad scientists in the office next door turn on the doomsday machine.

From conversation, my understanding is some governance/policy folks fortunately have (somewhat) more promising ideas than that. (This doesn't show up much on this site, partly because these professionals tend to be busy and the ideas are fairly rough.) I hear there's some work aimed at posting about some of these ideas - until then, chatting with people (e.g., by reaching out to people) might be the best way to learn about these ideas.

Sam ClarkeJul 13 20228

Another (unoriginal) way that heavy AI reg could be counterproductive for safety: AGI alignment research probably increases in productivity as you get close to AGI. So, regulation in jurisdictions with the actors who are closest to AGI (currently, US/UK) would give those actors less time to do high productivity AGI alignment research, before the 2nd place actor catches up

And within a jurisdiction, you might think that responsible actors are most likely to comply to regulation, differentially slowing them down

OferJul 17 20226

If AI progress slows down enough in countries were safety-concerned people are especially influential, then these countries (and their companies) will fall behind internationally in AI development. This would eliminate much/most of safety-concerned people's opportunities for impacting AI's trajectory.

There's a country-agnostic version of that argument about self-regulation: "If AGI companies in which safety-concerned people are especially influential allow safety concerns to slow down their progress towards AGI, then these companies will fall behind. This would eliminate much/most of safety-concerned people's opportunities for impacting AI's trajectory".

Therefore, without any regulation, it's not clear to what extent the presence of safety-concerned people in AGI companies will matter.

MauJul 17 20222

I'm mostly sympathetic - I'd add a few caveats:

Research has to slow down enough for an AI developer to fall behind; an AI developer that has some lead over their competition would have some slack, potentially enabling safety-concerned people to contribute. (That doesn't necessarily mean companies should try to get a lead though.)
It seems plausible for some useful regulation to take the form of industry self-regulation (which safety-concerned people at these companies could help advance).

OferJul 18 20224

It seems plausible for some useful regulation to take the form of industry self-regulation (which safety-concerned people at these companies could help advance).

Generally, I think self-regulation is usually promoted by industry actors in order to prevent actual regulation. Based on your username and a bit of internet research, you seem to be an AI Governance Research Contractor at a major AGI company. Is this correct? If so, I suggest that you disclose that affiliation on your profile bio (considering that you engage in the topic of AI regulation on this forum).

(To be clear, your comments here seem consistent with you acting in good faith and having the best intentions.)

MauJul 18 20222

I'm still figuring out how I want to engage on this forum; for now, I generally, tentatively prefer to not disclose personal information on here. I'd encourage readers to conservatively assume I have conflicts of interest, and to assess my comments and posts based on their merits. (My vague sense is that this is a common approach to this forum--common enough that non-disclosure doesn't imply an absence of conflicts of interest--but maybe I've misread? I'm not confident about the approach I'm taking - feel free to message me on this forum if you'd like to discuss this further.)

On your other point, I agree that suspicion toward self-regulation is often warranted; I think my earlier point was sufficiently hedged ("plausible"; "some") to be compatible with such suspicion.

tamgentJan 26 20231

This was discussed here too.

OttoJul 20 202213

I agree that this strategy is underexplored. I would prioritize the following work in this direction as follows:

What kind of regulation would be sufficiently robust to slow down, or even pause, all AGI capabilities actors? This should include research/software regulation, hardware regulation, and data regulation. I think a main reason why many people think this strategy is unlikely to work is that they don't believe any practical regulation would be sufficiently robust. But to my knowledge, that key assumption has never been properly investigated. It's time we do so.
How could we practically implement sufficiently robust regulation? What would be required to do so?
How can we inform sufficiently large portions of society about AI xrisk to get robust regulation implemented? We are planning to do more research on this topic at the Existential Risk Observatory this year (we already have some first findings).

titotalJul 13 202210

You can make a pretty good case for regulating AI deployment even if you're an AI x-risk skeptic like myself. The simple point being that companies and sometimes even governments are deploying algorithms that they don't fully understand, and that the more power is being given over to these algorithms, the greater potential damage from the code going wrong. I would guess that AI "misalignment" already has a death toll, my go-to example being mass shooters whose radicalisation was aided by social media algorithms. Add to that the issues with algorithmic bias, the use of predictive policing and so on, and the case for some sort of regulation is pretty clear.

Henry Howard🔸Jul 13 2022-2

In the scenario where AGI won't be malevolent, slowing progress is bad.

In the scenario where AGI will be malevolent but we can fix that with more research, slowing progress is very good.

In the scenario where AGI will be malevolent and research can't do anything about that, it's irrelevant.

What's the hedge position?

Will PayneJul 13 20229

In the scenario where AGI would 100% be malevolent it seems like slowing progress is very good and all AIS people should pivot to slowing or stopping AI progress. Unless we’re getting into “is xrisk bad given the current state of the world” arguments which become a lot stronger if there’s no safe AI utopia at the end of the tunnel. Either way it seems like it’s not irrelevant