CarlShulman

Glad to see this series up! Tons of great points here.

One thing I would add is a that I think the analysis about fragility of value and intervention impact has a structural problem. Supposing that the value of the future is hyper-fragile as a combination of numerous multiplicative factors, you wind up thinking the output is extremely low value compared to the maximum, so there's more to gain. OK.

But a hypothesis of hyper-fragility along these lines also indicates that after whatever interventions you make you will still get numerous multiplicative factors wrong, so it will again be an extreme failure.

On this analysis it's the worlds where things are non-fragile (e.g. because of epistemic enhancement and improved bargaining and wealth driving systematically getting things right) that are far more valuable.

Maybe on the hyper-fragile aggregative story it's easier to 10x the value of the future, but after doing so it will still be a bunch of orders of magnitude off from the optimum. On the feasible convergent optimum story a win gets you the optimum, far better than going from 10^-10 to 10^-9 of the optimum.

So there's a lot of oomph to be had averting a catstrophic disruption of an otherwise convergent win (e.g. preventing a nasty whimsical permanent dictatorship that does crazy things or has a very bad starting point, AI or human), but not so much messing around with the hyper-fragile cases.

What if we just…didn’t build AGI? An Argument Against Inevitability

CarlShulman1y*6

I think there's some talking past each other happening.

I am claiming that there are real coordination problems that lead even actors who believe in a large amount of AI risk to think that they need to undertake risky AI development (or riskier) for private gain or dislike of what others would do. I think that dynamic will likely result in future governments (and companies absent government response) taking on more risk than they otherwise would, even if they think it's quite a lot of risk.

I don't think that most AI companies or governments would want to create an indefinite global ban on AI absent coordination problems, because they think benefits exceed costs, even those who put 10%+ on catastrophic outcomes, like Elon Musk or Dario Amodei (e.g. I put 10%+ on disastrous outcomes from AI development but wouldn't want a permanent ban, even Eliezer Yudkowsky doesn't want a permanent ban on AI).

I do think most of the AI company leadership that actually believes they may succeed in creating AGI or ASI would want to be able to take a year or two for safety testing and engineering if they were approaching powerful AGI and ASI absent issues of commercial and geopolitical competition (and I would want that too). And I think future US and Chinese governments, faced with powerful AGI/ASI and evidence of AI misbehavior and misalignment, would want to do so save for geopolitical rivalry.

Faced with that competition each actor taking a given level of risk to the world doesn't internalize it, only the increment of risk over their competitor. And companies and states both get a big difference in value from being incrementally ahead vs behind competitors. For companies it can be huge profits vs bankruptcy. For states (and several AI CEOs who actually believe in AGI and worry about how others would develop and use it, I agree most of the hyperscaler CEOs and the like are looking at things from a pure business perspective and don't even believe in AGI/ASI) there is the issue of power (among other non-financial motives) as a reason to care about being first.

What if we just…didn’t build AGI? An Argument Against Inevitability

CarlShulman1y21

"Second, the primary benefits—higher incomes and earlier biomedical breakthroughs—are also broadly shared; they are not gated to the single lab that crosses the finish line first."

If you look at the leaders of major AI companies you see people like Elon Musk and others who are concerned with getting to AGI before others who they distrust and fear. They fear immense power in the hands of rivals with conflicting ideologies or in general.

OpenAI was founded and funded in significant part based on Elon Musk's fear of the consequences of the Google leadership having power over AGI (in particular in light of statements suggesting producing AI that lead to human extinction would be OK). States fear how immense AGI power will be used against them. Power, including the power to coerce or harm others, and relative standing, are more important there than access to advanced medicine or broad prosperity for the competitive dynamics.

In the shorter term, an AI company whose models are months behind may find that their APIs have negative margins while competitors earn 50% margins. Avoiding falling behind is increasingly a matter of institutional survival for AI companies, and a powerful motive to increment global risk a small amount to profit rather than going bankrupt.

The motive I see to take incremental risk is "if AI wipes out humanity I'm just as dead either way, and my competitors are similarly or more dangerous than me (self-serving bias plays into this) but there are huge ideological or relative position (including corporate survival) gains from control over powerful AGI that are only realized by being fast, so I should take a bit more risk of disaster conditional on winning to raise the chance of winning." This dynamic looks to apply to real AI company leaders who claim big risks of extinction while rushing forward.

With multiple players doing that, the baseline level of risk from another lab goes up, and the strategic appeal of incrementing it one more time for relative advantage continues. You can get up to very high levels of risk perceived by labs that way, accepting each small increment of risk as minor compared to the risk posed by other labs and the appeal of getting a lead, setting a new worse baseline for others to compete against.

And the epistemic variation makes it all worse, where the most unconcerned players set a higher baseline risk spontaneously.

What if we just…didn’t build AGI? An Argument Against Inevitability

CarlShulman1y5

Right, those comments were about the big pause letter, which while nominally global in fact only applied at the time to the leading US lab, and even if voluntarily complied with would not affect the PRC's efforts to catch up in semiconductor technology, nor Chinese labs catching up algorithmically (as they have partially done).

What if we just…didn’t build AGI? An Argument Against Inevitability

CarlShulman1y2

Sure, these are possible. My view above was about expectations. #1 and #2 are possible, although look less likely to me. There's some truth to #3, but the net effect is still gap closing, and the slowing tends to be more earlier (when it is less impactful) than later.

What if we just…didn’t build AGI? An Argument Against Inevitability

CarlShulman1y43

On my view the OP's text citing me left out the most important argument from the section they linked: the closer and tighter an AI race is at the international level as the world reaches strong forms of AGI and ASI, the less slack there is for things like alignment. The US and Chinese governments have the power to prohibit their own AI companies from negligently (or willfully) racing to create AI that overthrows them, if they believed that was a serious risk and wanted to prioritize stopping it. That willingness will depend on scientific and political efforts, but even if those succeed enormously, the international cooperation between the US and China will pose additional challenges. The level of conviction in risks governments would need would be much higher than to rein in their own companies without outside competition, and there would be more political challenges.

Absent an agreement with enough backing it to stick, slowdown by the US tightens the international gap in AI and means less slack (and less ability to pause when it counts) and more risk of catastrophe in the transition to AGI and ASI. That's a serious catastrophe-increasing effect of unilateral early (and ineffectual at reducing risk) pauses. You can support governments having the power to constrain AI companies from negligently destroying them, and international agreements between governments to use those powers in a coordinated fashion (taking steps to assure each other in doing so), while not supporting unilateral pause to make the AI race even tighter.

I think there are some important analogies with nuclear weapons. I am a big fan of international agreements to reduce nuclear arsenals, but I oppose the idea of NATO immediately destroying all its nuclear weapons and then suffering nuclear extortion from Russia and China (which would also still leave the risk of nuclear war between the remaining nuclear states). Unilateral reductions as a gesture of good faith that still leave a deterrent can be great, but that's much less costly than evening up the AI race (minimal arsenals for deterrence are not that large).

"So, at least when you go to the bargaining table, if not here, we need to ask for fully what we want without pre-surrendering. “Pause AI!”, not “I know it’s not realistic to pause, but maybe you could tap the brakes?” What’s realistic is to some extent what the public says is realistic."

I would think your full ask should be the international agreement between states, and companies regulated by states in accord with that, not unilateral pause by the US (currently leading by a meaningful margin) until AI competition is neck-and-neck.

And people should consider both the possibilities of ultimate success and of failure with your advocacy, and be wary of intermediate goals that make things much worse if you ultimately fail with global arrangements but make them only slightly more likely to succeed. I think it is certainly possible some kind of inclusive (e.g. including all the P-5) international deal winds up governing and delaying the AGI/ASI transition, but it is also extremely plausible that it doesn't, and I wouldn't write off consequences in the latter case.

Carl Shulman on the moral status of current and future AI systems

CarlShulman2y12

I have two views in the vicinity. First, there's a general issue that human moral practice generally isn't just axiology, but also includes a number of elements that are built around interacting with other people with different axiologies, e.g. different ideologies coexisting in a liberal society, different partially selfish people or family groups coexisting fairly while preferring different outcomes. Most flavors of utilitarianism ignore those elements, and ceteris paribus would, given untrammeled power, call for outcomes that would be ruinous for ~all currently existing beings, and in particular existing societies. That could be classical hedonistic utilitarianism diverting the means of subsistence from all living things as we know them to fuel more hedonium, negative-leaning views wanting to be rid of all living things with any prospects for having or causing pain or dissatisfaction, or playing double-or-nothing with the universe until it is destroyed with probability 1.

So most people have reason to oppose any form of utilitarianism getting absolute power (and many utilitarianisms would have reason to self-efface into something less scary and dangerous and prone to using power in such ways that would have a better chance of realizing more of what it values by less endangering other concerns). I touch on this in an article with Elliott Thornley.

I have an additional objection to hedonic-only views in particular, in that they don't even take as inputs many of people's concerns, and so more easily wind up hostile to particular individuals supposedly for those individuals' sake. E.g. I would prefer to retain my memories and personal identity, knowledge and autonomy, rather than be coerced into forced administration of pleasure drugs. I also would like to achieve various things in the world in reality, and would prefer that to an experience machine. A normative scheme that doesn't even take those concerns as inputs is fairly definitely going to run roughshod over them, even if some theories that take them as inputs might do so too.

Carl Shulman on the moral status of current and future AI systems

CarlShulman2y6

Physicalists and illusionists mostly don't agree with the identification of 'consciousness' with magical stuff or properties bolted onto the psychological or cognitive science picture of minds. All the real feelings and psychology that drive our thinking, speech and action exist. I care about people's welfare, including experiences they like, but also other concerns they have (the welfare of their children, being remembered after they die), and that doesn't hinge on magical consciousness that we, the physical organisms having this conversation, would have no access to. The illusion is of the magical part.

Re desires, the main upshot of non-dualist views of consciousness I think is responding to arguments that invoke special properties of conscious states to say they matter but not other concerns of people. It's still possible to be a physicalist and think that only selfish preferences focused on your own sense impressions or introspection matter, it just looks more arbitrary.

I think this is important because it's plausible that many AI minds will have concerns mainly focused on the external world rather than their own internal states, and running roughshod over those values because they aren't narrowly mentally-self-focused seems bad to me.

CarlShulman

Posts 8

Comments387

Posts
8

Comments
387