AI Could Defeat All Of Us Combined

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

131

Let's taboo the V-word

lincolnq·2d ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·2d ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

Recent opportunities to take action

brb243

-7

TLDR: Gaining attention aggressively in EA may be suboptimal and can be avoided.

This post uses the following stylistic aspects for these purposes:

Vague language to motivate readers accept specific propositions because they agree with some interpretation of the vague statement

improve the odds of things going well ... little to do with anything humans value ... or something like that ... Why would this be dangerous?

Suboptimally addressed fear to keep readers' attention while decreasing the critique of their engagement

A key focus of the new series will be the threat of misaligned AI

Appeal to unity in fight against a common enemy to reduce the questioning of or disagreement with the propositions due to readers' fear of consequences of exclusion

part where humans win ...

Implying the threat of negative consequences and authority's unacceptance of an alternative focus to motivate readers' unquestioning attention

Many people have trouble taking this "misaligned AI" possibility seriously.

Us and them mentality to motivate decisionmaking based on intuition on siding with the most aggressive authority rather than on rational evaluation of the arguments

They might see the broad point that AI could be dangerous, but they instinctively imagine that the danger comes from ways humans might misuse it. ... AI could still defeat us all

Appeal to emotion to motivate readers to accept propositions due to emotional state of acceptance.

I'm going to try to make this idea feel more serious and real.

Appeal to the joy of submission to motivate readers' emotional state of acceptance.

I mean a literal "defeat" in the sense that we could all be killed, enslaved or forcibly contained. ... overpowering advanced technologies

Appeal to own enjoyment of siding with an aggressive authority to motivate readers' similar focus

linger on the point that if such an attack happened, it could succeed against the combined forces of the entire world

Intrusion of readers' physical spaces to motivate reasoning under stress

don't have a lot of things that could end human civilization if they "tried" sitting around

Presumption of readers' communication with the author to motivate engagement by appeal to an authority's preference.

if you don't believe that AI could defeat all of humanity combined, I expect that we're going to be miscommunicating in pretty much any conversation about AI

Use of expressions traditionally used by historically stereotypical authorities to motivate acceptance and limited questioning

The kind of AI I worry about is the kind powerful enough that total civilizational defeat is a real possibility.

Emotional emphasis of the author's attention to this topic to motivate readers' attention by care for the author

I currently spend so much time planning around speculative future technologies

Exclusion of readers in author's work to decrease readers' engagement and critique

(instead of working on evidence-backed, cost-effective ways of helping low-income people today - which I did for much of my career, and still think is one of the best things to work on)

Implication that readers seek to refrain from reading further but are compelled by the author to increase readers' stress

Below:

Setting standards of any presumed competition when readers are stressed to motivate acceptance of these standards and limit their questioning

rival human civilization in terms of total population and resources.

Appeal to fear of being physically hurt by a larger number of individuals to motivate engagement under stress and attention with the intent of understanding means to avert this threat

At a high level, I think we should be worried if a huge (competitive with world population) and rapidly growing set of highly skilled humans on another planet was trying to take down civilization just by using the Internet.

Appeal to the notion that one is confident to fight physically weaker individuals but reducing the confidence (motivating shame, anger, and willingness to prove one's physical strength) to motivate action

So we should be worried about a large set of disembodied AIs as well.

Focus on body to make readers' fear for their bodies or perceive being observed to motivate repetition of the propositions to avoid these perceptions

How can AIs be dangerous without bodies?

Use of allusions to physical aggression that cannot be pinpointed with certainty to motivate trust by avoiding the disappointment of siding with a malevolent authority

nip it in the bud?

Offering the possibility that one can be aggressive to motivate attention with the purpose of understanding means to act aggressively in a way accepted due to the aggressor's power to hurt and limited consequences.

Isn't it fine or maybe good if AIs defeat us? They have rights too.

Offering the possibility that one can overpower others to motivate attention with the purpose of understanding the means

how unprecedented it would be to have something on our planet capable of overpowering us all

... I skipped to the last section.

Allusion to the confirmation that one can overpower others if they act fast to motivate impulsive action

humans move slowly and don't create many AIs?

Allusion to unwanted personal engagement to motivate internalization and repetition of propositions

"dry tinder everywhere, waiting for sparks."

Deflection from implied propositions by centralizing a current issue to motivate readers to internalize the implied propositions.

I think our concern should be any AI that is able to find enough security holes to attain that kind of freedom. Given the current state of cybersecurity, that seems like a big concern.

While I agree that readers should be aware of the implied propositions, I believe that if they are comfortable and motivated to critically engage, more effective and better accepted by humans action would take place.

Thus, I ask the author to use stylistic aspects for readers' attention and acceptance based on stress, fear, aggression, us/them, etc only when this is acknowledged and distanced from. If these are difficult to discern, draft readers should be engaged in observing their emotions and the writing should be adjusted.

I am curious about the merits of this style beyond the notion that attention of people in traditionally aggressive systems can be captivated only aggressively. I would like to challenge this notion: systems perceived as traditionally aggressive are based on care (perhaps in narrower circles than in EA); presuming authorities' (e. g. readers' superiors') benevolence and engaging them in developing solutions solves issues more efficiently and sustainably than excluding/'othering' them; and people like to engage in systems that prevent or solve important issues rather than those where persons are preoccupied with aggression.

I would like to suggest that a low % of this post focused on solutions. What would be the effects of inverting the %s of solutions coverage and attention captivation?

Assuming you accept other points made in the most important century series, e.g. that AI that can do most of what humans do to advance science and technology could be developed this century. ↩
See Superintelligence chapter 6. ↩
See the "Nanotechnology blue box," in particular. ↩
- The report estimates the amount of computing power it would take to train (create) a transformative AI system, and the amount of computing power it would take to run one. This is a bounding exercise and isn't supposed to be literally predicting that transformative AI will arrive in the form of a single AI system trained in a single massive run, but here I am interpreting the report that way for concreteness and simplicity.
- As explained in the next footnote, I use the report's figures for transformative AI arriving on the soon side (around 2036). Using its central estimates instead would strengthen my point, but we'd then be talking about a longer time from now; I find it helpful to imagine how things could go in a world where AI comes relatively soon. ↩
I assume that transformative AI ends up costing about 10^14 FLOP/s to run (this is about 1/10 the Bio Anchors central estimate, and well within its error bars) and about 10^30 FLOP to train (this is about 10x the Bio Anchors central estimate for how much will be available in 2036, and corresponds to about the 30th-percentile estimate for how much will be needed based on the "short horizon" anchor). That implies that the 10^30 FLOP needed to train a transformative model could run 10^16 seconds' worth of transformative AI models, or about 300 million years' worth. This figure would be higher if we use Bio Anchors's central assumptions, rather than assumptions consistent with transformative AI being developed on the soon side. ↩
They might also run fewer copies of scaled-up models or more copies of scaled-down ones, but the idea is that the total productivity of all the copies should be at least as high as that of several hundred million copies of a human-ish model. ↩
Intel, Google ↩
Working-age population: about 65% * 7.9 billion =~ 5 billion. ↩
Humans could rent hardware using money they made from running AIs, or - if AI systems were operating on their own - they could potentially rent hardware themselves via human allies or just via impersonating a customer (you generally don't need to physically show up in order to e.g. rent server time from Amazon Web Services). ↩
(I had a speculative, illustrative possibility here but decided it wasn't in good enough shape even for a footnote. I might add it later.) ↩
I don't go into detail about how AIs might coordinate with each other, but it seems like there are many options, such as by opening their own email accounts and emailing each other. ↩
Alien invasions seem unlikely if only because we have no evidence of one in millions of years. ↩
Here's a recent comment exchange I was in on this topic. ↩
E.g., individual AI systems may occasionally get caught trying to steal, lie or exploit security vulnerabilities, due to various unusual conditions including bugs and errors. ↩
E.g., see this list of high-stakes security breaches and a list of quotes about cybersecurity, both courtesy of Luke Muehlhauser. For some additional not-exactly-rigorous evidence that at least shows that "cybersecurity is in really bad shape" is seen as relatively uncontroversial by at least one cartoonist, see: https://xkcd.com/2030/ ↩
Purchases and contracts could be carried out by human allies, or just by AI systems themselves with humans willing to make deals with them (e.g., an AI system could digitally sign an agreement and wire funds from a bank account, or via cryptocurrency). ↩
See above note about my general assumption that today's cybersecurity has a lot of holes in it. ↩

AI Could Defeat All Of Us Combined

AI Could Defeat All Of Us Combined

How AI systems could defeat all of us

The "standard" argument: superintelligence and advanced technology

How AIs could defeat humans without "superintelligence"

Some quick responses to objections

Risks like this don't come along every day

Appendix: how AIs could avoid shutdown

How this could work if humans create a huge population of AIs

What if humans move slowly and don't create many AIs?

Footnotes