What's the exact way you predict probability of AI extinction?

jackchang110

I want to know how people estimate the probabilty of AI takeoff and causing humans extinction, and the details(such as: Humans attitude on AI safety, how AI gain physical access to world, how AI is good at tricking humans...) people consider on to predict. But I can only find estimation "results" on EA forum(mostly 2-10% in this century), but I don't know how you estimate it. Did you use complex math models to calculate? I know we should take a pinch of salt with the prediction, but I just want to know what people considers as important factors of AI risks.

18 Reactions

New Answer

New Comment

1 Answers sorted by
Top

Ben_West🔸

Jun 13, 2023

This talk has one approach.

Comments6

Sorted by

New & upvoted

Click to highlight new comments since: Today at 1:57 PM

harfeJun 13 20235

I think most probabilistic estimates are subjective probability estimates. There are no complicated math models behind them usually.

Some people do make models, but then make subjective probability estimates. The math is typically not that complicated for these models, often just multiplying different probabilities together (which is imo not a good class of models for this kind of problem).

My guess would be that even some of the people who make models have different probability estimates for human extinction than the one that the model spits out, because they realize that their models have flaws and try to correct for that.

jackchang110Jun 14 20233

So the prediction experts made are all pureb"subjective" predictions? I think there are some logical thinking/arguments or maybe like fermi estimation to explain how he estimates the number unless it's mostly intuition.

brunopargaJun 16 20234

[This article](https://slatestarcodex.com/2013/05/02/if-its-worth-doing-its-worth-doing-with-made-up-statistics/) explores why it is useful to work with subjective, "made-up" statistics.

My own view hinges on the following:

instrumental convergence: agents will tend to try and accumulate some kinds of resources like money, regardless of what their goals are;
value-capabilities orthogonality (often known as just "the orthogonality thesis"): regardless of their capabilities, agents might have pretty much any kind of goal;
the fact that most possible goals are incompatible with human thriving (we need a very specific set of conditions to survive, let alone thrive);
the fact that current AI capabilities are growing, the growth rate seems to be increasing, and that there are strong economic incentives to keep pushing them forward.

These factors lead me to think we have significantly worse than even odds (that is, <50%) of surviving this century.

titotalJun 16 20235

I'm also quite interested in how these estimates are being made, so can I ask you for more detail about how you got your estimate?

In particular, I'm interested in the "chain of events" involved. AI extinction involves several consecutive speculative events. What are your estimates for the following, conditional on the previous steps occurring?

at least one AGI is built this century
at least one of these AGI is motivated to conquer and wipe out humanity
at least one of the rebellious AGI successfully conquers and destroy humanity

Did your >50% estimate come from reasoning like this about each step?

brunopargaJun 25 20231

I think 1 is >95% likely. We're in an arms race dynamic for at least some of the components of AGI. This is conditional on us not having been otherwise wiped out (by war, pandemic, asteroid, etc).

I think 2 and 3 are the wrong way to think about the question. Was humankind "motivated to conquer" the dodo? Or did we just have a better use for its habitat, and its extinction was just a whoopsie in the process?

titotalJun 26 20232

I think 2 and 3 are the wrong way to think about the question. Was humankind "motivated to conquer" the dodo? Or did we just have a better use for its habitat, and its extinction was just a whoopsie in the process?

When I say "motivated to", I don't mean that it would be it's primary motivation. I mean that it has motivations that, at some point, would lead to it having "perform actions that would kill all of humanity" as a sub-goal. And in order to get to the point where we were dodo's to it, it would have to disempower humanity somehow.

Would you prefer the following restatement, each conditional on the previous step:

At least one Agi is built in our lifetimes
At least one of these AGI’s has the motivations that include "disempower humanity" as a sub-goal
At least one of these disempowerment attempts are successful

And then either:

4a: The process of disempowering humanity involves wiping out all of humanity

4b: After successfully disempowering humanity with some of humanity still intact, the AI ends up wiping out the rest of humanity anyway

Effective Altruism Forum
EA Forum

[ Question ]

What's the exact way you predict probability of AI extinction?

18

18

Reactions

1 Answers sorted by
Top

Jun 13, 2023

[ Question ]

What's the exact way you predict probability of AI extinction?

18

18

Reactions

1 Answers sorted by Top

Jun 13, 2023

1 Answers sorted by
Top