4590 karmaJoined Sep 2014



Global moratorium on AGI, now (Twitter). Founder of CEEALAR (née the EA Hotel;


It seems unlikely that we'll ever get AI x-risk down to negligible levels, but it's currently striking how high a risk is being tolerated by those building (and regulating) the technology, when compared to, as you say, aviation, and also nuclear power (<1 catastrophic accident in 100,000 years being what's usually aimed for). I think at the very least we need to reach a global consensus on what level of risk we are willing to tolerate before continuing with building AGI.

I guess you're sort of joking, but it should be really surprising (from an outside perspective) that biological brains have figured out how to understand neural networks (and it's taken billions of years of evolution).

Thoughts on this? Supposedly shows the leaked letter to the board. But seems pretty far out, and if true, it's basically game over (AES-192 encryption broken by the AI with new unintelligible maths; the AI proposing a new more efficient and flexible architecture for itself). Really hope the letter is just a troll!

Altman starting a new company could still slow things down a few months. Which could be critically important if AGI is imminent. In those few months perhaps government regulation with teeth could actually come in, and then shut the new company down before it ends the world.

Looks like Matthew did post a model of doom that contains something like this (back in May, before the top level comment:

My modal tale of AI doom looks something like the following: 

1. AI systems get progressively and incrementally more capable across almost every meaningful axis. 

2. Humans will start to employ AI to automate labor. The fraction of GDP produced by advanced robots & AI will go from 10% to ~100% after 1-10 years. Economic growth, technological change, and scientific progress accelerates by at least an order of magnitude, and probably more.

3. At some point humans will retire since their labor is not worth much anymore. Humans will then cede all the keys of power to AI, while keeping nominal titles of power.

4. AI will control essentially everything after this point, even if they're nominally required to obey human wishes. Initially, almost all the AIs are fine with working for humans, even though AI values aren't identical to the utility function of serving humanity (ie. there's slight misalignment).

5. However, AI values will drift over time. This happens for a variety of reasons, such as environmental pressures and cultural evolution. At some point AIs decide that it's better if they stopped listening to the humans and followed different rules instead.

6. This results in human disempowerment or extinction. Because AI accelerated general change, this scenario could all take place within years or decades after AGI was first deployed, rather than in centuries or thousands of years.

I think this scenario is somewhat likely and it would also be very bad. And I'm not sure what to do about it, since it happens despite near-perfect alignment, and no deception.

One reason to be optimistic is that, since the scenario doesn't assume any major deception, we could use AI to predict this outcome ahead of time and ask AI how to take steps to mitigate the harmful effects (in fact that's the biggest reason why I don't think this scenario has a >50% chance of happening). Nonetheless, I think it's plausible that we would not be able to take the necessary steps to avoid the outcome. Here are a few reasons why that might be true:

1. There might not be a way to mitigate this failure mode. 
2. Even if there is a way to mitigate this failure, it might not be something that you can figure out without superintelligence, and if we need superintelligence to answer the question, then perhaps it'll happen before we have the answer. 
3. AI might tell us what to do and we ignore its advice. 
4. AI might tell us what to do and we cannot follow its advice, because we cannot coordinate to avoid the outcome.

The link is dead. Is it available anywhere else?

Agree, but I also think that insufficient "security mindset" is still a big problem. From OP:

it still remains to be seen whether US and international regulatory policy will adequately address every essential sub-problem of AI risk. It is still plausible that the world will take aggressive actions to address AI safety, but that these actions will have little effect on the probability of human extinction, simply because they will be poorly designed. One possible reason for this type of pessimism is that the alignment problem might just be so difficult to solve that no “normal” amount of regulation could be sufficient to make adequate progress on the core elements of the problem—even if regulators were guided by excellent advisors—and therefore we need to clamp down hard now and pause AI worldwide indefinitely.

Matthew goes on to say:

That said, I don't see any strong evidence supporting that position.

I'd argue the opposite. I don't see any strong evidence opposing that position (given that doom is the default outcome of AGI). The fact that a moratorium was off the table at the UK AI Safety Summit was worrying. Matthew Syed, writing in The Times, has it right:

The one idea AI won’t come up with for itself — a moratorium

The Bletchley Park summit was an encouraging sign, but talk of regulators and off switches was delusional

Or, as I recently put it on X. It's

Crazy that accepted levels of [catastrophic] risk for AGI [~10%] are 1000x higher (or more) than for nuclear power. Any sane regulation would immediately ban the construction of ML-based AGI.

I imagine it going hand in hand with more formal backlashes (i.e. regulation, law, treaties).

Overall I don’t have settled views on whether it’d be good for me to prioritize advocating for any particular policy.5 At the same time, if it turns out that there is (or will be) a lot more agreement with my current views than there currently seems to be, I wouldn’t want to be even a small obstacle to big things happening, and there’s a risk that my lack of active advocacy could be confused with opposition to outcomes I actually support.

You have a huge amount of clout in determining where $100Ms of OpenPhil money is directed toward AI x-safety. I think you should be much more vocal on this - at least indirectly by OpenPhil grant making. In fact I've been surprised at how quiet you (and OpenPhil) have been since GPT-4 was released!

  • There’s a serious (>10%) risk that we’ll see transformative AI2 within a few years.
  • In that case it’s not realistic to have sufficient protective measures for the risks in time.
  • Sufficient protective measures would require huge advances on a number of fronts, including information security that could take years to build up and alignment science breakthroughs that we can’t put a timeline on given the nascent state of the field, so even decades might or might not be enough time to prepare, even given a lot of effort.

If it were all up to me, the world would pause now

Reading the first half of this post, I feel that your views are actually very close to my own. It leaves me wondering how much your conflicts of interest - 

I am married to the President of Anthropic and have a financial interest in both Anthropic and OpenAI via my spouse.

- are factoring into why you come down in favour of RSPs (above pausing now) in the end.

Load more